ulisp-esp
ulisp-esp copied to clipboard
suggestion to simplify gfuns
I was looking at the code and and noticed that all of the gfun_t's (at least the ones it is common to (read) from) have this block of code:
if (LastChar) {
char temp = LastChar;
LastChar = 0;
return temp;
}
I implemented my own gfun_t to read hard-coded data on startup and noticed that it caused reader errors when I commented out the above block in the gfun. (So, I'm guessing your parser would be classified as LL(1).)
To eliminate the need to include this block in each and every gfun_t, why not replace it with something like this to "wrap" an arbitrary gfun:
int mygetch (gfun_t g) {
if (LastChar) {
char temp = LastChar;
LastChar = 0;
return temp;
}
return g();
}
That way, it would allow you to read from any available gfun, including I2C and SPI (which currently don't include the LastChar lookahead-enable block). Then you could read uLisp code off a raw SPI or I2C flash chip.
I'm guessing your parser would be classified as LL(1)
That terminology is new to me. Is it equivalent to saying that the parser can look ahead one character?
Great idea to provide a general solution to this; I'll think about whether this has any downsides.
By the way, would this solve the first bug described by @jimsbugs here?
http://forum.ulisp.com/t/two-bugs-serial-stream-input-lognot-documentation-string/1180
Also, thinking about it a bit more - shouldn't there be a unique LastChar for each stream you're reading from, in case you're reading from two concurrently?
By the way, would this solve the first bug described by @jimsbugs here?
Probably, because all he was adding to Serial1, Serial2, and Serial3 functions was the if (LastChar)... block I noted above.
That terminology is new to me. Is it equivalent to saying that the parser can look ahead one character?
Technically one token, but since the tokens the reader is concerned about are only one character, there isn't much difference. See Wikipedia.
Also, thinking about it a bit more - shouldn't there be a unique LastChar for each stream you're reading from, in case you're reading from two concurrently?
Probably yes, but I'm not sure how to do that concisely. You can't even use a C++ closure here, because it can only be cast to a gfun_t if it doesn't close over any variables which ends up defeating the purpose (to close over its own unique LastChar).
On second thought you could define a typedef struct { gfun_t gfun; char LastChar; } stream_t; and pass that around to mygetch() (and maybe even an extended version of that as a pointer in the cdr cell of a stream object instead of just a number which has to be looked up by two functions to get the gfun and pfun).
typedef const struct {
gfun_t gfun;
pfun_t pfun;
const char* streamname PROGMEM;
union {
int address;
int port;
const char* filename PROGMEM;
};
} stream_t;
stream_t serialstream = {
.pfun = pserial;
.gfun = gserial;
.streamname = "serialstream";
};
// ETC...
Or to have multiple files open at once
typedef int (*pfun_t)(stream_t, char);
typedef void (*gfun_t)(stream_t);
// pass in the stream_t so you can generalize the function to multiple files
// the stream_t can live on the stack in the with-XXX forms because those don't
// support tail recursion
// and in
struct sobject {
..... // in cdr
stream_t stream;
.....
}
// maybe also:
typedef int (*sfun_t)(stream_t, bool, int); // to seek the stream
typedef bool (*cfun_t)(stream_t); // to close the stream
something á la funopen(3) but pared down to fit on low-memory microcontrollers
That all looks very clever. I think I'll go with your first suggestion for the time being, and think about the advanced solution later.
The first solution is an elegant refactoring that, as far as I can tell, fixes the problem I encountered. While I am not overly familiar with the uLisp implementation, I suspect the main problem with a single occurrence of the lastChar global is the possibility of error handling causing a non-local jump from within a with-XXX stream read to outside the stream's scope and leaving cruft in the lastChar that contaminates a subsequent read from a different stream. Unfortunately, I don't have time just now to try tracing the possible paths through the code to see if this can actually happen.
Unfortunately, I don't have time just now to try tracing the possible paths through the code to see if this can actually happen.
TBH I think that any such case where you'd get part of one stream and part of another via the LastChar would be in the case of malformed syntax in both streams, so behavior does't really matter at that point because you'd get an error anyway.
I think it would be a problem in this case:
Suppose stream1 contained "12 34 56" and stream2 contained "(12 23)(34 45)"
You do (read) from stream1. LastChar is now ' ' (space). You do (read) from stream2. It gets " (12 23)" and LastChar is now '('. You do (read) again from stream1. It will effectively read "(34 ..." and get stuck.
(I haven't tested this - thought experiment).
Ah - good point. It looks like the following sequence causes problems. Assuming your sample input:
(read stream1) - LastChar is now ' ' (space)
(read-byte stream2) - ' ' (space) from stream1 is returned, rather than ( from stream2
The data from the streams has gotten mixed up. (Also not tested.)
Tested my theory - the sequence above does not fail. Reading the code, I found a place where a close paren can get stored in LastChar. The following example causes data to be read from the wrong stream:
22217> (defun show-file (f) (with-sd-card (str f) (loop (let ((z (read-line str))) (if z (format t "~a~%" z) (return))))))
show-file
22217> (with-sd-card (str "f1" 2) (format str "(11111)~%"))
nil
22217> (show-file "f1")
(11111)
nil
22217> (with-sd-card (str "f2" 2) (format str "22222~%"))
nil
22217> (show-file "f2")
22222
nil
22217> (progn (with-sd-card (s1 "f1") (print (read-byte s1)) (print (read s1))) (with-sd-card (s2 "f2") (print (read s2))) nil)
40
11111
Error: 'read' incomplete list
22217> (progn (with-sd-card (s1 "f1") (print (read-byte s1)) (print (read s1))) (with-sd-card (s2 "f2") (print (read-byte s2))) nil)
40
11111
41
nil
22217>
Note that the second last form fails because it encounters the close paren left over from s1. This is shown more clearly in the final form where the first character encountered from s2 is 41 - a close paren - when it should be a 2 (50).
On a related note, with-sd-card in the current implementation can only have one read stream and one write stream open at a given point of execution because the global variables SDpfile and SDgfile can each only reference a single file at a time. Specifically:
(with-sd-card (s1 "f1") (with-sd-card s2 "f2" (read-byte s1)))
returns the first byte of the file f2, not the first byte of f1. As far as I can see, this restriction is not documented.
I think it's reasonable for uLisp to impose limitations, such as "only one read stream at a time", but it should give an error message rather than garbage output; for example "error: too many read streams open". What do you think?
Similarly for your point about with-sd-card.
I think I am fairly close to solving the "only one read stream at a time" problem using @dragoncoder047's wrapper suggestion. Instead of using a lambda, the idea is to use a class that overloads function calls. The current implementation of the class looks like this (it's a little messy because I'm still debugging it):
class gfun_t {
int (*next)();
bool haveLastChar = false;
int lastChar = 0;
public:
gfun_t(int (*n)()) : next(n) { }
int operator() () {
int ch;
if (haveLastChar) {
haveLastChar = false;
ch = lastChar;
}
else {
ch = next();
}
/* Serial.print("stream returning '");
Serial.print(char(ch));
Serial.println("'");*/
return ch;
}
void ungetc(int last) {
/* if (haveLastChar) signal error;*/
/* Serial.print("stream ungetc '");
Serial.print(char(last));
Serial.println("'");*/
lastChar = last;
haveLastChar = true;
}
};
It's a bit of a hack, but simple cases are working. I ran into a bit of difficulty with gserial which is sometimes called directly, and sometimes called through a gfun_t - I think it needs to be handled as a singleton to work properly. Similarly, gfun_t parameters need to be passed by reference, but I'm not sure I have fixed all of them. I'm pretty sure I'm going to have to start from scratch and make the changes again now that I have some experience with it. Does this approach look okay to you?
Overall, I agree that uLisp should impose some limitations. The limit of one with-sd-card read stream and one with-sd-card write stream seems reasonable to me. Having appropriate error messages would help.
This is a good solution, but I've been thinking of an alternative which might be simpler in the long run.
The Arduino stream functions, like Serial, SPI, and I2C, all inherit from the Stream class that supports a peek() function. The parser could use peek() to look ahead, rather than LastChar to backtrack, and each peek() would then automatically be distinct for each open stream.
It would involve totally rewriting the parser to use peek() rather than LastChar, and I would need to provide peek() for streams generated in uLisp, such as for read-from-string, but I think it should be possible.
Ah - if rewriting the parser is an option, then I'd be inclined to use peek(). Presumably peek() on a string stream would, more or less, be just be the initial part of a read() call that doesn't bump the index along(?).
I had another idea:
Instead of special forms (with-spi), (with-i2c), (with-serial), etc, I propose the interfaces be accessed using Unix-style device pseudo-files/folders that can be used to access the parameters.
For example,
(with-i2c (str 1 #x71 t) ...)-->(with-stream (str "/dev/i2c1" #x71 t) ...)(with-spi (str 2) ...)-->(with-stream (str "/dev/spi" 2) ...)(with-serial (str 3 96) ...)-->(with-stream (str "/dev/serial3" 96) ...)(with-sd-card (str "foo.txt" 2) ...)-->(with-stream (str "/sd/foo.txt" 2) ...)(with-gfx (str) ...)-->(with-stream (str "/dev/display") ...)(with-client (str "www.example.org" 80) ...)-->(with-stream (str "http://www.example.org:80") ...)
Each stream type could be registered with a prefix and a getter function, that returns a stream object which would be self contained as discussed above.
#define MAXSTREAMS 32
typedef object* (*openfun_t)(const char*, object*);
struct {
PGM_P prefix;
openfun_t opener;
} Openers[MAXSTREAMS];
int NumStreams = 0;
void registerstream (PGM_P prefix, openfun_t opener) {
if (NumStreams > MAXSTREAMS - 1) error2(PSTR("too many streams"));
Openers[NumStreams].prefix = prefix;
Openers[NumStreams].opener = opener;
NumStreams++;
}
object* getstream (object* args) {
char pathchars[BUFFERSIZE];
object* path = checkstring(first(args));
object* params = rest(args);
cstring(path, pathchars, BUFFERSIZE);
for (int i = 0; i < NumStreams; i++) {
PGM_P pr = Openers[i].prefix;
if (strncmp_P(pathchars, pr, strlen_P(pr)) == 0) return Openers[i].opener(pathchars, params);
}
error(PSTR("no such stream"), path);
return NULL;
}
// Full disclosure I have not tried to compile this
// then register all the streams
registerstream(PSTR("/dev/i2c"), [](const char* foo, object* params) -> object* {
// insert the code from (with-i2c)
// also bail if i2c is already open on that port
// streams can use sscanf()/strncpy()/atoi()/etc to extract parameters from the stream
});
// ETC
And that way, users could register their own streams for extensions too (and then #68 could be provided as an extension) in the same sort of way I provided addtable() in my fork.
Hmmm ... thanks, but I'll need to think about this.
Once the new streams API is implemented, I think (peek-char) (using the new equivalent of LastChar) would be a good addition as it allows people to implement their own LL(1) parsers - which would be necessary for a JSON parser. Lots of HTTP protocols use JSON to communicate but uLisp can't parse a JSON stream easily without using lookahead.
Yes, good idea.
How's the new stream thing coming? Still pre-pre-pre-pre-pre-pre-alpha stage, or is it more in progress?
Thanks for the reminder! Still on my to-do list, but low priority.
One other thing: I was wondering if something like CL's :if-does-not-exist argument to (open) could be mirrored in uLisp. It would be helpful in something like (with-i2c), because if you are implementing an I2C scanner, it would need :if-does-not-exist nil mode, but in other cases where you are expecting a device at that address and need to communicate with it, :if-does-not-exist :error would be a more appropriate mode.