portable `(save-image)`
Currently the way save-image works is it just compacts the used cells into the smallest contiguous block of memory possible and then dumps the workspace as a binary blob. If you're saving this blob to an SD card and you want to use it between multiple platforms running different revisions of uLisp, it won't work (memory/workspace locations, sizeof(void*) differences, builtin symbol indexes, etc)
Could a portable format for storing "compressed" images be developed? Then if the save-image source is the SD card, it can be saved to the portable format. That way if I'm developing a program on one microcontroller, get it to work, and then transfer the image to another microcontroller with a different feature set, it will still work.
My idea for a portable format would be to save the bit size of the microcontroller, so pointers can be resized appropriately, and then only save the offset of the object from the beginning of the array, so it can be loaded regardless of what the value of &Workspace[0] is (i.e. instead of writing (uintptr_t)obj, write (((uintptr_t)obj - &Workspace[0]) / sizeof(struct sobject))).
The other problem is it will still clobber over existing code when it is loaded. If only there were a way to compile a module file on a SD card, and then only load the compiled binary image of the module without clobbering the existing code!
Hi, good to hear from you again.
This is a great idea, but I can think of a number of things that would make it very hard to implement, some of which you've identified. One is that the builtin symbols are different between different versions and revisions of uLisp. I wonder if it's worth the effort?
An alternative is be to do (pprintall) to an SD card, and then eval it all back in when you want to load it. That also has the advantage that it doesn't clobber over existing code (assuming no name clashes). This could be made simpler with a couple of extra functions.
What I was going for is to be able to only store the (compiled, compressed) images on the SD card, and not have to store the source code. Kind of the same way Python reads .py files and writes the compiled bytecode to .pyc files which it uses if the original source hasn't changed. A quirk of this is you can distribute the .pyc file only and it will run fine!
The problem of the builtin symbols being different can be handled by storing all symbols as a non-builtin symbol, i.e. base40 or long symbol. The loader than can check each symbol and update it if it's a builtin. That way, it can properly report Error: undefined: some-system-specific-thing when the code is moved to a platform that doesn't have some-system-specific-thing built in.
To be able to handle loading a file and not clobbering the existing workspace, it could also be serialized only in reference to itself. You could probably use something like cl-conspack as a starting point...
If you feel like tackling it I think it would be a great addition to uLisp! I'm currently preoccupied with designing a good Lisp screen editor for the T-Deck: LilyGO T-Deck uLisp Machine.
Okay, here's a draft:
object *resurrect(gfun_t gfun) {
unsigned char op = gfun();
switch (op) {
case PAIR: return cons(resurrect(gfun), resurrect(gfun));
case STRING: return readstring('\0', gfun);
case NUMBER: return number(/* read number of bytes of a float */);
case 0: return nil;
// add more here etc.
default: error2(PSTR("corrupted object dump"));
}
}
The dump function would be the reverse, basically, unfortunately it would recurse infinitely on a circular object. Going to have to think about how to detect it and either bail or properly set the reference.
Create circular objects at your own peril!
I think I figured out how to automatically bail on circular objects:
bool dump(object *obj, pfun_t pfun) {
if (obj == nil) {
pfun(0);
return true;
}
if (marked(obj)) return false;
int type = obj->type;
object *aaa = car(obj);
mark(obj);
switch (type) {
case STRING:
pfun(STRING);
unmark(obj);
prin1object(obj, pfun);
break;
// etc.
default:
pfun(PAIR);
if (!dump(aaa, pfun)) goto error;
if (!dump(cdr(obj), pfun)) goto error;
break;
}
unmark(obj);
return true;
error:
unmark(obj);
return false;
}
It would function like the existing markobject() and stop recursing and return false if it finds a circular reference. It can't throw an error because that would longjmp out and the objects wouldn't get unmarked.
Great!
Looking at other object-serialization libraries that support circular objects, I think I came up with a solution that CAN correctly serialize circular objects.
First, function size_t pointers2 (object *obj) that scans the Workspace and counts the number of pointers that point to obj.
The serializer would do a couple of extra checks when it serializes a cons pair:
- If the object is in the already-seen assoc list (i.e.
assoc(obj, seen)is not nil) it would emit aBACKREFopcode with the key value in the list and stop. (the key can just be the memory address of the object itself to guarantee it's unique when the deserializer sees it.) - If it is not in the seen list, it would check to see if the object has multiple pointers to it (i.e.
pointers2(obj) > 1) and if so it would put it in the list and emit aCIRCULAR_PAIRopcode instead of the usualPAIR.
The good part about this is that it will also preserve object identity -- even if an object is not circular, it will still allow an object to be referenced multiple times. That is, the list '(#1=(a) #1#), while it doesn't directly point to itself and will normally be printed as ((a) (a)), you'll still get ((b) (b)) if you do (setf (caar x) 'b), and this will be preserved when the object is deserialized.