ulisp-esp icon indicating copy to clipboard operation
ulisp-esp copied to clipboard

Error in `markobject` causing segfault?

Open dragoncoder047 opened this issue 1 year ago • 32 comments

I'm using uLisp with a few custom functions and patches provided by other people and myself. I ran some code and got a Guru Meditation Error, and the backtrace pointed to the obj = cdr(obj) line in markobject that handles strings and longsymbols:

void markobject (object *obj) {
  MARK:
  if (obj == NULL) return;
  if (marked(obj)) return;

  object* arg = car(obj);
  unsigned int type = obj->type;
  mark(obj);

  // snip for brevity

  if ((type == STRING) || (type == SYMBOL && longsymbolp(obj))) {
/*Crashed here>>>>>>>*/    obj = cdr(obj);
/*Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.*/
    while (obj != NULL) {
      arg = car(obj);
      mark(obj);
      obj = arg;
    }
  }
}

Do you know what is going on here? It may be related to the custom patches I added so if that is where the problem lies I must have made a mistake somewhere applying the patches.

The fact that it is crashing getting the cdr of an object means the object's cdr is probably null and it tried to dereference a null pointer. Any other ideas? How do I fix it?

dragoncoder047 avatar Aug 15 '22 13:08 dragoncoder047

Version 4.0 of uLisp made quite a substantial change to the way that symbols are implemented, using the same representation for long symbols as for strings, making the symbol table unnecessary.

The line where you're getting the error:

if ((type == STRING) || (type == SYMBOL && longsymbolp(obj))) {

is checking for a string, or a symbol that is using the new long symbol representation.

Are you sure that the patches you've incorporated are compatible with Version 4.0 of uLisp? If you send me the additions I can have a quick look at them to see if there's anything obvious that might be causing the problem.

David

technoblogy avatar Aug 15 '22 15:08 technoblogy

Some of the patches were for an older version of uLisp, and I did have to make some judicious changes to the code to make it work. I figure I probably made an error there. I'll work on getting a fork of ulisp-esp and pasting in my changes so you can look at the diff.

Also, the error was not on the if line, it was the line immediately following that, obj = cdr(obj).

dragoncoder047 avatar Aug 15 '22 15:08 dragoncoder047

Have a fork now: https://github.com/dragoncoder047/ulisp-esp32

Specifically what is happening to cause the crash is I go here and try to run the code Dave Astels provided with his macro functions. The actual definitions run fine, but when it gets to (logger:define-level :emerg 1) things start getting weird. When it runs it successfully finishes and echoes log:emerg to the screen, but then repl does a garbage collect and suddenly it crashes. That's my error.

dragoncoder047 avatar Aug 15 '22 15:08 dragoncoder047

I would suspect the Dave Astels code, as it was written for a much earlier version of uLisp.

If the crash only occurs on a garbage collection, the most likely explanation is that some code is creating a temporary Lisp structure in memory that gets garbage collected. The solution is to push a pointer to the temporary structure on GCStack, and then pop it off again later. For examples see functions such as fn_sort().

technoblogy avatar Aug 15 '22 16:08 technoblogy

aha. I will check for memory un-leaks and push to the GCStack. (Did uLisp 3 not have a GCStack?)

dragoncoder047 avatar Aug 15 '22 17:08 dragoncoder047

GCStack has been there from the beginning, but perhaps a problem has arisen due to the change to long symbols.

technoblogy avatar Aug 15 '22 17:08 technoblogy

I found some recursive calls and added a push and pop pair around them. Testing now...

Also, how judicious do I need to be with the GCStack? (i.e. When/where would it be necessary to push an object to it?)

dragoncoder047 avatar Aug 15 '22 17:08 dragoncoder047

arduino IDE is having issues right now connecting to my esp32. Pushing changes anyway so you can have a look.

dragoncoder047 avatar Aug 15 '22 17:08 dragoncoder047

Also, how judicious do I need to be with the GCStack? (i.e. When/where would it be necessary to push an object to it?)

You need to push an object to GCStack if the garbage collector wouldn't otherwise find it and mark it, and if you are calling eval() while the object is being created, because eval() might invoke the garbage collector.

technoblogy avatar Aug 15 '22 19:08 technoblogy

Well then, I guess I hit everything. I don't know whether it works because my ESP32 is still having issues connecting, but I suppose it will.

Another unrelated idea: Another program I use (Golly) has dynamic garbage collection, and instead of using a GCStack to prevent temporary values from being garbage collected, it has a global flag that when set, prevents garbage collection altogether (i.e. this). This might help in time-critical operations (such as (time) which shouldn't be affected by a random garbage collection).

dragoncoder047 avatar Aug 15 '22 21:08 dragoncoder047

I got the arduino IDE going again and made a few more changes... pushed them a few minutes ago. Still get the same error during a gc. Do you see any more places to put pushes to GCStack, or is this actually a bug in the garbage collector?

dragoncoder047 avatar Aug 18 '22 17:08 dragoncoder047

Are you getting errors during GC with a vanilla copy of ESP uLisp Version 4.1, or only when you add third-party extensions?

is this actually a bug in the garbage collector?

I'm not aware of any bug in the garbage collector, and I've done extensive testing.

technoblogy avatar Aug 18 '22 18:08 technoblogy

Are you getting errors during GC with a vanilla copy of ESP uLisp Version 4.1, or only when you add third-party extensions?

No, vanilla uLisp is bug-free. This issue originally started because I naively assumed that all the third-party extensions included the proper garbage collection code, and it's obvious now they don't. Either Dave Astels didn't put in the proper garbage collection code in the first place, or something else has changed. I'm inclined to assume the latter.

I should probably change the title of this issue now that I know it's a bug in my code and not in yours. I don't know what to change it to, so if you do, please change it.

dragoncoder047 avatar Aug 18 '22 19:08 dragoncoder047

I've added a question mark.

technoblogy avatar Aug 18 '22 20:08 technoblogy

Now I'm hesitant to ask you this, but I still have no idea what is going wrong. I have simplified Dave Astels' code down to this:

(defmacro foo (aa bb) `(defmacro ,aa () `(princ ,,bb)))
(foo 'bar "baz") ; << crashes here
(bar) ; << Should print "baz"

which produces the same error. It appears to be due to the double nested quasiquotes, so I added GCStack pushes to the recursive calls in process_quasiquoted but to no avail.

Even from looking at fn_sort it's unclear how GCStack should be handled with recursive functions such as process_quasiquoted and I know I missed something, but for the life of me I cannot figure out what. Would you mind taking a look at the modifications I have made an see if you can spot where I am missing GCStack pushes? You're more experienced at knowing when and when not this would be necessary than I am.

dragoncoder047 avatar Aug 18 '22 23:08 dragoncoder047

Are you sure it's garbage collection that's at fault, and not the code itself. Disable the garbage collector by commenting out, in repl():

//gc(NULL, env);

and in eval():

//if (Freespace <= WORKSPACESIZE>>4) gc(form, env);

and run the code with plenty of memory to make sure it works OK then.

technoblogy avatar Aug 19 '22 06:08 technoblogy

Are you sure it's garbage collection that's at fault, and not the code itself?

I'm pretty sure it is, considering that my 'minimal example' above prints out bar and then crashes, and the backtrace always points to that same line in markobject() I mentioned in the OP, but what the heck, I'll try that.

dragoncoder047 avatar Aug 19 '22 13:08 dragoncoder047

I'm not very good at debugging other people's code, and I don't really understand Dave Astel's routines, but I had a look at it and couldn't see anything obviously wrong.

technoblogy avatar Aug 19 '22 13:08 technoblogy

Aha! Maybe is something else. I also uncommented the Serial.println debugging statements and they produced this output:

8990> (defmacro foo (aa bb) `(defmacro ,aa () `(princ ,,bb)))
foo

8956> (foo 'bar "baz")
**** Processing quasiquote of : (defmacro (unquote aa) nil (quasiquote (princ (unquote (unquote bb)))))
**** at level 1
Processing something else
**** At level 1
**** Processing quasiquote of : defmacro
**** at level 1
**** Processing quasiquote of : (unquote aa)
**** at level 1
**** Processing UNQUOTE
**** At level 1
**** Processing quasiquote of : aa
**** at level 1
**** Result: bar
**** Processing quasiquote of : nil
**** at level 1
**** Processing quasiquote of : (quasiquote (princ (unquote (unquote bb))))
**** at level 1
nested quasiquote
**** Processing quasiquote of : (princ (unquote (unquote bb)))
**** at level 2
Processing something else
**** At level 2
**** Processing quasiquote of : princ
**** at level 2
**** Processing quasiquote of : (unquote (unquote bb))
**** at level 2
**** Processing UNQUOTE
**** At level 2
**** Processing quasiquote of : (unquote bb)
**** at level 1
**** Processing UNQUOTE
**** At level 1
**** Processing quasiquote of : bb
**** at level 1
**** Result: "baz"
**** parts: (((Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.

Backtrace now points to plist>printobject>plist>printobject>printsymbol>psymbol>this line in plispstr:

void plispstr (symbol_t name, pfun_t pfun) {
  object *form = (object *)name;
  while (form != NULL) { // <<<<<<<<<<<<<<<<<<<<<<<<<<<<<< This line
    int chars = form->chars;
    for (int i=(sizeof(int)-1)*8; i>=0; i=i-8) {
      char ch = chars>>i & 0xFF;
      if (tstflag(PRINTREADABLY) && (ch == '"' || ch == '\\')) pfun('\\');
      if (ch) pfun(ch);
    }
    form = car(form);
  }
}

Well, now there's either two bugs or something isn't being initialized properly.

dragoncoder047 avatar Aug 19 '22 13:08 dragoncoder047

I'm not very good at debugging other people's code, and I don't really understand Dave Astel's routines, but I had a look at it and couldn't see anything obviously wrong.

Now that you've said that, maybe @dastels can lend a little help porting his macro code to uLisp 4 because he'd be the one who knows what's going on.

dragoncoder047 avatar Aug 19 '22 13:08 dragoncoder047

And I also forgot to mention this earlier -- @technoblogy if it is possible (I can't) this issue should probably be moved to my repository as opposed to this one because it's obviously a bug in my code and not yours.

dragoncoder047 avatar Aug 19 '22 14:08 dragoncoder047

this issue should probably be moved to my repository

Sorry, I'm not sure how to do that in GitHub.

technoblogy avatar Aug 19 '22 14:08 technoblogy

Sorry, I'm not sure how to do that in GitHub.

I have been able to move issues between other repositories I own (there would be a "-> Transfer this issue to another repository" option at the top right of the thread) but you may have to own or have write access to the repository you are trying to transfer it to. I wouldn't know.

dragoncoder047 avatar Aug 19 '22 14:08 dragoncoder047

I took another look and noticed some missing GCStack pushes in the section added to eval() that handles macros, and yet it still produces this bug. @dastels, would you mind lending some insight as to why your macro functionality which worked in uLisp 3 suddenly doesn't in uLisp 4, and how I may be able to fix it?

dragoncoder047 avatar Aug 20 '22 16:08 dragoncoder047

GCStack has been there from the beginning, but perhaps a problem has arisen due to the change to long symbols.

Now that I think of it, long symbols existed in @dastels's code because his examples used symbols longer than 3 characters. I think I found a mandelbug.

dragoncoder047 avatar Sep 22 '22 21:09 dragoncoder047

From Version 1.8 onwards, platforms with more than 2Kbytes of RAM could have long symbols, using a symbol table. From Version 4.0 onwards I eliminated the need for the symbol table by using uLisp's string handling to store the long symbols in the workspace. It's the Version 4.0 change that I think may have broken @dastel's code.

Are you sure that @dastel's code was finished and fully functional? Do you have examples showing it working on an earlier version of uLisp? If not, I'm not sure it's worth the effort to try and get it working.

PS I don't believe in mandelbugs!

technoblogy avatar Sep 23 '22 07:09 technoblogy

Are you sure that @dastel's code was finished and fully functional? Do you have examples showing it working on an earlier version of uLisp? If not, I'm not sure it's worth the effort to try and get it working.

According to what he said on the forum, yes it did work on the older version. My "pared-down example" above is virtually the same -- the gist of it is it has two nested quasiquotes. I think the problem is somewhere in there.

On second thought, and considering that, again, my example prints bar successfully and then crashes, might mean the bug may not be in process_quasiquoted at all.

dragoncoder047 avatar Sep 23 '22 11:09 dragoncoder047

I don't understand that example, and it gives an error on LispWorks Common Lisp.

technoblogy avatar Sep 23 '22 13:09 technoblogy

it gives an error on LispWorks Common Lisp.

Probably because:

  • he used @ as a replacement for ,@ since he didn't bother with the the LastChar jiggery needed for that;
  • Full Common Lisp thinks it's trying to write to a package logger: which doesn't exist.

I tried my minimal example on an online GNU Clisp runner, and it said The name of a macro must be a symbol, not 'BAR. Perhaps Dave's code is bugged in this regard?

dragoncoder047 avatar Sep 23 '22 13:09 dragoncoder047

I don't quite understand why you're keen to get this working.

technoblogy avatar Sep 23 '22 13:09 technoblogy