Carp icon indicating copy to clipboard operation
Carp copied to clipboard

Allocation made in match-ref branch do not get freed properly

Open TimDeve opened this issue 5 years ago • 13 comments

When we create a value in the branch of a match-ref, delete is not called on that value.

Example with a string:

(deftype Sum One Two)

(defn main []
    (let [state @"Ok" sumt &(Sum.One)]
     (match-ref sumt
      Sum.One (println* &@&state)
      Sum.Two ())))
Emitted C code
int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    /* let */ {
        static String _6 = "Ok";
        String *_6_ref = &_6;
        String _7 = String_copy(_6_ref);
        String state = _7;
        Sum _11 = Sum_One();
        Sum* _12 = &_11; // ref
        Sum* sumt = _12;
        if(sumt->_tag == Sum_One_tag) {
            Sum* _15_temp = sumt;
            // Case expr:
            String* _24 = &state; // ref
            String _25 = String_copy(_24);
            String* _26 = &_25; // ref
            String _27 = String_str(_26);
            String* _28 = &_27; // ref
            IO_println(_28);
        }
        else if(sumt->_tag == Sum_Two_tag) {
            Sum* _15_temp = sumt;
            // Case expr:
            /* () */
        }
        else {
          // This will not be needed with static exhaustiveness checking in 'match' expressions:
          fprintf(stderr, "Unhandled case in 'match' expression at line 7, column 6 in '/home/tim/dev/sandbox/matc/main.carp'\n");
          exit(1);
        }
        String_delete(state);
        Sum_delete(_11);
    }
    return 0;
}

Example with a lambda:

(deftype Sum One Two)

(defn main []
  (let [state 0]
   (match-ref &(Sum.One)
    Sum.One (println* ((fn [] @&state)))
    Sum.Two ())))
Emitted C code
int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    /* let */ {
        int state = 0;
        Sum _10 = Sum_One();
        Sum* _11 = &_10; // ref
        if(_11->_tag == Sum_One_tag) {
            Sum* _11_temp = _11;
            // Case expr:
            // This lambda captures 1 variables: state
            _Lambda_NAKED_LAMBDA_23_env *_23_env = CARP_MALLOC(sizeof(_Lambda_NAKED_LAMBDA_23_env));
            _23_env->state = state;
            Lambda _23 = {
              .callback = (void*)_Lambda_NAKED_LAMBDA_23,
              .env = _23_env,
              .delete = (void*)_Lambda_NAKED_LAMBDA_23_env_delete,
              .copy = (void*)_Lambda_NAKED_LAMBDA_23_env_copy
            };
            int _24 = _23.env ? ((int(*)(LambdaEnv))_23.callback)(_23.env) : ((int(*)())_23.callback)();
            String _25 = Int_str(_24);
            String* _26 = &_25; // ref
            IO_println(_26);
        }
        else if(_11->_tag == Sum_Two_tag) {
            Sum* _11_temp = _11;
            // Case expr:
            /* () */
        }
        else {
          // This will not be needed with static exhaustiveness checking in 'match' expressions:
          fprintf(stderr, "Unhandled case in 'match' expression at line 7, column 4 in '/home/tim/dev/sandbox/matc/main.carp'\n");
          exit(1);
        }
        Sum_delete(_10);
    }
    return 0;
}

TimDeve avatar Dec 11 '20 23:12 TimDeve

Might be caused by #876

TimDeve avatar Dec 11 '20 23:12 TimDeve

Sorry! I think you are right about #876. I added these 2 lines to emitCaseEnd: https://github.com/leblowl/Carp/blob/59758c33f82c9a6c8cf81fece3caaa1adfc0d64b/src/Emit.hs#L374-L375 which causes it to only delete variables when we match by value. But I don't think that's the only time we want to delete variables in a case expression. I think we want to delete variables in a case expression when we match by value AND by ref. However, addressing #843, we do not want to delete a reference passed to the match. I hope that makes sense and my understanding is correct.

Looking at the code more, I am wondering why the reference (e.g. in #843) is added to the deleters in the first place. Maybe the issue for #843 is somewhere in here: https://github.com/carp-lang/Carp/blob/master/src/Memory.hs#L334

leblowl avatar Sep 21 '21 05:09 leblowl

Yeah, you're right - we need to clean up in the end of the branch no matter if it's a match/match-ref. So we need some more clever fix to the original bug it seems.

eriksvedang avatar Sep 21 '21 08:09 eriksvedang

And yeah, seems like #843 should be solved by not adding the deleter in the first place, I think you're definitely on the right track.

eriksvedang avatar Sep 21 '21 08:09 eriksvedang

So I think I have a solution to this problem and #843 ... after removing the broken code from #876, I've added this: https://github.com/leblowl/Carp/commit/fa28bef6d1342243ebbdd26cd3b29b630c8ad35c ... basically, when managing memory, in a match case, if the symbol on the left hand side has a name that starts with "wildcard_" then we skip memory management for that symbol. This follows a similar pattern already in place for references:

visitCaseLhs (XObj Ref _ _) =
  pure (Right [])

I don't really know my way around the codebase much yet or understand much of the underlying design, so I am guessing a bit. But another solution might have to do with isVarName, used in another existing visitCaseLhs on this line: https://github.com/carp-lang/Carp/blob/bd553fb78e6bfe1def7976176cfff889306b6e20/src/Memory.hs#L437

Perhaps isVarName could return false when given a wildcard name? I don't know if that makes sense semantically. The docstring for isVarName states "Is this symbol name appropriate for a normal variable (i.e. NOT a type name or sumtype tag)" (https://github.com/carp-lang/Carp/blob/bd553fb78e6bfe1def7976176cfff889306b6e20/src/Obj.hs#L1077). Is a wildcard name appropriate for a normal variable?

If isVarName returned false when given a wildcard name, then no new visitCaseLhs would need to be added specifically to handle the wildcard, which might be cleaner/simpler.

Also this is probably a silly question, but this is the first compiler I've worked on and I am wondering, why are we creating a wildcard symbol in the first place? What purpose does it serve?

Here is the xobj for the wildcard symbol for reference:

XObj {xobjObj = Sym wildcard_30 (LookupLocal NoCapture),
      xobjInfo = Just (Info {infoLine = 8, infoColumn = 6, infoFile = "REPL",
                             infoDelete = Set {unSet = fromList []},
                             infoIdentifier = 30}),
      xobjTy = Just Example}

Also, the way this issue manifests in the match expression is pretty interesting, due to this piece of code in Emit.hs: https://github.com/carp-lang/Carp/blob/bd553fb78e6bfe1def7976176cfff889306b6e20/src/Emit.hs#L432-L439

Given this code:

(deftype Example
 One
 (Two [String]))

(defn main []
  (let-do [ex [(Example.Two @"OKOK")]]
    (match-ref (Array.unsafe-nth &ex 0)
     (Example.Two s) (println* s)
     _ ())))

It emits the following for the wildcard case:

...
         else if(true) {
             Example* _19_temp = _19;
             Example* wildcard_30 = _19_temp;
             /* () */
             Example_delete(wildcard_30);
         }

So it assigns a reference of the match variable to the wildcard variable and then deletes the wildcard variable, thus deleting the match variable. I wonder what purpose these lines of code serve. But that might be off topic for this issue.

leblowl avatar Oct 03 '21 04:10 leblowl

After thinking about it more, I think I've answered a few of my questions. I think the wildcard symbol is a valid symbol and can be used just like any other symbol, right? And it makes sense that we are assigning the match variable to the wildcard variable, since the wildcard represents the match variable, that's just how a match expression works.

Also it seems like there may be times when we want to delete the wildcard, if it represents a value, not a ref, right?

So maybe the solution would be to only add the deleter for the wildcard symbol if it's not a ref? And in that case, I am not totally sure how that would look. The xobj for the wildcard says it's a Sym not a Ref. So the xobj itself doesn't tell us that information. Should it? Or maybe we look at the Match variable when visiting the match itself and pass that to visitCaseLhs so we can tell whether the the match variable is a ref or not... if it's not, we delete the wildcard.

Looking at the full emitted code for the example in my previous comment:

9206 int main(int argc, char** argv) {
9207     carp_init_globals(argc, argv);                                                                                                                                                                                                                                         9208     /* let */ {
9209         Array _10 = { .len = 1, .capacity = 1, .data = CARP_MALLOC(sizeof(Example) * 1) };
9210         static String _7 = "OKOK";
9211         String *_7_ref = &_7;
9212         String _8 = String_copy(_7_ref);
9213         Example _9 = Example_Two(_8);
9214         ((Example*)_10.data)[0] = _9;
9215         Array__Example ex = _10;
9216         Array__Example* _17 = &ex; // ref
9217         Example* _19 = Array_unsafe_MINUS_nth__Example(_17, 0);
9218         if(_19->_tag == Example_Two_tag) {
9219             Example* _19_temp = _19;
9220             String* s = &_19_temp->Two.member0;
9221             // Case expr:
9222             String _27 = String_str(s);
9223             String* _28 = &_27; // ref
9224             IO_println(_28);
9225             String_delete(_27);
9226         }
9227         else if(true) {
9228             Example* _19_temp = _19;
9229             Example* wildcard_30 = _19_temp;
9230             /* () */
9231             Example_delete(wildcard_30);
9232         }
9233         else {
9234           // This will not be needed with static exhaustiveness checking in 'match' expressions:
9235           fprintf(stderr, "Unhandled case in 'match' expression at line 7, column 5 in '/mnt/c/Users/tym0/dev/sandbox/match-ref/main.carp'\n");
9236           exit(1);
9237         }
9238         Array_delete__Example(ex);
9239     }
9240 }

what should this look like? Of course we don't want the wildcard deletion there, but what about the other case. Should the string delete on line 9225 be there? Is that deleting a string member of _19? Or is the string copied? I am wondering if this is a general issue with case symbols and ref matches, or just an issue with wildcards and ref matches.

leblowl avatar Oct 03 '21 05:10 leblowl

Sorry for the verbosity, maybe I should use the gitter chat? But thinking about this more, it seems like a more general issue than just with wildcards. Looking at the example in the language guide:

(match-ref &might-be-a-string
  (Just s) (IO.println s)
  Nothing (IO.println "Got nothing"))

We wouldn't want to delete s in this case either, right? So I think the solution would be: if we are matching on a ref then do not delete any left-hand-side symbols. Does this sound right?

What this solution might look like is this: https://github.com/leblowl/Carp/commit/979c8c546675a8f67106bfa8e5c5056a4b753ac4

Here's what it looks like with various test cases:

(deftype Example
 One
 (Two [String]))

(defn main []
  (let-do [ex [(Example.Two @"OKOK")]]
    (match-ref (Array.unsafe-nth &ex 0)
     (Example.Two s) (println* s)
     _ ())))
Emitted C code
int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    /* let */ {
        Array _10 = { .len = 1, .capacity = 1, .data = CARP_MALLOC(sizeof(Example) * 1) };
        static String _7 = "OKOK";
        String *_7_ref = &_7;
        String _8 = String_copy(_7_ref);
        Example _9 = Example_Two(_8);
        ((Example*)_10.data)[0] = _9;
        Array__Example ex = _10;
        Array__Example* _17 = &ex; // ref
        Example* _19 = Array_unsafe_MINUS_nth__Example(_17, 0);
        if(_19->_tag == Example_Two_tag) {
            Example* _19_temp = _19;
            String* s = &_19_temp->u.Two.member0;
            // Case expr:
            String _27 = String_str(s);
            String* _28 = &_27; // ref
            IO_println(_28);
            String_delete(_27);
        }
        else if(true) {
            Example* _19_temp = _19;
            Example* wildcard_30 = _19_temp;
            /* () */
        }
        else UNHANDLED("REPL", 6);
        Array_delete__Example(ex);
    }
    return 0;
}
(deftype Sum One Two)

(defn main []
    (let [state @"Ok" sumt &(Sum.One)]
     (match-ref sumt
      Sum.One (println* &@&state)
      Sum.Two ())))
Emitted C code
int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    /* let */ {
        static String _6 = "Ok";
        String *_6_ref = &_6;
        String _7 = String_copy(_6_ref);
        String state = _7;
        Sum _11 = Sum_One();
        Sum* _12 = &_11; // ref
        Sum* sumt = _12;
        if(sumt->_tag == Sum_One_tag) {
            Sum* _15_temp = sumt;
            // Case expr:
            String* _25 = &state; // ref
            String _26 = String_copy(_25);
            String* _27 = &_26; // ref
            String _28 = String_str(_27);
            String* _29 = &_28; // ref
            IO_println(_29);
            String_delete(_26);
            String_delete(_28);
        }
        else if(sumt->_tag == Sum_Two_tag) {
            Sum* _15_temp = sumt;
            // Case expr:
            /* () */
        }
        else UNHANDLED("REPL", 12);
        String_delete(state);
        Sum_delete(_11);
    }
    return 0;
}
(deftype Sum One Two)

(defn main []
  (let [state 0]
   (match-ref &(Sum.One)
    Sum.One (println* ((fn [] @&state)))
    Sum.Two ())))
Emitted C code
int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    /* let */ {
        int state = 0;
        Sum _10 = Sum_One();
        Sum* _11 = &_10; // ref
        if(_11->_tag == Sum_One_tag) {
            Sum* _11_temp = _11;
            // Case expr:
            // This lambda captures 1 variables: state
            _Lambda_NAKED_LAMBDA_24_env_ty *_24_env = CARP_MALLOC(sizeof(_Lambda_NAKED_LAMBDA_24_env_ty));
            _24_env->state = state;
            Lambda _24 = {
              .callback = (void*)_Lambda_NAKED_LAMBDA_24_env,
              .env = _24_env,
              .delete = (void*)_Lambda_NAKED_LAMBDA_24_env_ty_delete,
              .copy = (void*)_Lambda_NAKED_LAMBDA_24_env_ty_copy
            };
            int _25 = _24.env ? ((int(*)(LambdaEnv))_24.callback)(_24.env) : ((int(*)())_24.callback)();
            String _26 = Int_str(_25);
            String* _27 = &_26; // ref
            IO_println(_27);
            Function_delete__int(_24);
            String_delete(_26);
        }
        else if(_11->_tag == Sum_Two_tag) {
            Sum* _11_temp = _11;
            // Case expr:
            /* () */
        }
        else UNHANDLED("REPL", 19);
        Sum_delete(_10);
    }
    return 0;
}
(deftype Example
 One
  (Two [(Array String)]))

(defn main []
  (match-ref &(Just (Example.Two [@"OKOK"]))
   (Just s) ()
   _ ())
  )
Emitted C code
int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    Array _10 = { .len = 1, .capacity = 1, .data = CARP_MALLOC(sizeof(String) * 1) };
    static String _8 = "OKOK";
    String *_8_ref = &_8;
    String _9 = String_copy(_8_ref);
    ((String*)_10.data)[0] = _9;
    Example _11 = Example_Two(_10);
    Maybe__Example _12 = Maybe_Just__Example(_11);
    Maybe__Example* _13 = &_12; // ref
    if(_13->_tag == Maybe__Example_Just_tag) {
        Maybe__Example* _13_temp = _13;
        Example* s = &_13_temp->u.Just.member0;
        // Case expr:
        /* () */
    }
    else if(true) {
        Maybe__Example* _13_temp = _13;
        Maybe__Example* wildcard_18 = _13_temp;
        /* () */
    }
    else UNHANDLED("REPL", 27);
    Maybe_delete__Example(_12);
    return 0;
}
(deftype Example
 One
  (Two [(Array String)]))

(defn main []
  (match (Just (Example.Two [@"OKOK"]))
   (Just s) ()
   _ ())
  )
Emitted C code
int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    Array _9 = { .len = 1, .capacity = 1, .data = CARP_MALLOC(sizeof(String) * 1) };
    static String _7 = "OKOK";
    String *_7_ref = &_7;
    String _8 = String_copy(_7_ref);
    ((String*)_9.data)[0] = _8;
    Example _10 = Example_Two(_9);
    Maybe__Example _11 = Maybe_Just__Example(_10);
    if(_11._tag == Maybe__Example_Just_tag) {
        Maybe__Example _11_temp = _11;
        Example s = _11_temp.u.Just.member0;
        // Case expr:
        /* () */
        Example_delete(s);
    }
    else if(true) {
        Maybe__Example _11_temp = _11;
        Maybe__Example wildcard_16 = _11_temp;
        /* () */
        Maybe_delete__Example(wildcard_16);
    }
    else UNHANDLED("REPL", 36);
    return 0;
}

leblowl avatar Oct 03 '21 05:10 leblowl

So I think the solution would be: if we are matching on a ref then do not delete any left-hand-side symbols. Does this sound right?

I think this is the right way to think about it, yes.

If you see the whole match as a function, you want it to consume its argument fully, and leave it completely untouched when it's a match-ref.

Since match will "split up" the value, it needs to make sure each such sub-element is consumed/deleted within its scope.

I do not have time to read through the C code in the final post right now, but will try to do that later today!

eriksvedang avatar Oct 04 '21 07:10 eriksvedang

@leblowl I've read through the emitted code now and it seems correct. There is one situation that would be good to check with your branch, something like this:

(deftype Example (A [String String]))

(defn f []
  (let-do [e (Example.A @"x" @"y")]
    (match e
      (Example.A a b) (Unsafe.leak a))
    ;; <*>
  ))

This should delete b but leave a since it has already been consumed by leak. Also, using e at <*> should be an error (e.g. (println* &e)) Both of these things currently hold in master.

eriksvedang avatar Oct 05 '21 13:10 eriksvedang

@eriksvedang thanks! I was wondering if there are still some test cases I haven't considered.

(deftype Example (A [String String]))

(defn f []
  (let-do [e (Example.A @"x" @"y")]
    (match e
      (Example.A a b) (Unsafe.leak a))
    ;; <*>
  ))
Emitted C code
void f() {
    /* let */ {
        static String _7 = "x";
        String *_7_ref = &_7;
        String _8 = String_copy(_7_ref);
        static String _10 = "y";
        String *_10_ref = &_10;
        String _11 = String_copy(_10_ref);
        Example _12 = Example_A(_8, _11);
        Example e = _12;
        if(e._tag == Example_A_tag) {
            Example _16_temp = e;
            String a = _16_temp.u.A.member0;
            String b = _16_temp.u.A.member1;
            // Case expr:
            Unsafe_leak__String(a);
            String_delete(b);
        }
        else UNHANDLED("REPL", 46);
    }
}

That looks good to me.

And the error still holds:

> (defn f []
(  (let-do [e (Example.B [@"x"])]
((    (match e
(((      (Example.A a b) (Unsafe.leak a))
((    (println* &e)
((    ;; <*>
((  ))
You’re referencing a given-away value `e` at line 65, column 16 in 'REPL'
XObj {xobjObj = Sym e (LookupLocal NoCapture), xobjInfo = Just (Info {infoLine = 65, infoColumn = 16, infoFile = "REPL", infoDelete = Set {unSet = fromList []}, infoIdentifier = 27}), xobjTy = Just Example}

You’ll have to copy the value using `@`. at REPL:61:2.

Traceback:
  (defn f [] (let-do [e (Example.B [(copy "x")])] (match e (Ex...) at REPL:61:1.

Exploring a bit, I also tried:

(deftype Example
  (A [String String])
  (B [(Array String)]))

(defn f []
  (let-do [e (Example.B [@"x"])]
    (match e
      (Example.A a b) (Unsafe.leak a))
    ;; <*>
  ))

Which surprisingly doesn't cause a compile error and appears to leak:

Emitted C code
void f() {
    /* let */ {
        Array _9 = { .len = 1, .capacity = 1, .data = CARP_MALLOC(sizeof(String) * 1) };
        static String _7 = "x";
        String *_7_ref = &_7;
        String _8 = String_copy(_7_ref);
        ((String*)_9.data)[0] = _8;
        Example _10 = Example_B(_9);
        Example e = _10;
        if(e._tag == Example_A_tag) {
            Example _14_temp = e;
            String a = _14_temp.u.A.member0;
            String b = _14_temp.u.A.member1;
            // Case expr:
            Unsafe_leak__String(a);
            String_delete(b);
        }
        else UNHANDLED("REPL", 56);
    }
}

but this is might be considered a different bug.

And then I was wondering if there is anyway to destructure nested arrays, but I guess the match destructuring is only for sum types right now?

And then I was also wondering if there is any leaking in the sum type struct itself. I couldn't figure out how to view the emitted C code for a deftype expression, but it doesn't look like there is anything malloc'd in this code: https://github.com/carp-lang/Carp/blob/bd553fb78e6bfe1def7976176cfff889306b6e20/src/Emit.hs#L842

leblowl avatar Oct 05 '21 16:10 leblowl

That example will crash with a failed match so that doesn't really qualify as a leak, right? Or am I missing something :)

Destructruring arrays (and other things) would be great but should probably be added later when this is 100% solid for sum types.

It shouldn't be necessary to free anything of the struct itself, it's just the members and the tag (char). It's a little weird that it just is implicitly destroyed though. In that sense you can use match as a deleter, kinda!

eriksvedang avatar Oct 05 '21 17:10 eriksvedang

Oh, I didn't actually run the function:

> (f)
Unhandled case in 'match' expression at REPL:6
[RUNTIME ERROR] 'out/Untitled' exited with return value -6.

Would it still leak inside the REPL?

leblowl avatar Oct 05 '21 17:10 leblowl

Would it still leak inside the REPL?

Since the leaking program is run in a sub process, the OS will clean it up. So for now, we're good! If we ever have in-process compilation (e.g. through LLVM) this would become a problem though.

eriksvedang avatar Oct 05 '21 18:10 eriksvedang