k-legacy icon indicating copy to clipboard operation
k-legacy copied to clipboard

out of memory error in parsing

Open daejunpark opened this issue 8 years ago • 8 comments

Parsing doesn't scale well for Bag, Set, and Map. For example, a bag with only 16 elements requires ~6GB memory to be parsed!

To reproduce:

run the following command:

K_OPTS=-Xmx6g kompile -v --debug --syntax-module A a.k 

against the following K definition:

module A

configuration <T>
  <k> @InitConfig </k>
  <objs>
    <obj multiplicity="*"> .K </obj>
  </objs>
</T>

syntax KItem ::= "@InitConfig"
rule <k> @InitConfig => . ... </k>
     <objs>
       (.Bag =>
            <obj> 1 </obj>
            <obj> 2 </obj>
            <obj> 3 </obj>
            <obj> 4 </obj>
            <obj> 5 </obj>
            <obj> 6 </obj>
            <obj> 7 </obj>
            <obj> 8 </obj>
            <obj> 9 </obj>
            <obj> 10 </obj>
            <obj> 11 </obj>
            <obj> 12 </obj>
            <obj> 13 </obj>
            <obj> 14 </obj>  // 2.7GB  10sec
            <obj> 15 </obj>  // 3.4GB  20sec
            <obj> 16 </obj>  // 5.6GB  60sec
            <obj> 17 </obj>  // java.lang.OutOfMemoryError
       )
     </objs>

endmodule

The comments in the side shows how much memory was used to parse <obj>s up to that point. Memory usage seems to blow up over the number of <obj>s.

K version of the commit de831309c0ef78138aecb896e382e3d095188283 is used. My machine is Intel Xeon CPU 3.40GHz and DDR3 RAM 8GB 1600MHz, Ubuntu 12.04.4 LTS, Java 1.8.0_60, 64bit.

daejunpark avatar Jun 09 '16 21:06 daejunpark

@radumereuta any idea?

daejunpark avatar Jun 09 '16 21:06 daejunpark

This problem frustrates me in porting JavaScript semantics into K4.0, especially because I cannot initialize the configuration with a large number of built-in JavaScript objects.

daejunpark avatar Jun 09 '16 21:06 daejunpark

This is almost certainly because of lack of support for priorities in the parser. As a result, I would be highly interested in a solution to this issue, because it will likely have significant implication for the speed of the parser as well.

dwightguth avatar Jun 09 '16 21:06 dwightguth

I would be interested to know whether you still get this behavior in the runtimeverification version of k though, because there are several major optimizations to parsing that were included in that version that have not been integrated into this version, at least one of which might very well affect memory usage.

dwightguth avatar Jun 09 '16 21:06 dwightguth

okay, I can confirm that this issue affects rv-k as well. please tag me in the pull request if one gets made.

dwightguth avatar Jun 09 '16 21:06 dwightguth

@dwightguth yes, it still happens in rv-k:

$ kompile --version
K framework version 1.0-SNAPSHOT
Git revision: 9f04b6c
Git branch: master
Build date: Thu Jun 09 16:39:42 CDT 2016

daejunpark avatar Jun 09 '16 21:06 daejunpark

A possible workaround is to use explicit klabel, i.e., #cells for Bag, _Map_ and _Set_ for Map and Set, like:

rule <k> @InitConfig => . ... </k>
     <objs>
       (.Bag =>
#cells(            <obj> 1 </obj> ,
#cells(            <obj> 2 </obj> ,
#cells(            <obj> 3 </obj> ,
#cells(            <obj> 4 </obj> ,
#cells(            <obj> 5 </obj> ,
#cells(            <obj> 6 </obj> ,
#cells(            <obj> 7 </obj> ,
#cells(            <obj> 8 </obj> ,
#cells(            <obj> 9 </obj> ,
#cells(            <obj> 10 </obj> ,
#cells(            <obj> 11 </obj> ,
#cells(            <obj> 12 </obj> ,
#cells(            <obj> 13 </obj> ,
#cells(            <obj> 14 </obj> ,
#cells(            <obj> 15 </obj> ,
#cells(            <obj> 16 </obj> ,
                   <obj> 17 </obj> ))))))))))))))))
       )
     </objs>

But it requires me to put a lot of manual efforts. But, depending on how long it will be taken to fix this issue, it might be worth doing it to make a progress. So, @radumereuta I need your input. How much is it hard to fix this?

daejunpark avatar Jun 09 '16 22:06 daejunpark

Somehow this escaped my attention at the time. It seems that for some reason, the length of that list is a problem with the way we're parsing cells in rules. As a quick fix, you can put a few parenthesis around the list: (1 2 3 4)(5 6 7 8..). This dropped the total kompile time to 4.6s.

Next time when you don't get a response from me, send me a message on hangouts: "hey, look at this please". @dwightguth suggested correctly that this may be related to the lack of native support in the parser kernel, but I will look to see if I am doing something else wrong since the performance decrease in such a case shouldn't be that dramatic.

radumereuta avatar Aug 22 '16 14:08 radumereuta