bnfc icon indicating copy to clipboard operation
bnfc copied to clipboard

Support "integer" category to bind to `Long` instead of `Integer`

Open nobeh opened this issue 10 years ago • 8 comments

I just created an issue at jflex-de/jflex#191 as I initially thought I am wrongly using the tool chain.

I tried to

$ bnfc -m --java -o src/main/java --jflex -p bnfc src/main/resources/abs.cf

and then I tried to tweak a bit the generated Yylex as:

$ sed -i -e 's/new Integer(yytext())/new Long(yytext())/g' src/main/java/bnfc/abs/Yylex

before continuing with JFlex and CUP runtime. However it seems that the above is not enough and all the "integer" category are used as java.lang.Integer.

In my language when I have

Integer s = 10000000000;

Now I get a ClassCastException based on the change I did to Yylex:

java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
    at bnfc.abs.CUP$parser$actions.CUP$parser$do_action(parser.java:1443)
    at bnfc.abs.parser.do_action(parser.java:1107)
    at java_cup.runtime.lr_parser.parse(lr_parser.java:584)
    at jabsc.Compiler.parseSource(Compiler.java:175)
    at jabsc.Compiler.compile(Compiler.java:115)
    at jabsc.Compiler.compile(Compiler.java:86)
    at jabsc.cli.CompilerCommand.run(CompilerCommand.java:30)
    at jabsc.cli.Bootstrap.main(Bootstrap.java:24)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
    at java.lang.Thread.run(Thread.java:745)

Is there a way I can use bnfc to bind "integer" category to java.lang.Long in Java mode?

Thanks.

nobeh avatar Aug 20 '15 07:08 nobeh

I've been able to so far to have a workaround after running bnfc by:

$ sed -i -e 's/new Integer(yytext())/new Long(yytext())/g' /path/to/Yylex
$ sed -i -e 's/terminal Integer/terminal Long/g' /path/to/LANG.cup
$ sed -i -e 's/Integer /Long /g' /path/to/GeneratedJavaSourcesUsingInteger.java

This is a workaround and not sure if it is at all maintainable.

nobeh avatar Aug 20 '15 14:08 nobeh

I think your trying to make the change in the wrong place. Seems to me if you want to do this you do it in the grammar. There is nothing that says you can't do something like this:

Pdecl. Decl ::= Type Vdef; Vdec. Vdef ::= Ident ;

TInt. Basetype ::= "int" ; TDouble. Basetype ::= "float" ; TString. Basetype ::= "string" ;

Then When you write your Expressions, Just don't insert Integer in the expressions. You could even write your grammar so that all variables are of type String underneath.. Implementation is where you really define how it uses these types. You could also avoid using String, Integer, Double and even Ident, and write your own expressions for those as well. They are just provided because they are pretty popular and this makes it easy if you want to use the prebuilt shortcuts.

I tried to download your abs.cf file to look at it but the link didn't work.

Kent

On Thu, Aug 20, 2015 at 9:02 AM, Behrooz Nobakht [email protected] wrote:

I've been able to so far to have a workaround after running bnfc by:

$ sed -i -e 's/new Integer(yytext())/new Long(yytext())/g' /path/to/Yylex $ sed -i -e 's/terminal Integer/terminal Long/g' /path/to/LANG.cup $ sed -i -e 's/Integer /Long /g' /path/to/GeneratedJavaSourcesUsingInteger.java

This is a workaround and not sure if it is at all maintainable.

— Reply to this email directly or view it on GitHub https://github.com/BNFC/bnfc/issues/153#issuecomment-133019543.

sirdude avatar Aug 21 '15 20:08 sirdude

Thanks for looking into this.

The ABS.cf can be found at https://github.com/CrispOSS/jabsc/blob/master/src/main/resources/abs.cf

I am quite at beginner level with BNFC and the BNF grammar that it supports. I will try to better understand your explanation and apply it in our context.

nobeh avatar Aug 21 '15 21:08 nobeh

@nobeh , I believe I bumped sometimes in your same problem. I guess this is way too late, but anyway let me spend some words about this issue:

BNFC translates its built-in Integer token type as a java.lang.Integer object in Java, and this is a problem if the parsed token is actually larger than 32 bits. You could define your own "numeric" token:

token Long digit+;

but this will come at a price: 1) this will shadow the Integer built-in token, so if you are going to use the Integer token somewhere, the lexer will just not find it as they both have the same definition, so depending on where one is defined in the lexer input, the other will never be found. You can see this by running bnfc on the following grammar:

LongLiteral    . Exp ::= "long" Long;
IntegerLiteral . Exp ::= "int" Integer;
token Long digit+;

2) Even if you manage to have an Integer-free grammar, you are bound to manually convert the text of the token to a long, as BNFC gives you back the character that matched the token without further processing.

The "further processing" in the case of the built-in token Integer is calling the method java.lang.Integer.parseInt(<the digits matched by the BNFC Integer token >).

Your workaround is probably the best strategy if you cannot get rid of using the Integer built-in token in your grammar. Otherwise, define your own token and use it instead of Integer, but you will have convert it whenever you need the numeric value of the string matching the token.

gapag avatar Oct 30 '15 18:10 gapag

From the point of view of how to address this issue in the next releases of BNFC, I believe the only reasonable solution is to convert the Double and Integer tokens in, respectively, java.math.BigDecimal and java.math.BigInteger; this removes point 1) and mitigates somehow 2) since the APIs of such classes allow to convert the numeric value to the fixed width primitive types.

gapag avatar Oct 30 '15 22:10 gapag

I implemented the solution above in the bigInteger branch of my fork.

gapag avatar Nov 02 '15 15:11 gapag

@gapag Thanks for the explanations and the patch for this. I am starting to better understand BNFC side of the problem here and see the challenge. As a Java programmer, I just want to suggest generally avoid translating to BigDecimal and BigInteger unless there's specific reason. Simply, primitive types perform much better in this area.

As you said and would be my conclusion as well, if BNFC is used for Java as a target, this assumption is fair that there should be enough Java knowledge to work around or use the primitive translation scheme. It's not worth to trade performance off in many cases for just being able to use >32b integers.

nobeh avatar Nov 02 '15 16:11 nobeh

@nobeh Yes, with these non-fixed precision types it can get very expensive and most of all annoying when you have to make operations with such numbers. I must also say that I realized that I ignored or refused to solve the issue you pointed out... I now pushed a commit in bigInteger which supports all the 4 binding combinations (Integer|Long)(Float|Double) plus the variable size arithmetic types. The cup backend was also not supported, now I fixed it.

gapag avatar Nov 02 '15 23:11 gapag