antlr4 icon indicating copy to clipboard operation
antlr4 copied to clipboard

[WIP]Typescript target

Open ericvergnaud opened this issue 1 year ago • 15 comments

All ts tests pass locally. Submitting a PR to ensure nothing else is broken, and try ts CI. Also fixes #3868 tsc generates tons of warnings which I need to deal with

ericvergnaud avatar Sep 07 '22 02:09 ericvergnaud

mmm... not sure why my commits aren't signed

ericvergnaud avatar Sep 07 '22 02:09 ericvergnaud

@parrt hey, here's a beta of the typescript target Also fixes #3868 the CI works on Linux but needs to be fixed for Mac/Windows beyond my own beta-testing, I think it would be great to bring in beta-testers, but that involves a 'sensitive' announcement so let me know what you think ?

ericvergnaud avatar Sep 07 '22 10:09 ericvergnaud

mmm... not sure why my commits aren't signed

Every commit should have a sign, for example:

Signed-off-by: Ivan Kochurkin <[email protected]>

Also, is it possible to squash most part of commits? Commits like

renamed for clarity
fix issues
progressing
...

are not useful for reviewers and git history since they are not informative.

KvanTTT avatar Sep 07 '22 10:09 KvanTTT

mmm... not sure why my commits aren't signed

Every commit should have a sign, for example:

Signed-off-by: Ivan Kochurkin <[email protected]>

Also, is it possible to squash most part of commits? Commits like

renamed for clarity
fix issues
progressing
...

are not useful for reviewers and git history since they are not informative.

As written, I know what DCO expects, but I had automated this on my end, and it seems to no longer work..
This is WIP, so no need to review/squash yet.

ericvergnaud avatar Sep 07 '22 10:09 ericvergnaud

I had automated this on my end, and it seems to no longer work..

Looks like so because not every commit has a sign.

KvanTTT avatar Sep 07 '22 11:09 KvanTTT

@parrt hooray! all tests successful across all targets. Typescript CI is very slow, but we can look into it later (it doesn't affect the runtime). Otherwise, this is almost ready to go in beta, just missing code from #3878, and squashing the commits (some of them are unsigned due a config issue at my end).

ericvergnaud avatar Sep 09 '22 17:09 ericvergnaud

Hooray! Lemme look at #3878 tomorrow and then take a look at this.

parrt avatar Sep 09 '22 17:09 parrt

Typescript CI is very slow, but we can look into it later (it doesn't affect the runtime).

Now I see a lot of

java.io.IOException: Failed to delete file: C:\Users\RUNNER~1\AppData\Local\Temp\TscRunner-ForkJoinPool-1-worker-3-1662740528124

on CI: https://github.com/antlr/antlr4/runs/8274285204?check_suite_focus=true#step:22:3510 It should be fixed before merging, otherwise typescript tests are useless.

KvanTTT avatar Sep 09 '22 18:09 KvanTTT

Same happens I suspect with JavaScript. Not sure why it fails but it does not affect the runtime nor the tests

Envoyé de mon iPhone

Le 9 sept. 2022 à 20:10, Ivan Kochurkin @.***> a écrit :

 Typescript CI is very slow, but we can look into it later (it doesn't affect the runtime).

Now I see a lot of

java.io.IOException: Failed to delete file: C:\Users\RUNNER~1\AppData\Local\Temp\TscRunner-ForkJoinPool-1-worker-3-1662740528124 on CI: https://github.com/antlr/antlr4/runs/8274285204?check_suite_focus=true#step:22:3510 It should be fixed before merging, otherwise typescript tests are useless.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

ericvergnaud avatar Sep 10 '22 19:09 ericvergnaud

I don't see such errors in dev branch for JavaScript: https://github.com/antlr/antlr4/runs/8288355910?check_suite_focus=true It affects CI servers since test data is not being removed and may take a lot of resource.

KvanTTT avatar Sep 11 '22 13:09 KvanTTT

I don't see such errors in dev branch for JavaScript: https://github.com/antlr/antlr4/runs/8288355910?check_suite_focus=true It affects CI servers since test data is not being removed and may take a lot of resource.

Only the top folder encounters this error, the contents are cleared

ericvergnaud avatar Sep 12 '22 07:09 ericvergnaud

@parrt thanks for merging #3878 (which was missing some tweaks, my bad), however I noticed you merged it into master rather than dev ? So I had to merge back from master which brings into this PR a few unrelated files, I guess they'll fade away once you merge master onto dev. Runtime wise, as far as I'm concerned, this is ready to go. However not sure about the API (the d.ts files) since we don't know what people need. So my preferred path would be to publish a beta tool and js/ts runtime only, and call for testers. I expect these will come primarily from antlr4ts, so hopefully they will have mature requirements. I'll also be doing my own testing, but that may take a while since it requires migrating my code from js to ts. I expect the beta to last 2-3 months. Let me know what you think.

ericvergnaud avatar Sep 12 '22 09:09 ericvergnaud

It’s an incorrect implementation therefore it’s a design bug. I’m not really interested in arguing about code style. Your suggestions are welcome but they should remain suggestions.

Envoyé de mon iPhone

Le 13 sept. 2022 à 23:13, Ivan Kochurkin @.***> a écrit :

 @KvanTTT commented on this pull request.

In tool/src/org/antlr/v4/codegen/target/TypeScriptTarget.java:

  • public String getTargetStringLiteralFromANTLRStringLiteral(
  • CodeGenerator generator,
    
  • String literal, boolean addQuotes)
    
  • {
  • StringBuilder sb = new StringBuilder();
    
  • String is = literal;
    
  • if ( addQuotes ) sb.append('"');
    
  • for (int i = 1; i < is.length() - 1; ) {
    
  • 	int codePoint = is.codePointAt(i);
    
  • 	int toAdvance = Character.charCount(codePoint);
    
  • 	if  (codePoint == '\\') {
    
  • 		// Anything escaped is what it is! We assume that
    
  • 		// people know how to escape characters correctly. However
    
  • 		// we catch anything that does not need an escape in Java (which
    
  • 		// is what the default implementation is dealing with and remove
    
  • 		// the escape. The C target does this for instance.
    
  • 		//
    
  • 		int escapedCodePoint = is.codePointAt(i + toAdvance);
    
  • 		toAdvance++;
    
  • 		switch (escapedCodePoint) {
    
  • 			// Pass through any escapes that Java also needs
    
  • 			//
    
  • 			case    'n':
    
  • 			case    'r':
    
  • 			case    't':
    
  • 			case    'b':
    
  • 			case    'f':
    
  • 			case    '\\':
    
  • 				// Pass the escape through
    
  • 				sb.append('\\');
    
  • 				sb.appendCodePoint(escapedCodePoint);
    
  • 				break;
    
  • 			case    'u':    // Either unnnn or u{nnnnnn}
    
  • 				if (is.charAt(i + toAdvance) == '{') {
    
  • 					while (is.charAt(i + toAdvance) != '}') {
    
  • 						toAdvance++;
    
  • 					}
    
  • 					toAdvance++;
    
  • 				} else {
    
  • 					toAdvance += 4;
    
  • 				}
    
  • 				if (i + toAdvance <= is.length()) {
    
  • 					// we might have an invalid \\uAB or something
    
  • 					String fullEscape = is.substring(i, i + toAdvance);
    
  • 					appendUnicodeEscapedCodePoint(
    
  • 						CharSupport.getCharValueFromCharInGrammarLiteral(fullEscape),
    
  • 						sb);
    
  • 				}
    
  • 				break;
    
  • 			default:
    
  • 				if (shouldUseUnicodeEscapeForCodePointInDoubleQuotedString(escapedCodePoint)) {
    
  • 					appendUnicodeEscapedCodePoint(escapedCodePoint, sb);
    
  • 				} else {
    
  • 					sb.appendCodePoint(escapedCodePoint);
    
  • 				}
    
  • 				break;
    
  • 		}
    
  • 		// Go past the \ character
    
  • 		i++;
    
  • 	} else {
    
  • 		if (codePoint == 0x22) {
    
  • 			// ANTLR doesn't escape " in literal strings, but every other language needs to do so.
    
  • 			sb.append("\\\"");
    
  • 		} else if (shouldUseUnicodeEscapeForCodePointInDoubleQuotedString(codePoint)) {
    
  • 			appendUnicodeEscapedCodePoint(codePoint, sb);
    
  • 		} else {
    
  • 			sb.appendCodePoint(codePoint);
    
  • 		}
    
  • 	}
    
  • 	i += toAdvance;
    
  • }
    
  • if ( addQuotes ) sb.append('"');
    
  • return sb.toString();
    
  • } IMHO it's a bug to have just one translation method for all targets because inevitably users will complain, and we need to safely fix the translation on a per target basis.

Firstly, it's not bug, but maybe incorrect implementation. Secondly, IMHO it's not optimal to create duplicated functionality because of slightly different things. They can be encapsulated to separated methods according to OOP. The less code we have, the less potential errors we have. Moreover, this method not only escapes chars but also joins them. Joining logic should be the same. For that matter, charCount, codePointAt and other char methods are Java-specific, they should not be used for TypeScript or other target literals, but actually it doesn't matter.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

ericvergnaud avatar Sep 13 '22 22:09 ericvergnaud

beta

Well, sounds reasonable to me but I need to evaluate the merits of either approach discussed in https://github.com/antlr/antlr4/discussions/3867

What's involved in creating a suitable beta? A zip? A branch?

parrt avatar Sep 16 '22 00:09 parrt

It involves publishing a beta in maven central and npm

ericvergnaud avatar Sep 16 '22 06:09 ericvergnaud