pyre-check icon indicating copy to clipboard operation
pyre-check copied to clipboard

Adds line number support to taint.config parser

Open abishekvashok opened this issue 2 years ago • 15 comments

Adds line number support to taint.config parser to report line numbers along with the validation errors and hence make the errors typed. YoJSON, the currently used JSON persing module doesn't support line number parsing, hence resorted to using another low level module - JSONM.

However, JSONM lacks the high-level features YoJSON provides and is pretty bare-bone. It would be a mundane task to rewrite code to include only JSONM and ditch YoJSON. Hence, using YoJSON along with JSONM to derive line numbers. Instead of storing the line numbers along during parsing, we deligate the validation step to JSONM.

The validation step required a pass through the entire taint.config files already, so migrating it to JSONM and providing only line numbers for some errors won't be a big performance overhead.

Further, this approach enables us to use the 31% speed advantage YoJSON provides over JSONM (line number parsing takes time) and by storing line numbers only during errors, we can make efficient use of memory by storing only the things we want!

A downside of this approach is that we would indeed be using two JSON parsers!

Working preview: Screenshot 2021-12-10 at 12 19 39 AM

Test Plan:

  • add duplicate sinks/sources/features to a taint.config file or spread across multiple files.
  • pull, recompile, and run pyre. See the new errors pop-up

Next plans:

  • change the logic of exiting when the first hash_set hit is found, but continue on and exit only after the complete iteration.

Fixes part of https://github.com/MLH-Fellowship/pyre-check/issues/82 Signed-off-by: Abishek V Ashok [email protected]

abishekvashok avatar Dec 01 '21 05:12 abishekvashok

Facing these weird compile time errors:

dune build @install -j auto --profile dev
File "interprocedural_analyses/taint/taintConfiguration.ml", line 560, characters 10-12:
560 |           in
                ^^
Error: Syntax error: ')' expected
File "interprocedural_analyses/taint/taintConfiguration.ml", line 554, characters 33-34:
554 |         | location_list::rest -> (
                                       ^
  This '(' might be unmatched
make: *** [dev] Error 1 

abishekvashok avatar Dec 01 '21 05:12 abishekvashok

@onionymous has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot avatar Dec 01 '21 21:12 facebook-github-bot

@abishekvashok has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot avatar Dec 02 '21 10:12 facebook-github-bot

@abishekvashok has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot avatar Dec 02 '21 11:12 facebook-github-bot

Making slow but definite progress :) @onionymous Now the below line seems to cause

validate configuration (List.map ~f:(Json.Util.to_string) configurations);

this error:

Pyre encountered an internal exception: ("Yojson.Safe.Util.Type_error(\"Expected string, got object\", _)")

Is this in some way related because configurations is piped through Option.some ?

abishekvashok avatar Dec 02 '21 11:12 abishekvashok

@onionymous has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot avatar Dec 03 '21 16:12 facebook-github-bot

Making slow but definite progress :) @onionymous Now the below line seems to cause

validate configuration (List.map ~f:(Json.Util.to_string) configurations);

this error:

Pyre encountered an internal exception: ("Yojson.Safe.Util.Type_error(\"Expected string, got object\", _)")

Is this in some way related because configurations is piped through Option.some ?

think we resolved this in OH last week - let me know if you have more issues! :)

onionymous avatar Dec 06 '21 21:12 onionymous

@onionymous thank you. Yes we resolved it in the last OH. I am now a facing a bit of trouble in the logic of the inner begin keywords. But I will try to resolve it asap and will reach out if I get stuck :)

abishekvashok avatar Dec 07 '21 05:12 abishekvashok

@abishekvashok has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot avatar Dec 07 '21 13:12 facebook-github-bot

@onionymous almost done! Now I am getting the only the first occurrence of the duplicate entry.

abishekvashok avatar Dec 07 '21 13:12 abishekvashok

@onionymous has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot avatar Dec 09 '21 16:12 facebook-github-bot

@abishekvashok has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot avatar Dec 09 '21 18:12 facebook-github-bot

@onionymous @0xedward thanks for all the help in this PR :)

abishekvashok avatar Dec 09 '21 18:12 abishekvashok

@0xedward has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot avatar Dec 09 '21 21:12 facebook-github-bot

@arthaud thanks for looking into this. I took a short break - sorry for the late reply.

Also, it looks like Jsonm.decode only does parsing into tokens. What happens if a label or section appears twice in the json? We would probably error on the wrong line?

That's why I implemented a counter inside which keeps track of preceding tokens. But I think it's hacky, and I trying to find a better way to do it and update this PR :)

abishekvashok avatar Jan 04 '22 05:01 abishekvashok

Closing in favour of https://github.com/facebook/pyre-check/pull/732

Also, it looks like Jsonm.decode only does parsing into tokens. What happens if a label or section appears twice in the json? We would probably error on the wrong line?

Fixed this in the new PR by changing the approach we took

abishekvashok avatar Apr 24 '23 09:04 abishekvashok