PigeonJS icon indicating copy to clipboard operation
PigeonJS copied to clipboard

Demonstration of the path-extraction process shown in the paper "A General Path-Based Representation for Predicting Program Properties"

PigeonJS

PigeonJS is a tool for demonstration of the path-extraction process shown in the paper "A General Path-Based Representation for Predicting Program Properties" (PLDI'2018): https://arxiv.org/pdf/1803.09544.pdf

PigeonJS is based on UnuglifyJS.

Requirements

  • node.js (http://nodejs.org/)
  • NPM (https://www.npmjs.com/)

They can be installed using:

sudo apt-get install nodejs npm

Setup

git clone https://github.com/urialon/PigeonJS sudo npm install

Path-Extraction

bin/unuglifyjs uri.js --nice_formatting --extract_features --no_hash --max_path_length=<max_length> --max_path_width=<max_width>

This will extract paths between variables and the rest of the elements, in the file uri.js . Possible flags:

  • removing --no_hash - will hash each path for lower memory consumption
  • --semi_paths - will extract paths from variables to their ancestor non-leaves nodes
  • --include_giv_giv - include paths between AST terminals which are not variables, such as constants.

python extract_features.py --dir <training_dir> --max_path_length <max_length> --max_path_width <max_width> > training 2> out.err

This command runs the nodeJS scripts using multiple processes (much faster for large datasets, when running on a machine with many cores).

Nice2Predict

To install Nice2Predict framework please follow the instructions on the https://github.com/eth-srl/Nice2Predict page.

Data

JavaScript dataset: https://www.dropbox.com/s/nynvowu8wobdagw/pldi18_js.tar.gz?dl=0

Python dataset: https://www.dropbox.com/s/4q08j78f7hdbiob/python50starsplus.tar.gz?dl=0

Trained Code Embeddings

Can be downloaded from here: https://s3.amazonaws.com/pigeon-pldi18/js_vectors_dim150.tar.gz