MalGraph icon indicating copy to clipboard operation
MalGraph copied to clipboard

Help wanted

Open Divine-sh opened this issue 1 year ago • 1 comments

Hello, I noticed that the file train_external_function_name_vocab.jsonl is needed when training the model. Is it generated through the 1BuildExternalVocab.py ? If yes, can you please provide this file; if not, can you tell me some details about generating the file train_external_function_name_vocab.jsonl. Thank you

Divine-sh avatar Jun 29 '23 09:06 Divine-sh

As we have described in Section IV.A.2) For each node representing the external function in FCG, it is one-hot encoded based on its function name and we limit the vocabulary size of external functions to 10,000 that are most frequently used in the training dataset. Therefore, therefore it is quite easy to obtain the file of train_external_function_name_vocab.jsonl by simply counting frequency of external function names in all samples in the training set and then ranking/select TOP-K external function names.

ryderling avatar Jun 29 '23 09:06 ryderling