PyCG
PyCG copied to clipboard
<Performace Bug>: It costs too much time to analyze DL packages(e.g., numpy)
Hello, it costs me tooooo much time to analyze deep learning packages(numpy, tensorflow, etc.), which is unacceptable. Do you have any idea about optimizing PyCG to reduce its running time?
As you mentioned here, it's likely to improve the complexity. Can you give me some suggestions on how to optimize DefinitionManager.complete_definitions()
? (I don’t mind sacrificing a certain precision in exchange for significantly less execution time.)
I'm looking forward to receiving your constructive suggestions. Thanks!
Optimizing this part is a work in progress. It basically implements a transitive closure of the assignment graph. However, this can be implemented in a lazy manner -- i.e. whenever we look for the functions that can be pointed to by a certain identifier, we can update the assignment graph with new edges towards the results.
I have also added the --max-iterations
CLI argument which limits the fix-point iteration to a certain number of iterations. The quickest way to improve performance with a very small sacrifice in precision & recall would be to use this argument with a numerical value (e.g. --max-iterations 1
.
Thanks a lot! Your reply is quick and quite useful!
Optimizing this part is a work in progress. It basically implements a transitive closure of the assignment graph. However, this can be implemented in a lazy manner -- i.e. whenever we look for the functions that can be pointed to by a certain identifier, we can update the assignment graph with new edges towards the results.
I have also added the
--max-iterations
CLI argument which limits the fix-point iteration to a certain number of iterations. The quickest way to improve performance with a very small sacrifice in precision & recall would be to use this argument with a numerical value (e.g.--max-iterations 1
.
Hi,
When I set --max-iterations as 1, the execution time is much less and I appreciate it very much!
However, consider the following code:
label_binarizer = LabelBinarizer()
image_labels = label_binarizer.fit_transform(label_list)
PyCG can only extract sklearn.preprocessing.LabelBinarizer
while LabelBinarizer.fit_transform
can not be extracted.
I'm curious about the reason: Does the small value of --max-iterations contribute to it? or, PyCG doesn't have the ability to extract the call chain? @vitsalis
Closing due to archival of repository.