Tianle (Tim) Li
Tianle (Tim) Li
At the moment, I am taking in cache file and output file as jsonl file unless there is a more effective way at writing to json file row by row.
Also make sure to set `ulimit -n 16384` or other limit accordingly when running larger parallel.
@andrewwan0131 @PranavB-11 I resolved the old comments because it is no longer relevant. We can start commenting this new code as it is pretty different from before. The pdfchat is...
71 files changed?? ðŸ˜ðŸ˜
This PR was transfered to internal repo.
@derixu Could you fix the formatting check error?
Overall, great PR! Very clean code. A few fixes and I will poke around more here and there and we should be good to go!
@derixu Thanks for the new changes! I found a very weird bug: when I 1. Begin running the labeler. 2. Stop (control + c) the labeling processes half way through...
I tried a couple testing, the classifier code should be good to go now.
@infwinston This PR is good to go! Thanks for the great work, @derixu!!