FActScore
FActScore copied to clipboard
About the enwiki-20230401
after download the data and set the environment, I run this command python -m factscore.factscorer --input_path "/root/FNDLLM/test.jsonl" --model_name "retrieval+llama+npm" --use_atomic_facts --data_dir '/root/.cache/factscore/ and get this File "/root/anaconda3/envs/factstore/lib/python3.7/site-packages/factscore/retrieval.py", line 57, in build_db with open(data_path, "r") as f:'FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/factscore/enwiki-20230401.jsonl' I didn't find the enwiki-20230401.jsonl in the download data, where is it?
Hi @Toblame, thanks for your interest in our work. What command did you use to download the data?
The cache is stored by default in the folder where you ran the download command, see https://github.com/shmsw25/FActScore/blob/main/factscore/download_data.py#L119
Can you confirm that the other cache files are present in /root/.cache
for you?
Thank you and I have solve this problem, however I meet another problem 'AssertionError: topic in your data (topic) is likely to be not a valid title in the DB.' This happened when I used both my own data and the factscore labeled data.
Hi @Toblame ,
How did u solve this problem? The download_data.py file only downloads a enwiki-20230401.db file, I cannot find a .jsonl file in the cache. TIA
Hi @Toblame ,
How did u solve this problem? The download_data.py file only downloads a enwiki-20230401.db file, I cannot find a .jsonl file in the cache. TIA
I just restart the command and check the cache file's location, then run the command again. However I still meet another problem above.
Hi @Toblame,
Thank you and I have solve this problem, however I meet another problem 'AssertionError: topic in your data (topic) is likely to be not a valid title in the DB.'
You are likely getting this error because you have set topic
in some rows of the input JSONL file to the string "topic"
. For this to work, topic
must be equal to some article title (like "Billy Conigliaro"
) which is present in the database.