Simone Persiani

Results 3 issues of Simone Persiani

Hi @Harshdeep1996 ! I recently discovered some annoying problems in the output dataset, which is stored in the parquet format. ### First problem The 'metadata_file' column is stored as a...

bug

Hi @Harshdeep1996 , I'm working on the parent dataset (the **'citations_from_wikipedia.zip'** file [available on Zenodo](https://zenodo.org/record/3940692#.YBgwQZeg_ct)). I found some duplicated rows (approx. 2 thousands for each parquet partition file), meaning that...

bug