Dataset Inconsistency: Missing Tweets and Users Information
Hi there,
I appreciate the open-access dataset provided for the research paper on depression detection through social media. However, I encountered an inconsistency between the dataset I downloaded and the dataset description given in the paper and this repository. Here’s a summary of the issue:
Expected Data Volume: According to the paper and GitHub description, the dataset should contain a substantial number of tweets and user information spanning from 2009 to 2016 for D1, as well as comprehensive data for D2 and D3.
Observed Data Volume: In the downloaded dataset, I noticed that the volume of tweets and user information does not match the numbers stated in the documentation. Many anchor tweets do not have corresponding user information or recent tweets within the last month.
Could you please clarify if the full dataset is available? If this is a reduced version of the dataset, it would be helpful to include this information in the GitHub README or dataset description. Additionally, if there is any way to access the complete dataset or if there are specific steps required, I would greatly appreciate any guidance.
Thank you very much for your time and for making this valuable dataset available. I look forward to your response.
Best regards,