dblp
dblp copied to clipboard
Parse the dblp data into a structured format for experimentation.
I am facing problem while running pipeline.py, as i am getting problem in config module
When I tried to run the pipeline, paper.csv was generated from Miner-Papertxt (about 2.2G). And the paper.csv file was too large (exceeded 1.7T) but my computer has only about 2T...
Could you please add descriptions for each file in the repdocs module. I'm trying to use this parser for my projects and am unclear what all the files contain and...
I was able to run the whole project. But i am not sure to from where do i get author id to author name mapping ?
It appears that `Task`s with no output are supposed to implement a custom `complete()` method, since completion normally means all the output files exist. We should either make the outputs...
From the data, it appears the AMiner group did not perform any name disambiguation. This has led to a dataset with quite a few duplicate author records. This package currently...
While the AMiner group already has a co-authorship network provided, it unfortunately does not allow for filtering by year ranges, which is a key feature of this library. Therefore it...
Should rely on a small portion of the real dataset that is representative in order to test.
For a complete dataset, generate a summary of salient characteristics, such as: - number of nodes and edges for each graph, diameter, avg. degree - number of documents, terms, and...