datasets icon indicating copy to clipboard operation
datasets copied to clipboard

source{d} datasets ("big code") for source code analysis and machine learning on source code

Results 27 datasets issues
Sort by recently updated
recently updated
newest added

Programming language Identifiers dataset link directs me to a broken URL. https://github.com/src-d/datasets/tree/master/Identifiers -> https://drive.google.com/open?id=1wZR5zF1GL1fVcA1gZuAN_9rSLd5ssqKV

Hello Everyone, I am trying to access PGA's assembly files in order to train my language models. However, when I run "pga list siva" to see what is in the...

Hello, for about one month now, I am getting connection refused when using the tool. For example command `pga list siva` results in the following error message: `WARN[0001] could not...

When trying to download the dataset, clicking on the link lead me to an empty Google Drive folder...

Update `go-git` and `core-retrieval` dependencies on pga to make it work with the latest borges version (database models).

Requires #117 * [x] create batch job that uses pga-create container to index repositories * [x] run indexing on the repositories * [ ] add job description to charts repository...

- [ ] Remove pga.sourced.tech (it is a nightmare to maintain yet another site) - [ ] Remove web folder - [ ] Remove examples folder (examples can be part...

enhancement

The current PGA index format does not allow to understand under which references a given URL is written. For example, `tensorflow/tensorflow` belongs to 2 siva files, the first has two...

The used go-git version has a bug decoding pgp signatures and hangs indefinitely. This affects indexing process. ``` goroutine 1943 [runnable]: github.com/src-d/datasets/PublicGitArchive/pga-create/vendor/gopkg.in/src-d/go-git.v4/plumbing/object.(*Commit).Decode(0xc12d0944b0, 0x1477540, 0xc07ca89ec0, 0x0, 0x0) /go/src/github.com/src-d/datasets/PublicGitArchive/pga-create/vendor/gopkg.in/src-d/go-git.v4/plumbing/object/commit.go:191 +0x3d3 github.com/src-d/datasets/PublicGitArchive/pga-create/vendor/gopkg.in/src-d/go-git.v4/plumbing/object.DecodeCommit(0x7faa2f8f2110, 0xc01a3500c0,...

bug

PGA tool used to download dataset repositories is currently incompatible with windows. One problem is #100: this one is related to windows path separator. After fixing this problem all its...