dh-core icon indicating copy to clipboard operation
dh-core copied to clipboard

Functional data science

Results 24 dh-core issues
Sort by recently updated
recently updated
newest added

It's way for emulating not quite quad-precision number using two doubles. Algorithms is interesting by itself and could have few uses. But I think its main value is providing example...

help wanted
R&D: library

I'm working on a hasktorch example with fashion-mnist, and @stites suggested adding the dataset to `datasets`, which I think is pretty useful! Referring to this issue over at `hasktorch` :...

enhancement
R&D: applications
R&D: library

When running `stack bench` I get ``` bench: /Users/ocramz/.cache/datasets-hs/cifar-10-imagefolder/Truck: getDirectoryContents:openDirStream: does not exist (No such file or directory) ``` I guess it's a matter of copying the test data in...

bug
infrastructure

Unify dense and sparse lin.alg. , for a given underlying vector type, under one same interface Blocked by #1 and #3

enhancement
R&D: library

Possibly a binary in the app/ folder with an end-to-end workflow. Then we can split back anything good that comes out of this into the main library

enhancement
help wanted
good first issue
R&D: applications
documentation

Looking over the `Dataloader` code, I immediately thought about integrating a private dataset to play with some haskell code. This made me wonder if anyone has thought about adding a...

enhancement
help wanted
R&D: library

Medium-long term : the loading/parsing machinery is growing in size and scope (see #22 , #29 ), so those functions and types could be gathered in a separate `datasets-core` package....

enhancement
help wanted
R&D: library
low priority

The datasets downloader could use the above improvements: verifying downloads with hashes, and multithreading large downloads. I've written a version of the first feature in the `Setup.ht` for a personal...

enhancement
help wanted
R&D: library

The Netflix Prize dataset uses a custom parser because one data example does not fit into a single dataset row (such as CSV data) but has a custom "stanza-based" format....

enhancement
help wanted
good first issue
R&D: library

The netflix dataset seems to be still available in the public domain via kaggle: https://www.kaggle.com/netflix-inc/netflix-prize-data contrary to the comment in the corresponding data loader: https://github.com/DataHaskell/dh-core/blob/bd06214e092bc53e5bb8b05afc8bc3420ff96886/datasets/src/Numeric/Datasets/Netflix.hs#L9