R Max Espinoza comments

Results 56 comments of


                                            R Max Espinoza

possible inspirations?

> Your resources/recipes seem more geared towards volume than specificity... My big fail on documenting the design. My goal was a tool to find and download datasets with a very...

Dataset Request: Mendeley datasets

Example: https://data.mendeley.com/datasets/c693yzczts/1

Dataset Request: Pokemon dataset

There is pokeapi too https://pokeapi.co/

Dataset Request: Pokemon dataset

Pokeapi uses data from veekun: https://github.com/veekun/pokedex/tree/master/pokedex/data/csv

Dataset Request: US GOV Datasets

This is a huge one, though. At this point I wonder if it may be better to use a micro-service for the search rather than local indexing and search capabilities....

Dataset Request: US GOV Datasets

We could follow `brew tap` approach and move these big datasets aggregators to its own recipes repository, and the users may choose to use it. A drawback is that the...

Dataset Request: R-datasets

Datasets descriptions html files can be linked via github pages, i.e.: http://vincentarelbundock.github.io/Rdatasets/doc/plm/EmplUK.html

Integrate darkrho/scrapy-inline-requests

> it's hacky, debugging it is kind of hard I totally agree with that. :)

S3FilesStore can use a lot of memory

I use `CONCURRENT_ITEMS = 1` in this cases. I haven't verified how much improve the memory usage, though.

TypeError: init() got an unexpected keyword argument 'server'

Looks like you are missing this setting: ``` # Ensure all spiders share same duplicates filter through redis. DUPEFILTER_CLASS = "scrapy_redis.dupefilter.RFPDupeFilter" ```