Ed Summers

Results 206 comments of Ed Summers

Yes, I was thinking the same thing :-D I guess that would be a separate issue. @nuest were you thinking of bagit library for node like [bagit-js](https://www.versioneye.com/nodejs/bagit-js/1.0.0)?

@singh-95 The search API returns retweets that match a query, but the web accessible search interface does not...at least to my knowledge.

Do you have the JSON for the tweet?

Interesting @nemobis! Have you compared the results of each approach at all?

I think there are (at least) two approaches to this: 1. mailbagit could use [py-wacz](https://github.com/webrecorder/py-wacz) to bundle the WARC files in a WACZ file, and then add each mailto: URI...

I'm interested in hearing where the need for this optimization arose. Was it a problem generating the CSV, or reading the generated CSV in another application? It sounds like the...

Would being able to write to parquet help in situations like that?

Being able to output as parquet would be nice too--even if it's called twarc-csv :-)

I was going to say that pandas has many output formats. It might not be hard to add parquet, pickle, hdf, sql, excel, json, html, feather, latex, stata, gbq, markdown,...

Hi @cKlee, I'm really glad you like the idea. I'm thinking of adding support for marcspec to pymarc, and this would help considerably I think. Thanks for the excellent comments,...