bigbang
bigbang copied to clipboard
collect data from google group
Most 'mailing lists' I am looking into for my analysis unfortunately use google groups. See specifically [email protected] https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/alaveteli-users
and https://groups.google.com/forum/#!forum/alaveteli-dev
Is it possible to make this work?
trying to get this to work https://gist.github.com/punchagan/7947337
would it make sense to integrate with collect_mail.py ?
(it is a browser based scraper though)
So far, I've been unable to find a workable scraper for Google Groups.
When I've studied Google Groups, I've found somebody who is a long-time member of the list who is willing to export the archive from their inbox,
for example with Google Takeout https://takeout.google.com/settings/takeout
I think if there were a good scraper, like that punchagan script, it would certainly make sense to integrate it with BigBang.
I found this script https://github.com/icy/google-group-crawler , it appears to download the message from google group but I'm not able to import them as mbox to thunderbird.
On 8 February 2017 at 23:36, Sebastian Benthall [email protected] wrote:
So far, I've been unable to find a workable scraper for Google Groups.
When I've studied Google Groups, I've found somebody who is a long-time member of the list who is willing to export the archive from their inbox,
for example with Google Takeout https://takeout.google.com/settings/takeout
I think if there were a good scraper, like that punchagan script, it would certainly make sense to integrate it with BigBang.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/datactive/bigbang/issues/280#issuecomment-278412114, or mute the thread https://github.com/notifications/unsubscribe-auth/ACXO3MvKzIRYDUAOIMJJPwG08oqkDvIrks5ragQLgaJpZM4L6nkL .
-- Harsh Sent from a GNU/Linux
Hey guys, Thanks so much for looking into this and excuses upfront for my non-techie contributions and questions ;) I could export the mentioned group traffic with takeout. Wondering though if it makes for historical analysis of the entire group traffic or just what I have in my inbox history?
I found someone who built a scraper http://saturnboy.com/2010/03/scraping-google-groups/
Is this of any use for you guys or is there anything I can concretely do with that?
Also, I just used Google Takeout BUT
- it only gives you data of the google groups you are owner of
- it only seems to provide contact data