bigbang icon indicating copy to clipboard operation
bigbang copied to clipboard

collect data from google group

Open Kerstiru opened this issue 7 years ago • 5 comments

Most 'mailing lists' I am looking into for my analysis unfortunately use google groups. See specifically [email protected] https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/alaveteli-users

and https://groups.google.com/forum/#!forum/alaveteli-dev

Is it possible to make this work?

Kerstiru avatar Feb 08 '17 10:02 Kerstiru

trying to get this to work https://gist.github.com/punchagan/7947337

would it make sense to integrate with collect_mail.py ?

(it is a browser based scraper though)

davidberra avatar Feb 08 '17 10:02 davidberra

So far, I've been unable to find a workable scraper for Google Groups.

When I've studied Google Groups, I've found somebody who is a long-time member of the list who is willing to export the archive from their inbox,

for example with Google Takeout https://takeout.google.com/settings/takeout

I think if there were a good scraper, like that punchagan script, it would certainly make sense to integrate it with BigBang.

sbenthall avatar Feb 08 '17 18:02 sbenthall

I found this script https://github.com/icy/google-group-crawler , it appears to download the message from google group but I'm not able to import them as mbox to thunderbird.

On 8 February 2017 at 23:36, Sebastian Benthall [email protected] wrote:

So far, I've been unable to find a workable scraper for Google Groups.

When I've studied Google Groups, I've found somebody who is a long-time member of the list who is willing to export the archive from their inbox,

for example with Google Takeout https://takeout.google.com/settings/takeout

I think if there were a good scraper, like that punchagan script, it would certainly make sense to integrate it with BigBang.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/datactive/bigbang/issues/280#issuecomment-278412114, or mute the thread https://github.com/notifications/unsubscribe-auth/ACXO3MvKzIRYDUAOIMJJPwG08oqkDvIrks5ragQLgaJpZM4L6nkL .

-- Harsh Sent from a GNU/Linux

hargup avatar Feb 08 '17 18:02 hargup

Hey guys, Thanks so much for looking into this and excuses upfront for my non-techie contributions and questions ;) I could export the mentioned group traffic with takeout. Wondering though if it makes for historical analysis of the entire group traffic or just what I have in my inbox history?

Kerstiru avatar Feb 08 '17 21:02 Kerstiru

I found someone who built a scraper http://saturnboy.com/2010/03/scraping-google-groups/

Is this of any use for you guys or is there anything I can concretely do with that?

Also, I just used Google Takeout BUT

  • it only gives you data of the google groups you are owner of
  • it only seems to provide contact data

Kerstiru avatar Feb 09 '17 15:02 Kerstiru