bigbang issues

Results 110 bigbang issues

Sort by recently updated

document date format in tenure.py

As noted in #355, just need to update the script help text that the date format is ordinal.

documentation

Better support for CSV export

Some users want to extract data from mailing lists for graphing and analysis in some other program, like Excel. We should make it clearer how to export data to CSV...

sbenthall

document that *.txt for file-based collect_mail usage points to archive pages

See this issue: https://github.com/nllz/bigbang/issues/10#issuecomment-227876957 Documentation on the use of collect_mail.py with file import is not clear enough. URLs need to point specifically to the archive page of each list.

sbenthall

consolidate entity resolution scripts into single module

There are multiple "entity resolution" functions for mailing lists currently in BigBang. These should be consolidated into a single module with the relevant differences documented.

sbenthall

enhancement

prototypefund

bad column name on repo.commit_data dataframe

``` from bigbang import repo_loader # The file that handles most loading repo = repo_loader.get_repo("numpy", in_type = "name" ) repo.commit_data[:5] ``` The dataframe's first column is named "Unnamed: 0". It...

sbenthall

Git Repo

global Repo not defined

In git_repo.py, when I use get_repo(url,"remote"), cache = "none" so the function call self.repo = Repo(url) with undefined Repo. I can't figure what class is used here.

emilienschultz

Git Repo

activity() doesn't exist in notebook example Analyze Senders

process.activity() doesn't exist. It seems because the class Archive wrapper is not used, and we have to wrap the dataframe and use get_activity(). I will fix it later.

emilienschultz

Examples

None body in parsed mails

I try to use the Single Word Trend notebook example and raise an error when iterating across the archive. Some mails (seems to be multipart mails) have a None body...

emilienschultz

Add Comprehensive Tests to Git Loading Scheme

Currently, the repo loader files and its logic are all too complicated and definitely contain bugs. I need to make sure they will be able to handle loading many different...

falahat

Git Repo

Archive 'Date' column sometimes is string not datetime type

Sometimes when an Archive is loaded from data on the file system, the "Date" column is of type string rather than datetime. This irregularity makes it hard to compare mailing...

sbenthall

bigbang
bigbang copied to clipboard

Metadata

document date format in tenure.py

Better support for CSV export

document that *.txt for file-based collect_mail usage points to archive pages

consolidate entity resolution scripts into single module

bad column name on repo.commit_data dataframe

global Repo not defined

activity() doesn't exist in notebook example Analyze Senders

None body in parsed mails

Add Comprehensive Tests to Git Loading Scheme

Archive 'Date' column sometimes is string not datetime type

← Metadata

Owner

Metadata

bigbang bigbang copied to clipboard

Metadata

← Metadata

Owner

Metadata

bigbang
bigbang copied to clipboard