pyglmnet
pyglmnet copied to clipboard
[MRG] Improve test coverage
Let's see if this helps with coverage ...
Codecov Report
Merging #253 into master will decrease coverage by
0.65%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #253 +/- ##
==========================================
- Coverage 75.48% 74.82% -0.66%
==========================================
Files 4 5 +1
Lines 673 719 +46
Branches 148 130 -18
==========================================
+ Hits 508 538 +30
- Misses 128 140 +12
- Partials 37 41 +4
Impacted Files | Coverage Δ | |
---|---|---|
pyglmnet/utils.py | 32.55% <0%> (-10.58%) |
:arrow_down: |
pyglmnet/pyglmnet.py | 79.67% <0%> (-1.41%) |
:arrow_down: |
pyglmnet/datasets.py | 81.37% <0%> (ø) |
|
pyglmnet/base.py | 48.48% <0%> (+3.32%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update f170e00...85de164. Read the comment docs.
hmm, looks like it didn't budge. merge or close?
neither. Wait a bit. I'll give another try in a day or two :)
@pavanramkumar looks like the dataset fetchers don't work on python 3.5. Do you want to iterate on top of my pull request here? You can push directly to my branch if you want.
@pavanramkumar let's merge this? It will improve coverage a little ...
@jasmainak it's really strange why the dataset fetcher doesn't work with travis. when i run it on my local py35 evironment, it works fine. perhaps a miniconda dependency issue?
py.test --cov=pyglmnet tests/test_pyglmnet.py -k 'test_fetch_datasets'
===================================================================== test session starts =====================================================================
platform darwin -- Python 3.5.4, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /Users/pavanramkumar/Projects/pyglmnet, inifile:
plugins: cov-2.6.0
collected 8 items
tests/test_pyglmnet.py .
---------- coverage: platform darwin, python 3.5.4-final-0 -----------
Name Stmts Miss Branch BrPart Cover
--------------------------------------------------------
pyglmnet/__init__.py 4 0 0 0 100%
pyglmnet/base.py 66 56 34 0 10%
pyglmnet/metrics.py 21 21 6 0 0%
pyglmnet/pyglmnet.py 488 442 202 0 7%
pyglmnet/utils.py 43 34 10 0 17%
--------------------------------------------------------
TOTAL 622 553 252 0 8%
===================================================================== 7 tests deselected ======================================================================
=========================================================== 1 passed, 7 deselected in 2.87 seconds ============================================================
That's weird I am able to reproduce the Travis
issue though. Are you sure you are on the right branch? On your codecov
output I don't see datasets.py
(py35) mainak@mainak-ThinkPad-W540 ~/Desktop/projects/github_repos/pyglmnet $ pytest tests/test_pyglmnet.py -k 'test_fetch_datasets'
============================================================== test session starts ===============================================================
platform linux -- Python 3.5.3, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: /home/mainak/Desktop/projects/github_repos/pyglmnet, inifile:
collected 9 items
tests/test_pyglmnet.py F
==================================================================== FAILURES ====================================================================
______________________________________________________________ test_fetch_datasets _______________________________________________________________
def test_fetch_datasets():
"""Test fetching datasets."""
datasets.fetch_community_crime_data('/tmp/glm-tools')
> datasets.fetch_group_lasso_datasets()
tests/test_pyglmnet.py:348:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
def fetch_group_lasso_datasets():
"""
Downloads and formats data needed for the group lasso example.
Returns:
--------
design_matrix: pandas.DataFrame
pandas dataframe with formatted data and labels
groups: list
list of group indicies, the value of the ith position in the list
is the group number for the ith regression coefficient
"""
try:
import pandas as pd
except ImportError:
raise ImportError('The pandas module is required for the '
'group lasso dataset')
# helper functions
def find_interaction_index(seq, subseq,
alphabet="ATGC",
all_possible_len_n_interactions=None):
n = len(subseq)
alphabet_interactions = \
[set(p) for
p in list(itertools.combinations_with_replacement(alphabet, n))]
num_interactions = len(alphabet_interactions)
if all_possible_len_n_interactions is None:
all_possible_len_n_interactions = \
[set(interaction) for
interaction in
list(itertools.combinations_with_replacement(seq, n))]
subseq = set(subseq)
group_index = num_interactions * \
all_possible_len_n_interactions.index(subseq)
value_index = alphabet_interactions.index(subseq)
final_index = group_index + value_index
return final_index
def create_group_indicies_list(seqlength=7,
alphabet="ATGC",
interactions=[1, 2, 3],
include_extra=True):
alphabet_length = len(alphabet)
index_groups = []
if include_extra:
index_groups.append(0)
group_count = 1
for inter in interactions:
n_interactions = comb(seqlength, inter)
n_alphabet_combos = comb(alphabet_length,
inter,
repetition=True)
for x1 in range(int(n_interactions)):
for x2 in range(int(n_alphabet_combos)):
index_groups.append(int(group_count))
group_count += 1
return index_groups
def create_feature_vector_for_sequence(seq,
alphabet="ATGC",
interactions=[1, 2, 3]):
feature_vector_length = \
sum([comb(len(seq), inter) *
comb(len(alphabet), inter, repetition=True)
for inter in interactions]) + 1
feature_vector = np.zeros(int(feature_vector_length))
feature_vector[0] = 1.0
for inter in interactions:
# interactions at the current level
cur_interactions = \
[set(p) for p in list(itertools.combinations(seq, inter))]
interaction_idxs = \
[find_interaction_index(
seq, cur_inter,
all_possible_len_n_interactions=cur_interactions) + 1
for cur_inter in cur_interactions]
feature_vector[interaction_idxs] = 1.0
return feature_vector
positive_url = \
"http://genes.mit.edu/burgelab/maxent/ssdata/MEMset/train5_hs"
negative_url = \
"http://genes.mit.edu/burgelab/maxent/ssdata/MEMset/train0_5_hs"
> pos_file = tempfile.NamedTemporaryFile(bufsize=0)
E TypeError: NamedTemporaryFile() got an unexpected keyword argument 'bufsize'
pyglmnet/datasets.py:203: TypeError
-------------------------------------------------------------- Captured stdout call --------------------------------------------------------------
...99%, 1 MB
...100%, 1 MB
=============================================================== 8 tests deselected ===============================================================
===================================================== 1 failed, 8 deselected in 2.85 seconds =====================================================
arfff ... now the website hosting the community crime data seems to be down. So Travis won't pass ...