qiita icon indicating copy to clipboard operation
qiita copied to clipboard

Prohibition for env_package should apply to samples in prep not total list of samples

Open ackermag opened this issue 5 years ago • 2 comments

Approval for studies is requested per prep and the prohibition applicable to env_package should be applied only to the set/subset of samples in the prep where approval is being requested not to the entire list of samples. Many samples in the full list are entered erroneously or not sequenced and are not visible in the public study.
The warning on the sample info page is sufficient I think for people planning to use sandbox studies for analysis.

ackermag avatar Jun 12 '19 16:06 ackermag

The message on the warning should be reworded because it implies that all values are incorrect. Change to:

Sample Info has invalid values: ", Unspecified, LabControl test, Not applicable, None", valid values are: "air, built environment, host-associated, human-associated, human-skin, human-oral, human-gut, human-vaginal, microbial mat/biofilm, misc environment, plant-associated, sediment, soil, wastewater/sludge, water"

Currently reads: Sample Info has a no valid values: ", Unspecified, LabControl test, Not applicable, None", valid values are: "air, built environment, host-associated, human-associated, human-skin, human-oral, human-gut, human-vaginal, microbial mat/biofilm, misc environment, plant-associated, sediment, soil, wastewater/sludge, water"

ackermag avatar Jun 13 '19 14:06 ackermag

I was taking a look of what will it take to solve this issue and it's not as trivial as I thought; the main issue is that we need to check which metadata columns (categories) are present in the info file and which are required, and the method we use to get those categories is by looking at the sample_values->>'columns' row (as this is much faster that checking each sample). Thus, to make this change, we will need to create a new method that returns the existing categories by a group of samples (which might be too slow if the prep has too many samples AKA it needs to be benchmarked).

antgonza avatar Jul 03 '19 12:07 antgonza