nupic.audio
nupic.audio copied to clipboard
CinC/Physionet PCG/ECG challenge 2016
CinC challenge
https://physionet.org/challenge/2016/
A prestigious challenge/conference with nice data!
:fire: UPDATE: game's still ON! :guitar:
Looking for hackers to help me set someting up, if it's feasible. The there will be whole summer to tune the app.
Blocked by: Add encoders #22
Plan of attack
- [ ] audio
- [x] for now use
wav2vectfrom Matlab - [ ] implement
wavEncoder- IN PROGRESS #26 - [ ] evaluate if functionality of the WAVEncoder (internal scipy) is the same as matlab's
- [ ] try
Cochleaencoder - [ ] implement sound encoders for nupic.audio #22
- [x] for now use
- [ ] training
- records are Normal/Anomaly/Unknown
- [ ] aggregate all NORMAL records to a 2 column file (reset, PCG)
- [x] how radical subsampling? bcs nupic is too slow to process whole dataset: only down to 1000(from 2000),bcs of Sampling Theorem (Fs>=2*F)
- [ ] commit the training data files (bcs the preprocessing takes long)
- [ ] train a HTM model + serialize it
- [ ] try param swarming
- [ ] evaluation
- [ ] load the model, disable learning
- [ ] 2 tasks description.py?, OR other way to train/load/eval a model on datasets
- [ ] compute average anomaly score for all datapoints of a record
- [ ] implement the anomaly metric in nupic
- [ ] create a model (for nupic?) that does this classification based on avg. anomaly scores?
- [ ] threshold to Normal/Anomaly/Unknown
- [ ] submission
- [ ] modify examples
sample2016* - [ ] nupic is installed, so
setupwill just source a virtualenv - [ ] each evaluation in
nextwill call matlab (wav2csv), python(writes anomaly scores to CSV), matlab again(loads anomalies and decides classification) - this is problematic, better go full-python if possible!
- [ ] modify examples
- [ ] improvements:
- [ ] try bag (multi model) voting
- [ ] model trained on full normal data
- [ ] model on FHS parts
- [ ] model on anomalous data
- [ ] model pretrained on ECG data from other sources! https://github.com/breznak/nupic.biodat
- [ ] try bag (multi model) voting
Working plan to get some validation results ASAP:
- [ ] training data
- will train only on
Normaldata and select (FHS) subsequences of it - data extracted from Matlab @breznak will do that
- will train only on
- [ ] train HTM model
- on the provided data
- just one HTM model (with RDSE? encoder, what best settings? probably no time to swarm)
- able to
serialize the modelandloadto run on eval. data (learning off)- the approach with OPF is not reliably working, can someone post code to do that? (@rhyolight or someone..?)
- [ ] write simple classification function:
classify(anScores[])- should decide classification from the anomaly scores for the whole sequence/sample
- can be sth like
avgand Normal iff <0.4; UNKNOWN iff [0.4...0.7]; Anomal iff > 0.7; ETA ~10mins
- [ ] score
- process validation data (@breznak will commit a file)
- classify & compute score -> submit! :pray: