healthcareai-py
healthcareai-py copied to clipboard
Initial Neural Net - DO NOT MERGE
This requires substantial review and discussion before merging.
Although you removed the ctg dataset in the end, I can still see it at ab5b130. Dunno if it matters though.
create_nn:
- Maybe give the user the an option create a deeper network? Looks like a fixed layer size network
- Last layer activation: Maybe have a choice for sigmoid for multi-label classification?
Just my two cents.
@mxlei01 I agree with your comments - Thank you for checking it out! It is important to note that is only the first step toward getting neural nets into healthcare.ai. I may be pulling pieces of this in slowly (for example the multiclass support) as we decide how we want to handle nets.
@Aylr Regarding the neural network that Healthcare-AI would use. Would you guys rather use TensorFlow, or high level tool like Keras? I have researched a little bit about Keras vs TensorFlow. I'm not sure your deep learning training flow, but it used to be multi-threading + queues, then now the recommended way is to use DataSets. With Keras batch training using a Python generator, you would only get a portion of throughput you get for TensorFlow (plus the overhead of switching between the underlying C++ code and Python) compared to pure TensorFlow. However, Keras with MXNet backend seems to be a good alternative, with a high training throughput, although I'm not sure the performance compared with pure TensorFlow.
Good GPUs are expensive, and training times are long so we might want every performance we can squeeze out of a GPU.
With deep learning, we would also want to batch data for training, but right now we actually read in the whole dataset for training.
Would we want to somehow make a scalable version of our data pipeline? Don't need to actually replace the whole thing, but can be set with user settings. For example pipeline='TensorFlow'
.
I could invest in some time playing with TensorFlow and see how we could integrate it in Healthcareai-py.
Finally, I might be wrong, so I'm throwing this out there to see if anyone corrects my statements.