model-zoo icon indicating copy to clipboard operation
model-zoo copied to clipboard

Adding custom Model for one of the newly added dataset in MLDatasets.jl

Open arcAman07 opened this issue 3 years ago • 8 comments

Adding a customized ( low level ) but super easy to use and understand model for the beginner friendly "Titanic Dataset", which can Machine Learining beginers to get started with this package. If it needs to be added, I would love to work on the PR.

arcAman07 avatar Feb 19 '22 07:02 arcAman07

A PR would be good, it would be awesome to have a very straightforward implementation focussing only on getting a dataset from that package suitable for use with Flux.

DhairyaLGandhi avatar Feb 19 '22 08:02 DhairyaLGandhi

Cool, have already made the model. Will make couple of changes so it is less complex and very easy to comprehend for beginners( can serve as a great starting point for them. ) Will work on the PR and try to do it asap

arcAman07 avatar Feb 19 '22 11:02 arcAman07

A part of the problem I just encountered is that I am unable to actually load the Titanic Data from the MLDatasets library( have raised an issue ). Should I just implement it from reading from a csv file( like it was done by me in that library) and then create the complete model for use?

arcAman07 avatar Feb 19 '22 16:02 arcAman07

I don't think there's any time crunch on this, so fixing the titanic dataset loading should be done first.

ToucheSir avatar Feb 19 '22 16:02 ToucheSir

Great will look into it. I had tested that locally was working then. After the release was created, I wasn't able to load it

arcAman07 avatar Feb 19 '22 16:02 arcAman07

Issue has been solved, was a mistake on my end while loading it. Will create the model using it and send the PR asap. Thanks @ToucheSir

arcAman07 avatar Feb 19 '22 17:02 arcAman07

Having some troubles training the model for this dataset. After thorough EDA, features importance, data manipulation the testing accuracy is stuck at 0.6 using the simple Logistic regression using the Flux Dense layer. Just not able to improve the model accuracy on testing set after trying out various permutations and combinations by creating various models.( The different datatypes of input features makes it a tough choice of using a neural network rather than other algos ), If accuracy is not the most important thing, and helping user understanding on getting the data, training the data using the Flux.jl library and how to test on it, they I can do the PR.

arcAman07 avatar Feb 22 '22 11:02 arcAman07

File a PR, maybe there is something wrong with your code

CarloLucibello avatar Feb 22 '22 12:02 CarloLucibello