Unreproducable: accuracy for porn class in nsfw_data_scrapper dataset
I tried to test the model against all the data from https://github.com/alexkimxyz/nsfw_data_scraper. It turns out the the model accuracy was pretty low (0.84) for porn classes. I checked several false negative images, should belong to porn. Any suggestions? Thanks
Below are detailed results for porn images:
porn 0.8378566785677277 sexy 0.09817904345614349 neutral 0.028440081768767722 hentai 0.033282149350465834 drawings 0.0022420468568952363
I haven't had such poor results. How clean/sure are you of your data?
Also which model did you use? Keras or TF?
I do not have such poor results on my machine, so I'm trying to gather the details.
I used this keras model https://s3.amazonaws.com/nsfwdetector/nsfw.299x299.h5. I basically ran the scripts in https://github.com/alexkimxyz/nsfw_data_scraper to generate the train dataset, and used the train dataset to test it. There are 106153 porn images tested with the model. What's your suggestions around this test? Thanks
Wow. That result sucks! It sounds like I need to re-run the scraper and get some more data and see if I can retrain the model. It's doing amazing with my local dataset. Which was pulled not long ago.
Feel free to see if training on your dataset improves performance.
@GantMan thanks, I will try to do it when I have time. Meanwhile, if you had chance to re-train, please let me know. Thanks
Also, if you run the "self clense" script can you let me know if you have a lot of errors in your dataset?
Mine was pretty clean. From what I recall, a fresh pull on NSFW data scraper can be pretty off.
@TechnikEmpire what's the advantage of SSD for this simple classification?
Hrmmm. That came off kinda harsh. This project is about everything and everyone. Would you like to try again? I think this is a great chance for you to practice sharing your research in a friendly yet challenging way.