djl icon indicating copy to clipboard operation
djl copied to clipboard

Request for help.

Open DxsSucuk opened this issue 3 years ago • 51 comments

Information

Hey there!

I wanted to ask for help since I couldn't figure out how to fix a certain bug, and because I did not get an answer in 5 days on the DJL Slack channel.

Related Slack Related Exception

Issue

The Issue that I am having is that the BinaryImageTranslator I created has Issue with any Image given to it, I could not figure out why exactly but as of it looks there is an Issue with the resizing/reshaping of the Input Image.

DxsSucuk avatar May 19 '22 09:05 DxsSucuk

@DxsSucuk

Sorry for the delay. We were busy in 0.17.0 release. @zachgk can you follow up on this issue?

frankfliu avatar May 19 '22 15:05 frankfliu

All good I understand, expecially on a project like this which Im pretty sure takes a lot of "big brain".

DxsSucuk avatar May 19 '22 16:05 DxsSucuk

Hi @DxsSucuk if you don't mind, can you send the code and model (if it is open sourced) to us and we will take a look. And also feel free to grab me (Lanking) a slack call, happy to hop in and help :)

lanking520 avatar May 19 '22 18:05 lanking520

Sure thing! The source is on my Git Account here

DxsSucuk avatar May 19 '22 21:05 DxsSucuk

Any Idea of what can be broke?

DxsSucuk avatar May 22 '22 09:05 DxsSucuk

@DxsSucuk here is the solution:

In line: https://github.com/DxsSucuk/PissAI/blob/master/src/main/java/de/presti/pissai/trainer/BinaryImageTranslator.java#L24-L32

Change them to:

    @Override
    public Batchifier getBatchifier() {
        return null;
    }

    @Override
    public Float processOutput(TranslatorContext ctx, NDList list) {
        return list.singletonOrThrow().toFloatArray()[0];
    }

To pass the run. The reason why the error happened is:

  1. You use StackBatchifier, where it tried to unbatchify at the beginning, and you don't need this right now.

  2. The output is a NDArray with Shape() empty, there are single value inside. So you should try to get Float array and extrat the first value from it.

lanking520 avatar May 22 '22 18:05 lanking520

Thanks @lanking520 !

That fixed the Issue, but now it returns a Number that is quite weird. The Number is -7.2680035, and that is quite confusing.

DxsSucuk avatar May 22 '22 19:05 DxsSucuk

@lanking520 Any Idea why that could be?

DxsSucuk avatar May 25 '22 10:05 DxsSucuk

In my mind it can caused by you haven't apply a softmax function to make the result fall from 0 -1. What is the expected number for you? @zachgk

lanking520 avatar May 25 '22 17:05 lanking520

In my mind it can caused by you haven't apply a softmax function to make the result fall from 0 -1. What is the expected number for you? @zachgk

For me it would be a numer between 0 and 1 like an precentage. Means 0.7 = 70% similiar. @lanking520

DxsSucuk avatar May 25 '22 18:05 DxsSucuk

@lanking520 Just tried to use softmax, to fix it but now it always return 1.0

DxsSucuk avatar May 25 '22 20:05 DxsSucuk

You want sigmoid, not softmax. Softmax is for multi-class while sigmoid is for binary. Also, the sigmoid is automatically applied by the SigmoidBinaryCrossEntropyLoss, so you don't need to add it to your model. You can just think of it as the sigmoid(modelOutput) would be the probability you are looking for.

You can also make sense of the pre-sigmoid result too. Maybe look at the graph on wikipedia. More positive means higher probability, more negative means lower probability, and 0 means 50-50.

Or, sigmoid(-7.2680035) = 0.0006970170 = .06970170%

zachgk avatar May 25 '22 23:05 zachgk

Is there a math equation to "convert" it into a percentage. @zachgk

DxsSucuk avatar May 26 '22 18:05 DxsSucuk

Yeah. You can use the DJL operator Activation.sigmoid(array). The actual equation for sigmoid is $sigmoid(x) = \frac{1}{1 + e^{-x}}$. These give you the decimal form and then you can multiply by 100 to get the percentage if you prefer that

zachgk avatar May 26 '22 18:05 zachgk

Thank you! It really helped, now the only thing is that it says that it is only abou 6% similiar, which it shouldn't since I used an Image it was trained with. @zachgk

DxsSucuk avatar May 26 '22 19:05 DxsSucuk

What are you getting for your training/validation accuracy? Are they increasing as you train?

zachgk avatar May 26 '22 23:05 zachgk

I did not setup anything to show me that information, as far as I know. Any tips of how I can do that?

DxsSucuk avatar May 27 '22 08:05 DxsSucuk

You have the logging training listeners (https://github.com/DxsSucuk/PissAI/blob/master/src/main/java/de/presti/pissai/trainer/ModelTrainer.java#L114), so it should be printing the loss and accuracy to the stdout

zachgk avatar May 27 '22 20:05 zachgk

It actually does not, maybe a Logger Configuration Issue?

DxsSucuk avatar May 27 '22 21:05 DxsSucuk

I checked and tried using a LogBack Configuration that I use on another project to store debug data in another File, but I can't seem to find anything @zachgk

DxsSucuk avatar May 29 '22 19:05 DxsSucuk

After reading alot of the documentation, I found out that the Example class for MNIST already does what is needed, so I checked it out and this should be the result you are searching for right? @zachgk

DxsSucuk avatar May 30 '22 07:05 DxsSucuk

After training with more Images it seems like it got worst? I tried to understand more about how to train with the "E-Book" on the Website, and it help me a bit, so I thought it could be a sample Issue? Since I don't have alot of Images for the Algorithm to train good. Training Paste.gg I hope this helps @zachgk

DxsSucuk avatar May 31 '22 12:05 DxsSucuk

Yeah, that log is what I am looking for. The training does seem to be working because the loss is going down. The training accuracy is 100% which means it should be classifying all the training images correctly. Can you double check that you aren't backwards and the 6% should actually be 94% probability.

But in general, you probably want more images then you have. Once you have more images, then you can also get to using a better model as well. Also, I am looking at your image folder and it looks like all of the images are of dream. If you don't have any images of not-dream, the model would be trained to think everything is dream? Maybe try mixing your dataset with another dataset like Cifar10. So, it could be trained as a combined 11-class dataset with the 10 classes of Cifar10 + dream class.

zachgk avatar May 31 '22 18:05 zachgk

But the Model is trained to be binairy? Wouldn't it be the currect way to let it only know how dream looks like, so it can determin if a new Image suits the known pattern or not?

DxsSucuk avatar May 31 '22 20:05 DxsSucuk

And I just checked again with the same Image and got this, either my percentage calculation is wrong or something else in my code. And yeah I used the exact same Image I used as training Image. When I try the same, but with the Model that has 41 epochs, I get 3%.

DxsSucuk avatar May 31 '22 20:05 DxsSucuk

Think about it like this. The model is trying to determine rules that help explain what we are asking of it. And, it could learn the rule isDream(image) = true. The rule is easy to learn and based on the dataset it is 100% accurate. So there's no reason not to do it.

That's why we need negative examples. Then, it would learn that the always true rule is not very good and it would work to come up with better rules.

zachgk avatar May 31 '22 20:05 zachgk

Aah, basically a way to prevent the Model from just always saying "Yeah it is dream", since that would be true on the training dataset. How would I fix this exactly? Just add another folder into the Imagefolder with alot of Images of different stuff? Or?

DxsSucuk avatar May 31 '22 20:05 DxsSucuk

Yeah, adding another folder with a variety of random stuff would work. Especially consider examples that you specifically want it to not misclassify

zachgk avatar May 31 '22 21:05 zachgk

Hey, I wanted to ask for help since adding new Images to the Model broke something? Logs @zachgk

DxsSucuk avatar Jun 08 '22 08:06 DxsSucuk

Any Idea? @zachgk ?

DxsSucuk avatar Jun 20 '22 07:06 DxsSucuk