knowledge-tracing-collection-pytorch icon indicating copy to clipboard operation
knowledge-tracing-collection-pytorch copied to clipboard

Question regarding predictions

Open Enrique-Val opened this issue 3 years ago • 3 comments
trafficstars

I have a question regarding your implementation. I'm doing some introspection in order to get some meaningful results regarding prediction in DKT. A modified version of the function main of the file train.py give me a trained model which I (originally) named model. To get my predictions, I use the following code:

model.eval() output = model(q_seq, a_seq), where the inputs represent a sequence of $N$ questions and answers respectively

I struggle to fully the understand the output, which is a 2D array of shape $NxM$, where $M$ is the number of skills. I guess that for $\forall i \in N$, we get the probability of correctly guessing a question related with each one of the $M$ skills in the next interaction $i+1$, given the $i$-th and the past interactions with the system. Is this correct? If so, I noticed that these probabilities orbit around 0.5 (randomness), even though I get a good AUC in the training.

As it is currently the implementation, is it possible to generate a sequence of user answers given only a sequence of questions? I guess the answer is no and I myself programmed a function that does this, in case you want to add it. Assuming that my interpretation of the output is correct (above), I think my function does work. The likelihood of the sequences generated is low, mainly because the probabilities orbit around 0.5, as mentioned.

Enrique-Val avatar Sep 07 '22 09:09 Enrique-Val

I understood that your question is about the shape of the output sequence. Its shape should be as [batch_size, sequence_length]. Also as you mentioned, the knowledge tracing problems tries to infer the user's response for the $i+1$-th question given $1:i$-th user's interactions.

In the AUC metric, binary classification is performed and the classifier concludes that positive if the probability is higher than 0.5, and negative otherwise. This means that the model tries to generate the inference about the response probability close to 0.5 to get a higher AUC score.

Finally, theoretically, it is possible to generate a sequence of the user's response sequence given a sequence of questions.

hcnoh avatar Sep 08 '22 02:09 hcnoh

Thanks for your very fast response. I think that I understand why our output shape is "different". I'm inputing a single vector, so the shape of my output is [sequence_length, n_skills], whereas if you input a batch of inputs your output will be of shape [batch_size, sequence_length, n_skills]. Thank you for clearing this issue.

Are the number of the output actual probabilities of guessing a question or do they have another interpretation? If they are probabilities, I can share you my code for generating sequences, in case you want to include it.

Sorry about the AUC consult, I'm quite new to neural networks. Have you studied how to train a model that output more discriminative numbers (closer to 0 or 1 rather than to 0.5)? Do you have a preliminar idea about which parameters would train a model with such characteristics?

Best regards, Enrique

Enrique-Val avatar Sep 08 '22 02:09 Enrique-Val

Sorry for the late response. I think the output shape corresponding to the input shape of [sequence_length, n_skills] would be also the shape of [sequence_length, n_skills]. Maybe it was implemented to do that. (Should be checked.)

Also, I am very happy if you share your code for generating sequences. I will check the code and give you feedback. (Or you can also contribute to this repository.)

Finally, if you want the output of the neural network to get closer to 0 or 1 then you can use the concept of heat in the Boltzmann distribution. Let me give you an example. Let $\sigma$ be sigmoid function. Then $\sigma(\frac{x}{T})$ will be getting closer to 0.5 if we increase $T$.

hcnoh avatar Sep 14 '22 04:09 hcnoh