DKVMN icon indicating copy to clipboard operation
DKVMN copied to clipboard

Which assistment 2009?

Open clara2911 opened this issue 4 years ago • 4 comments

Which assistment 2009 dataset are you using? According to this link (https://sites.google.com/site/assistmentsdata/home/assistment-2009-2010-data/skill-builder-data-2009-2010) there are 2 versions after the detection of the duplicate row problem. Version (1) with one row per student-problem-skill and version (2) with one row per student-problem. If the problem has multiple skills it is give as skill1_skill2 (see image below).

Are you using version 1 or version 2? Thank you for your help!

image

clara2911 avatar Jun 02 '20 10:06 clara2911

@clara2911 I think neither versions are used. It seems there are a lot versions of assistment2009 (they keep updating this dataset?). I downloaded this version https://drive.google.com/file/d/0B3f_gAH-MpBmUmNJQ3RycGpJM0k/view?usp=sharing few weeks ago and tested on DKT model, I could only get auc around 0.74. (My DKT model could reproduce the results from this paper https://files.eric.ed.gov/fulltext/ED592679.pdf, so I guess its not the problem of my model)

dxywill avatar Jun 10 '20 20:06 dxywill

@clara2911 I think neither versions are used. It seems there are a lot versions of assistment2009 (they keep updating this dataset?). I downloaded this version https://drive.google.com/file/d/0B3f_gAH-MpBmUmNJQ3RycGpJM0k/view?usp=sharing few weeks ago and tested on DKT model, I could only get auc around 0.74. (My DKT model could reproduce the results from this paper https://files.eric.ed.gov/fulltext/ED592679.pdf, so I guess its not the problem of my model)

Thanks a lot for your answer - indeed there are a lot of different assistment2009 versions. I will try this version you mentioned.

Still it would be amazing if the authors of this paper can give a definite answer on which version they used.

clara2911 avatar Jun 11 '20 06:06 clara2911

Sorry for the late reply. I have no idea which version we used. But I remember that in our experiments one exercise maps to one skill, so maybe the version with one row per student-problem-skill. Or could you use the dataset we preprocessed in the Data folder? Thanks.

jennyzhang0215 avatar Jun 11 '20 08:06 jennyzhang0215

@clara2911 I implemented a pytorch version (https://github.com/dxywill/pytorch_dkvmn) that directly use the file https://drive.google.com/file/d/0B3f_gAH-MpBmUmNJQ3RycGpJM0k/view?usp=sharing if you are interested

dxywill avatar Jun 11 '20 12:06 dxywill