omr icon indicating copy to clipboard operation
omr copied to clipboard

getting multiple marks from the same question

Open kiran-taylor opened this issue 5 years ago • 4 comments

hi @rbaron. In this image "answered-sheet-photo-result.png", Question no-6 two options are marked Option-B and Option-D because it is marked with two options program bypasses it. is it possible to get multiple marks from the same question?

kiran-taylor avatar Dec 24 '20 08:12 kiran-taylor

Hi @kiran-taylor,

To get multiple marked alternatives for a single question, you can tweak the get_marked_alternative function. Currently it identifies the marked alternative by comparing it with the other ones:

# Simple heuristic
if sorted_means[0]/sorted_means[1] > .7:
    return None

This doesn't work if you want to consider multiple marked alternatives. The first thing I would try is using an absolute threshold instead of comparing the it with other alternatives. In this approach, you would consider every entry in this means list individually, and consider it marked if its value is below a threshold. Finding a suitable threshold will likely require some trial and error.

rbaron avatar Dec 24 '20 08:12 rbaron

yes, I tried increasing the value,

# Simple heuristic
if sorted_means[0]/sorted_means[1] > .9:
    return None 

if the single question has multiple marks it compares with alternatives and chooses a higher threshold value leaving lower.

kiran-taylor avatar Dec 24 '20 08:12 kiran-taylor

I think you would need to modify get_marked_alternative a little more. For instance, you would need to return multiple values from it, instead of just a single marked alternative. Additionally, comparing the "darkest" one with the second darkest one is not useful anymore, since both now could be legitimately marked.

The approach I suggested is something like this:

def get_marked_alternative(alternative_patches):
    means = list(map(np.mean, alternative_patches))

    # Here, we return all alternatives that we consider marked.
    return [
        alternative_index
        for alternative_index, mean in enumerate(means)
        if mean < THRESHOLD
    ]

You would need to experiment with a few values for THRESHOLD in order to figure out a good one that separates marked and unmarked alternatives well.

rbaron avatar Dec 24 '20 09:12 rbaron

thanks, will try it out.

kiran-taylor avatar Dec 24 '20 09:12 kiran-taylor