jasp-issues icon indicating copy to clipboard operation
jasp-issues copied to clipboard

[Feature Request]: Mode for scale variables with oldschool AND newschool options

Open bluezone1202-ui opened this issue 4 months ago • 7 comments

JASP Version

0.95.1

Commit ID

No response

JASP Module

Descriptives

What analysis are you seeing the problem on?

Descriptives

What OS are you seeing the problem on?

Other

Bug Description

I'm getting reports from multiple students that the mode is incorrect when they run descriptives. In addition, I just had a student report that the PDF export shows different numbers than the JASP file exported.

Image Image

I'll add some screenshots from two of the students, but there have been about 4-5. Thank you!!

Expected Behaviour

The mode should be 4.0

Steps to Reproduce

The students are importing a .csv. Then they run descriptives for two variables. The variables are set to "scale".

Log (if any)

I don't have the log files because the screenshots are coming from students who are using a different (new) version than me (and my laptop won't download the new version without updating my OS).

More Debug Information

No response

Final Checklist

  • [x] I have included a screenshot showcasing the issue, if possible.
  • [x] I have included a JASP file (zipped) or data file that causes the crash/bug, if applicable.
  • [x] I have accurately described the bug, and steps to reproduce it.

bluezone1202-ui avatar Sep 06 '25 23:09 bluezone1202-ui

I think these are two separate issues:

  1. The computation of the mode. As an aside, if the data are continuous then the mode will be based on a density estimate. When the data are discrete, the mode will be the value with the highest frequency. If the latter result is desired, you have to change the measurement level of the variable. Does this address the issue?
  2. It would be mysterious if JASP would display different results than are in the pdf, since the pdf is based directly on the output. Can you confirm this?

EJWagenmakers avatar Sep 08 '25 06:09 EJWagenmakers

Hi @bluezone1202-ui,

In addition, could you share the data file (.csv or .jasp - if Github won't let you upload this, you can change the extension to .zip)? That would give some more insight in what could be happening with the results.

Cheers Johnny

JohnnyDoorn avatar Sep 08 '25 09:09 JohnnyDoorn

Recent versions of JASP (0.95.0 or possibly from an earlier version) calculate the mode exactly as described by EJW above, point (1): "if the data are continuous then the mode will be based on a density estimate".

However, this does not match the definition given by most old-fashioned introductory stats textbooks. According to this common definition, the mode computed in the post by [bluezone1202] is correct -- 4.0 (not 4.150).

EARLIER VERSIONS OF JASP COMPUTED THINGS IN THE OLD-FASHIONED WAY. I am sorry to report that my professorial skills are too low to be able to communicate the new-fashioned meaning to my students, so I would like to ask (probably in agreement with bluezone1202?) that JASP revert to the old-fashioned method.

I understand that this can be accomplished, as EJW pointed out, if you "change the measurement level of the variable" from Scale to Ordinal. Again, my students rebelled when I explained this to them; they pointed again to the very clear definition in their (not very old-fashioned) textbook. And they also complained that after such a change, JASP will no longer provide a mean!

Could we have a checkbox to click in the Descriptives panel to tell JASP to calculate a Scale variable's mode in the old-fashioned way (namely, not based on a density estimate), and to preserve its ability to calculate the variable's mean and standard deviation?

Thanks!

--Jim Weinrich Grossmont College

JimWeinrich avatar Oct 15 '25 03:10 JimWeinrich

Here is a PDF showing how earlier versions of JASP used to calculate the mode in the old-fashioned way:

JASP inconsistencies.pdf

JimWeinrich avatar Oct 15 '25 03:10 JimWeinrich

Sounds like a reasonable request to me. While it is good that students learn, that nothing is written in stone (not even textbooks) it is also good to have options.

Another example: I also always struggle a bit explaining why Q3 and Q1 in the boxplot are sometimes not exactly the same like what my students calculate on paper with the oldschool Tukey method. So I can relate. Explaining such things is neccessary but hard. And it is good then to show the students that there are many truths, even for the same scale. It is just a question of definition and perspective. And there are many.

tomtomme avatar Oct 15 '25 06:10 tomtomme

Hi Jim,

There are two issues here as well:

  1. JASP does not provide a mean for ordinal data. This is because for ordinal data (say a Likert scale), the mean is not defined. However, we realize that people still want to see the mean, and we will make that possible again in the next version of JASP (we are still debating how to best solve this -- tagging @vandenman and @JohnnyDoorn.
  2. About the "old-fashioned" definition of the mode. Presumably you refer to the definition that the mode is the most frequently occurring value in the sample. This definition becomes unusable for continuous data, where every value is unique (up to the accuracy of the measurent instrument). It is stressed and very clearly explained on the Wiki entry for mode: https://en.wikipedia.org/wiki/Mode_(statistics) For instance: "For a sample from a continuous distribution, such as [0.935..., 1.211..., 2.430..., 3.668..., 3.874...], the concept is unusable in its raw form, since no two values will be exactly the same, so each value will occur precisely once. In order to estimate the mode of the underlying distribution, the usual practice is to discretize the data by assigning frequency values to intervals of equal distance, as for making a histogram, effectively replacing the values by the midpoints of the intervals they are assigned to. The mode is then the value where the histogram reaches its peak. For small or middle-sized samples the outcome of this procedure is sensitive to the choice of interval width if chosen too narrow or too wide; typically one should have a sizable fraction of the data concentrated in a relatively small number of intervals (5 to 10), while the fraction of the data falling outside these intervals is also sizable. An alternate approach is kernel density estimation, which essentially blurs point samples to produce a continuous estimate of the probability density function which can provide an estimate of the mode."

JASP now uses the kernel density approach. This is not just us being pendantic about stats -- at first we used the "old-fashioned definition" and then experienced that for continuous data it yielded anomalous results. @vandenman can you still find these data, or construct data to show the anomaly? It would be nice for teaching and/or a blog post I think.

If you textbooks do not mention this complication for continuous data then it is the textbooks that are at fault or at least incomplete. I checked our own textbook with Andy Field, and it is not mentioned there either! @JohnnyDoorn let's put this on the list of improvements for the next version. So your students have the opportunity to learn something that most textbooks (probably -- I did not do a systematic search) gloss over...

Cheers, E.J.

EJWagenmakers avatar Oct 15 '25 06:10 EJWagenmakers

I think for continous scale data there is no debate. The "new way" in JASP is the way to go. I think the confusion arises with more discrete scale data like counts or data with only a single decimal value. And JASP does not differenciate the two. It never did.

tomtomme avatar Oct 15 '25 07:10 tomtomme