bug icon indicating copy to clipboard operation
bug copied to clipboard

scaladoc sorts in ASCII order

Open martijnhoekstra opened this issue 5 years ago • 12 comments

reproduction steps

https://www.scala-lang.org/api/current/scala/collection/mutable/Stack.html sorts top after toVector as reported on https://contributors.scala-lang.org/t/scaladoc-ordering/4503

problem

This is sorted in ASCII order rather than alphabetical. I suspect a quick fix is possible at in https://github.com/scala/scala/blob/26dd17aebc988ba84c243e6f23796680df0d4a26/src/scaladoc/scala/tools/nsc/doc/model/Entity.scala#L79 with a toLowerCase

martijnhoekstra avatar Sep 10 '20 15:09 martijnhoekstra

Can someone explain to me why case-insensitive sort is better? Strings are almost always sorted this way.

nafg avatar Sep 10 '20 19:09 nafg

fwiw I tried javadoc on:

package foo;

public interface J {
  void toList();
  void toVector();
  void top();
}

and the output has the case-insensitive sort order:

Modifier and Type Method and Description
void toList() 
void top() 
void toVector() 

SethTisue avatar Sep 10 '20 19:09 SethTisue

I replied with the "I'm confused now" confused smiley, not the "I don't think this is a good idea but I'm not about to write that out in words" confused smiley. It can be difficult to spot the difference.

martijnhoekstra avatar Sep 11 '20 15:09 martijnhoekstra

personally, I like toList and toVector being grouped, as they're semantically closer? and my intuition says that that holds for methods in general, but I can't actually prove that

NthPortal avatar Sep 12 '20 15:09 NthPortal

I think it's good practice to make human-facing text alphabetical rather than, what I've found out is called, "ASCIIbetical" order. But when the human is a developer and the case is significant, then ASCIIbetical might be the better choice.

dwijnand avatar Sep 14 '20 08:09 dwijnand

If toList and toVector should be grouped, shouldn't we just use a @group for it?

martijnhoekstra avatar Sep 15 '20 12:09 martijnhoekstra

I just mean that semantically, toList and toVector are closer than either is to top

NthPortal avatar Sep 15 '20 16:09 NthPortal

sortWith
sorted

might be an interesting case to ponder.

Also supposed we have:

sortForList
sortForVector
sortForall
sortForeach
sortWith
sorted

ASCII ordering + camel casing creates an interesting effect as if the methods are sorted by camel-cased words, but it could require multiple scanning of A-Z, a-z trying to look for sortForeach if you can't remember the exact casing (after all you're in the Scaladoc).

Case insensitive sorting would do:

sorted
sortForall
sortForeach
sortForList
sortForVector
sortWith

eed3si9n avatar Sep 15 '20 20:09 eed3si9n

but it could require multiple scanning of A-Z, a-z trying to look for [...] if you can't remember the exact casing

that's a very good point, and I think a more compelling one

NthPortal avatar Sep 15 '20 20:09 NthPortal

Maybe instead of sorting by the name as a single word, it should sort by words, case-insensitive. Something like:

def symbolToWords(sym: String): Seq[String] = ... symbols.sortBy(symbolToWords)(Ordering.Iterable(Ordering.comparatorToOrdering(String.CASE_INSENSITIVE_ORDER)))

On Tue, Sep 15, 2020 at 4:53 PM Princess | April [email protected] wrote:

but it could require multiple scanning of A-Z, a-z trying to look for [...] if you can't remember the exact casing

that's a very good point, and I think a more compelling one

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/scala/bug/issues/12149#issuecomment-692972837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAYAUDFPYIW5QUPEOXT2TTSF7H3DANCNFSM4RFHKUMA .

nafg avatar Sep 15 '20 21:09 nafg

but it could require multiple scanning of A-Z, a-z trying to look for [...] if you can't remember the exact casing

that's a very good point, and I think a more compelling one

I agree it makes a compelling argument but

multiple scanning of A-Z, a-z trying to look for sortForeach if you can't remember the exact casing

implies (to me) a lack of naming convention or at least a lack in consistency. When a convention is applied consistently, as in the top/toList/toVector case, the ascii order presents better results.

So I think I still favour ASCIIbetical ordering, as if you're looking specifically for "sortforeach" the search functionality should be case-insensitive and find you sortForeach.

dwijnand avatar Sep 16 '20 07:09 dwijnand

MLA citation style says alpha order letter by letter, but I had a notion that

sortBy
sortWith
sorted

is a desired ordering because sort precedes sorted (not that B precedes e).

If there were a method sortaImmutable, it would fall between sortX and sorted.

Similarly for grouping all toX before tokenize, or what have you.

I don't seem to have a book that demonstrates citations with shared prefixes, but this hymnal sorts everything with "O" ("O Beautiful for Spacious Skies") before everything starting "Of" or "On", and so on. (But it is not citation order, which drops initial articles, such as in "The First Nowell".) The must be a word for "sort by complete word prefix".

Just noticed the sort order

scala
scalaDoc
scalac

som-snytt avatar Apr 06 '24 16:04 som-snytt