geany icon indicating copy to clipboard operation
geany copied to clipboard

Regroup filetypes by letter

Open techee opened this issue 1 year ago • 4 comments

This patch converts the currently used groups like "Programming languages", "Scripting languages", etc. to groups based on the starting letter of the language only. There are two main reasons for this change:

  1. Some languages are hard to categorize by some semantic group name and the group names are not really fitting. In addition, the currently used group name "Programming languages" isn't very good as "Scripting languages" are also a subset of programming languages. On the other hand it's hard to find a good substitute for "Programming languages" - mostly these are "Compiled languages" but not always and some languages allow to be both interpreted and compiled which complicates the situation.
  2. The "Programming languages" group is too big and the menu is so long that it doesn't fit the display on smaller screens and one has to scroll the menu to get to the right item which isn't user friendly. Things will get only worse as there are still many "Programming languages" that Geany does not support yet and that might be added to the editor in the future.

The newly introduced alphabetic groups are:

A-B
C
D-E-F
G-H-I
J-K-L
M-N-O
P-Q
R-S
T-U-V-W
X-Y-Z

These allow roughly even distribution of existing languages into smaller groups with enough space for possible future language additions. While it would be possible to make the group names more symmetrical, e.g. by having "R-S-T", "U-V-W", I found that the asymmetry helps quicker navigation as one remembers the group with his favorite language is e.g. "the one before the long group" without thinking where exactly in the alphabet the letter is.

Some notes to the implementation:

  1. It mostly follows the existing implementation trying to do minimal changes and doing things in a "dumb and straightforward way". This means that group names are hard coded (they could also be autogenerated, possibly auto-attempting to distribute languages into evenly sized groups).
  2. Technically this change breaks API as it modifies GeanyFiletypeGroupID which is used for the group member of GeanyFiletype which is accessible to plugins. However, this member isn't documented to plugins and no existing plugin from geany-plugins uses it so probably not a big problem.
  3. Because grouping happens automatically now, the [Groups] section from filetype_extensions.conf can be removed and is not read any more.
  4. Because grouping happens automatically now, the [5] argument from FT_INIT() can be removed.
  5. In addition, this patch also removes the [4] argument from FT_INIT() which determined the suffix in the filetype menu like "C++ source file" - IMO the "source file", "file", etc. suffix for all the languages in the menu introduced just a visual clutter and made legibility worse. In addition with the removal of [Groups] from filetype_extensions.conf in (3), it would not be possible to determine the right suffix for custom file types.
  6. The newly introduced groups are untranslatable strings - there should be no need to translate those.

For some more context, see https://github.com/geany/geany/issues/3938#issuecomment-2394477313 and below.

A few screenshots with the new grouping:

Screenshot 2024-10-06 at 14 35 12 Screenshot 2024-10-06 at 14 36 18 Screenshot 2024-10-06 at 14 37 18

techee avatar Oct 06 '24 12:10 techee

Will hopefully inspect and maybe even try "soon" but am re-building my development machine after an SSD death, and then will have to catch up the lost time first, but a couple of comments.

As said elsewhere, (many times over years) these days the distinction of programming vs script vs AOT vs JIT vs interpretor are meaningless for the language, they apply to the implementation, and in some cases there are multiple implementations, so those categories in the menu are meaningless for the filetype.

  1. agree the menu should be hard coded. Some automajik method that totally reorganised the menu just because a user tried a custom filetype would be very annoying.
  2. As the structure member is not individually documented its not in the API so any plugin that uses them is its own fault, and so long as the new member is the same size it won't change the ABI either and so no problems.
  3. agree
  4. agree
  5. personally I don't care about "source file" etc. I only care about the filetype name. Geany does not edit anything but (UTF-8) text and it won't ever edit anything but a "file" so never eg "C++ object file" so why say "C++ source file". Lets see if anyone else who cares can make a cogent argument for keeping the extra text.
  6. agree

Speculatively, perhaps there should be an "Other" for non-ASCII names, maybe the new language is named "Åland" for example.

elextr avatar Oct 07 '24 02:10 elextr

Speculatively, perhaps there should be an "Other" for non-ASCII names, maybe the new language is named "Åland" for example.

What happens right now is that those languages starting with a non-ASCII letter are placed to the top-level menu and not within the A-Z submenus. But, yes, some "Other" would probably be better.

techee avatar Oct 11 '24 16:10 techee

There's also some more discussion in https://github.com/geany/geany/issues/2087 which I just found. I still think alphabetical sorting is the easiest to understand rather than some "Pascal-like" or "Python-like" groups.

techee avatar Oct 16 '24 11:10 techee

Agree that alphabetic is the only sensible default menu division. To illustrate categories, look at the Wikipedia categorical section, and look at how many categories each language appears in. It is meaningless. The Geany team should be humble enough to not be programming language wizards and enforce some categorisation, just use alphabetic.

If somebody is soooo convinced that they need non-alphabetical they can make a separate PR built on this that reads a conf file to replace alphabetic and they can do whatever they want.

elextr avatar Oct 16 '24 21:10 elextr

Speculatively, perhaps there should be an "Other" for non-ASCII names, maybe the new language is named "Åland" for example.

Done in the latest commit.

techee avatar Nov 02 '24 22:11 techee

LGBI, will try to make time to test in a few days, but if others test it don't wait for me.

elextr avatar Nov 02 '24 23:11 elextr

Looks good. I kinda liked the idea of separating languages by categories but honestly that simply wasn't feasible.

Question. How is this handled in different locales? Will the types be in the English placement, or in the localized one?

By removing the "source file" suffix you avoided most of the trouble this could cause (e.g. in Spanish nearly all languages are called "Archivo de fuente <type>" or "Archivo <type>", and they'd all be filed under A), but there are still a few localized names (e.g. "Cascading Stylesheet" = "Hoja de estilo en cascada"). Probably the most reasonable approach is to not translate the names, and to use "short names" (e.g. "CSS" rather than "Cascading Stylesheet") in all locales.

cousteaulecommandant avatar Nov 17 '24 14:11 cousteaulecommandant

Question. How is this handled in different locales? Will the types be in the English placement, or in the localized one?

It will be the localized one.

By removing the "source file" suffix you avoided most of the trouble this could cause (e.g. in Spanish nearly all languages are called "Archivo de fuente <type>" or "Archivo <type>", and they'd all be filed under A)

Even if the prefix/suffix stayed there, it would be grouped by <type>.

but there are still a few localized names (e.g. "Cascading Stylesheet" = "Hoja de estilo en cascada"). Probably the most reasonable approach is to not translate the names, and to use "short names" (e.g. "CSS" rather than "Cascading Stylesheet") in all locales.

Yes, I was thinking about this too. The following language names are translatable now:

  • Shell (possibly keep untranslated)
  • Makefile (possibly keep untranslated)
  • Cascading Stylesheet (could become CSS)
  • Config (could become Ini/conf)
  • Gettext translation (???)

So it means these could appear under different letters depending on the used locale. And as you said, at least for CSS, it would be better to use "CSS" (not sure how common it's to translate Shell or Makefile).

techee avatar Nov 17 '24 17:11 techee

I would've thought "Assembly" would've been translatable too ("Ensamblador" in Spanish), but at least in Geany it's not translated. I suppose "Gettext translation" is translatable because of the "translation" part in the name. No idea about "Makefile". (And I don't think "shell" is meant to be translated as "cáscara" in Spanish) :)

Cascading Stylesheet should probably be left as "CSS", and Gettext translation as "Gettext". Just the language. (About "config"… yeah that's hard since I'm not even sure it has a standard name, and there are billions of slightly different implementations.)

Even if the prefix/suffix stayed there, it would be grouped by <type>.

OK, I probably confused "internal name" and "name displayed on the menu". (Or they're grouped by the untranslated name, even if they're displayed by the translated one, which is what I meant with my first question.)

cousteaulecommandant avatar Nov 17 '24 18:11 cousteaulecommandant

(Or they're grouped by the untranslated name, even if they're displayed by the translated one, which is what I meant with my first question.)

There are 2 types of translatable strings. One denoting the kind of file

  • _("%s source file")
  • _("%s file")
  • _("%s script")
  • _("%s document")

and then the actual translation of the language which is inserted inside one of the above 4 strings (i.e. those 5 languages I mentioned above).

The placement into the alphabetic groups is based on the translation of the language, not the translation of the kind of file (which is gone with this patch anyway)

techee avatar Nov 17 '24 19:11 techee

Looks good to me, tested with English and German locale (for German "Config" is actually translated to "Konfigurationsdatei".

I noticed one issue in the Tools->Configuration Files->Filetype Configurationmenu, there is filetypes.conf listed under K which might be unexpected.

So, I'd also vote for making those five filetype names untranslatable and rename them accotding to the suggestion in https://github.com/geany/geany/pull/3977#issuecomment-2481384893.

eht16 avatar Nov 18 '24 12:11 eht16

I noticed one issue in the Tools->Configuration Files->Filetype Configurationmenu, there is filetypes.conf listed under K which might be unexpected.

Yeah, good point, done.

Also, the human-readable name (of those several filetypes that contain it) should start by the same letter as the filetype name for the very same reason. So

  • for "Po" I used "Po (Gettext)"
  • for "Conf" I used "Conf/Ini"
  • and changed "(O)Caml" to "Caml/OCaml" (thanks to which I could remove a special hack used to handle the braces)

techee avatar Nov 18 '24 18:11 techee

I've also updated the documentation (i.e. removed stuff related to filetype group configuration). I hope I haven't missed anything. There don't seem to be any screenshots showing the groups so no need to update those.

techee avatar Nov 18 '24 18:11 techee

The latest changes look good and especially Conf/Ini works well.

After resolving the merge conflict, I think we are good to merge.

eht16 avatar Nov 19 '24 14:11 eht16

After resolving the merge conflict, I think we are good to merge.

Done. For merging, I'd just squash the commits into one.

techee avatar Nov 19 '24 22:11 techee