natsort icon indicating copy to clipboard operation
natsort copied to clipboard

Set which OS to sort by in `os_sorted`

Open PhillipMaire opened this issue 2 years ago • 8 comments

Describe the feature or enhancement use an optional input to os_sorted that allows sorting based on any operating system even when you are not using that operating system e.g. os_sorted(my_list, force_os = 'windows') will sort based on a windows machine even when in Unix/mac

Provide a concrete example of how the feature or enhancement will improve natsort If people need to replicate results in one operating system using code created in another operating system which uses os_sorted then this would be useful. I have a project where I work on cloud drives across machines and happen to use os_sorted, I would love to replace some stuff that already integrates os_sorted. alternatively could you show me how to do this using the os_sort_keygen? or some other method?

Would you be willing to submit a Pull Request for this feature? I don't have the experience to do this so I am sorry but I can't help here

thank you for your help and useful package

PhillipMaire avatar Apr 24 '22 22:04 PhillipMaire

This is what I had wanted to do originally, and why I sat on #41 for so many years. Unfortunately, as far as I can tell this is not possible.

In order to do the OS sorting on Windows, natsort literally loads the low-level function on Windows that is responsible for deciding how to sort directory components. This function is simply not available on non-Windows operating systems. It also appears that the sort order of Windows Explorer is proprietary and so do not have a way to re-implement it.

If you can find anything that refutes either of my findings then I would love to have this implemented.

SethMMorton avatar Apr 25 '22 03:04 SethMMorton

I see, no I don't have anything to refute your findings but have you seen this post? (assuming the answer is yes but just in case)

Explorer uses the API StrCmpLogicalW() for this kind of sorting (called 'natural sort order').

You don't need to write your own comparison function, just use the one that already exists.

A good explanation can be found here.

that being said would it be possible to reverse engineer? you would have to want it bad enough haha but in theory if you have a windows system and you generate files you could use a classifier to make your own algorithm that effectively sorted the same as windows. just a thought though

PhillipMaire avatar Apr 25 '22 05:04 PhillipMaire

StrCmpLogicalW() is what natsort is using under-the-hood on Windows.

It certainly could be reverse engineered, but unless I am getting paid to do it I am not interested in spending my nights and weekends doing so (especially given that I don't readily have access to a Windows machine).

I would happily accept a PR from someone else who does want to spend their free time doing that.

SethMMorton avatar Apr 25 '22 16:04 SethMMorton

totally get that! did you see this comment on issue 41. it seems like WINE implements a version of StrCmpLogicalW meant for UNIX

PhillipMaire avatar Apr 25 '22 19:04 PhillipMaire

I went down the rabbit hole a bit, and found that there is quite a bit of code behind that collation function.

If you go down the rabbit hole for CompareStringW (which is what we care about) you end up at https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2b847f321ad89b2ae4f6f1/dlls/kernelbase/locale.c#L2495 which has a fair bit of logic in it. One of the bits of logic involves using bitshifts to access the collation table at https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2b847f321ad89b2ae4f6f1/dlls/kernelbase/collation.c... that's all pretty messy.

Though, the function at https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2b847f321ad89b2ae4f6f1/dlls/kernelbase/locale.c#L2131 looks more promising in terms of being useful in the scope of natsort. Not sure how easily that could be ported. It would still involve that 11K element collation array...

SethMMorton avatar Apr 25 '22 22:04 SethMMorton

I see I see, thanks for the effort! never heard of bitshifts before, interesting!

ok well I will leave this request open in case you or anyone else finds it and wants to implement it. But I won't expect it for the reasons you mentioned above. thanks again for your package it is appreciated

PhillipMaire avatar Apr 25 '22 23:04 PhillipMaire

oh one more thing, just a thought but someone could also implement a windows-ish and UNIX-ish sorting that would allow users to get the desired results on most cases (excluding some edge cases) but the method would be hardcoded making it reproducible across operating systems.

PhillipMaire avatar Apr 25 '22 23:04 PhillipMaire

oh one more thing, just a thought but someone could also implement a windows-ish and UNIX-ish sorting that would allow users to get the desired results on most cases (excluding some edge cases) but the method would be hardcoded making it reproducible across operating systems.

If one just uses natsorted with alg=ns.PATH they will get Windows-ish and UNIX-ish results for >90% of the data you would want to sort and it will be reproducible. Tossing in ns.LOCALE will make it closer to >95% of the data.

SethMMorton avatar Apr 25 '22 23:04 SethMMorton