stellarium icon indicating copy to clipboard operation
stellarium copied to clipboard

Use fast_float library

Open gzotti opened this issue 1 year ago • 9 comments

Is your feature request related to a problem? Please describe.

Sometimes loading large files, esp. large landscape OBJ in scenery3D, takes a long time.

Describe the solution you'd like

The fast_float library https://github.com/fastfloat/fast_float provides float/double parsing with much higher throughput than stdlib. We should incorporate that library and change the OBJ loader to use it.

Describe alternatives you've considered

Currently we just have to sit and wait.

Additional context

Maybe also ssystem_minor.ini parsing would become faster, to be felt esp. with a large number of objects.

gzotti avatar Feb 02 '24 10:02 gzotti

Hello @gzotti!

Thank you for suggesting this enhancement.

github-actions[bot] avatar Feb 02 '24 10:02 github-actions[bot]

Before adding yet another dependency, I suggest that you try std::from_chars, which was added in C++17 exactly for deserialization.

10110111 avatar Feb 02 '24 10:02 10110111

OK, I tried to change parseVec3 and others, but it seems the conversion from the text file to QString (16-bit text) and back to QByteArray with toLocal8Bit(), from where the char* is then available for consumption by std::from_chars is very wasteful to begin with. The whole file should probably be processed with std::ifstream, std::getline, std::string, contradicting our own "use Qt classes" rule.

gzotti avatar Feb 20 '24 17:02 gzotti

with toLocal8Bit()

This is a bad idea in any case, since, as the docs say, "On Windows the system's current code page is used." This implies that you'll get e.g. CP1252 with all its shortcomings. This function should be replaced with QString::toUtf8() if you are going to do the conversions.

The whole file should probably be processed with std::ifstream, std::getline, std::string, contradicting our own "use Qt classes" rule.

If you go this route, don't forget that std::ifstream must get an explicit UTF-8 path, otherwise you'll have problems on Windows with non-ASCII paths (this may happen even in the installation prefix!). To do this, use std::filesystem::u8path. Also, don't use std::ifstream for parsing the numbers, because you'll waste time on handling locale this way, and maybe get into weird performance problems similar to this.

OTOH, you can still use QFile to get all the file contents as QByteArray by QFile::readAll(), or get the text line by line into a char array by QFile::readLine(char*,qint64). The latter will require you to somehow know the size of the buffer in advance, or read in multiple attempts, which may be an uglier way than using std::getline on an ifstream.

10110111 avatar Feb 20 '24 19:02 10110111

Don't we set the C locale somewhere to avoid any such stupid issues? The OBJ text format should contain only ASCII characters (v, vt, vn, f, o, names and numbers with dot decimal separator) in any case. What I think now that should be avoided is reading the textfile into 16bit QString (and split/process into QStringView, as convenient as this path is) when std::from_chars() reads 8bit char* only. The current mixed attempt (I did not make a branch yet) is even a bit slower than the purely Qt solution, so I am not sure if the number parsing or the string mangling is slowing that whole OBJ input so badly.

gzotti avatar Feb 20 '24 21:02 gzotti

Don't we set the C locale somewhere to avoid any such stupid issues?

Regardless of this, iostream's parser may spend some time in the logic around this.

The current mixed attempt (I did not make a branch yet) is even a bit slower than the purely Qt solution, so I am not sure if the number parsing or the string mangling is slowing that whole OBJ input so badly.

Depending on how you implemented this, the problem may be in excessive (re)allocations. You should try profiling the code.

10110111 avatar Feb 21 '24 06:02 10110111

If loading OBJ files is so time-consuming, it may make sense to switch to a binary format like glTF.

10110111 avatar Feb 21 '24 06:02 10110111

I have learned about glTF about two weeks ago ;-) The big advantage of OBJ is that it's so simple and human-readable (so that some problems can be fixed with scripts in Python or even sed/awk). In addition, we have (at least technically, if only applied a few times) extended the MTL format for special purposes. Of course, using some offline preprocessor/converter (which must be free, must remain available, must not fail with tens of millions of triangles, and should be available on all platforms) and then loading glTF as alternative to OBJ usable in many cases would be possible.

gzotti avatar Feb 21 '24 09:02 gzotti

The big advantage of OBJ is that it's so simple and human-readable

glTF binary blob can be extracted into a JSON and a bunch of binary buffers and textures using e.g. gltf-pipeline script. The same utility can convert these items back into a single .glb.

10110111 avatar Feb 21 '24 09:02 10110111

Not fast_float, but another bottleneck was identified.

gzotti avatar Mar 03 '24 16:03 gzotti

Hello @gzotti!

Please check the fresh version (development snapshot) of Stellarium: https://github.com/Stellarium/stellarium-data/releases/tag/weekly-snapshot

github-actions[bot] avatar Mar 10 '24 13:03 github-actions[bot]

Hello @gzotti!

Please check the latest stable version of Stellarium: https://github.com/Stellarium/stellarium/releases/latest

github-actions[bot] avatar Mar 26 '24 06:03 github-actions[bot]