stellarium icon indicating copy to clipboard operation
stellarium copied to clipboard

Faster OBJ import

Open gzotti opened this issue 11 months ago • 6 comments

Description

Loading large OBJ files takes too long, easily more than a minute. I hope with this change loading can be improved.

The current little change does not help yet. We need to change the entire file parsing and avoid all QString-based type conversions. However, this deviates from our rule to use Qt classes where possible.

Fixes #3620

Screenshots (if appropriate):

Type of change

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [x] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] This change requires a documentation update
  • [ ] Housekeeping

How Has This Been Tested?

Load a large scene, compare loading times.

Test Configuration:

  • Operating system: Windows 10/11, Ubuntu 22.04LTS
  • Graphics Card: irrelevant

Checklist:

  • [ ] My code follows the code style of this project.
  • [ ] I have performed a self-review of my own code
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation (header file)
  • [ ] I have updated the respective chapter in the Stellarium User Guide
  • [ ] My changes generate no new warnings
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] New and existing unit tests pass locally with my changes
  • [ ] Any dependent changes have been merged and published in downstream modules

gzotti avatar Feb 26 '24 14:02 gzotti

Great PR! Please pay attention to the following items before merging:

Files matching src/**/*.cpp:

  • [ ] Are possibly unused includes removed?

This is an automatically generated QA checklist based on modified files.

github-actions[bot] avatar Feb 26 '24 14:02 github-actions[bot]

Do you have an example of a large file to load?

10110111 avatar Feb 26 '24 14:02 10110111

I am involved with callanish3d.org. My other large ones are unfortunately not free for distribution.

gzotti avatar Feb 26 '24 15:02 gzotti

OK, from my experiments with Callanish's OBJ file (and Qt 5.15), here's what I got on this branch:

  • Full file loads in 40 s
  • Stubbing out parseVec3 from the "v" handling reduces reading time to 34 seconds
  • Stubbing out parseFace from the "f" handling in addition to parseVec3 from the "v" handling reduces reading time to 24 seconds
  • Resetting the code and stubbing out everything after QString line = stream.readLine().trimmed() reduces reading time to 1.6 s
  • Allowing line.splitRef(separator, Qt::SkipEmptyParts) to run and stubbing everything after it bumps reading time to 24 s.
  • Replacing the separator regex with a QChar(' ') in addition to the above stubbing reduced reading time to 3.8 s.
  • Doing only the replacement of the separator after code reset results in total loading time of 16 s, which alone is quite an improvement over the initial 40 s.

So, looks like the first thing to optimize is the regular expression. Particularly, the separator currently used, \s, is too general for the OBJ file. Just a simple space with skipping of empty parts should be sufficient:

diff --git a/src/core/StelOBJ.cpp b/src/core/StelOBJ.cpp
index fb8f19627d..f0fda5eb54 100644
--- a/src/core/StelOBJ.cpp
+++ b/src/core/StelOBJ.cpp
@@ -814,8 +814,7 @@ bool StelOBJ::load(QIODevice& device, const QString &basePath, const VertexOrder
 
     VertexCache vertCache;
     CurrentParserState state = CurrentParserState();
-    static const QRegularExpression separator("\\s");
-    separator.optimize();
+    static const QChar separator(' ');
 
     int lineNr=0;
 

The parsing of the numbers may actually not even be so important, the real bottlenecks may be elsewhere.

10110111 avatar Feb 26 '24 17:02 10110111

Great! Thanks for experiments. My second idea (after this) was to skip the high-level line splitting entirely and read the lines into char* directly. Probably the new "space only" regexp can go into master for immediate implosion in 24.1. I will test with some of my large scenes.

gzotti avatar Feb 26 '24 19:02 gzotti

skip the high-level line splitting entirely and read the lines into char* directly

I'm afraid the next major bottleneck is not in file IO. And the splitting here is done via QStringRef, which doesn't allocate anything in the heap.

the new "space only" regexp

The point is that it's no longer a regexp, just a fixed char (see the diff above).

10110111 avatar Feb 26 '24 20:02 10110111

I pushed the regex-to-char change.

I also added one more commit, which is not quite complete—in a way. Namely, it improves loading of vertex-only faces while keeping the other combinations of vertex, texture, normal unchanged. Ideally, these should also be changed.

And again, this shows that real bottlenecks are not in number parsing, so I think we can safely remove your first commit.

10110111 avatar Feb 27 '24 07:02 10110111

Now I've pushed the full handling of all cases. Please check how it behaves with textured and normal-mapped models if you have any.

10110111 avatar Feb 27 '24 12:02 10110111

Hopefully later this week, thank you!

gzotti avatar Feb 27 '24 12:02 gzotti

BTW, the vertCache is another source of slowdown. I don't think it should even exist in this code, because normally the model should already be optimized to avoid duplication of vertices, and if it does duplicate them, this may have been done on purpose by the exporting software (e.g. to optimize for GPU cache locality).

10110111 avatar Feb 27 '24 15:02 10110111

Next short view. Hm, the QStringRef is part of Qt5compat which I was happy to get rid of before 1.0. But Qtdocs recommend it explicitly for speeding up parsers etc, just this case. I hope this module will not fall away in 2 years. Have you tried also with QStringView? It compiles with Qt6.6, but I did not install Qt5compat to make comparisons against QStringRef so far.

gzotti avatar Feb 27 '24 21:02 gzotti

the QStringRef is part of Qt5compat which I was happy to get rid of before 1.0

Well, QString::splitRef is also part of this API, and the calls to it are still present in current master. This is why I continued using it. Ideally, indeed, I'd just use QStringView, which is basically the same as QStringRef.

10110111 avatar Feb 28 '24 04:02 10110111

Oh, actually I'm wrong: this API is only used in the Qt5-specific branches. OK, I'll try to switch to QStringView in Qt6 parts, though I really dislike this dependence on Qt version in such a platform-independent code.

10110111 avatar Feb 28 '24 05:02 10110111

Did you notice any performance change? If not, probably all occurrences of QStringRef can be switched to QStringView (to remove the #ifdef Qt6/Qt5 branching), but I think in my edits towards Qt6 I came across some cases where I needed to even move to QString. And probably we should use QStringViews more frequently in many other cases where the string content doesn't change.

gzotti avatar Feb 28 '24 10:02 gzotti

probably all occurrences of QStringRef can be switched to QStringView

Unfortunately, QStringView in Qt5 lacks some methods that are present in QStringRef, so we'll have to keep this conditional compilation as is for now.

10110111 avatar Feb 28 '24 10:02 10110111

Did you notice any performance change?

No, these two classes are the same in principle. It's just that in Qt5 QStringView is incomplete.

10110111 avatar Feb 28 '24 12:02 10110111

OK, I hoped for that. Just the description of QStringRef was so focused on highlighting the use for more efficient parsing.

gzotti avatar Feb 28 '24 12:02 gzotti

Current state: 42s vs. 70s when loading another large scene. Already a big win.

Loading Callanish fresh after program start: 23.4: 34s This: 22s This with from_chars changes reverted: 21s.

OK, it seems I should revert my attempts, and we could finish this already for 24.1.

gzotti avatar Mar 03 '24 11:03 gzotti

Did you check textured scenes?

10110111 avatar Mar 03 '24 11:03 10110111

Yes, with textures and with simple materials. Looks identical to the previous.

gzotti avatar Mar 03 '24 11:03 gzotti

I've pushed the relevant changes to master, reshaping the commits to make the history cleaner. So I suppose this PR can be closed.

What I've noticed just now is that in the Qt6 version loading takes 3× as long as in the Qt5 one. In both cases my changes improve the times, but still Qt6 version appears to be similarly slower.

10110111 avatar Mar 03 '24 16:03 10110111

收到!

ElonVampire avatar Mar 03 '24 16:03 ElonVampire

OMG! What is it then? QVector/QList were more or less merged in Qt6. Does that make a difference? OK, we can call the branch closed, but if you have an idea, please feel free to further improve.

gzotti avatar Mar 03 '24 16:03 gzotti