osgearth icon indicating copy to clipboard operation
osgearth copied to clipboard

Too many threads?

Open robspearman opened this issue 3 years ago • 9 comments

My application has a handful of concurrent osgEarth instances and I am updating it from 2.10 to use the osgEarth master branch.

Performance is a big problem, particularly approaching a body surface. I am noticing up to 2400 threads at a single time and very high CPU load. Are that many threads really expected?

I also get a lot of repeating "Error 4": "too many open files" for even small 8k x 4k layer textures (geotiff).

robspearman avatar May 31 '21 00:05 robspearman

2400 threads is definitely too many. What platform are you running on and what does your earth file look like?

On Sun, May 30, 2021, 8:08 PM Rob Spearman @.***> wrote:

My application has a handful of concurrent osgEarth instances and I am updating it from 2.10 to use the osgEarth master branch.

Performance is a big problem, particularly approaching a body surface. I am noticing up to 2400 threads at a single time and very high CPU load. Are that many threads really expected?

I also get a lot of repeating "Error 4": "too many open files" for even small 8k x 4k layer textures (geotiff).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gwaldron/osgearth/issues/1710, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACPXYXTEJUW4UF4WMJQVO3TQLHQ3ANCNFSM45Z4GDBA .

jasonbeverage avatar May 31 '21 00:05 jasonbeverage

Linux, 8 core CPU. Example earth file (layers are added programmatically):

<map name="Mercury" type="geocentric" version="2"> 
   <options> 
      <profile> 
        <srs>+proj=latlong +a=2439700 +b=2439700</srs> 
      </profile> 
      <terrain driver="rex" 
       lighting="false" 
       cluster_culling="true" 
       tile_pixel_size="135" 
       range_mode="PIXEL_SIZE_ON_SCREEN" 
       morph_imagery="true" 
       morph_terrain="true" 
       normal_maps="true" 
       high_resolution_first="false" 
       first_lod="2" 
       skirt_ratio="0.1" 
       tile_size="17" 
       progressive="true"/> 
   </options> 
</map>

robspearman avatar May 31 '21 14:05 robspearman

What do the layers look like though? It sounds like it might be something to do with the type and quantity of files you are using. For example if you are using jp2 files I know that the default OpenJPEG driver in GDAL wants to kick off multiple threads just when doing a read. So if you couple that with our threading plus osg's threading you might end up seeing a bunch of threads for sure. I would doubt 2400 and in theory they should be short lived but that might be what you are seeing.

On Mon, May 31, 2021 at 10:57 AM Rob Spearman @.***> wrote:

Linux, 8 core CPU. Example earth file (layers are added programmatically):

+proj=latlong +a=2439700 +b=2439700

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/gwaldron/osgearth/issues/1710#issuecomment-851543580, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACPXYV2YCJPXMJHQJ4L4KTTQOPU7ANCNFSM45Z4GDBA .

jasonbeverage avatar May 31 '21 15:05 jasonbeverage

These are just test layers, so all under 16k x 8k geotiffs. Some are 8k jpegs. I have 8 osgEarth instances loaded but only one is much visible at a time.

With osgEarth 2.10 and the same instances and textures and gdal version I didn't notice an issue. Is your 3.x threading much different? It's completely possible I broke something in the migration.

robspearman avatar May 31 '21 15:05 robspearman

Something odd is that view->getDatabasePager()->getNumDatabaseThreads() is showing zero database threads and the OSG stats display does not show any database pager info.

I checked back to my working osgEarth 2.10 branch and it is showing the same there since upgrading to OSG 3.6.5.  However, it does show database pager info in the stats.

Does this raise a red flag?

Rob

robspearman avatar Jun 01 '21 01:06 robspearman

That's not surprising, osgEarth master is going to use a new tasked based multithreading mechanism and is no longer using the database pager for doing multithreading.

jasonbeverage avatar Jun 01 '21 02:06 jasonbeverage

I can fix the too many open files errors by changing my ulimit (ulimit -n 4096).

What used to happen is that osgEarth instances initialized at startup.  Now they do not initialize until they LOD in the first time, at which point there is an unacceptable multi-second hang.

Once that is done the thread count is minimal, but performance is unusable at < 10 fps near the surface.  These are not even large textures.

Can I force earlier initialization?

How can I debug the poor performance?

robspearman avatar Jun 16 '21 18:06 robspearman

Rob, you might need to send Glenn or I the data to take a look at. You shouldn't have to raise your ulimit (you're not actually using over 4096 files right?) I suspect this is related to your giant number of threads that are getting initialized for whatever reason, the GDAL driver has changed to open up a copy of the GDAL dataset per thread rather than sharing a single one, which is great for performance but will result in more file handles than you had before. But that doesn't explain why it's starting an insane amount of threads to begin with.

If you run osgearth_viewer with a simple earth file do you see the large number of threads as well or is this only in your app?

jasonbeverage avatar Jun 16 '21 19:06 jasonbeverage

osgearth_viewer loads the same textures instantaneously, with no noticeable delay.  However, the thread count is huge.  984 threads for one body, but eventually drops down to only 9.

I left my application the same and just deleted the texture files.  Frame rate was poor still at ~20fps near a surface. Camera stats if it helps:

cull: 23 ms

draw: 14 ms

GPU: 12 ms

robspearman avatar Jun 16 '21 19:06 robspearman