MSSQL query is really slow because of ORDER BY
Expected behavior and actual behavior.
I'd like to upgrade from MapServer 7.6.0 to MapServer 8.0.0, but the MSSQL layer queries are becoming really slow. I'm using the msplugin_mssql2008.dll provided with the binaries of MapServer.
This is the connection string I'm using under both versions:
CONNECTION "Driver={ODBC Driver 17 for SQL Server};Server=(local);Database=dbname;trusted_connection=yes;"
DATA "ogr_geometry from clu USING UNIQUE ogr_fid USING SRID=3857"
By upgrading the version from 7.6.0 to 8.0.0, the SQL query has an added ORDER BY [ogr_fid] statement.
Without the ORDER BY statement the query finishes in less then a second. When the ORDER BY statement is added by 8.0.0, the query takes over a minute to finish.
The table I'm querying has 39 169 857 records. I've never had an issue with the query performance in years, but this is blocking me from upgrading.
Is there any way to bypass that ORDER BY statement without building a custom version of MapServer?
MapServer version and installation method
I'm switching from 7.6.0 version from MS4W to the 8.0.0 version from GISInternals
@petrovicz - I'm using MapServer 8.0 and don't see this behaviour with a WMS request. What type of request are you using?
I'm using a WFS request.
This is the request URL I'm using:
http://localhost/cgi-bin/mapserv.exe?map=clu.map&service=WFS&version=1.1.0&request=GetFeature&OUTPUTFORMAT=GML2&typename=clu&SRS=EPSG:3857&BBOX=-11333373.461429717,4035961.002164016,-11317919.711706003,4045672.897209224
@petrovicz - the ORDER BY ensures that WFS requests return the same data, otherwise operations such as paging won't work correctly. Assuming the BBOX returns a lot less than 40 million records you may want to check indexes on your query, and create a default ORDER BY with a clustered index (on [ogr_fid]), so performance should be similar to before.
As a workaround you could try adding ORDER BY (SELECT NULL) which should avoid any ordering:
DATA "ogr_geometry from clu USING UNIQUE ogr_fid USING SRID=3857 ORDER BY (SELECT NULL)"
@geographika I'm just thinking, that if just nextShape is used by the WFS query, we should not necessary to force an ordered query since we use the same result set. However the divers may not be notified well enough, that just a forward only query is taking place. We could also use scrollable cursors in some cases, but that would be less efficient I think.
@szekerest - just to note I've run into issues if caching is used with no order.
For example when using wfs_features_cache_count initial results are saved in the first pass query, and then a second query is made (in the same MapServer request) to get any additional records. The second query returns different results which can mean duplicates with the first query causing strange results.
Having said that I'm not against removing a default ORDER BY (or switching to ORDER BY (SELECT NULL)). WFS paging etc. would then be broken without explicitly adding an ORDER BY, so either option has drawbacks.
Thank you all for the tips!
After I added an index on the ogr_fid with a default ordering, I got the fast speeds back. However when I add the maxFeatures CGI variable to the request it slows down again.
I can skip the maxFeatures variable for now and restrict the maximum size of the BBOX as an alternative.
Assigned to @szekerest or should this just be closed?
@sdlime That will require a bit more investigation.