Examine icon indicating copy to clipboard operation
Examine copied to clipboard

NativeQuery performance CPU usage

Open Sven883 opened this issue 1 year ago • 4 comments
trafficstars

Hello @Shazwazza !

We're using Lucene.Net v3.0.3 with Examine v1.2.2 in an umbraco v8.14.1 installation. We're also using https://github.com/skttl/umbraco-fulltextsearch8 package v1.3.0.

Our externalindex has a document count of 8590 items with fieldcount of 109. We've exluded a lot of itemTypes and only have necessary document types added to the externalIndex.

We've altered the externalindex by using this code:

public class CustomizeExternalIndexComposer : IUserComposer
{
    public void Compose(Composition composition)
    {
        composition.RegisterUnique<IUmbracoIndexConfig, CustomIndexConfig>();
    }
}

public class CustomIndexConfig : UmbracoIndexConfig, IUmbracoIndexConfig
{
    private IScopeProvider _scopeProvider;

    public CustomIndexConfig(IPublicAccessService publicAccessService, IScopeProvider scopeProvider) : base(publicAccessService)
    {
        _scopeProvider = scopeProvider;
    }

    IContentValueSetValidator IUmbracoIndexConfig.GetPublishedContentValueSetValidator()
    {
        var excludeItemTypes = Current.Services.ContentTypeService.GetAll().Where(x => !x.IsElement && !x.PropertyTypeExists("allowSearching"))?.Select(x => x.Alias) ?? null;

        return new ContentValueSetValidator(true, false, PublicAccessService, _scopeProvider, null, null, excludeItemTypes);
    }
}

In our backoffice we have a composition with a property 'allowSearching' that is added to each document type that should be in the externalIndex.

As already mentioned, we're using the fulltextsearch package and we notice that our CPU is spiking when performing the nativeQuery code: return searcher.CreateQuery().NativeQuery(query.ToString()).Execute(_search.PageLength * _currentPage);

Example of a query:

(( 
    ((nodeName:"examplesearchquery"^20 OR nodeName_nl:"examplesearchquery"^20) 
      (FullTextContent:"examplesearchquery"^2 OR FullTextContent_nl:"examplesearchquery"^2)
    )  
    OR  
    (  
        (nodeName:examplesearchquery^10 OR nodeName_nl:examplesearchquery^10) 
        (nodeName:examplesearchquery*^5 OR nodeName_nl:examplesearchquery*^5) 
        (FullTextContent:examplesearchquery OR FullTextContent_nl:examplesearchquery) 
        (FullTextContent:examplesearchquery*^0.5 OR FullTextContent_nl:examplesearchquery*^0.5)
    ) 
)) 
AND 
(__NodeTypeAlias:faqItem) 
AND 
(__IndexType:content AND ((__VariesByCulture:y AND __Published_nl:y) OR (__VariesByCulture:n AND __Published:y))) 
AND 
-(templateID:0)

This is an example where we would only search for one specific contentType. Another example is our 'findasyoutype' functionality that searches through some more contentTypes at once:

(
    (
        (nodeName:examplesearchquery^10 OR nodeName_nl:examplesearchquery^10)
        (nodeName:examplesearchquery*^5 OR nodeName_nl:examplesearchquery*^5)
        (FullTextContent:examplesearchquery OR FullTextContent_nl:examplesearchquery)
        (FullTextContent:examplesearchquery*^0.5 OR FullTextContent_nl:examplesearchquery*^0.5)
    )
)
AND
(
    __NodeTypeAlias:contentPage OR
    __NodeTypeAlias:labItem OR
    __NodeTypeAlias:newsItem OR
    __NodeTypeAlias:project OR
    __NodeTypeAlias:technicalCommittee OR
    __NodeTypeAlias:standardsAntennaItem OR
    __NodeTypeAlias:jobItem OR
    __NodeTypeAlias:tool OR
    __NodeTypeAlias:contact OR
    __NodeTypeAlias:technologiesDetail
)
AND
(__IndexType:content AND ((__VariesByCulture:y AND __Published_nl:y) OR (__VariesByCulture:n AND __Published:y)))
AND
-(templateID:0)

Any idea why the nativeQuery exeuction is causing high CPU usage of 100%?

edit: I've noticed that my output window shows this when performing the nativeQuery. Maybe this rings a bell? :-)

9:56:30:515	Exception thrown: 'System.NotSupportedException' in Lucene.Net.dll
09:56:42:306	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:306	Exception thrown: 'System.IO.IOException' in Lucene.Net.dll
09:56:42:306	Exception thrown: 'System.IO.IOException' in Lucene.Net.dll
09:56:42:306	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:306	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:556	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:556	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:556	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:556	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:556	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:556	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:805	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:805	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:805	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:805	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:805	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:42:805	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:052	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:052	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:052	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:052	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:052	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:052	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:052	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:318	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:318	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:318	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:318	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:318	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:318	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:605	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:605	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:605	Exception thrown: 'System.IO.IOException' in Lucene.Net.dll
09:56:43:605	Exception thrown: 'Lucene.Net.QueryParsers.QueryParser.LookaheadSuccess' in Lucene.Net.dll
09:56:43:605	Exception thrown: 'System.IO.IOException' in Lucene.Net.dll
09:56:43:605	Exception thrown: 'System.IO.IOException' in Lucene.Net.dll

Any help would be greatly apreciated!

Thanks!

Sven883 avatar Jan 23 '24 07:01 Sven883

@Shazwazza i've done some extra investigation and tried to simplify the above query. I've changed my code a bit and my query now looks like this:

+(FullTextContent_nl:ventilatie* FullTextContent:ventilatie* nodeName_nl:ventilatie* nodeName:ventilatie*) +(__NodeTypeAlias:contentpage __NodeTypeAlias:labitem __NodeTypeAlias:newsitem __NodeTypeAlias:project __NodeTypeAlias:technicalcommittee __NodeTypeAlias:standardsantennaitem __NodeTypeAlias:jobitem __NodeTypeAlias:tool __NodeTypeAlias:contact __NodeTypeAlias:technologiesdetail)

I've noticed that our CPU usage is also spiking when using this query in de examine management tab in our externalIndex to about 80-90%.

Sven883 avatar Jan 23 '24 11:01 Sven883

Hi, you'll need to do some more investigating on this one to determine the underlying problem.

  • What happens when you only search non 'FullTextContent' fields?
  • What happens when you only do a very simple search?
  • Is the problem directly related to the FullTextContent or not?
  • Is the problem related to building the query or executing the query?
  • Can you create a dump file for analysis of where the CPU bottleneck is?
  • Can you replicate this from scratch with a simple setup?

Shazwazza avatar Jan 23 '24 17:01 Shazwazza

Hi @Shazwazza ,

I've done some more investigation. We've got some pages with a lot of content (like a lot lot :-) ).

The FullTextSearch package creates an extra property in the externalIndex with a string that is generated based on the content of that page.

We notice that some pages have a fullTextSearch value in examine of 1 million+. I'm guessing that this can cause some serious issues in performance when quering the fullTextSearch property with an examine query?

And if so - What would be the ideal max length (if there is any) that a examine property value can have?

Thanks a lot!

Sven883 avatar Jan 25 '24 10:01 Sven883

You should also check if the fulltextsearch project specifies LuceneSearchOptions.AllowLeadingWildcard = true. If that is the case, this is most likely the cause for the performance degredation. Lucene specifically says that enabling this

//     When set, * or ? are allowed as the first character of a PrefixQuery and WildcardQuery.
//     Note that this can produce very slow queries on big indexes.

Shazwazza avatar Jan 26 '24 19:01 Shazwazza

I will close this issue since I don't think this is an issue with Examine but more the usage of Examine via the FullTextSearch package. As for what is an ideal max length in a field, you'll need to investigate lucene docs/forums regarding that. I assume there is some performance penalties for indexing values of such huge sizes.

Shazwazza avatar Jun 14 '24 15:06 Shazwazza