PetaPoco icon indicating copy to clipboard operation
PetaPoco copied to clipboard

Page query is very slow when use a long sql statement

Open kuangyunsheng opened this issue 9 years ago • 9 comments

When call db.Page<T> method with a long sql statement (e.g. contains more than 400 charactors), it is very slow.

I find it caused by the regular expression "RegexOrderBy".

In BuildPageQuery function, it try to use this regular expression to process the sql statement: "parts.SqlSelectRemoved = helper.RegexOrderBy.Replace(parts.SqlSelectRemoved, "", 1);"

But I don't konw how to optimize it.

kuangyunsheng avatar Feb 19 '16 09:02 kuangyunsheng

I write code to test the regular expression:

Regex RegexOrderBy = new Regex( @"\bORDER\s+BY\s+(?!.*?(?:\)|\s+)AS\s)(?:\((?>\((?<depth>)|\)(?<-depth>)|.?)*(?(depth)(?!))\)|[\[\]""\w().])+(?:\s+(?:ASC|DESC))?(?:\s_,\s_(?:((?>((?)|)(?<-depth>)|.?)(?(depth)(?!)))|[[]`""\w().])+(?:\s+(?:ASC|DESC))?)", RegexOptions.RightToLeft | RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline | RegexOptions.Compiled);

string s = "[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd] order by a,b,c";
TestReplace(s);
s = "[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd],[ab],[cd]";
        TestReplace(s);

void TestReplace(string s) { DateTime d1 = DateTime.Now; RegexOrderBy.Replace(s, "", 1); Console.Write("replace "{0}" \r\ncost {1} seconds\r\n\r\n", s, (DateTime.Now - d1).TotalSeconds); } ` when the sql contains "order by", it's fast,if no "order by" it becomes very slowly.

see the test result: image

and, if no "order by", it becomes more and more slowly when sql statement becomes longer.

kuangyunsheng avatar Feb 22 '16 02:02 kuangyunsheng

So, following is my suggestion:

change the function BuildPageQuery of SqlServerDatabaseProvider

`

Regex SimpleRegexOrderBy =
            new Regex(
                @"\bORDER\s+BY\s+",
                RegexOptions.RightToLeft | RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline | RegexOptions.Compiled);
public override string BuildPageQuery(long skip, long take, SQLParts parts, ref object[] args)
        {
            var helper = (PagingHelper) PagingUtility;
            //when contains no "order by", its is very slow
            //parts.SqlSelectRemoved = helper.RegexOrderBy.Replace(parts.SqlSelectRemoved, "", 1);
            //use a simple regular expression to find whether contains "order by"
            if(SimpleRegexOrderBy.IsMatch(parts.SqlSelectRemoved))
            {
                parts.SqlSelectRemoved = helper.RegexOrderBy.Replace(parts.SqlSelectRemoved, "", 1);
            }
            if (helper.RegexDistinct.IsMatch(parts.SqlSelectRemoved))
            {
                parts.SqlSelectRemoved = "peta_inner.* FROM (SELECT " + parts.SqlSelectRemoved + ") peta_inner";
            }
            var sqlPage = string.Format("SELECT * FROM (SELECT ROW_NUMBER() OVER ({0}) peta_rn, {1}) peta_paged WHERE peta_rn>@{2} AND peta_rn<=@{3}",
                parts.SqlOrderBy == null ? "ORDER BY (SELECT NULL)" : parts.SqlOrderBy, parts.SqlSelectRemoved, args.Length, args.Length + 1);
            args = args.Concat(new object[] {skip, skip + take}).ToArray();

            return sqlPage;
        }

kuangyunsheng avatar Feb 22 '16 02:02 kuangyunsheng

Thanks @kuangyunsheng I'll have a look at this soon

pleb avatar Feb 22 '16 07:02 pleb

Thanks a lot @kuangyunsheng it work! You are my God!!!!

FabioParigi avatar Apr 20 '16 13:04 FabioParigi

Thanks a lot @kuangyunsheng it work!Prefect!!

dongfangzhizhu avatar Jun 13 '16 08:06 dongfangzhizhu

Thank you, @kuangyunsheng , you saved my days

phamtienhung avatar Jun 17 '16 04:06 phamtienhung

Hello, the same problem is present on the SplitSQL method of PagingHelper class. Should it be fixed there too ?

AlbertoCe avatar May 07 '17 08:05 AlbertoCe

@AlbertoCe did you want to submit a PR?

pleb avatar May 09 '17 09:05 pleb

Ok, I'll do a PR as soon as possible (today or tomorrow). I found that SplitSQL and BuildPageQuery methods needs to be slightly modified or, for complex queries, they could became very very slow.

AlbertoCe avatar May 09 '17 09:05 AlbertoCe