active-record icon indicating copy to clipboard operation
active-record copied to clipboard

ActiveQuery::one() limiting database records

Open hvta opened this issue 7 years ago • 33 comments

What steps will reproduce the problem?

ActiveQuery::one() doesn't seem to limit records in SQL.

What is the expected result?

Queries created with ActiveQuery::one() should have a LIMIT 1 clause in the prepared SQL.

What do you get instead?

The function selects all rows from a table resulting in excess memory consumption.

Additional info

Q A
Yii version 2.0.?
PHP version
Operating system

Funding

  • You can sponsor this specific effort via a Polar.sh pledge below
  • We receive the pledge once the issue is completed & verified
Fund with Polar

hvta avatar Mar 29 '17 13:03 hvta

See #9578

rob006 avatar Mar 29 '17 20:03 rob006

@samdark Maybe it is worth add some shortcut for that, like ActiveQuery::first()? Just see how many duplicates have #4348 or #9578.

rob006 avatar Mar 29 '17 23:03 rob006

Yeah... maybe.

samdark avatar Mar 29 '17 23:03 samdark

As I see there were some attempts: #12563

MysteryDragon avatar Mar 30 '17 06:03 MysteryDragon

I looked into some of the old issues but couldnt find the answer to this question: What's the exact use case for selecting one record without a limit?

SamMousa avatar Mar 30 '17 07:03 SamMousa

I think this is the potential problem - https://github.com/yiisoft/yii2/issues/4348#issuecomment-49405392

bizley avatar Mar 30 '17 07:03 bizley

So can we create a specific test case to see if that's still true anno 2017?

Basically if that happens it is a bug in MySQL. The whole point of a dbms is to tell it what I want using SQL not how I want it...

If I only want one record I should tell the database and it is its responsibility to figure out the most efficient way to do that.

SamMousa avatar Mar 30 '17 07:03 SamMousa

Pay attention on https://github.com/yiisoft/yii2/issues/4348#issuecomment-49419664 also. (And I'm not expert in MSSQL/Oracle at all.)

MysteryDragon avatar Mar 30 '17 07:03 MysteryDragon

@SamMousa a test would be helpful.

samdark avatar Mar 30 '17 14:03 samdark

Looking into that now. It could have been this bug: https://bugs.mysql.com/bug.php?id=78325

I'm unable to get mysql to pick the wrong index when using 2 tables with 50 million records, then joining them with a limit of 1 and different sorting (both on index and non-index columns).

I imagine that if your limit is high and you use a sort / group at some point mysql decides that it will be too big to store in memory and uses on disk storage. This is however not related to the limit clause but to the sorting.

@samdark It's impossible to prove a negative so I won't try ;-)

SamMousa avatar Mar 30 '17 14:03 SamMousa

MSSQL:

mssql_select

mssql_select_limit

samdark avatar Mar 30 '17 17:03 samdark

So it seems it doesn't matter for MSSQL... @sergeymakinen you're good with MSSQL. Any idea?

samdark avatar Mar 30 '17 18:03 samdark

Most of the time (99%) it’s not an issue for MSSQL anymore. Especially in the yii\db\mssql\QueryBuilder::newBuildOrderByAndLimit() case. I consider limit(1) safe to be used in one(). I can neither confirm nor deny the same for Oracle but in recent versions (10.2 or higher) it should not be an issue.

sergeymakinen avatar Mar 30 '17 19:03 sergeymakinen

For mysql, you can have some performance issues by combining limit and order by: https://github.com/yiisoft/yii2/issues/10869

I suggest to put ->one(true|false) to add the limit or make a global parameter.

ghost avatar Apr 03 '17 18:04 ghost

@juanmacias How is that related to limit? The linked issue seems to be related to index hinting..

SamMousa avatar Apr 03 '17 18:04 SamMousa

When using limit and order by, MySQL some times don't use the appropiate index, so the query will kill the server.

For example Product::find()->orderBy('name')->limit(1)->one();

MySql will do in some cases a full scan.

So, if you want to introduce limit 1, you have to allow index hinting.

In my case, I moved to PostGreSQL tired of this....

ghost avatar Apr 03 '17 18:04 ghost

Do you have a sample SQL file? I know this occurs but it's hard to recreate consistently... Therefore it's hard to check if it still happens in the latest versions

On Apr 3, 2017 8:44 PM, "Juan Macias" [email protected] wrote:

When using limit and order by, MySQL some times don't use the appropiate index, so the query will kill the server.

For example Product::find()->orderBy('name')->limit(1)->one();

MySql will do in some cases a full scan.

So, if you want to introduce limit 1, you have to allow index hinting.

In my case, I moved to PostGreSQL tired of this....

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yiisoft/yii2/issues/13875#issuecomment-291235593, or mute the thread https://github.com/notifications/unsubscribe-auth/AAhYzSaHxyZ1ZWnpSpPrqWEMm9d8yUoWks5rsT37gaJpZM4MtAv_ .

SamMousa avatar Apr 03 '17 18:04 SamMousa

It's a known issue with a lot of complains, stackoverflow is full of samples.

https://www.percona.com/blog/2006/09/01/mysql-order-by-limit-performance-optimization/

But this not only occurs with limit and order by, also with left join + where, and other querys.

If you have to deal with large tables and complex querys, force index is a must.

ghost avatar Apr 03 '17 18:04 ghost

@juanmacias that article is from 2006. do you have more recent resources?

I did not find performance related things but there is this section from the mysql docs:

One manifestation of this behavior is that an ORDER BY query with and without LIMIT may return rows in different order, as described later in this section.

https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html

That means that automatically applying limit would result in behavior change in some cases.

cebe avatar Apr 04 '17 09:04 cebe

Still present in MySQL 5.6 version (I have a lot of issues), more issues in stackoverflow:

http://stackoverflow.com/questions/9641463/mysql-not-using-index-for-order-by http://stackoverflow.com/questions/39954181/mysql-order-by-takes-a-very-long-time-even-if-i-have-indexes

As I told before, MySQL execution plan is not good in some stupid queries, that's the reason they implemented Force Index.

On Tue, Apr 4, 2017 at 11:04 AM, Carsten Brandt [email protected] wrote:

@juanmacias https://github.com/juanmacias that article is from 2006. do you have more recent resources?

I did not find performance related things but there is this section from the mysql docs:

One manifestation of this behavior is that an ORDER BY query with and without LIMIT may return rows in different order, as described later in this section.

https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html

That means that automatically applying limit would result in behavior change in some cases.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yiisoft/yii2/issues/13875#issuecomment-291438789, or mute the thread https://github.com/notifications/unsubscribe-auth/AEk1ZeOQsICs5BdYSQddrnLXAp77gVI0ks5rsge1gaJpZM4MtAv_ .

-- Juan Macias CEO QaShops.com 659203561

ghost avatar Apr 04 '17 09:04 ghost

Still open?

pvolyntsev avatar Apr 01 '18 10:04 pvolyntsev

I think so....

ghost avatar Apr 01 '18 11:04 ghost

It is. It looks like a simple thing at the first sight but in reality it isn't. See comments above...

samdark avatar Apr 01 '18 11:04 samdark

It's quite simple as I see

yiisoft\yii2\db\ActiveQuery.php

namespace yii\db;

class ActiveQuery
{
    /**
     * Executes query and returns a single row of result.
     * @param Connection|null $db the DB connection used to create the DB command.
     * If `null`, the DB connection returned by [[modelClass]] will be used.
     * @return ActiveRecord|array|null a single row of query result. Depending on the setting of [[asArray]],
     * the query result may be either an array or an ActiveRecord object. `null` will be returned
     * if the query results in nothing.
     */
    public function one($db = null)
    {
        $this->limit(1); // limit the query
        $row = parent::one($db);
        if ($row !== false) {
            $models = $this->populate([$row]);
            return reset($models) ?: null;
        }

        return null;
    }
}

pvolyntsev avatar Apr 03 '18 01:04 pvolyntsev

@pvolyntsev it's not about code implementation, it's about pitfalls.

MysteryDragon avatar Apr 03 '18 03:04 MysteryDragon

The question remains if this is really still an issue anno 2018; since this is not a yii bug but a supposed DBMS..

SamMousa avatar Apr 03 '18 06:04 SamMousa

I really like @rob006's proposal: https://github.com/yiisoft/yii2/issues/13875#issuecomment-290261535 Let's make two separate methods:

  1. ActiveQuery::first()
  2. ActiveQuery::one()

Does anyone mind? And if so, why?

rugabarbo avatar Apr 03 '18 08:04 rugabarbo

I don't think the difference is clear from the method name. Note that this is all about clarity; currently the workaround is:

$query->limit(1)->one()

Which is not that much longer than $query->first(). The issue arises from the fact that people expect one() to be memory efficient.

Adding a new function will not remove any unclarity. If we really want to add another function i'd go with something like ActiveQuery::oneWithLimit(), that way you get to see it when using code completion and looking for one() and it conveys the meaning more clearly as well.

That said, I still think we should investigate if the underlying issue is still valid; I have not been able to reproduce it. Also DBMSes have improved over the years and their query optimizers I'd assume are less likely to make these kinds of stupid mistakes...

SamMousa avatar Apr 03 '18 09:04 SamMousa

In Mysql:

Don't use this:

// select * from order where user_id=:user_id
$order = Order::findOne(['user_id'=>$user_id])

Must use this:

// select * from order where user_id=:user_id limit 1
$order = Order::find(['user_id'=>$user_id])->limit(1)->one()

c4ys avatar Apr 03 '18 10:04 c4ys

Note that in most cases you'll query using the primary key or another unique set when using findOne(), so then it doesn't really matter..

SamMousa avatar Apr 03 '18 11:04 SamMousa