yii2 icon indicating copy to clipboard operation
yii2 copied to clipboard

Cannot get DISTINCT query on one column

Open kkursor opened this issue 4 years ago • 13 comments

What steps will reproduce the problem?

I have a table which looks like this:

CREATE TABLE materials(
id serial,
short_name text,
full_name text
);

Actually it has much more fields, but it doesn't matter. I need to get results DISTINCT by short_name - PostgreSQL supports this. In SQL the query looks like:

SELECT DISTINCT ON (short_name) short_name, id, full_name FROM materials;

DISTINCT clause should go first.

When I try to get rows like

Materials::find()->select(['short_name'])->distinct()->addSelect(['*'])

it gets DISTINCT on all columns (which obviously doesn't make sense).

I tried to modify Materials::select field manually to make it select what I need like this:

         $new_select = [];
            $new_select['short_name'] = "DISTINCT ON (short_name)";

            foreach ($query->select as $key=>$value) {
                if ('material.short_name' === $key) continue;

                $new_select[$key] = $value;
            }

            $query->select = $new_select;

            $query->select(['distinct on (short_name) short_name_uq', '*'])
                ->andWhere(['<>', 'material_condition', 'perspective'])
                ->andWhere(['is_archive' => false]);

but it generates alias for short_name like SELECT DISTINCT ON (short_name) AS short_name (but AS is syntax error in PostgreSQL).

I even tried to replace raw SQL query in ActiveQuery like this:

            $rawSql = str_ireplace("DISTINCT ON (short_name) AS", "DISTINCT ON (short_name)", $query->createCommand()->rawSql);
            $query = Materials::findBySql($rawSql);

but it somewhy runs out of memory (though the query itself is ok).

What is the expected result?

I want to get data from table selected with DISTINCT clause applied to one column.

What do you get instead?

I cannot :(

Additional info

Q A
Yii version 2.0.38
PHP version 7.0.27
Operating system AstraLinux 1.6

kkursor avatar Nov 13 '20 09:11 kkursor

In your example you are building select array and then calling select(['distinct on (short_name) short_name_uq', '*']) - isn't the last part all what you need? Could you try with only that?

bizley avatar Nov 13 '20 09:11 bizley

It should be what I need if it didn't generate alias for this condition. It generates SELECT DISTINCT ON(short_name) AS short_name, * FROM materials. AS here is syntax error in PostgreSQL. It should be SELECT DISTINCT ON(short_name) short_name, * FROM materials

kkursor avatar Nov 13 '20 09:11 kkursor

I was asking because I see short_name_uq as alias in your example and yet you state it's totally ignored on output.

bizley avatar Nov 13 '20 09:11 bizley

That's full output of this example.

Exception (Database Exception) 'yii\db\Exception' with message 'SQLSTATE[42601]: Syntax error: 7 ERROR:  syntax error at or near "AS"
LINE 1: SELECT distinct on (short_name) AS "short_name_uq", * FROM "...
                                        ^
The SQL being executed was: SELECT distinct on (short_name) AS "short_name_uq", * FROM "materials_schema"."material_list" "material" WHERE ("is_archive"=FALSE) AND (NOT ("material_condition"='perspective')) ORDER BY "id", "short_name", "full_name" LIMIT 20' 

AS before short_name is the issue. I can't find out how to strip it

kkursor avatar Nov 13 '20 09:11 kkursor

Does it work when you do it like that?

->select(['short_name_uq' => 'distinct on (short_name) short_name', '*'])

bizley avatar Nov 13 '20 10:11 bizley

It seems to work better. Though it doesn't distinct (row count is the same as without distinct condition), there is no syntax error. BTW select(['distinct on (short_name) *']) works too. Actually I don't need short_name_uq

kkursor avatar Nov 13 '20 10:11 kkursor

Soo, can we close it or...?

bizley avatar Nov 13 '20 10:11 bizley

maybe after I'll find a complete working solution? I'll post it here for somebody googling around the same trouble

kkursor avatar Nov 13 '20 10:11 kkursor

Please state your problem then

bizley avatar Nov 13 '20 10:11 bizley

Hm.

I have code:

            $query->select(['distinct on (short_name) *'])
                ->andWhere(['<>', 'material_condition', 'perspective'])
                ->andWhere(['is_archive' => false]);

Then

$query->createCommand()->rawSql

reads

SELECT distinct on (short_name) * FROM "materials_schema"."material_list" "material" WHERE ("material_condition" <> 'perspective') AND ("is_archive"=FALSE)

When I execute this query on DB manually, I get 1110 rows as I expect and they are actually distinct. But when I get $query->count(), I get 2553 rows - as if there were no distinct at all.

Strange behaviour.

kkursor avatar Nov 13 '20 13:11 kkursor

I worked it around with a view in database. No other way worked

kkursor avatar Nov 15 '20 20:11 kkursor

How about something like this?

$totalUniquePatients = $user->findClaimQuery($params)
    ->select([Claim::tableName() . '.member_id'])
    ->distinct()
    ->count();

kjusupov avatar May 10 '21 08:05 kjusupov

@kjusupov I think this will return only member_id field, not fully populated model instances.

kkursor avatar May 13 '21 10:05 kkursor