orphaned treatments?
This search shows that there treatment_ids 1-10 exist in management_treatments, but 2-10 do not have referents in traits or yields.
select distinct treatment_id from managements_treatments order by treatment_id limit 10
select distinct treatment_id from yields order by treatment_id limit 10
select distinct treatment_id from managements_treatments order by treatment_id limit 10
select distinct treatment_id from traits_and_yields_view_private order by treatment_id limit 10
But if I try to identify treatment_ids that exist in managments_treatments but not in other tables,
select treatment_id from managements_treatments where treatment_id not in (select distinct treatment_id from traits_and_yields_view_private)
select treatment_id from managements_treatments where treatment_id not in (select distinct treatment_id from traits_and_yields_view_private)
select treatment_id from managements_treatments where treatment_id not in (select distinct id from treatments)
I tried disabling constraints but that did not help.
alter table managements disable trigger all
alter table treatments disable trigger all
alter table managements_treatments disable trigger all
alter table yields disable trigger all
alter table traits disable trigger all
I'm not sure exactly why the subselects aren't working the way one would expect (though I'm sure constraints have nothing to do with it). For now, you can use a join instead, like this:
select distinct mt.treatment_id from managements_treatments mt
left join traits_and_yields_view_private tyvp on mt.treatment_id = tyvp.treatment_id
where tyvp.id is null order by mt.treatment_id;
It's probably a bit more intuitive to do things this way:
SELECT DISTINCT treatment_id FROM managements_treatments AS mt
WHERE NOT EXISTS(
SELECT 1 FROM traits_and_yields_view_private AS tyvp
WHERE tyvp.treatment_id = mt.treatment_id
) ORDER BY treatment_id;
(The "AS" keyword here is optional--you can leave them out and get the same effect. And in this case at least, you don't really need to alias the table names at all except to save some typing.)
This is even simpler (and perhaps faster):
SELECT DISTINCT treatment_id FROM managements_treatments
EXCEPT
SELECT treatment_id FROM traits_and_yields_view_private
ORDER BY treatment_id;
Mystery Solved!
It appears that your original query didn't work because of SQL's perverse 3-valued logic.
SELECT treatment_id FROM managements_treatments
WHERE treatment_id NOT IN
(SELECT DISTINCT treatment_id FROM traits_and_yields_view_private);
would have worked as expected if the traits_and_yields_view_private table didn't have any NULLs in the treatment_id column. But it does, so the WHERE clause evaluates either to true [correction: false; see following note] (if the given treatment_id is found) or NULL (if it isn't). If you think of NULL as standing for an unknown value, then the WHERE clause never returns false [correction: never returns true; see following note] when the table has a NULL because we can't be sure that that NULL isn't a stand-in for the value we're looking for. So unless and until we constrain the treatment_id column of the traits_and_yields_view_private table to be non-NULL, we have to use this query instead:
SELECT treatment_id FROM managements_treatments
WHERE treatment_id NOT IN
(SELECT DISTINCT treatment_id FROM traits_and_yields_view_private
WHERE treatment_id IS NOT NULL);
(By the way, you probably want the DISTINCT next to the first occurrance of treatment_id, not in the subselect).
Sorry--I got it backwards: x IN list is never false if list contains a NULL, so x NOT IN list is never true if list contains a NULL. This means the overall SELECT statement will return zero rows, as we have seen, since the WHERE clause never evaluates to true.
To summarize:
x IN list IS TRUE if x occurs in list, regardless of whether or not NULL occurs in list
IS NULL if x does not occur in list but NULL does
IS FALSE if neither x nor NULL occur in list
x NOT IN list IS FALSE if x occurs in list, regardless of whether or not NULL occurs in list
IS NULL if x does not occur in list but NULL does
IS TRUE if neither x nor NULL occur in list
I'm not sure why you closed this. This query shows 575 treatments are not referred to by any trait or yield:
SELECT DISTINCT treatment_id FROM managements_treatments
WHERE treatment_id NOT IN
(SELECT treatment_id FROM traits_and_yields_view_private
WHERE treatment_id IS NOT NULL) ORDER BY treatment_id;
If you don't care about this or if this issue is duplicated in some other issue, then I guess you can close it again.
These records have been orphaned and can be deleted; please delete them.
I think this would be
delete from treatments where id in (SELECT DISTINCT treatment_id FROM managements_treatments
WHERE treatment_id NOT IN
(SELECT treatment_id FROM traits_and_yields_view_private
WHERE treatment_id IS NOT NULL));
Then delete the orphaned managements
delete from managements_treatments where treatment_id NOT IN
(SELECT treatment_id FROM traits_and_yields_view_private
WHERE treatment_id IS NOT NULL);
delete from managements where id NOT IN
(SELECT management_id FROM managements_treatments);