mathesar
mathesar copied to clipboard
make row order deterministic in more cases
Fixes #1786
Now, we automatically append a last-step ordering by the primary key field(s) of a table when getting its records.
Technical details
This does not handle the case where both are true:
- The user does not specify a fully-determined order, and
- there is no primary key.
It would be possible to eke out a bit more ordering, since we could append all non-included columns to the requested order_by
clause to make it more deterministic, but I chose not to, since this would have a performance impact, and any table created through Mathesar will have a primary key anyway. The performance impact would be due to sorting by non-indexed columns.
Checklist
- [X] My pull request has a descriptive title (not a vague title like
Update index.md
). - [X] My pull request targets the
master
branch of the repository - [X] My commit messages follow best practices.
- [X] My code follows the established code style of the repository.
- [ ] I added tests for the changes I made (if applicable).
- [X] I added or updated documentation (if applicable).
- [X] I tried running the project locally and verified that there are no visible errors.
Developer Certificate of Origin
Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
@seancolsen The only review I'm requesting from you is that this solves the problem you described from the user perspective.
Codecov Report
Base: 92.44% // Head: 92.48% // Increases project coverage by +0.03%
:tada:
Coverage data is based on head (
8148e52
) compared to base (72cfaff
). Patch coverage: 94.64% of modified lines in pull request are covered.
Additional details and impacted files
@@ Coverage Diff @@
## master #1810 +/- ##
==========================================
+ Coverage 92.44% 92.48% +0.03%
==========================================
Files 146 147 +1
Lines 7122 7155 +33
==========================================
+ Hits 6584 6617 +33
Misses 538 538
Flag | Coverage Δ | |
---|---|---|
pytest-backend | 92.48% <94.64%> (+0.03%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
mathesar/api/db/viewsets/records.py | 97.05% <ø> (-0.03%) |
:arrow_down: |
mathesar/models/base.py | 92.87% <ø> (ø) |
|
db/transforms/base.py | 93.70% <66.66%> (-0.84%) |
:arrow_down: |
db/records/operations/sort.py | 95.65% <95.65%> (ø) |
|
db/queries/base.py | 98.71% <100.00%> (ø) |
|
db/records/exceptions.py | 100.00% <100.00%> (ø) |
|
db/records/operations/select.py | 97.56% <100.00%> (+4.70%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Fascinating. I clicked on a couple tables to make sure the order looked good, but didn't notice anything. Now, with more info, I've found that I clicked the wrong tables to check. I've determined that:
- Authors, Publications, and Publishers are all consistently out of order on my machine.
- Checkouts, Items, and Patrons are all consistently properly ordered on my machine.
@seancolsen Is this the case on your machine as well?
Moving to draft
status while I figure out why the fix isn't working properly.
@mathemancer
- Authors, Publications, and Publishers are all consistently out of order on my machine.
- Checkouts, Items, and Patrons are all consistently properly ordered on my machine.
Is this the case on your machine as well?
No. Currently all of that is true for me except that Items is out of order. I have wiped out my .volumes
and rebuilt Mathesar a number of times since we began using this Library Management schema and I'm about 80% confident that I've observed different sorting behavior after rebuilding. Although I only have a rudimentary understanding of the inner-working of Postgres, I would expect to observe these ordering inconsistencies, given this excerpt from the Postgres docs (emphasis added):
The actual order in that case will depend on the scan and join plan types and the order on disk, but it must not be relied on
My hunch is that when we load the library data, it gets placed on-disk with an ordering subject to the fragmentation of other data on-disk at that point, though also I understand even less about how SSDs work nowadays. Just a hunch. I'm about 60% confident that ordering is consistent after I re-build, but re-building seems to shuffle the ordering somewhat.
. I'm about 60% confident that ordering is consistent after I re-build, but re-building seems to shuffle the ordering somewhat.
This is my experience as well. I finally figured out the problem. Unbeknownst to me, we'd added another way to get records from a table so we can join previews in. I fixed the ordering (and the problem you'd noticed was already fixed) on a method for getting table records that's not used in most cases any more. The downside is, I'm now trying to solve a much more complicated problem of providing a default ordering on query results (where a primary key is often not present). The plus side is, when I succeed, query results will also be consistently ordered in the data explorer.
@seancolsen @dmos62 This needs another review, since I had to make some major changes.
As noted above, we're using the querying infrastructure for many (but not all) table row requests now, and I'd neglected that part of things. Unfortunately, adding default sorting to general queries makes this PR quite a bit more complicated.
This time, I checked all tables to make sure they were properly ordered.