Deduplication on queries without presence of primary key
Please answer these questions before submitting your issue. Thanks!
- What version of Python are you using (
python --version)?
3.7.7
- What operating system and processor architecture are you using (
python -c 'import platform; print(platform.platform())')?
Darwin-19.4.0-x86_64-i386-64bit
- What are the component versions in the environment (
pip list)?
adal 1.2.2
alembic 1.2.1
appdirs 1.4.3
asn1crypto 1.2.0
astroid 2.3.2
atomicwrites 1.3.0
attrs 19.3.0
azure 4.0.0
azure-applicationinsights 0.1.0
azure-batch 4.1.3
azure-common 1.1.23
azure-cosmosdb-nspkg 2.0.2
azure-cosmosdb-table 1.0.5
azure-datalake-store 0.0.47
azure-eventgrid 1.3.0
azure-graphrbac 0.40.0
azure-keyvault 1.1.0
azure-loganalytics 0.1.0
azure-mgmt 4.0.0
azure-mgmt-advisor 1.0.1
azure-mgmt-applicationinsights 0.1.1
azure-mgmt-authorization 0.50.0
azure-mgmt-batch 5.0.1
azure-mgmt-batchai 2.0.0
azure-mgmt-billing 0.2.0
azure-mgmt-cdn 3.1.0
azure-mgmt-cognitiveservices 3.0.0
azure-mgmt-commerce 1.0.1
azure-mgmt-compute 4.6.2
azure-mgmt-consumption 2.0.0
azure-mgmt-containerinstance 1.5.0
azure-mgmt-containerregistry 2.8.0
azure-mgmt-containerservice 4.4.0
azure-mgmt-cosmosdb 0.4.1
azure-mgmt-datafactory 0.6.0
azure-mgmt-datalake-analytics 0.6.0
azure-mgmt-datalake-nspkg 3.0.1
azure-mgmt-datalake-store 0.5.0
azure-mgmt-datamigration 1.0.0
azure-mgmt-devspaces 0.1.0
azure-mgmt-devtestlabs 2.2.0
azure-mgmt-dns 2.1.0
azure-mgmt-eventgrid 1.0.0
azure-mgmt-eventhub 2.6.0
azure-mgmt-hanaonazure 0.1.1
azure-mgmt-iotcentral 0.1.0
azure-mgmt-iothub 0.5.0
azure-mgmt-iothubprovisioningservices 0.2.0
azure-mgmt-keyvault 1.1.0
azure-mgmt-loganalytics 0.2.0
azure-mgmt-logic 3.0.0
azure-mgmt-machinelearningcompute 0.4.1
azure-mgmt-managementgroups 0.1.0
azure-mgmt-managementpartner 0.1.1
azure-mgmt-maps 0.1.0
azure-mgmt-marketplaceordering 0.1.0
azure-mgmt-media 1.0.0
azure-mgmt-monitor 0.5.2
azure-mgmt-msi 0.2.0
azure-mgmt-network 2.7.0
azure-mgmt-notificationhubs 2.1.0
azure-mgmt-nspkg 3.0.2
azure-mgmt-policyinsights 0.1.0
azure-mgmt-powerbiembedded 2.0.0
azure-mgmt-rdbms 1.9.0
azure-mgmt-recoveryservices 0.3.0
azure-mgmt-recoveryservicesbackup 0.3.0
azure-mgmt-redis 5.0.0
azure-mgmt-relay 0.1.0
azure-mgmt-reservations 0.2.1
azure-mgmt-resource 2.2.0
azure-mgmt-scheduler 2.0.0
azure-mgmt-search 2.1.0
azure-mgmt-servicebus 0.5.3
azure-mgmt-servicefabric 0.2.0
azure-mgmt-signalr 0.1.1
azure-mgmt-sql 0.9.1
azure-mgmt-storage 2.0.0
azure-mgmt-subscription 0.2.0
azure-mgmt-trafficmanager 0.50.0
azure-mgmt-web 0.35.0
azure-nspkg 3.0.2
azure-servicebus 0.21.1
azure-servicefabric 6.3.0.0
azure-servicemanagement-legacy 0.20.6
azure-storage-blob 1.5.0
azure-storage-common 1.4.2
azure-storage-file 1.4.0
azure-storage-queue 1.4.0
bandit 1.6.2
black 19.3b0
boto3 1.9.253
botocore 1.12.253
certifi 2019.9.11
cffi 1.13.1
chardet 3.0.4
Click 7.0
coverage 5.0a8
cryptography 2.8
docutils 0.15.2
future 0.18.1
gitdb2 2.0.6
GitPython 3.0.4
idna 2.8
ijson 2.5.1
importlib-metadata 0.23
isodate 0.6.0
isort 4.3.21
jmespath 0.9.4
lazy-object-proxy 1.4.2
Mako 1.1.0
MarkupSafe 1.1.1
mccabe 0.6.1
more-itertools 7.2.0
msrest 0.6.10
msrestazure 0.6.2
oauthlib 3.1.0
oscrypto 1.1.0
packaging 19.2
pbr 5.4.3
pip 20.1
pluggy 0.13.0
psycopg2 2.8.4
py 1.8.0
pycparser 2.19
pycryptodomex 3.9.0
PyJWT 1.7.1
pylint 2.4.3
pyOpenSSL 19.0.0
pyparsing 2.4.2
pytest 5.2.1
pytest-cov 2.8.1
python-dateutil 2.8.0
python-editor 1.0.4
pytz 2019.3
PyYAML 5.1.2
requests 2.22.0
requests-oauthlib 1.2.0
s3transfer 0.2.1
setuptools 46.1.3
six 1.12.0
smmap2 2.0.5
snowflake-connector-python 2.0.2
snowflake-sqlalchemy 1.1.16
SQLAlchemy 1.3.10
stevedore 1.31.0
toml 0.10.0
typed-ast 1.4.0
urllib3 1.25.6
wcwidth 0.1.7
wheel 0.34.2
wrapt 1.11.2
zipp 0.6.0
- What did you do?
See https://docs.sqlalchemy.org/en/13/faq/sessions.html#my-query-does-not-return-the-same-number-of-objects-as-query-count-tells-me-why
If I run a query to return full entities q = session.query(User).outerjoin(User.addresses).filter(User.name == 'jack')
then q.count() returns multiple objects while there is deduplication with q.all(). But there is no primary key set in this table so this is unexpected. The SQL query executed in Snowflake does no deduplication so it is at SQLAlchemy side, but since other engines do not have this, reporting to you.
- What did you expect to see?
The q.all() above returns all objects in the database.
- What did you see instead?
A single object is returned.
- Can you set logging to DEBUG and collect the logs?
I generated and reviewed these logs and nothing seems relevant, let me know if otherwise and I will go through the effort of anonymising them.