databases icon indicating copy to clipboard operation
databases copied to clipboard

scrambled data in returned values

Open sloev opened this issue 5 years ago • 1 comments

Scrambled data in simultaneous SQL requests

The following has been anonymized

in our sanic app we have the following three pieces of db logic. All of them are wrapped in a thin api request handler that provides them with correctly typed args.

get_impressions_and_views

async def get_impressions_and_views(business_unit_id:str, from_date:datetime.date, to_date:datetime.datetime):
   query = """
        SELECT
            COALESCE(SUM(impressions), 0) AS impressions,
            COALESCE(SUM(views), 0) AS views
        FROM
            some_impressions
        WHERE
            business_unit_id = :business_unit_id
        AND
            day >= :from_date AND day < :to_date
    """

    result = await database.fetch_one(
        query=query,
        values=dict(
            business_unit_id=business_unit_id, from_date=from_date, to_date=to_date
        ),
    ) or dict(impressions=0, views=0)

    return dict(result)

invitations overview

async def invitation_overview(business_unit_id:str, from_date:datetime.date):
    query = """
        SELECT (
                SELECT score
                FROM historical
                WHERE calculationdate < :from_date
                AND businessunitid = :business_unit_id
                ORDER BY calculationdate DESC
                LIMIT 1
            ) AS latest_score_before_period,
            (
                SELECT score
                FROM score_historical
                WHERE businessunitid = :business_unit_id
                ORDER BY calculationdate DESC
                LIMIT 1
            ) AS current_score
    """
       
    result = await database.fetch_one(
        query=query,
        values=dict(
            business_unit_id=business_unit_id,
            from_date=two_months_ago,
        ),
    )


    return {
        "last_score": result["latest_score_before_period"] or 0,
        "current_score": result["current_score"] or 0,
    }

field usage

async def get_field_usage(business_unit_id: str):
    query_template = """
        SELECT DISTINCT template_id as template
        FROM  some_table
        WHERE business_unit_id = :business_unit_id
        and template_id != ""
        and template_id IS NOT NULL
    """

    query_language = """
        SELECT DISTINCT language
        FROM  some_table
        WHERE business_unit_id = :business_unit_id
        and language != ""
        and language IS NOT NULL
    """

    query_source = """
        SELECT DISTINCT source
        FROM  some_table
        WHERE business_unit_id = :business_unit_id
        and source != ""
        and source IS NOT NULL
    """

    templates = (
        await database.fetch_all(
            query=query_template, values=dict(business_unit_id=business_unit_id)
        )
        or []
    )

    sources = (
        await database.fetch_all(
            query=query_source, values=dict(business_unit_id=business_unit_id)
        )
        or []
    )

    languages = (
        await database.fetch_all(
            query=query_language, values=dict(business_unit_id=business_unit_id)
        )
        or []
    )

    return dict(
        templates=[dict(d) for d in templates],
        sources=[dict(d) for d in sources],
        languages=[dict(d) for d in languages],
    )

our db is setup like this:

import logging

from databases import Database
from src import settings

database = Database(settings.DB_CONNECTION_STRING)


async def setup_session(app, loop):
    await database.connect()


async def teardown_session(app, loop):
    await database.disconnect()

The issue

today we saw the following returned by our api call to "field-usage" endpoint:

{
    "languages": {"latest_score_before_period": 45, "current_score": 47},
    "sources: [{impressions: 0, views: 0}]",
    "templates": []
}

That is:

  1. the outer structure is from the field_usage call being a json with keys templates, languages, sources
  2. inside the languages key is a dict resembling the returned SQL result of the overview SQL request
  3. inside the sources key is a dict resembling the returned SQL of the get_impressions_and_views SQL call

if you look at the field-usage endpoint you can see that passes directly the output of the SQL fetch_one into the individual keys, no checks.

It looks like the sql results got scrambled and somehow the wrong result ended up in the wrong coroutine future or something

sloev avatar Apr 17 '20 12:04 sloev

i am trying to write a script that demonstrates this is isolation, but currently i've had no luck yet. thats why i posted id here in the mean time

sloev avatar Apr 17 '20 12:04 sloev