boto3 icon indicating copy to clipboard operation
boto3 copied to clipboard

Add ability to disable Decimal usage for DynamoDB number type.

Open jonapich opened this issue 9 years ago • 116 comments

Hi,

Is there any strong reason why using a DynamoDB's Table resource will convert the number type "N" to a Decimal() object?

Shouldn't it try looking up the right python type, such as int or float or long?

I am trying to unpack a record's data (a mapping) into a function call, specifically: next_execution = now() + datetime.timedelta(**dynamo_record['frequency']) but datetime.timedelta will not accept the Decimal object into its arguments, although it does accept long and float.

TypeError: unsupported type for timedelta days component: Decimal

jonapich avatar Nov 16 '15 15:11 jonapich

This was one of the lessons learned from boto. If I remember correctly, there were issues with round tripping float values and that the built in float() type could not handle the 38 digits of precision supported in dynamodb's numeric types. This would result in not being able to delete items in the table:

>>> d = Decimal('1234567890123.12345678901234567890')
>>> d
Decimal('1234567890123.12345678901234567890')
>>> float(d)
1234567890123.1235

More background info:

https://github.com/boto/boto/issues/873

PR: https://github.com/boto/boto/pull/1183

Perhaps we could support using ints() if there's no floating point in the number. Would need to investigate what the impact of that would be. Would that help in your scenario?

jamesls avatar Nov 16 '15 22:11 jamesls

After a trip to python's doc on floating point limitations, I can see where this came from. It's a shame that the standard library doesn't support Decimal() in place of a float.

To answer your question specifically, it will not help my scenario. The timedelta supports floats, it's perfectly valid to do a timedelta(hours=1.5). I think it would only add to the confusion if boto3 would change between int and Decimal on a per-record basis as its default behavior.

Perhaps a small option somewhere to tell boto we don't really care about the added precision, so it can return floats and ints across the board?

jonapich avatar Nov 16 '15 22:11 jonapich

FWIW, I use this little function to recurse into Python objects returned by the boto3 DynamoDB resource layer and convert any Decimal values to int or float. It is by no means foolproof and doesn't in any way solve the problem of lack of precision in Python's float type but it solves my problem which is mainly to turn the data into something that can be returned to API Gateway via a Python Lambda function.

def replace_decimals(obj):
    if isinstance(obj, list):
        for i in xrange(len(obj)):
            obj[i] = replace_decimals(obj[i])
        return obj
    elif isinstance(obj, dict):
        for k in obj.iterkeys():
            obj[k] = replace_decimals(obj[k])
        return obj
    elif isinstance(obj, decimal.Decimal):
        if obj % 1 == 0:
            return int(obj)
        else:
            return float(obj)
    else:
        return obj

garnaat avatar Nov 16 '15 23:11 garnaat

Thanks for the code @garnaat, that certainly works.

This issue is a big PITA.

astewart-twist avatar Nov 18 '15 00:11 astewart-twist

Would adding some sort of use_decimal=False option to the config object when creating clients/resources be helpful?

jamesls avatar Nov 18 '15 01:11 jamesls

Of course that would be useful :)

jonapich avatar Nov 18 '15 17:11 jonapich

Ok, let's mark this as a feature request. I'll update the title.

jamesls avatar Nov 18 '15 23:11 jamesls

:+1: for this feature

niallrobinson avatar Feb 10 '16 15:02 niallrobinson

This particular problem likes to creep into my code in the most unusual places. Converting from Decimal to int/float is a thing, but it seems boto won't take my python floats anymore (did it take them before? i'm not sure) so I created the following function to prepare all of my data before sending it to dynamodb (kinda like @garnaat's method, but the other way around):

def _sanitize(data):
    """ Sanitizes an object so it can be updated to dynamodb (recursive) """
    if not data and isinstance(data, (basestring, Set)):
        new_data = None  # empty strings/sets are forbidden by dynamodb
    elif isinstance(data, (basestring, bool)):
        new_data = data  # important to handle these one before sequence and int!
    elif isinstance(data, Mapping):
        new_data = {key: _sanitize(data[key]) for key in data}
    elif isinstance(data, Sequence):
        new_data = [_sanitize(item) for item in data]
    elif isinstance(data, Set):
        new_data = {_sanitize(item) for item in data}
    elif isinstance(data, (float, int, long, complex)):
        new_data = Decimal(data)
    else:
        new_data = data
    return new_data

jonapich avatar Mar 11 '16 18:03 jonapich

+1 for this

mr337 avatar Apr 08 '16 22:04 mr337

+1 for this feature request, I've now run into a similar problem to the one @jonapich experienced.

jonathanwcrane avatar May 24 '16 14:05 jonathanwcrane

#665

jonapich avatar May 31 '16 21:05 jonapich

+1 for this feature request

sridharrajagopal avatar Jun 01 '16 03:06 sridharrajagopal

+1

ustroetz avatar Jun 15 '16 09:06 ustroetz

+1

josepvalls avatar Aug 02 '16 05:08 josepvalls

any update on this ?

ghost avatar Oct 03 '16 22:10 ghost

+1

jong-eatsa avatar Nov 11 '16 17:11 jong-eatsa

+1

yit-b avatar Nov 23 '16 16:11 yit-b

+1

johnjjung avatar Dec 14 '16 19:12 johnjjung

+1

blieber avatar Jan 23 '17 08:01 blieber

+1

itssiva avatar Feb 08 '17 19:02 itssiva

+1

pmranade avatar Feb 10 '17 19:02 pmranade

+1

mwada avatar Feb 13 '17 12:02 mwada

+1

cdmbr avatar Feb 13 '17 12:02 cdmbr

Seriously, this is not okay. Using python3.6 I can store math.floor(time.time()) in DynamoDB. Using python2.7 I cannot. A database is expected to be able to receive numbers. That's pretty basic.

RichardBronosky avatar Mar 01 '17 06:03 RichardBronosky

+1

Manelmc avatar May 04 '17 08:05 Manelmc

Just used the code by @garnaat and updated it to Python 3.6:

def replace_decimals(obj):
    if isinstance(obj, list):
        for i in range(len(obj)):
            obj[i] = replace_decimals(obj[i])
        return obj
    elif isinstance(obj, dict):
        for k, v in obj.items():
            obj[k] = replace_decimals(v)
        return obj
    elif isinstance(obj, decimal.Decimal):
        if obj % 1 == 0:
            return int(obj)
        else:
            return float(obj)
    else:
        return obj

flomotlik avatar May 17 '17 16:05 flomotlik

+1

sillmnvg avatar Aug 21 '17 13:08 sillmnvg

+1

lkuffo avatar Sep 12 '17 23:09 lkuffo

The code below is for saving to DynamoDB; use @flomotlik's code in https://github.com/boto/boto3/issues/369#issuecomment-302137290 to load floats from DynamoDB.

To allow rounding and inexact values and still prevent over/underflow and clamping, I'd recommend using a decimal.Context such as the one in boto3/dynamodb/types.py but drop the decimal.Inexact and decimal.Clamped traps. I'd also use numeric for the type check in the sanitizer instead of just checking for Decimal or float. The following serializer should be a bit more robust:

from collections.abc import Iterable, Mapping, ByteString, Set
import numbers
import decimal

context = decimal.Context(
    Emin=-128, Emax=126, rounding=None, prec=38,
    traps=[decimal.Clamped, decimal.Overflow, decimal.Underflow]
)


def dump_to_dynamodb(item):
    # don't catch str/bytes with Iterable check below;
    # don't catch bool with numbers.Number
    if isinstance(item, (str, ByteString, bool)):
        return item

    # ignore inexact, rounding errors
    if isinstance(item, numbers.Number):
        return context.create_decimal(item)
    
    # mappings are also Iterable
    elif isinstance(item, Mapping):
        return {
            key: dump_to_dynamodb(value)
            for key, value in item.values()
        }

    # boto3's dynamodb.TypeSerializer checks isinstance(o, Set)
    # so we can't handle this as a list
    elif isinstance(item, Set):
        return set(map(dump_to_dynamodb, item))
    
    # may not be a literal instance of list
    elif isinstance(item, Iterable):
        return list(map(dump_to_dynamodb, item))
    
    # datetime, custom object, None
    return item

<shameless plug> I wrote bloop to be a simpler interface to DynamoDB. It's overkill if float handling is the only thing you want to solve. It's good if you want more ergonomic systems for consuming streams, writing conditions and managing optimistic concurrency, sharing tables and simpler query projections. There is a pattern for a Float type which links to this issue. </shameless plug>

numberoverzero avatar Sep 18 '17 06:09 numberoverzero