PynamoDB Multiple Models on top of One DynamoDB Table

Hi All,

It is being suggested by DynamoDB Best practices that "most well designed applications require only one table". Does anyone has tried to fit all relational entities in one DynamoDB table and access them through PynamoDB.

Please share the experience, if anyone tried to follow this best practice.

Thanks, Akash

Sep 01 '19 09:09 akashg90

I've done this using PynamoDB, but you have to be very conscious of how you manage your model's Meta classes, hash keys etc (if you implement inheritance).

The main thing to be careful of is how you manage your composite keys (assuming you're using composite keys for either your hash or range keys). I've found that using a custom Attribute for hash/range keys helps a lot and this strategy can enable some really powerful schemas on top of a single DynamoDB table. Especially now that Transactions are a thing.

Note too that you will likely have to over-ride the query/get class methods to ensure your composite keys are serialised (for example: prefixing a static value to the dynamic portion of the composite key) before calling the underlying query/get class methods.

I must admit that PynamoDB doesn't ~make this very easy~ encourage this strategy, but it is possible as long as you're careful.

Sep 20 '19 07:09 grvhi

You might be able to use a UnicodeDelimitedTupleAttribute to somehow formalize that "composite key" approach: https://github.com/lyft/pynamodb-attributes/blob/master/pynamodb_attributes/unicode_delimited_tuple.py

It's a fair thing to say that if your datastore is schema-less, thought you might have schema in the application layer, you don't need to involve your datastore in that by naming your tables after your schemas (or having a table per schema even). I'll go ahead and say that's not how we've been using DynamoDB here at Lyft, but I'll be curious to see examples of this design approach being applied on some canonical app example, e.g. design of a blogging system with users, posts, analytics, etc.

Sep 21 '19 02:09 ikonst

Thanks @ikonst - the UnicodeDelimitedTupleAttribute is definitely a good way to manage composite keys; it's very similar to the approach I've taken on my projects.

On the back of the custom in-application work I've been doing, I've started working (very casually) on a library which uses PynamoDB's models and some Attribute classes, but encourages the concept of modelling your data as DynamoDB partitions rather than tables (i.e. one table, many models).

The potential danger of this approach is hot partitions on DynamoDB. But as long as your data either a) will fit inside one DynamoDB partition, or b) can use a significant spread of hash (partition) key values, then it should be easy to overcome.

Your blogging system example, using the concept of "models-as-partitions" (one table, many models), could yield data in DynamoDB which looks something like this:

hash_key	range_key	created_dt	content	username	value
posts	post1	1569038270	Some post
posts	post2	1569038270	Another post
posts	post3	1569038270	Third post
users	user1	1569038270		Bob
users	user2	1569038270		Roger
users	user3	1569038270		Tim
counters	post1	1569038270			23
counters	post2	1569038270			44
counters	post3	1569038270			64

Your python models could then define a single base class for each logical partition. i.e. A Posts model, Users model etc.

However, given the small number of hash_key values, you'd need to make sure all your data would fit within one DynamoDB partition (per a) above).

Note though that time-based analytics would not be a good use-case for a single-table pattern (per DynamoDB's best practices docs).

An (somewhat contrived) example of a data schema which I feel avoids hot partitions is as follows:

hash_key	range_key	created_dt	username	full_name	content	owner_id	post_id	primary_email
ff1c09eb-2e67-4773-a8ae-6d9e48055034#user	meta	1569038771	bob1	Bob Richards		ff1c09eb-2e67-4773-a8ae-6d9e48055034
ff1c09eb-2e67-4773-a8ae-6d9e48055034#user	email_data	1569038771				ff1c09eb-2e67-4773-a8ae-6d9e48055034		[email protected]
ff1c09eb-2e67-4773-a8ae-6d9e48055034#user	comments#1569038771	1569038771			This is bob's comment	ff1c09eb-2e67-4773-a8ae-6d9e48055034	841f97a1-2004-4ee5-907d-cc24f928f8cb
17a4b8d2-79fa-4564-ab2c-bdc9dd4ce4b5#user	meta	1569038771	tim2	Tim Nice		17a4b8d2-79fa-4564-ab2c-bdc9dd4ce4b5
35c208e9-6057-40c8-b062-1f722f07d5f6#user	meta	1569038771	roger3	Roger Dodger		35c208e9-6057-40c8-b062-1f722f07d5f6
841f97a1-2004-4ee5-907d-cc24f928f8cb#post	posts	1569039058			This is a post body		841f97a1-2004-4ee5-907d-cc24f928f8cb

You could add GSIs for owner_id and post_id to allow for retrieval of all data belonging to a user or post.

Sep 21 '19 04:09 grvhi

I'm working on a project that is following the linked best practise for data storage. As it stands, there doesn't seem to be any documented support for this access and storage pattern. Is there any goal to eventually include this as part of the documentation?

@grvhi Do you have any further reading on your approach?

In the meantime, I'm going to make a small attempt at this and report back on my findings for your proposed approach using UnicodeDelimitedTupleAttribute.

Sep 23 '19 10:09 curlywurlycraig

@curlywurlycraig - I don't have any documentation or currently-available reference implementation. However, I will say that the UnicodeDelimitedTupleAttribute is definitely a good starting point, based on what I've come across going down this road, so far.

The library I'm working on (which I refer to briefly above) has implemented a class I've called Partition which uses PynamoDB's MetaModel and AttribubteContainer to offer similar functionality as PynamoDB's Model, but with some workarounds for inheriting the Meta class and correctly serialising composite keys. However, the main focus of the library is to tightly couple graphql-core-next, a "query planner" and results caching. It's very (very) early days at this stage, but I could share it if you think it will be useful.

Sep 23 '19 14:09 grvhi

That could be pretty helpful for me. I'm currently having a crack at creating a very minimal similar thing for our use cases, so having a reference would be helpful if you're interested in sharing the source.

Sep 23 '19 14:09 curlywurlycraig

@curlywurlycraig - here's a gist: https://gist.github.com/grvhi/78889f32b3701c421ef30e72aebc7f69

I've tried to remove anything which related specifically to the integration of graphql-core-next; hopefully I've neither removed too much, nor too little! Happy to address comments/questions (if you have any) on the gist itself.

Note that there's still a lot of work to be done here and I'm very much open to the idea that I've gone down the wrong path! Would be interested to hear your thoughts.

Sep 23 '19 14:09 grvhi

@grvhi Thank you! I'll take a look at this. I have my own piece of code at the moment and will compare. Your approach looks more comprehensive than mine.

Sep 23 '19 14:09 curlywurlycraig

How have you guys progressed in using pynamo for single-table design? I'm working on a new project and after reading lots of aws documentation, blogs, etc, it's clear that they promote 1-app/1-table. It's not clear how to implement this in pynamodb or whether native support will eventually land?

Dec 11 '19 22:12 erichaus

Seems like even AWS's own internal team didn't implement adjacency lists when doing this for AppSync.

Dec 12 '19 19:12 ricky-sb

@ricky-sb I've had an aws engineer tell me single table makes m2m hard at scale, but publicly they are recommending single table design: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-general-nosql-design.html#bp-general-nosql-design-concepts You should maintain as few tables as possible in a DynamoDB application. Most well designed applications require only one table.

Dec 12 '19 21:12 erichaus

I've built a Python library called ~Dokklib-DB~ for the DynamoDB single table pattern.

~Dokklib-DB~ takes a different philosophy from PynamoDB in that follows the DynamoDB API closely, so if you've have used Boto3 before, it will feel familiar.

Features:

Simple, Pythonic query interface on top of Boto3. No more nested dict literals!
Type safety for primary keys and indices (for documentation and data integrity).
Easy error handling.
Full type hint & unit test coverage + integration testing.

~Docs: https://dokklib.com/libs/db/~

Update 2022-03-11: The project is now archived, but the source is still available at https://github.com/dokklib/dokklib-db

Feb 25 '20 19:02 agostbiro

@abiro this looks really nice! great work! hm now I want to give it a try... using multi-table atm, def bookmarking, thank you

Feb 25 '20 20:02 erichaus

We're working on a fork(ish) of PynamoDB that's built for single table design. Same interface, but works with multiple models on a single table: https://github.com/3mcloud/falcano

Aug 13 '20 18:08 erictwalker18

If the ability to query multiple polymorphic models at the same time was added, then I think the single-table use case would be more-or-less covered. E.g:

class ParentModel(Model):
    class Meta:
        table_name = 'polymorphic_table'
    id = UnicodeAttribute(hash_key=True)
    sort = UnicodeAttribute(range_key=True)
    cls = DiscriminatorAttribute()

class FooModel(ParentModel, discriminator='Foo'):
    foo = UnicodeAttribute()

class BarModel(ParentModel, discriminator='Bar'):
    bar = UnicodeAttribute()

items = ParentModel.query('some id')
# Items contains instances of both FooModel, and BarModel depending on the discriminator property

All models on the same table would be sub-classes of a single parent model.

Feb 04 '21 18:02 wisaac407

#1004 with this issue present, polymorphic models are out of the choices for me right now 😢

Feb 03 '22 08:02 mrsakkaro

It would be really nice to have support for Single table design for DynamoDB at hash level, without using additional columns (like typedorm) . I believe this is how most people are using it, at least most of the examples I see and that's what we are using. I'm guessing pynamodb would be even more useful in this cases to deal with querying using single table design and it would gain wider adoption. :)

Feb 08 '22 16:02 bafonso

@bafonso what do you mean by "at hash level"?

Feb 14 '22 18:02 erichaus

@bafonso what do you mean by "at hash level"?

Sorry my terminology is not very accurate. I basically meant support for prefixing keys with type, ie USER#, PRODUCT#<product_id> etc

Mar 18 '22 14:03 bafonso

If the ability to query multiple polymorphic models at the same time was added, then I think the single-table use case would be more-or-less covered. E.g:
class ParentModel(Model):
    class Meta:
        table_name = 'polymorphic_table'
    id = UnicodeAttribute(hash_key=True)
    sort = UnicodeAttribute(range_key=True)
    cls = DiscriminatorAttribute()

class FooModel(ParentModel, discriminator='Foo'):
    foo = UnicodeAttribute()

class BarModel(ParentModel, discriminator='Bar'):
    bar = UnicodeAttribute()

items = ParentModel.query('some id')
# Items contains instances of both FooModel, and BarModel depending on the discriminator property
All models on the same table would be sub-classes of a single parent model.

Agree completely with this approach.

Jul 08 '22 13:07 OGoodness

PynamoDB PynamoDB copied to clipboard

Multiple Models on top of One DynamoDB Table

PynamoDB
PynamoDB copied to clipboard