electrodb icon indicating copy to clipboard operation
electrodb copied to clipboard

feat: index projection & index scanning

Open anatolzak opened this issue 6 months ago • 15 comments

Closes https://github.com/tywalch/electrodb/issues/508, https://github.com/tywalch/electrodb/issues/507

Currently, users who want to use DynamoDB indexes with the INCLUDE projection type do not receive any type safety from ElectroDB.

These indexes are highly beneficial for access patterns that require multiple optional filters, which a standard index cannot accommodate. By creating an index with only the attributes needed for the multi-filter access pattern, you can achieve much faster queries while scanning less data, resulting in lower costs.

It is also sometimes useful to scan a specific index when it is sparse and you are only interested in the items present in that index.

At present, ElectroDB does not offer a built-in type-safe method to scan an index.

This PR introduces two new features:

  • A new option to specify the projected attributes of an index
  • The ability to scan a specific index
  • You can combine the previous two features: scanning a sparse index with specific projected attributes in the index.

Here is an example of how to define projected attributes.

indexes: {
    statusIndex: {
        index: 'gsi1pk-gsi1sk-index',
        projection: ['name', 'status', 'createdAt'],
        pk: {
            field: 'gsi1pk',
            composite: ['status'],
        },
        sk: {
            field: 'gsi1sk',
            composite: ['createdAt'],
        }
    }
}

Here is an example of how to scan a specific index.

await StoreLocations.scan.units.go();

The PR primarily includes modifications at the type level to enhance type safety. Here is the type functionality:

  1. Scanning or querying an index with limited projected attributes results in the output type containing only those projected attributes.
  2. When scanning or querying an index with limited projected attributes, using the attributes argument in the go method allows you to specify only the projected attributes.
  3. If you use the hydrate option while scanning or querying an index with limited projected attributes, the output type will include all of the entity’s attributes.
  4. When using the hydrate option, you can specify any attributes of the entity in the attributes parameter while scanning or querying an index with limited projected attributes.
  5. When scanning or querying an index with limited projected attributes, you can only filter by the projected attributes, regardless of whether you use the hydrate option.

Implementation Overview

I introduced a new generic type P to the Entity and Schema to capture the type of the projected attributes. To ensure backward compatibility, the generic type has a default value of string, and the new generic is added as the last one (after S).

This change allows users to migrate to the new version of the library without encountering type errors when using the Schema or Entity types.

To ensure that the output type of queries and index scans with limited projected attributes contains only the projected attributes, we forward the "access pattern" key to the QueryBranches and GoQueryTerminalOptions etc. This allows us to implement conditional types for achieving the fine-grained type safety described above.

Notes

I added documentation, numerous tests, and type tests to validate all this functionality.

anatolzak avatar Jul 11 '25 14:07 anatolzak

Deploy Preview for electrodb-dev canceled.

Name Link
Latest commit 38ba77ab24a9186916ea13ea6318bd196df07753
Latest deploy log https://app.netlify.com/projects/electrodb-dev/deploys/68fcd7f144484f00085b2485

netlify[bot] avatar Jul 11 '25 14:07 netlify[bot]

Hey @anatolzak 👋

Firstly, wow, just wow. This is easily the most extensive feature PR this project has ever received (excluding the time someone submitted a PR for electrodb.dev). I will need some time to review this, but thank you for including both code and type tests with your PR. Two things that may come up in review (note: I have not read through this PR yet):

  1. Splitting the PR into a "scan index" PR and an "index project" PR
  2. Changes to avoid breaking existing and/or exported types

tywalch avatar Jul 11 '25 19:07 tywalch

@tywalch, first, thank you for the kind words! :)

Regarding the two points you mentioned:

  1. I considered splitting this PR into two separate ones. However, the "index project" PR required modifications that depend on the "index scan" already being implemented. It was easier to combine the two features. Let me know if you'd like me to split it after you review the code, and I will be happy to do so.
  2. I did my best to avoid breaking the most important exported types, such as Schema and Entity. I am happy to address any other types that you think should be handled better.

Looking forward to your feedback! Thanks a lot and thanks for creating such an amazing library! :)

anatolzak avatar Jul 11 '25 20:07 anatolzak

I'm starting to review this today. You'll find I am very cautious around new changes and I won't be rushing to get this merged, but at offset I am very impressed with all the considerations I am seeing you make in your tests 💪

tywalch avatar Jul 15 '25 14:07 tywalch

@anatolzak I made some progress on this today and two things shook out:

  1. Instead of going down a path that would likely have a lot of back and forth, I created a branch that adds projection validation to Entities and Services. It also renames project to projection. The branch is called tywalch/feat-index-projection-extension. If you pull it to your PR branch, we can retain your contributions on merge. My branch should be based on your latest commit, but please double-check that it's still true when you pull.
  2. Can you take a look at adding this narrowed typing to Service collections? Let me know if it's a heavy lift, and we can consider postponing it.

tywalch avatar Jul 16 '25 18:07 tywalch

@tywalch

  1. makes perfect sense
  2. no problem, I glanced at the service collections code, and it doesn't seem too difficult

anatolzak avatar Jul 17 '25 16:07 anatolzak

@tywalch I finished adding the type narrowing for service collections based on the DynamoDB index projection type, along with many type tests.

a few notes:

  1. Let's say you have a collection with entity A, which has attributes attr1 and attr2, and another entity B with attributes attr3 and attr4. If you query the collection and specify that you want to return the attributes attr1 and attr3, ElectroDB will throw an error: ElectroError: Unknown attributes provided in query options. I am unsure about the best way to handle this. Should we ignore this check for collection queries, or maybe perform the attribute check based on all the valid attributes of the entities in the collection?
  2. I don't believe this is critical for this PR, but you will notice in my last commit that I had to extract the generic A type from the Schema in a few places instead of inferring it from the Entity. It seems that inferring the A generic type based on the Entity does not work. see a playground example

anatolzak avatar Jul 19 '25 16:07 anatolzak

@tywalch I finished adding the type narrowing for service collections based on the DynamoDB index projection type, along with many type tests.

Awesome! I'll take a look tomorrow to familiarize myself with everything 💪

  1. Let's say you have a collection with entity A, which has attributes attr1 and attr2, and another entity B with attributes attr3 and attr4. If you query the collection and specify that you want to return the attributes attr1 and attr3, ElectroDB will throw an error: ElectroError: Unknown attributes provided in query options. I am unsure about the best way to handle this. Should we ignore this check for collection queries, or maybe perform the attribute check based on all the valid attributes of the entities in the collection?

Yes, likely that latter (perform the attribute check based on all the valid attributes of the entries in the collection. I'd like to take a look at this tomorrow as well. Services delegate queries to a member entity and then use execution options to change the behavior of the Entity. It's all very hacky and I want to see if I can stumble into a fast solution there.

  1. I don't believe this is critical for this PR, but you will notice in my last commit that I had to extract the generic A type from the Schema in a few places instead of inferring it from the Entity. It seems that inferring the A generic type based on the Entity does not work. see a playground example

This makes sense; I will be the first to say index.d.ts is desperate for optimization 😭

tywalch avatar Jul 19 '25 20:07 tywalch

  1. Let's say you have a collection with entity A, which has attributes attr1 and attr2, and another entity B with attributes attr3 and attr4. If you query the collection and specify that you want to return the attributes attr1 and attr3, ElectroDB will throw an error: ElectroError: Unknown attributes provided in query options. I am unsure about the best way to handle this. Should we ignore this check for collection queries, or maybe perform the attribute check based on all the valid attributes of the entities in the collection?

An update on this front: I have a working local branch with these changes. I will be working on additional tests for it tomorrow and hope to have it ready for you to merge with your PR.

tywalch avatar Jul 22 '25 21:07 tywalch

An update on this front: I have a working local branch with these changes. I will be working on additional tests for it tomorrow and hope to have it ready for you to merge with your PR.

That's awesome! Thanks so much for doing this! 🙏

anatolzak avatar Jul 23 '25 18:07 anatolzak

hey @tywalch 👋

let me know if there is anything I can do to help!

anatolzak avatar Jul 31 '25 15:07 anatolzak

An update on this front: I have a working local branch with these changes. I will be working on additional tests for it tomorrow and hope to have it ready for you to merge with your PR.

Hey @tywalch 🙂

Just checking in to see how things are going with the review and the additional changes you mentioned (collection attribute checks, ownership checks, etc.). Is there anything I can do to help move this PR forward?

Thanks!

anatolzak avatar Aug 15 '25 15:08 anatolzak

Hey @tywalch!

I went ahead and updated the PR to make it merge-ready. Previously, there were two outstanding issues:

  1. Entity identifier attributes in INCLUDE projections – Users had to manually add ElectroDB’s internal entity identifier attributes to the projected attributes of the index, which isn’t a great DX.
  2. Specifying attributes to return when querying a collection – This PR originally added types for specifying attributes to return in collection queries, but that doesn’t currently work at runtime.

Here’s how I addressed them:

  1. I updated the PR so that ignoreOwnership is set to true by default when querying an INCLUDE or KEYS_ONLY index. I also added tests for this and updated the docs. This way, users don’t need to add the entity identifier attributes, while still getting ElectroDB’s guarantees (especially if they’re not using the template feature). From what I see, ElectroDB will still validate if the identifier attributes are present, and if not, it falls back to checking PK/SK prefixes. The risk of using template without unique prefixes per entity exists outside of this PR as well.
  2. Since runtime support for specifying attributes to return in collection queries is not yet available, I commented out the type definitions I had added. This will simplify reintroducing them once we have runtime support. Let me know if you would prefer a different approach. Nevertheless, this PR still provides significant type safety for collections that use indexes with limited projected attributes.
    • .where() filters are aware of which attributes are projected
    • The result type from queries only contains the projected attributes
    • Using the hydrate option, will alter the result return type to include all attributes

With these changes, I think the PR is in a good place to merge. Let me know what you think, or if there’s anything else I can do to help.

Thanks! 🙏

anatolzak avatar Sep 05 '25 12:09 anatolzak

Hey @tywalch! Just a quick nudge to take a look at this PR when you get a chance. I think it should be good to merge. Thanks so much!

anatolzak avatar Oct 18 '25 15:10 anatolzak

Hey @tywalch! Just checking in on this PR. Happy to adjust anything that doesn’t look right or make any changes you’d like. Let me know if there’s anything I can improve. Thank you!

anatolzak avatar Nov 29 '25 08:11 anatolzak