arbor icon indicating copy to clipboard operation
arbor copied to clipboard

Support dynamic switching of schema prefixes

Open devonestes opened this issue 6 years ago • 15 comments

Howdy!

So right now when we're using Arbor, the fragment/1 call in ancestors/1 and descendants/1 are essentially hard coded at compile time because of the restrictions of fragment/1.

However, it would be really nice if they could use the prefix in the given's struct's :__meta__.prefix field so we're always searching for ancestors and descendants in the same schema as the given struct.

Since all of these strings used in fragment/1 need to be compiled and not interpolated, my initial thought is to allow users to configure a number of prefixes that they want to compile functions for, sort of like this:

for prefix <- opts[:prefixes] do
  def descendants(%{__meta__: %{prefix: unquote(prefix)}} = struct, depth \\ 2_147_483_647) do
    from(
      t in unquote(definition),
      where:
        t.unquote(opts[:primary_key]) in fragment(
          unquote("""
          WITH RECURSIVE #{opts[:tree_name]} AS (
            SELECT #{opts[:primary_key]},
                    0 AS depth
            FROM #{prefix}.#{opts[:table_name]}
            WHERE #{opts[:foreign_key]} = ?
          UNION ALL
            SELECT #{opts[:table_name]}.#{opts[:primary_key]},
                    #{opts[:tree_name]}.depth + 1
            FROM #{prefix}.#{opts[:table_name]}
              JOIN #{opts[:tree_name]}
              ON #{opts[:table_name]}.#{opts[:foreign_key]} = #{opts[:tree_name]}.#{
            opts[:primary_key]
          }
            WHERE #{opts[:tree_name]}.depth + 1 < ?
          )
          SELECT #{opts[:primary_key]} FROM #{opts[:tree_name]}
          """),
          type(^struct.unquote(opts[:primary_key]), unquote(opts[:foreign_key_type])),
          type(^depth, :integer)
        )
    )
  end
end

Then there could be at the end the same definition that you have now with no pattern matching.

So, sound like something that makes sense to you? If so, I'll send along a PR.

devonestes avatar Feb 22 '19 15:02 devonestes

fragment/1 does support positional interpolation. I am not positive if postgres supports a relation name/prefix in a prepared statement.

coryodaniel avatar Feb 27 '19 23:02 coryodaniel

It is not supported by postgres to do that. I'm going to put a comment on slack to get some opinions on how best to handle this from the Ecto team.

Ref: https://github.com/coryodaniel/arbor/issues/12

coryodaniel avatar Feb 27 '19 23:02 coryodaniel

@coryodaniel Yeah, I realized while I was implementing the solution I came up with that interpolation wouldn't work. I opened up #24 with a solution that's working for us in production. It requires folks to designate the prefixes that they're going to compile functions for in their configuration, so it's not totally dynamic, but it's at least solving the current issue that I encountered at work which led me down this path.

If there's a different way that the Ecto team can think of to solve this that didn't require listing your prefixes at compile time that would be great!

devonestes avatar Feb 28 '19 10:02 devonestes

In your app do you have s fixed number of prefixes with the same schema?

I’m working on a multi-tenant app (prefix per customer) now that is going to need this library soon. I’d like to figure out if there is a way to configure this at run time.

I wonder if there is a route down the raw SQL path without exposing injection opportunities.

Thanks,

Cory O’Daniel

On Feb 28, 2019, at 2:49 AM, Devon Estes [email protected] wrote:

@coryodaniel Yeah, I realized while I was implementing the solution I came up with that interpolation wouldn't work. I opened up #24 with a solution that's working for us in production. It requires folks to designate the prefixes that they're going to compile functions for in their configuration, so it's not totally dynamic, but it's at least solving the current issue that I encountered at work which led me down this path.

If there's a different way that the Ecto team can think of to solve this that didn't require listing your prefixes at compile time that would be great!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

coryodaniel avatar Feb 28 '19 14:02 coryodaniel

So I might have an idea how this could work.

If the fragment’s string was built at run time, the SQL identifiers could be interpolated into the elixir string and the arguments (record ID) could still use prepared statement placeholders.

This would require the tree functions not being macros, and calling a function on the module to get arbor options at runtime to interpolate into the elixir string.

If we wanted to keep the functions as macros we could add a shadow function that does the elixir interpolation and return it to the macro so the macro doesn’t have to fetch the options from some place on each call.

Update: this doesn't work. Some magic macro stuff happens in Ecto to see if a fragment's string has been interpolated at runtime, and emits a SQL injection warning. :(

coryodaniel avatar Feb 28 '19 15:02 coryodaniel

The issue is only with the fragment functions, correct? roots, parent, children, and siblings work as expected?

coryodaniel avatar Feb 28 '19 15:02 coryodaniel

Was trying to be slick... This didnt work. :( Invalid syntax using a query as a prepared statement I suppose. Happens @ (?)

        table_name_with_prefix = Arbor.CTE.prefixed_table_name(
          struct.__meta__, unquote(opts[:source])
        )

        local_opts = unquote(opts)

        query1 = from(table_name_with_prefix,
          select: fragment("?, ?, 0 AS depth", ^local_opts[:primary_key], ^local_opts[:foreign_key]),
          where: ^[
            {local_opts[:primary_key], struct.unquote(opts[:primary_key])}
          ]
        )

        query2 = from(table_name_with_prefix,
          select: fragment("?.?, ?.?, ?.depth + 1",
            ^table_name_with_prefix, ^local_opts[:primary_key],
            ^table_name_with_prefix, ^local_opts[:foreign_key],
            ^local_opts[:tree_name]
          ),
          join: ^local_opts[:tree_name],
          on: fragment("? = ?", ^"#{local_opts[:tree_name]}.#{local_opts[:foreign_key]}", ^"#{table_name_with_prefix}.#{local_opts[:primary_key]}")
        )

        recursive_union = union_all(query1, ^query2)

        from(t in unquote(definition),
          join:
            g in fragment(
              unquote("""
              (
                WITH RECURSIVE #{opts[:tree_name]} AS (?)
                SELECT *
                FROM #{opts[:tree_name]}
              )
              """),
              ^recursive_union
            ),
          on: t.unquote(opts[:primary_key]) == g.unquote(opts[:foreign_key])
        )

coryodaniel avatar Feb 28 '19 23:02 coryodaniel

Yes, we do have a fixed number of schemas, which is why #24 works for us.

That PR most likely won't work for a multi-tenant app since (in my experience) most of those tenants are generated dynamically during the execution of the application.

The only way I know of to do this dynamically at runtime is to use something like Ecto.Adapters.SQL.query/4, but that won't give users the composibility that's so nice with the current implementation.

The issue isn't just with the fragments, actually, because when you do from(t in unquote(definition), #...) that will set the prefix if there's a @schema_prefix defined on that schema, and once it's set it cannot be overridden later on. The prefix needs to be set at the first time you use from or join if there's a @schema_prefix on the schema.

devonestes avatar Mar 01 '19 10:03 devonestes

Yeah I was down the raw query route, but didnt want to give up composing.

Are you using a fork in your code now? I'd hate to block you.

I'll probably end up merging by the end of this weekend. I'm going to take a swing to see if I can work something crazy up with windows since it seems like thats supported by ecto.

If not I'll merge and see if I can add CTE support to ecto and circle back to the runtime solution later.

I was able to get all of the functions working at runtime except the CTE ones!

coryodaniel avatar Mar 01 '19 15:03 coryodaniel

Yeah, we're using that fork for now, so no worries about blocking.

Window functions might work here - good thinking!

devonestes avatar Mar 01 '19 15:03 devonestes

https://github.com/elixir-ecto/ecto/pull/2757

CTE support is coming!

I’m going to start a refactor using this branch. Macro API will remain the same.

coryodaniel avatar Mar 09 '19 16:03 coryodaniel

CTE has landed in dev: https://github.com/elixir-ecto/ecto/pull/2757

coryodaniel avatar Jun 21 '19 13:06 coryodaniel

Hi folks. Bumping this discussion as I'm interested in how to get this library working with Triplex?

glennr avatar Apr 17 '20 12:04 glennr

I started a branch to refactor using CTEs, but have had zero time outside the office for work lately.

Note, I’ve only refactored two functions. I can try to finish it up this weekend, depends how babynaps go!

https://github.com/coryodaniel/arbor/pull/31

coryodaniel avatar Apr 17 '20 14:04 coryodaniel

+1 for this issue. Can I help somehow? My app is SaaS PG schema-based I need this too much

sveredyuk avatar Mar 22 '22 17:03 sveredyuk