quarkus icon indicating copy to clipboard operation
quarkus copied to clipboard

Provide read-only transactions.

Open loicmathieu opened this issue 5 years ago • 26 comments

Description Provides performance optimized read only transactions inside Hibernate ORM.

Some discussions about it has been made on #2997 and #6095.

Implementation ideas We can add a readOnly attribute to the @TransactionConfiguration annotation that will set the underlying Hibernate session to readOnly and the flush type to FlushType.MANUAL.

In our current implementation, this annotation will only be used if the user also annotate it's method with @Transactional as the interceptor that manages it is on Narayana. But as Narayana is not aware of the Hibernate session we will need to implements some cooperation between Narayana and Hibernate ...

An other solution could be to add an interceptor on Hibernate extension side to deals with the readOnly attribute. Another one to use JPA hints.

loicmathieu avatar Jan 06 '20 10:01 loicmathieu

This has alread been discussed on the mailing list: https://groups.google.com/d/msg/quarkus-dev/01GSKgzQ9dA/GjgE2c7nAwAJ

loicmathieu avatar Jan 13 '20 14:01 loicmathieu

Perhaps check how SpringBoot does that. If I remember correctly, they somehow push this info all the way to the underlying JDBC connection ...

alesj avatar Feb 19 '20 11:02 alesj

@alesj it needs to be pushed to the EntityManager as it's an hibernate feature.

I have a working prototype using a CDI interceptor

loicmathieu avatar Feb 19 '20 11:02 loicmathieu

Hi, i'm using quarkus with hiberante orm with panache and multitenancy. I've configured my datasource with connector/j replication feature because we have an Aurora RDS Cluster like this:

// application.yml
datasource:
    db1:
        ...
        url: jdbc:mysql:replication://${DB_01_RW_HOST}:3306,${DB_01_RO_HOST}:3306/DBNAME

The purpose is to route SELECT to read-replicas, i'm able to route the query to the replica endpoints if i set the connection.setReadOnly(true).

I was able to try this, setting the connection readOnly directly in the tenant resolver class:

// CustomTenantConnectionResolver Class

 @Override
    public ConnectionProvider resolve(String tenantId) {
 
        log.debug("resolve((persistenceUnitName={}, tenantIdentifier={})", this.persistenceUnitName, tenantId);

        AgroalDataSource dataSource = getDataSourceByTenantId(tenantId);

        if (dataSource == null) {
            throw new IllegalStateException(String.format("No instance of datasource found for persistence unit '%1$s' and tenant '%2$s'", this.persistenceUnitName, tenantId));
        } else {
            return new CustomConnectionProvider(tenantId, dbName, dataSource);
        }
    }


private static class CustomConnectionProvider extends QuarkusConnectionProvider {
        private String tenantId;

        public ArgoConnectionProvider(String tenantId, AgroalDataSource dataSource) {
            super(dataSource);
            this.tenantId = tenantId;
        }

        @Override
        public Connection getConnection() throws SQLException {
            Connection conn = super.getConnection();
            conn.setCatalog(dbName);

//            conn.setReadOnly(true);
//            conn.setAutoCommit(false);

            log.debug("Getting connection tenant={} read-only={}", tenantId, conn.isReadOnly());
            return conn;
        }
    }

There's a way to workaround this, until this pr is merged? There is a better way to route traffic for SELECT operations to read-replicas?

Thanks

stefanorg avatar Mar 11 '21 09:03 stefanorg

@loicmathieu any hint?

stefanorg avatar Mar 18 '21 16:03 stefanorg

@stefanorg the PR is for readonly hibernate session (called readonly transaction by some other framework), it's a different thing as readonly connection to route to a different tenant. You should open a dedicated issue for this.

loicmathieu avatar Mar 18 '21 17:03 loicmathieu

Hi @loicmathieu, I went through the links you mentioned and I have to say I do not understand why Narayana is reported in this issue.

The read-only behaviour you want to develop cannot be provided with Narayana (even as a “hint” for Resource Managers (RM)). In fact, from Narayana’s point of view, ReadOnly is only used when a RM enlisted in a transaction votes ReadOnly during the prepare phase. In this context, ReadOnly means that ”[...] no persistent data associated with the resource has been modified by the transaction” (from spec). This comes from the “ReadOnly optimisation protocol” that states that “a RM can respond to the Transaction Manager’s prepare request by asserting that the RM was not asked to update shared resources in this transaction branch. This response concludes the RM’s involvement in the transaction; the Phase 2 dialogue between the Transaction Manager (TM) and this RM does not occur [...]” (from the same spec). So, as I understand it, the ReadOnly protocol is a bottom-up approach (i.e. the RM tells the TM that no persistent data has been modified) rather than a top-down approach (i.e. the TM propagates the “hint” to run a ReadOnly transaction down to the RM).

I hope that this clarifies why Narayana is not the right place to develop a “ReadOnly” behaviour. If this makes sense, could you please remove the label Narayana from this issue? Thanks.

jmfinelli avatar Jun 07 '22 15:06 jmfinelli

@jmfinelli it has nothing to do with Narayana itself but as it'll need coordination between the transaction manager and the entity manager it will impact the Narayana extension. The transactional CDI interceptor is inside the Narayana transaction and it would need to be updated to allow read-only transactions. You can check the prototype I made here https://github.com/quarkusio/quarkus/pull/10077 a long time ago, there was no agreement on it so it was closed.

loicmathieu avatar Jun 08 '22 06:06 loicmathieu

As #10077 was closed, I think that also this issue should be closed. What do other contributors think?

jmfinelli avatar Jun 13 '22 10:06 jmfinelli

@jmfinelli no, the proposed implementation was refused but we keep the issue open for feedback and may propose something at some time.

loicmathieu avatar Jun 14 '22 08:06 loicmathieu

@loicmathieu ReadOnly has already a defined semantic (as described in my comment above). Having said that, as your prototype is not strictly related to the Narayana extension, could we remove the Narayana label from this issue?

jmfinelli avatar Jun 15 '22 12:06 jmfinelli

@jmfinelli I removed the lable.

loicmathieu avatar Jun 15 '22 13:06 loicmathieu

Thanks @loicmathieu

jmfinelli avatar Jun 15 '22 13:06 jmfinelli

This topic was... kind of revived in https://github.com/quarkusio/quarkus/pull/40365#discussion_r1586209813, where we discuss the ability for Narayana to detect that a non-XA datasource is readonly, so that Narayana can relax some contraints and allow multiple non-XA datasources (all but one readonly) in the same transaction.

Since we now have another use case involving Narayana, and it's no longer just about performance (some JDBC drivers just don't support XA)... I wonder if a new feature in Narayana would make sense?

I realize a "generic" solution would take a long time to see the light, given Jakarta Transaction changes would be involved. But in the specific case of Agroal + Narayana + Quarkus... maybe?

Let's say:

  • Users (somehow) mark a datasource usage as read-only: @ReadOnly on the datasource/Session injection point, readOnly flag on QuarkusTransaction, ... we'll probably need multiple ways of doing it depending on the use case.
  • Quarkus/Hibernate/Agroal take advantage of java.sql.Connection#setReadOnly to make the JDBC driver aware of this, enabling potential performance optimizations in the JDBC driver (feature 1)
  • Quarkus/Agroal take advantage of java.sql.Connection#isReadOnly to (somehow) make Narayana aware that a particular non-XA resource is readonly
  • Narayana (or a custom component in the Quarkus Narayana extension) takes the knowledge of a datasource being read-only into account to determine whether it can be combined with other non-XA datasources (feature 2)

A limitation in all this: the "read-onlyness" would absolutely need to be defined before the corresponding datasource gets involved in the transaction -- that "read-onlyness" wouldn't be able to change during the transaction, due to JDBC constraints (can't call setReadOnly() during a transaction). That makes accessing the same datasource from multiple, independent sections of code quite dodgy. Hopefully that's still a valuable feature?

So... would such a new feature make sense in Agroal/Narayana, @barreiro @mmusgrov?

cc @shawkins

yrodiere avatar May 02 '24 13:05 yrodiere

@yrodiere thank you for the proposal. The current Keycloak logic directly uses the jta TransactionManager to control transaction boundaries - but we can of course adapt as needed.

A base level ask mentioned on the pr was for a heurstic mixed exception that would help users understand if something unsafe has happened during a commit.

Beyond that maybe some of our concern could be addressed by tweaking the meaning of the transactions property. That is if you specify - quarkus.datasource.name.jdbc.transactions=enabled - then it should behave like agroal 2.1 meaning best effort 1pc. That would provide backwards compatibility.

If you specify - quarkus.datasource.name.jdbc.transactions=xa - and only a non-XA driver is available or specified, then rather than there being a configuration error it should behave like agroal 2.3. That is create a local xa resrouce, but allow only 1 of them per transaction. If this is too subtle of a change, another property value could be introduced, maybe something like "safe"?

The docs could then explain why "enabled" is just best effort and potentially unsafe with multiple datasources.

On top of that some consideration of readonly would be great to have - it would either refine either the commit exception handling for the enabled case (we'd know whether it was a heurstic mixed situation), or otherwise allow for multiple non-xa resources even though safe handling is expected.

shawkins avatar May 02 '24 19:05 shawkins

Okay, a lot to unpack here... And it's not only about read-only mode.

I thought #40365 was good enough, which is why I suggested to move the discussion about additional "read-only" features here. But if #40365 is not good enough, please: let's continue the discussion directly on #40365.

Let's get back to #40365 for everything unrelated to read-only then.

See https://github.com/quarkusio/quarkus/pull/40365#discussion_r1588854130

yrodiere avatar May 03 '24 07:05 yrodiere

+1 to have "real" Read-Only connections, seems to be highly requested and we have now some real needs in the form of the Keycloack request.

This would need to go beyond Hibernate ORM though, which was seemingly implied by the original issue description.

We'll need:

  • connection pool support to open such a connection
  • Hibernate ORM support to be aware of it being run in such mode
  • Narayana support, my (naive) understanding is that it could take this into account when deciding the order of operations to coordinate. e.g. it could allow for multiple single-phase resources to be coordinated as long as only the last one is in write mode.
  • public APIs to define how this is to be used.

I think this could be done safely, but I'll defer to @mmusgrov and @tomjenkinson to think carefully about it. For this reason it might be wise to start this in Narayana? Assuming the Narayana team is willing to give it a shot.

The public API might also deserve some upfront experimentation. @shawkins perhaps could lead this aspect based on the practical needs you have?

To be clear, I can't volunteer for this. Just trying to help by summarizing as there was a lot of context spread over various discussions.

Sanne avatar May 03 '24 15:05 Sanne

@Sanne the practical need for Keycloak is that we're supporting at least 1 non-XA datasource type and users are free to create extensions also targeting that source, so we actully need to support the case where 2 (or more) non-XA datasources are involved in the same transaction. Having some way to take advantage of a read-only feature would allow us to minimize unnecessary warnings and/or make extension developers better aware of potential heurstic situations.

Narayana support, my (naive) understanding is that it could take this into account when deciding the order of operations to coordinate. e.g. it could allow for multiple single-phase resources to be coordinated as long as only the last one is in write mode.

This matches my expectation as well. What doesn't work well is that at the time we're starting the transaction we don't know which of the datasources will be the writable one. Also not all drivers, such as SQLServer, support Connection.setReadOnly so enforcement would likely be best effort (at the orm level?) and that flag maintainted by the pool or narayana, rather than relying upon the actual Connection. So I'm struggling to come up with a workable simple proposal.

shawkins avatar May 06 '24 13:05 shawkins

I think this could be done safely, but I'll defer to @mmusgrov and @tomjenkinson to think carefully about it. For this reason it might be wise to start this in Narayana? Assuming the Narayana team is willing to give it a shot.

I'm not sure how safe it would be since unless we are sure the connection is only ever used for read only operations it would be similar to the current problem of using multiple XA unaware resources - what is the guarantee that the connection was only used for read-only operations and how would we detect that it had some write operations?

mmusgrov avatar May 07 '24 08:05 mmusgrov

I tend to agree this feature is worthless unless it's safe by default, so personally I would stick to supporting it only for JDBC drivers on our "nice list" that are known to implement Connection.setReadOnly with actual checks (e.g. org.postgresql:jdbc), throw an exception when attempting read-only for other drivers (e.g. H2 where setReadOnly is a no-op), and leave it at that.

Note the JDBC spec apparently does not mandate that Connection.setReadyOnly leads to actually enforcing read-only-ness, it just allows for optimizations. So alternatively we could implement the behavior in Agroal itself: it would issue database-specific SQL statements to start a transaction as read-only, e.g. for PostgreSQL SET SESSION CHARACTERISTICS AS TRANSACTION READ ONLY, or throw an exception where unsupported. But I suspect we'd get more or less the same result as when using Connection.setReadOnly.

Maybe we could add another flag in the spirit of quarkus.datasource.jdbc.readonly = yolo for drivers/databases that are known not to support read-only transactions, to mimic support while not enforcing anything, but frankly I don't think it's worth the trouble; people using these drivers/databases might as well just set quarkus.transaction-manager.unsafe-multiple-last-resources=allow and accept that they live their life dangerously: we can't help them.

yrodiere avatar May 07 '24 08:05 yrodiere

Maybe we could add another flag in the spirit of quarkus.datasource.jdbc.readonly = yolo for drivers/databases that are known not to support read-only transactions, to mimic support while not enforcing anything, but frankly I don't think it's worth the trouble; people using these drivers/databases might as well just set quarkus.transaction-manager.unsafe-multiple-last-resources=allow and accept that they live their life dangerously: we can't help them.

That an interesting view given the work involved in adding the feature vs. teams' other priorities - especially given the current changes going on in our industry where resources are being stretched.

I spoke to my colleagues about this and they had a similar view about it "... not being worth the effort ..." for the reasons you gave although in the end there was a discussion around jta-dev spec issue 58 concerning "support explicit ordering of commits for XAResources enlisted in a transaction" where the ordering could help with both enlistment decisions depending upon which others are already enlisted and at prepare/commit time (read only before others, last resource after others). In he discussion it was also mentioned that the feature is potentially handy for deployments that have read replica databases and want to offload some reporting queries to the replica whilst writes go to the primary.

mmusgrov avatar May 07 '24 09:05 mmusgrov

I was saying that mimicking support while not enforcing anything for JDBC drivers that don't support read-only transactions is not worth the effort.

The work you'd be doing in Narayana to add support for read-only datasources (and everything this implies for multiple non-XA resources) is very much worth the effort IMO.

EDIT: Though I can't talk about the Narayana team's priorities, obviously.

yrodiere avatar May 07 '24 09:05 yrodiere

So let's get the current set of problems sorted after which we (Agroal/Quarkus/Narayana) can look into read-only resources.

mmusgrov avatar May 07 '24 10:05 mmusgrov

I was saying that mimicking support while not enforcing anything for JDBC drivers that don't support read-only transactions is not worth the effort.

The work you'd be doing in Narayana to add support for read-only datasources (and everything this implies for multiple non-XA resources) is very much worth the effort IMO.

EDIT: Though I can't talk about the Narayana team's priorities, obviously.

I agree with Yoann. There's no point in trying to provide a "poor man's" read-only API in Hibernate ORM if the underlying middleware can't support it properly, all the way. We need consistency guarantess for this to be safe.

The fact that this can't be done on certain databases is not our problem; if any, providing support on the databases which do support this would:

  • allow users that run such databases to benefit from it
  • put pressure on whoever chooses the database: change if it's important for you
  • ultimately, put pressure on the database vendors to support such aspects better

Sanne avatar May 07 '24 14:05 Sanne

What doesn't work well is that at the time we're starting the transaction we don't know which of the datasources will be the writable one.

This concerns me, it doesn't feel right. I suppose one could experiment with trying to detect which resources are being used to only read but I don't think it could be done reliably, for example to classify the execution of a stored procedure one might need to know more about it or classify them all as having potentially side-effects. I might be wrong, didn't think about it deeply but I thought I'd flag this criticality already - so as to not waste time on this unless it's doable.

Unless I'm proven wrong and someone can put down a persuasive proposal to detect the safe usage reliably I believe we'd need an explicit API, and getting back to my earlier comment I believe we should try to see how such an API would look like. It's not at all straight forward to try patching the connection pool for such a feature, and apparently it wouldn't help for @shawkins 's use case.

Sanne avatar May 07 '24 14:05 Sanne

I took the liberty and created an issue on the JTA issue tracker for this, since Jakarta EE 12 planning is starting: https://github.com/jakartaee/transactions/issues/220

beikov avatar Feb 25 '25 16:02 beikov

Reassigning to @beikov , who is trying to get the Jakarta Transaction spec change through. Once that's done, we can start working on integrations in e.g. Agroal.

yrodiere avatar Jun 26 '25 13:06 yrodiere

FWIW the spec has been updated, and includes a workaround directly in the transaction manager for datasources that don't support read-only transactions. So we'll need to wait for the spec and Narayana to be released, then upgrade Narayana in Quarkus, then we can see what happens and consider deeper integration in Agroal/Hibernate.

yrodiere avatar Nov 03 '25 14:11 yrodiere

If anyone is up to the task, @jhalliday suggested that we could start experimenting with read-only transactions straight away, before the spec or Narayana being ready, by leveraging TransactionSynchronizationRegistry to transmit the "this is read-only" information from the API level (CDI interceptor) to Hibernate and the connection pool.

API-wise, we should probably start with that Quarkus-specific transaction annotation suggested in #50119? Or if we feel the feature really needs to be used for experiments only, we can add a separate @ReadOnlyExperiment annotation and call it a day.

yrodiere avatar Nov 26 '25 10:11 yrodiere