persistence icon indicating copy to clipboard operation
persistence copied to clipboard

declare DataSources in persistence.xml

Open gavinking opened this issue 3 weeks ago • 12 comments

I would like to add the option of declaring a DataSource in persistence.xml. Currently, an XML-based DataSource declaration is only allowed in application.xml, application-client.xml, web.xml, and ejb-jar.xml. Those descriptors are tied to the Jakarta Platform, whereas Jakarta Persistence features a standalone mode. It would be very nice to have a uniform way to declare a datasource that exists in and out of the container.

For example:

<persistence xmlns="https://jakarta.ee/xml/ns/persistence"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="https://jakarta.ee/xml/ns/persistence
                                 https://jakarta.ee/xml/ns/persistence/persistence_4_0.xsd"
             version="4.0">

    <persistence-unit name="Library" transaction-type="JTA">
        <description>My example persistence unit</description>
        <jta-data-source>jdbc/library</jta-data-source>
        <class>org.example.Book</class>
        <class>org.example.Author</class>
        <class>org.example.Publisher</class>
        <exclude-unlisted-classes/>
        <default-fetch-type>LAZY</default-fetch-type>
        <shared-cache-mode>ENABLE_SELECTIVE</shared-cache-mode>
    </persistence-unit>

    <data-source>
        <name>jdbc/library</name>
        <description>My example persistence datasource</description>
        <class-name>org.postgresql.Driver</class-name>
        <url>jdbc:postgresql://localhost/library</url>
        <user>gavin</user>
        <password>p0ny</password>
        <max-pool-size>100</max-pool-size>
        <max-idle-time>10000</max-idle-time>
    </data-source>
</persistence>

Or, in a slightly streamlined form:

<persistence xmlns="https://jakarta.ee/xml/ns/persistence"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="https://jakarta.ee/xml/ns/persistence
                                 https://jakarta.ee/xml/ns/persistence/persistence_4_0.xsd"
             version="4.0">

    <persistence-unit name="Library" transaction-type="JTA">
        <description>My example persistence unit</description>
        <data-source>
            <class-name>org.postgresql.Driver</class-name>
            <url>jdbc:postgresql://localhost/library</url>
            <user>gavin</user>
            <password>p0ny</password>
            <max-pool-size>100</max-pool-size>
            <max-idle-time>10000</max-idle-time>
        </data-source>
        <class>org.example.Book</class>
        <class>org.example.Author</class>
        <class>org.example.Publisher</class>
        <exclude-unlisted-classes/>
        <default-fetch-type>LAZY</default-fetch-type>
        <shared-cache-mode>ENABLE_SELECTIVE</shared-cache-mode>
    </persistence-unit>
</persistence>

Now, the immediate objection to this is that such a file is almost (but not quite) as "static" as putting a @DataSourceDefinition annotation in the code, since META-INF/persistence.xml is delivered in the same archive as the compiled code.

That's sort of true, except for the fact that persistence.xml already features a well-defined way to override information specified via attributes and elements of <persistence-unit> using deployment-specified properties, and we could extend that approach to elements of <jta-data-source > and <non-jta-data-source>. That is, we could define properties like jakarta.persistence.jta-datasource.url, jakarta.persistence.jta-datasource.user, etc.

What I do still need to think through a bit more carefully is whether this should impact PersistenceConfiguration / Persistence / PersistenceProvider. In principle, I believe we could leave all that stuff alone:

  • PersistenceProvider createEntityManagerFactory(name) is responsible for parsing persistence.xml, creating a DataSource in standalone mode, and wiring it up with the EMF, while
  • PersistenceProvider createContainerEntityManagerFactory(PersistenceUnitInfo) never even sees the persistence.xml, and the container is responsible for creating the DataSource.

On the other hand, it might be nice for a standalone client to be able to create a DataSource programmatically:

  1. We could introduce DataSourceConfiguration and Persistence.createDataSource(DataSourceConfig). That would be a whole new API.
  2. Or maybe we could somehow "denormalize" the configuration of the DataSource into PersistenceConfiguration.

I'm not sure what's better there. I'm scared that option 2 grows the API of PersistenceConfiguration into something very confusing. Perhaps we could do something like this:

var config = new PersistenceConfiguration("Library");
List.of(Book.class, Author.class, Publisher.class)
        .forEach(puConfig::managedClass);
config.dataSource(new DataSourceConfiguration("jdbc/Library")
        .className("org.postgresql.Driver")
        .user(user)
        .password(pass)
        .url("jdbc:postgresql://localhost/library")
        .maxPoolSize(10));
config.schemaManagementDatabaseAction(VALIDATE);
var factory = config.createEntityManagerFactory()

That looks pretty reasonable, I suppose. You would not be able to create a DataSource independently of creating an EMF, but I guess that's fine or perhaps even desirable.

gavinking avatar Dec 27 '25 10:12 gavinking