eclipselink icon indicating copy to clipboard operation
eclipselink copied to clipboard

Find all query with full qualified class name is recognized as ReportQuery

Open TJDev opened this issue 4 years ago • 14 comments

EclipseLink Version: 2.6.8.WAS

I have tested the BatchFetch feature with our current implementation and got a NullPointerException when executing a query like the following:

select e from my.package.Entity e

Caused by: java.lang.NullPointerException at org.eclipse.persistence.queries.BatchFetchPolicy.getDataResults(BatchFetchPolicy.java:251) at org.eclipse.persistence.mappings.ForeignReferenceMapping.extractResultFromBatchQuery(ForeignReferenceMapping.java:562) at org.eclipse.persistence.internal.indirection.NoIndirectionPolicy.valueFromBatchQuery(NoIndirectionPolicy.java:297) at org.eclipse.persistence.mappings.ForeignReferenceMapping.batchedValueFromRow(ForeignReferenceMapping.java:275) at org.eclipse.persistence.mappings.ForeignReferenceMapping.valueFromRow(ForeignReferenceMapping.java:2176) at org.eclipse.persistence.mappings.ForeignReferenceMapping.buildCloneFromRow(ForeignReferenceMapping.java:336) at org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildAttributesIntoWorkingCopyClone(ObjectBuilder.java:2002) at org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildWorkingCopyCloneFromRow(ObjectBuilder.java:2255) at org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObjectInUnitOfWork(ObjectBuilder.java:853) at org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:740) at org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:694) at org.eclipse.persistence.queries.ReportQueryResult.processItem(ReportQueryResult.java:251)

For my understanding the exception is thrown because the query is recognized as a ReportQuery and not a ReadAllQuery. If I delete the package definition, the query is recognized as ReadAllQuery.

select e from Entity e

My question is: Is this an bug of EclipseLink or should a query with package and class name be a ReportQuery?

Regards, Thomas

TJDev avatar Aug 19 '21 10:08 TJDev

Hello @TJDev !

Thank you for reporting this. I have a couple questions if you could please answer:

What database are you running against? It's unlikely to matter, but I'd still like to know for completeness. Can you provide the code you are using to create your JPQL query? And the Entity class code as well? I assume this is all as simple as it sounds, but I want to double check. Is this while running on Open Liberty? If so, can you tell me what Liberty version you are running on? Can you reproduce this failure reliably? Can you reproduce with a simple JSE usecase or only within an EE application environment?

Running my own simple test, I am seeing some interesting differences in trace.

Test 1:

    Query q = em.createQuery("SELECT e FROM Employee e");
    List<?> ret = q.getResultList();

Trace:

[EL Finest]: query: 2021-08-20 14:23:59.208--UnitOfWork(1120670624)--Thread(Thread[main,5,main])--Execute query ReadAllQuery(referenceClass=Employee sql="SELECT ID FROM EMPLOYEE")

Test 2:

    Query q = em.createQuery("SELECT e FROM model.Employee e");
    List<?> ret = q.getResultList();

Trace:

[EL Finest]: query: 2021-08-20 14:26:34.898--UnitOfWork(955443582)--Thread(Thread[main,5,main])--Execute query ReportQuery(referenceClass=Employee sql="SELECT ID FROM EMPLOYEE")

That being said, while I am not sure at the moment why the query is running with different EclipseLink query types, I am also not getting a failing test. Both tests return the expected results and neither throws a NPE

Also, I was able to verify that this difference is not specific to 2.6_WAS version of EclipseLink. Both 2.7 and 3.0 versions also use ReadAllQuerys for queries without a full package and use ReportQuerys for queries with a full package.

dazey3 avatar Aug 20 '21 19:08 dazey3

The issue starts in org.eclipse.persistence.internal.jpa.jpql.ReadAllQueryBuilder.visit(IdentificationVariable expression):

    if (queryContext.isRangeIdentificationVariable(variableName)) {
        if (selectStatement.hasGroupByClause() || selectStatement.hasHavingClause()  ||
			    variableName != queryContext.getFirstDeclaration().getVariableName()) {
            initializeReportQuery();
        } else {
            initializeReadAllQuery();
        }
    } else {
        initializeReportQuery();
    }

Test 1 creates a ReadAllQuery because org.eclipse.persistence.internal.jpa.jpql.JPQLQueryContext.isRangeIdentificationVariable(String variableName) returns true, but on Test 2, isRangeIdentificationVariable(String variableName) returns false, causing a ReportQuery to be created instead.

The "problem" seems to be in org.eclipse.persistence.internal.jpa.jpql.DeclarationResolver. When DeclarationResolver is created for Test 1, DeclarationResolver.declarations gets the following value:

    RangeDeclaration {
        identificationVariable	"e"
        rootPath	"Employee"
        type	null
    }

When DeclarationResolver is created for Test 2, DeclarationResolver.declarations gets the following value:

    RangeDeclaration {
        identificationVariable	"e"
        rootPath	"model.Employee"
        type	Class<T> (model.Employee)
    }

This causes org.eclipse.persistence.internal.jpa.jpql.DeclarationResolver.isRangeIdentificationVariableImp(String variableName) to behave differently as org.eclipse.persistence.internal.jpa.jpql.RangeDeclaration.getType() returns Type.CLASS_NAME when type is not NULL and Type.RANGE when it is NULL.

Interestingly, org.eclipse.persistence.internal.jpa.jpql.RangeDeclaration.type has the following comment that seems to validate this behavior difference by indicating that using fully qualified class names vs entity class names is an expected usecase:

    /**
     * If the "root" path is the fully qualified class name, then this will be set.
     */
    public Class<?> type;

    /**
     * Determines whether the "root" path is either the abstract schema name (entity name) or the
     * abstract schema name.
     *
     * @return <code>true</code> if it is the fully qualified class name; <code>false</code> if it
     * is the entity name
     */
    public boolean isFullyQualifiedClassName() {
        return type != null;
    }

I'm not sure yet WHY the use of a fully qualified class name difference is being tracked in the FROM clause. It would make some sense for the SELECT clause, possibly to allow for Constructor usage, but I'm not quite sure why it's important to note this difference in the FROM clause

dazey3 avatar Aug 20 '21 20:08 dazey3

As to the reported NPE though, it looks like maybe org.eclipse.persistence.queries.BatchFetchPolicy.dataResults is just written badly. Every other method that accesses BatchFetchPolicy.dataResults seems to check if == null. The NPE is occurring because org.eclipse.persistence.queries.BatchFetchPolicy.dataResults is null and there was no check before trying to get a value from it. I think a safe fix for now would be to patch that NPE issue and you can test if that resolves your usecase; regardless if a ReportQuery or ReadAllQuery is run, it shouldnt fail with a NPE.

dazey3 avatar Aug 20 '21 20:08 dazey3

@dazey3 Anserwing your questions in first comment: DB: IBM DB2 11.5 LUW Application Server: WAS 9.0.5.7 Code for creating JPQL (simplified class):

public abstract class GenericJpaDao<T extends Entity> {
    protected Class<T> type;

    public GenericJpaDao(Class<T> type) {
        this.type = type;
    }

    @PersistenceContext
    protected EntityManager em;

    public List<T> findAll() {
        return em.createQuery("select e from " + type.getName() + " e", type).getResultList();
    }
}

public class EmployeeJpaDaoGen extends GenericJpaDao<Employee> {
    public EmployeeJpaDaoGen() {
        super(Employee.class);
    }
}

Unfortunately I will not be able to test the fix because I will leave my company this thursday. Maybe some colleague can follow-up here.

TJDev avatar Aug 23 '21 11:08 TJDev

@TJDev Thank you for the information! Can you supply the persistence properties for the persistence unit? You can find these in the persistence.xml.

dazey3 avatar Aug 23 '21 14:08 dazey3

@dazey3 The only property we have specified is eclipselink.session.customizer for checking the database integrity with the following class.

public class EnableJpaIntegrityChecker implements SessionCustomizer {
   @Override
   public void customize(Session session) throws Exception {
      session.getIntegrityChecker().checkDatabase();
      session.getIntegrityChecker().setShouldCatchExceptions(true);
   }  
}

TJDev avatar Aug 23 '21 14:08 TJDev

It appears that the fix I introduced for the NullPointerException issue introduced a memory leak in BatchFetchPolicy.clone()

public BatchFetchPolicy clone() {
        BatchFetchPolicy clone = null;
        try {
            clone = (BatchFetchPolicy)super.clone();
        } catch (CloneNotSupportedException error) {
            throw new InternalError(error.getMessage());
        }
        if (clone.dataResults != null) {
            clone.dataResults.put(clone, clone.dataResults.get(this));
        }
        return clone;
    }

The issue with this implementation is that super.clone() will copy the Map reference for this.dataResults -> clone.dataResults. Then, clone.dataResults adds to that Map reference the clone reference with the ArrayList reference. This means that every time a BatchFetchPolicy is cloned, the Map of dataResults grows by 1 in size, all sharing the same ArrayList reference.

The reason this poorly implemented clone() method wasn't a problem before was because the initialization of BatchFetchPolicy.dataResults was done during BatchFetchPolicy.addDataResults(). Calling addDataResults() was done AFTER the clone() was already created and on the clone! The clone is then disposed at the end of query execution, so the original BatchFetchPolicy never had dataResults initialized!

With my fix change, now the BatchFetchPolicy constructor initializes BatchFetchPolicy.dataResults so now the original BatchFetchPolicy.dataResults != null. When the original policy is cloned, now the non-null dataResults is copied by reference to the clone AND the original.dataResults gets the clone reference added.

dazey3 avatar May 23 '22 22:05 dazey3

@dazey3 EclipseLink Version: 2.7.10 I have an entity as follows: class User { private Long id; private List<Address> addresses; } When I use the JPQL "select e from com.yaaidoata.User", the result is User{id=1, addresses=[]}. But, the real data is User{id=1, addresses=[{city=changsha, region=hunan}]}. I notice that you have fix the related NPE bug in PR#1045. I think the root cause may be when EntityMananger.createQuery(), and the query should be ReadAllQuery instead of ReportQuery. So can you help me to solve this problem. And if you want to know more details, I'm glad to provide more info. Thank you.

yaaidota avatar May 26 '22 02:05 yaaidota

@yaaidota

When I use the JPQL "select e from com.yaaidoata.User", the result is User{id=1, addresses=[]}. But, the real data is User{id=1, addresses=[{city=changsha, region=hunan}]}.

Can you provide a test that reproduces the failure? I am not able to reproduce this behavior locally.

EclipseLink Version: 2.7.10

Where you getting a NPE before 2.7.10? Are you reporting a regression in behavior on 2.7.10?

dazey3 avatar May 26 '22 15:05 dazey3

@dazey3

Can you provide a test that reproduces the failure? I am not able to reproduce this behavior locally.

Thank you for your reply, and I have reproduces the failure. The info is below:

  1. maven pom dependency: <groupId>org.eclipse.persistence</groupId> <artifactId>eclipselink</artifactId> 2.7.10

  2. persistence.xml

<persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd"> org.eclipse.persistence.jpa.PersistenceProvider false

    <properties>
        <property name="javax.persistence.jdbc.driver" value="com.mysql.jdbc.Driver"/>
        <property name="javax.persistence.jdbc.url" value="jdbc:mysql://localhost:3306/admin"/>
        <property name="javax.persistence.jdbc.user" value="root"/>
        <property name="javax.persistence.jdbc.password" value="123456"/>

    </properties>
</persistence-unit>
  1. Entities: @Data @Entity @Table(name = "test_user") public class TestUser {

    @Id private Long id;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true) @JoinColumn(name = "user_seq_id", referencedColumnName = "id") @BatchFetch(BatchFetchType.IN) private List<Address> addresses; }

@Data @Entity @Table(name = "address") public class Address {

@Id
private Long id;

@Column(name = "user_seq_id")
private Long userSeqId;

private String city;

private String region;

}

  1. Test result: 4.1 use simple name: public static void main(String[] args) { EntityManagerFactory emf = Persistence.createEntityManagerFactory("user-test"); EntityManager eman = emf.createEntityManager();

     try {
    
         String sql = "SELECT e FROM TestUser e";
    
         Query query = eman.createQuery(sql);
         List<TestUser> users = query.getResultList();
    
         System.out.println(users);
    
     } finally {
    
         eman.close();
         emf.close();
     }
    

    }

output: [TestUser(id=1, addresses=[Address(id=1, userSeqId=1, city=changsha, region=hunan)])]

4.2 user full name: public static void main(String[] args) { EntityManagerFactory emf = Persistence.createEntityManagerFactory("user-test"); EntityManager eman = emf.createEntityManager();

    try {

        String sql = "SELECT e FROM com.xx.xx.TestUser e";

        Query query = eman.createQuery(sql);
        List<TestUser> users = query.getResultList();

        System.out.println(users);

    } finally {

        eman.close();
        emf.close();
    }
}

output: [TestUser(id=1, addresses=[])]

yaaidota avatar May 26 '22 17:05 yaaidota

First off, EclipseLink @BatchFetch(BatchFetchType.IN) is EclipseLink API documented here: https://www.eclipse.org/eclipselink/documentation/2.7/jpa/extensions/annotations_ref.htm#CHDCCIDA

The issue is that when using

    String sql = "SELECT e FROM model.TestUser e";

EclipseLink creates a org.eclipse.persistence.queries.ReportQuery and when using

    String sql = "SELECT e FROM TestUser e";

EclipseLink creates a org.eclipse.persistence.queries.ReadAllQuery

That decision is made here: https://github.com/eclipse-ee4j/eclipselink/blob/f905f7c7c7a8c6234313fcc320db27933f5f308b/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/jpa/jpql/ReadAllQueryBuilder.java#L111-L125

and that decision is influenced by https://github.com/eclipse-ee4j/eclipselink/blob/f905f7c7c7a8c6234313fcc320db27933f5f308b/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/jpa/jpql/RangeDeclaration.java#L46-L49

https://github.com/eclipse-ee4j/eclipselink/blob/f905f7c7c7a8c6234313fcc320db27933f5f308b/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/jpa/jpql/RangeDeclaration.java#L137-L151


Why is this ReadAllQuery vs ReportQuery making behavior change for BatchFetchPolicy?

The reason why this changes behavior can be traced to ReportQuery.executeDatabaseQuery() vs ReadAllQuery.executeDatabaseQuery()

The call to ReadAllQuery.executeDatabaseQuery() eventually leads to this: https://github.com/eclipse-ee4j/eclipselink/blob/f905f7c7c7a8c6234313fcc320db27933f5f308b/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/queries/ReadAllQuery.java#L591-L594

This this.batchFetchPolicy.setDataResults(rows); is extremely important much, much later in https://github.com/eclipse-ee4j/eclipselink/blob/f905f7c7c7a8c6234313fcc320db27933f5f308b/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/mappings/ForeignReferenceMapping.java#L565-L574

For ReadAllQuery, there are results returned from originalPolicy.getDataResults(this);. This causes EclipseLink to execute a followup query on the ADDRESS table and everything is fine. For ReportQuery, this.batchFetchPolicy.setDataResults(rows); was never called! So there are no results here and an empty List gets set for TestUser.addresses.


So what's the fix? I honestly don't know.

The comments on RangeDeclaration and logic in ReadAllQueryBuilder indicate a strong reason for ReadAllQuery vs ReportQuery! It seems unlikely that making a change there is correct. Perhaps ReportQuery.executeDatabaseQuery() needs

    // Batch fetching in IN requires access to the rows to build the id array.
    if ((this.batchFetchPolicy != null) && this.batchFetchPolicy.isIN()) {
        this.batchFetchPolicy.setDataResults(rows);
    }

added somewhere in the chain? Perhaps, but I don't know where or how.

At the moment, the work around of "don't use absolute, fully qualified names in your JPQL" seems like a reasonable solution. Also, worth noting is that this is a bug in EclipseLink's thirdparty BatchFetch API, so the JPA specification says nothing on this usecase.

dazey3 avatar May 27 '22 00:05 dazey3

Yes, I total agree with you about the cause chain analysis of the abnormal behavior when I use full qualified names in JPQL. And I now use simple name in my application. But, should we doubt the logic in this method? `public void visit(IdentificationVariable expression) {

    // Use ReadAllQuery if the variable of the SELECT clause expression is the base variable
    // Example: ReadAllQuery = SELECT e FROM Employee e
    // Example: ReportQuery  = SELECT e FROM Department d JOIN d.employees e
    String variableName = expression.getVariableName();

    if (queryContext.isRangeIdentificationVariable(variableName)) {

        if (selectStatement.hasGroupByClause() ||
            selectStatement.hasHavingClause()  ||
            variableName != queryContext.getFirstDeclaration().getVariableName()) {

            initializeReportQuery();
        }
        else {
            initializeReadAllQuery();
        }
    }
    else {
        initializeReportQuery();
    }
}`

As the notes there, when use "SELECT e FROM Employee e", the method of initializeReadAllQuery() should be invoked. But, actually, when use "SELECT e FROM xx.xx.x.Employee e", initializeReportQuery() is invoked. As you mentioned before, the reason is that the related Type is not null. Then, it rises another greater question, why the type should influence the behavior there? And whether the logic is reasonable?

yaaidota avatar May 27 '22 01:05 yaaidota

why the type should influence the behavior there? And whether the logic is reasonable?

I don't know why the code makes this decision in the first place. However, it seems to be intentionally set this way (for an unknown reason), which makes me loath to change it.

https://github.com/eclipse-ee4j/eclipselink/blob/f905f7c7c7a8c6234313fcc320db27933f5f308b/foundation/org.eclipse.persistence.core/src/org/eclipse/persistence/internal/jpa/jpql/RangeDeclaration.java#L46-L49

It seems like the code is intending to set the type if the fully qualified path is used. Why does the Range type end up influencing Query type?? No idea, but obviously intentional.

dazey3 avatar May 27 '22 16:05 dazey3

Yes, I also agree that the type is intentionally set to CLASS_NAME. But, I notice that the type of CLASS_NAME is not range, which indicate that this type is not a range variable declaration. I have referred Java Persistence Chapter in Java EE Tutorial (https://docs.oracle.com/cd/E19226-01/820-7627/bnbum/index.html), which there has a following description: Range Variable Declarations To declare an identification variable as an abstract schema type, you specify a range variable declaration. In other words, an identification variable can range over the abstract schema type of an entity According to this specification, I think the type of CLASS_NAME in eclipse-link is a very special type, which cannot represent the abstract schema of the entity. So, it seems that use full qualified class name in eclipse-link is inappropriate. While, I have to admit that when I do not use BatchFetchType.IN, the application work well. And in Java Persistence Specifications, there have no description to indicate whether we should use simple name or full qualified name.

yaaidota avatar May 27 '22 17:05 yaaidota