spring-data-mongodb icon indicating copy to clipboard operation
spring-data-mongodb copied to clipboard

org.springframework.data.mongodb.core.MongoTemplate#executeFindMultiInternal should be optimized

Open constanine opened this issue 3 years ago • 1 comments

In my case, when pagequery the collections in mongodb, which size more than 1000, such as mongoTemplate.find(new Query().limit(1000),xxx.class) the cost time of querying is too much, and then I find some reason in the code as:

  private <T> List<T> executeFindMultiInternal(CollectionCallback<FindIterable<Document>> collectionCallback,
      CursorPreparer preparer, DocumentCallback<T> documentCallback, String collectionName) {
    try {
      try (MongoCursor<Document> cursor = preparer
          .initiateFind(getAndPrepareCollection(doGetDatabase(), collectionName), collectionCallback::doInCollection)
          .iterator()) {
        List<T> result = new ArrayList<>();
        while (cursor.hasNext()) {
          Document object = cursor.next();
          result.add(documentCallback.doWith(object));
        }
        return result;
      }
    } catch (RuntimeException e) {
      throw potentiallyConvertRuntimeException(e, exceptionTranslator);
    }
  }

The point is new ArrayList() ?! , which length of the list is 1000,this means the list will be copyed 18 times, I think the funtion getAndPrepareCollection() which one returns the bean of MongoCollection.java,and this bean has a funtion called countDocuments(), so how about code like this:

  private <T> List<T> executeFindMultiInternal(CollectionCallback<FindIterable<Document>> collectionCallback,
      CursorPreparer preparer, DocumentCallback<T> documentCallback, String collectionName) {
    try {
	  MongoCollection collections = getAndPrepareCollection(doGetDatabase(), collectionName)
      try (MongoCursor<Document> cursor = preparer
          .initiateFind(collections, collectionCallback::doInCollection)
          .iterator()) {
        List<T> result = new ArrayList<>(collections.countDocuments());
        while (cursor.hasNext()) {
          Document object = cursor.next();
          result.add(documentCallback.doWith(object));
        }
        return result;
      }
    } catch (RuntimeException e) {
      throw potentiallyConvertRuntimeException(e, exceptionTranslator);
    }
  }

The code would reduce the funtcion:java.util.ArrayList#grow be called and save time

constanine avatar Jan 19 '22 08:01 constanine

Introducing a call to collections.countDocuments() introduces another I/O operation that outweighs its benefit if the result is empty or contains a small number of documents. Why don't you use the stream() method that bypasses ArrayList creation entirely?

mp911de avatar Jan 24 '22 07:01 mp911de