Mongo.Migration icon indicating copy to clipboard operation
Mongo.Migration copied to clipboard

Performance concern in StartUpDocumentMigrationRunner.cs

Open iamzhaoxu opened this issue 4 years ago • 1 comments

Hi,

I have a concern about the code below in the StartUpDocumentMigrationRunner.cs. As you see we try to load all the documents required migration by cursor which is good. However, after we build the ReplaceOneModel, it will push all the WriteModel<BsonDocument> to a list variable "bulk".

If the amount of the document is small and this is fine. But if we want to migrate data more than millions, will we have a memory concern here since there will be millions of records sitting in memory?

 public void RunAll()
        {
            var locations = _collectionLocator.GetLocatesOrEmpty();

            foreach (var locate in locations)
            {
                var information = locate.Value;
                var type = locate.Key;
                var databaseName = GetDatabaseOrDefault(information);
                var collectionVersion = _documentVersionService.GetCollectionVersion(type);

                var collection = _client.GetDatabase(databaseName)
                    .GetCollection<BsonDocument>(information.Collection);

                var bulk = new List<WriteModel<BsonDocument>>();

                var query = CreateQueryForRelevantDocuments(type);

                using (var cursor = collection.FindSync(query))
                {
                    while (cursor.MoveNext())
                    {
                        var batch = cursor.Current;
                        foreach (var document in batch)
                        {
                            _migrationRunner.Run(type, document, collectionVersion);

                            var update = new ReplaceOneModel<BsonDocument>(
                                new BsonDocument {{"_id", document["_id"]}},
                                document
                            );

                            bulk.Add(update);
                        }
                    }
                }

                if (bulk.Count > 0) collection.BulkWrite(bulk);
            }
        }

iamzhaoxu avatar Oct 21 '21 23:10 iamzhaoxu

Do you have any update on this topic ? Is there a preferred design to not have all document in memory ?

Is it possible to inject our own StartUpDocumentMigrationRunner ?

rpallares avatar Oct 26 '23 14:10 rpallares