meteor-file-collection icon indicating copy to clipboard operation
meteor-file-collection copied to clipboard

Working on multiple instances?

Open dnish opened this issue 9 years ago • 38 comments

Hey, we are using "CollectionFS" at the moment and having big issues with server crashes because we are running multiple Meteor instances to scale our application. Is File-Collection ready for multiple instances or will it have the same problems like CollectionFS?

dnish avatar Oct 29 '15 16:10 dnish

File-collection is much simpler than CollectionFS, as it is built directly on top of gridFS. You don't mention what kinds of crashes you are encountering, or even if you are using CollectionFS with gridFS as the backing store. If you are running into concurrency issues (gridFS is not concurrency friendly out of the box) then you may be in luck, because file-collection uses a gridFS locking package to ensure robust concurrent access. Not knowing anything about your app, I don't know if that may be part of the issues you are seeing or not. Basically it's impossible for me to answer your question because I don't know what your application does, how it works or how you've engineered the scalability/multiple instance support internally (or how you have your mongo server(s) configured...). What I can say is that in its simplicity (assuming it meets your other needs) file-collection reduces the number of variables you need to consider, and explicitly anticipates concurrent access issues that CollectionFS doesn't (yet).

But the only way for you to definitively answer this question is to try it with your app in your environment and beat on it to see what happens.

To that end, you may find this discussion helpful: https://github.com/vsivsi/meteor-file-collection/issues/62

vsivsi avatar Oct 29 '15 17:10 vsivsi

Hey, thank you for your answer. I've just asked because I thought that maybe someone already has a productional app and is running multiple instances with file-collection. We use meteor-cluster for our productional environment and actually running 3 servers (instances) with a MongoDB replica set.

The CollectionFS problem is this one https://github.com/CollectionFS/Meteor-CollectionFS/issues/731. We upload profile images to our app and store them in our GridFS collection. I've just seen that some guy did a "file-queue-worker" plugin for CollectionFS to make it work with multiple instances, but it isn't officially supported.

If I understand it right, CollectionFS works with observers to maintain the uploads, and that's the problem with multiple instances. So I just want to know if File-Collection uses another architecture (just saw your package job-collection) to maintain the uploading process - but as you already said, the best way will be to test it in our environment.

dnish avatar Oct 29 '15 18:10 dnish

I've never seen those types of errors using file-collection. It uses a different upload architecture from collectionFS. If I recall correctly, collectionFS streams uploaded data (chunks) into the local file system, and then when the upload is complete, it copies (streams) the chunks into a single gridFS file in the correct order.

file-collection works in one of two ways. If you use the built-in resumable.js support, then chunks are uploaded and stored as temporary files within gridFS. A constraint is that the resumable.js chunk size and the gridFS chunk size must be equal. Then, once all chunks have been received, file-collection simply creates a zero length file (under write lock) and re-numbers all of the resumable.js chunks to become the chunks of the whole file. So no copying of any kind happens (although the entire file is read by mongo to calculate the correct new md5 sum).

The other way is if you don't use resumable.js, then the uploaded file streams directly from the http connection socket into the gridFS file (under write lock). The file shows up as zero length until the transfer is complete and the lock is released.

So yeah, file-collection and collectionFS do upload completely differently. I wrote file-collection because collectionFS was too slow and sometimes unstable when dealing with large (multi-GB) files that I was using. I tried working on their gridFS backend, but the lack of locking for concurrency and overall complexity of the package (and its dozens of subpackages) convinced me that it would be easier to start from scratch and build a simpler, gridFS focused package. I hope it works for you!

vsivsi avatar Oct 29 '15 20:10 vsivsi

BTW, file-collection has quite a few users, and I've never had a single multiple instance issue reported. So either there aren't any obvious problems, or nobody has tried it (which seems unlikely). I'll leave this issue open. Please report back here with your findings so that I can have a place to send people with similar questions in the future! Thanks.

vsivsi avatar Oct 29 '15 20:10 vsivsi

Sorry to ask you again, but is there an "optimal/integrated" way to upload data uris? Our users select a picture which is converted into a base64 data uri. So we don't use any drag&drop or form elements in our application (Cordova). After some research I would have to do it like this way:

Convert data uri into blob & adding it to a hidden form element: http://stackoverflow.com/questions/4998908/convert-data-uri-to-file-then-append-to-formdata

Then I've to add it via myFiles.resumable.addFile(file);

dnish avatar Oct 29 '15 21:10 dnish

Look in here: https://github.com/vsivsi/meteor-file-collection/issues/29

vsivsi avatar Oct 29 '15 23:10 vsivsi

@dnish, I also had instability issues, a few server crashes due to CollectionFS and the problem of not being able to run multiple server instances. Replaced CFS in production (internal corporate app) and it is now running without a glitch. Faster uploads. Lower server memory usage. If the only thing you used CFS for is to upload files into gridFS, them my opinion it to not look back. And if you give the collection name the same name as the current CFS name (i.e., "cfs_gridfs.name") then you don't even have to worry about your current data. After replacing you can just delete the cfs gridfs collections not required by file-collection.

Many thanks to Vaughn.

hluz avatar Oct 30 '15 00:10 hluz

Thank you for your answers. Was Cordova support merged to the main package (@riaan53)? My files are all empty when I try to upload with a Cordova app. This is the url he calls when starting upload:

http://meteor.local/gridfs/sentfiles2/_resumable?resumableChunkNumber=1&resumableChunkSize=2096128&resumableCurrentChunkSize=33080&resumableTotalSize=33080&resumableType=image%2Fjpeg&resumableIdentifier=dcd1e1279320899e1801f307&resumableFilename=sentImage.jpg&resumableRelativePath=sentImage.jpg&resumableTotalChunks=1

It responses with the whole Meteor head/body HTML document.

dnish avatar Nov 02 '15 15:11 dnish

This may be related... https://github.com/23/resumable.js/issues/252

vsivsi avatar Nov 02 '15 20:11 vsivsi

I'll tomorrow take a look into the source code any try to implement some Cordova settings. Will send you a PR if it works without problems.

dnish avatar Nov 02 '15 21:11 dnish

At the moment I'm facing the following problem, if I change my base url to this:

resumable_client.coffee

url = "#{@baseURL}/_resumable"
url = Meteor.absoluteUrl(url) if Meteor.isCordova
r = new Resumable
 target: url
 ...

the uploaded file get's added to my "sentfiles2" collection. But I get a 404 at this link:

Request URL: http://192.168.178.34:3000/gridfs/sentfiles2/_resumable?resumableChunkNumber=1&resumableChunkSize=2096128&resumableCurrentChunkSize=42448&resumableTotalSize=42448&resumableType=image%2Fjpeg&resumableIdentifier=5a0a5fc5b8acb9987c45dda4&resumableFilename=sentImage.jpg&resumableRelativePath=sentImage.jpg&resumableTotalChunks=1

He throws the 404 in http_access_server.coffee at 289:

# Perform the collection query
                  req.gridFS = @findOne lookup
                  unless req.gridFS
                     res.writeHead(404, {'Content-Type':'text/plain','Access-Control-Allow-Origin': 'http://meteor.local'})

                     res.end()
                     return

At this time, lookup has the value {md5:'_resumable'}. Shouldn't it have the _id value of the uploaded document?

This is my client side code:

  SentFiles2.resumable.on('fileAdded', function (file) {


            SentFiles2.insert({
                    _id: file.uniqueIdentifier,  // This is the ID resumable will use
                    filename: file.fileName,
                    contentType: file.file.type
                },
                function (err, _id) {  // Callback to .insert
                    if (err) { return console.error("File creation failed!", err); }

                    window.setTimeout(function() {
                        SentFiles2.resumable.upload(); //Files is created, timeout was for testing sync problems
                    },10000);
                }
            );
        });

dnish avatar Nov 03 '15 16:11 dnish

I think you almost have it. You need to add the suffix /_resumable to your target setting. Like is done here: https://github.com/vsivsi/meteor-file-collection/blob/master/src/resumable_client.coffee#L29

Without that, the resumable chunk requests are being routed to the wrong request handlers.

vsivsi avatar Nov 03 '15 17:11 vsivsi

@vsivsi Uff...sorry, I forgot to post the value of "url" (now edited). It's already added, but gives me that 404 error.

dnish avatar Nov 03 '15 17:11 dnish

In fact, if you are configuring your own Resumable object, it needs to match all of the options as the one being declared at the link above...

vsivsi avatar Nov 03 '15 17:11 vsivsi

It's the same object like the original one, except that I've added the "target" parameter to it. The only changes I did so far are adding the access headers and setting the target url.

dnish avatar Nov 03 '15 17:11 dnish

And you're setting resumable: true on the server-side file collection?

vsivsi avatar Nov 03 '15 17:11 vsivsi

Yep, this are my collection settings:

SentFiles2 = new FileCollection('sentfiles2',
    { resumable: true,   // Enable built-in resumable.js upload support
        baseURL:"/gridfs/sentfiles2",
        http: [
            { method: 'get',
                path: '/:md5',  // this will be at route "/gridfs/myFiles/:md5"
                lookup: function (params, query) {  // uses express style url params
                    return { md5: params.md5 };       // a query mapping url to myFiles
                }
            }
        ]
    }
);

dnish avatar Nov 03 '15 17:11 dnish

What is the value of urlthat you are using to set the target in your Resumable object?

vsivsi avatar Nov 03 '15 17:11 vsivsi

It's the following:

url = "#{@baseURL}/_resumable"
url = Meteor.absoluteUrl(url) if Meteor.isCordova

So I get http://192.168.178.34:3000/gridfs/sentfiles2/_resumable as target url on my Cordova device (it's linked to my dev version on the Mac).

dnish avatar Nov 03 '15 17:11 dnish

It should be http://192.168.178.34:3000/gridfs/sentfiles2/_resumable

The problem you are having is that the resumable requests are being routed to the md5 handler you are declaring rather than matching the resumable route defined here:

https://github.com/vsivsi/meteor-file-collection/blob/master/src/resumable_server.coffee#L242

And added to the route list (before any user defined routes) here: https://github.com/vsivsi/meteor-file-collection/blob/master/src/http_access_server.coffee#L369

So either the client side route that is defined for resumable is still wrong somehow. Or the server side route is getting screwed up somehow, but it doesn't sound like you've made any server changes...

vsivsi avatar Nov 03 '15 17:11 vsivsi

Argh sorry, but yeah I missed the _resumable part here, so it is in the target url. I've just seen that he adds this to my sentfiles2 collection, when I start uploading a file:

{
    "_id": ObjectId('c36c874a4afa0f8e40952d59'),
    "length": 0,
    "md5": "d41d8cd98f00b204e9800998ecf8427e",
    "uploadDate": ISODate('2015-11-03T17:44:27.866Z'),
    "chunkSize": 2096128,
    "filename": "sentImage.jpg",
    "metadata": {},
    "aliases": [],
    "contentType": "image/jpeg"
}

The interesting here is, that all of my files have length 0 and all the same md5 value (but they are different images). Every file has a different ResumableFile object like this (every file has another size, so I'm wondering why md5 and length all the same in my collection)

{
  size: 25689,
  file: Blob,
  fileName: 'sentImage.jpeg',
  isUploading: function(),
  [...]
}

dnish avatar Nov 03 '15 17:11 dnish

When you insert the file into the collection on the client side, it is created as a zero length file on the server, pending the completion of the resumable chunk uploads. All zero length files have the same md5 sum. When the resumable chunks are uploaded, they are placed in temporary gridFS files, and once they all arrive, the chunks are renumbered and assigned to the real file and the md5 sum is updated.

The issue is that the chunk uploads are failing because they aren't ending up on the proper route server side.

vsivsi avatar Nov 03 '15 18:11 vsivsi

Just to be clear, the file document you are seeing is coming from the SentFiles2.insert(...) not from the resumable upload. This is correct behavior.

vsivsi avatar Nov 03 '15 18:11 vsivsi

When your resumable chunks are being successfully uploaded, you will see files named _Resumable_<a bunch of numbers> in your collection until the upload of all chunks completes. You aren't getting that far yet, it seems.

vsivsi avatar Nov 03 '15 18:11 vsivsi

I've just logged the file object of SentFiles2.resumable.on('fileAdded', function (file) { ... });

After some correction, my 404 status code is now changed to 204

Remote Address:192.168.178.34:3000
Request URL:http://192.168.178.34:3000/gridfs/sentfiles2/_resumable?resumableChunkNumber=1&resumableChunkSize=2096128&resumableCurrentChunkSize=42448&resumableTotalSize=42448&resumableType=image%2Fjpeg&resumableIdentifier=c41cca8fdd527858d74a26b8&resumableFilename=sentImage.jpg&resumableRelativePath=sentImage.jpg&resumableTotalChunks=1
Request Method:HEAD
Status Code:204 No Content

After this request, I get 10 requests with a 404 error on

Remote Address:192.168.178.34:3000
Request URL:http://192.168.178.34:3000/gridfs/sentfiles2/_resumable
Request Method:OPTIONS
Status Code:404 Not Found

It's in http_access_server.coffee (line 320):

     @router.route('/*')
         .all (req, res, next) ->  # Make sure a file has been selected by some rule
            unless req.gridFS
               res.writeHead(404, {'Content-Type':'text/plain','Access-Control-Allow-Origin': 'http://meteor.local'})
               res.end()
               return
            next()

I'll check the req object tomorrow.

dnish avatar Nov 03 '15 18:11 dnish

The 204 responses are correct! That is the resumable client asking the server if it has that chunk yet, and the server responding: No.

vsivsi avatar Nov 03 '15 18:11 vsivsi

https://github.com/vsivsi/meteor-file-collection/blob/master/src/resumable_server.coffee#L237

vsivsi avatar Nov 03 '15 18:11 vsivsi

I don't know anything about that OPTIONS request. Never seen it before from resumable. Is that related to CORS support or something? There is no handler for that in my server-side resumable support, so that may be an issue.

vsivsi avatar Nov 03 '15 18:11 vsivsi

Yeah correct, it is related to CORS. Think I've to response with "200" to make it work. The browser checks if the server allows CORS.

In my example, I got the 204 response as I've changed the lookup function

  lookup: function (params, query) {  
                   //Just for a test
                    return { md5: "d41d8cd98f00b204e9800998ecf8427e" };  //this MD5 exists in the collection.
                }

dnish avatar Nov 03 '15 18:11 dnish

Hi there,

Sorry been busy with lots of other things and havnt got to the pr for cordova support. Will try to find it, its not ready for a merge and still on a older code base.

Just remember that doing this in js on cordova is very cpu intensive for the mobile device. But you can write a plugin or modify the Cordova file transfer one to do it natively that works with this package if you want later.

Regards, Riaan

riaan53 avatar Nov 03 '15 18:11 riaan53