meteor-file-collection
meteor-file-collection copied to clipboard
Working on multiple instances?
Hey, we are using "CollectionFS" at the moment and having big issues with server crashes because we are running multiple Meteor instances to scale our application. Is File-Collection ready for multiple instances or will it have the same problems like CollectionFS?
File-collection is much simpler than CollectionFS, as it is built directly on top of gridFS. You don't mention what kinds of crashes you are encountering, or even if you are using CollectionFS with gridFS as the backing store. If you are running into concurrency issues (gridFS is not concurrency friendly out of the box) then you may be in luck, because file-collection uses a gridFS locking package to ensure robust concurrent access. Not knowing anything about your app, I don't know if that may be part of the issues you are seeing or not. Basically it's impossible for me to answer your question because I don't know what your application does, how it works or how you've engineered the scalability/multiple instance support internally (or how you have your mongo server(s) configured...). What I can say is that in its simplicity (assuming it meets your other needs) file-collection reduces the number of variables you need to consider, and explicitly anticipates concurrent access issues that CollectionFS doesn't (yet).
But the only way for you to definitively answer this question is to try it with your app in your environment and beat on it to see what happens.
To that end, you may find this discussion helpful: https://github.com/vsivsi/meteor-file-collection/issues/62
Hey,
thank you for your answer. I've just asked because I thought that maybe someone already has a productional app and is running multiple instances with file-collection. We use meteor-cluster
for our productional environment and actually running 3 servers (instances) with a MongoDB replica set.
The CollectionFS problem is this one https://github.com/CollectionFS/Meteor-CollectionFS/issues/731. We upload profile images to our app and store them in our GridFS collection. I've just seen that some guy did a "file-queue-worker" plugin for CollectionFS to make it work with multiple instances, but it isn't officially supported.
If I understand it right, CollectionFS works with observers to maintain the uploads, and that's the problem with multiple instances. So I just want to know if File-Collection uses another architecture (just saw your package job-collection) to maintain the uploading process - but as you already said, the best way will be to test it in our environment.
I've never seen those types of errors using file-collection. It uses a different upload architecture from collectionFS. If I recall correctly, collectionFS streams uploaded data (chunks) into the local file system, and then when the upload is complete, it copies (streams) the chunks into a single gridFS file in the correct order.
file-collection works in one of two ways. If you use the built-in resumable.js support, then chunks are uploaded and stored as temporary files within gridFS. A constraint is that the resumable.js chunk size and the gridFS chunk size must be equal. Then, once all chunks have been received, file-collection simply creates a zero length file (under write lock) and re-numbers all of the resumable.js chunks to become the chunks of the whole file. So no copying of any kind happens (although the entire file is read by mongo to calculate the correct new md5 sum).
The other way is if you don't use resumable.js, then the uploaded file streams directly from the http connection socket into the gridFS file (under write lock). The file shows up as zero length until the transfer is complete and the lock is released.
So yeah, file-collection and collectionFS do upload completely differently. I wrote file-collection because collectionFS was too slow and sometimes unstable when dealing with large (multi-GB) files that I was using. I tried working on their gridFS backend, but the lack of locking for concurrency and overall complexity of the package (and its dozens of subpackages) convinced me that it would be easier to start from scratch and build a simpler, gridFS focused package. I hope it works for you!
BTW, file-collection has quite a few users, and I've never had a single multiple instance issue reported. So either there aren't any obvious problems, or nobody has tried it (which seems unlikely). I'll leave this issue open. Please report back here with your findings so that I can have a place to send people with similar questions in the future! Thanks.
Sorry to ask you again, but is there an "optimal/integrated" way to upload data uris? Our users select a picture which is converted into a base64 data uri. So we don't use any drag&drop or form elements in our application (Cordova). After some research I would have to do it like this way:
Convert data uri into blob & adding it to a hidden form element: http://stackoverflow.com/questions/4998908/convert-data-uri-to-file-then-append-to-formdata
Then I've to add it via myFiles.resumable.addFile(file);
Look in here: https://github.com/vsivsi/meteor-file-collection/issues/29
@dnish, I also had instability issues, a few server crashes due to CollectionFS and the problem of not being able to run multiple server instances. Replaced CFS in production (internal corporate app) and it is now running without a glitch. Faster uploads. Lower server memory usage. If the only thing you used CFS for is to upload files into gridFS, them my opinion it to not look back. And if you give the collection name the same name as the current CFS name (i.e., "cfs_gridfs.name") then you don't even have to worry about your current data. After replacing you can just delete the cfs gridfs collections not required by file-collection.
Many thanks to Vaughn.
Thank you for your answers. Was Cordova support merged to the main package (@riaan53)? My files are all empty when I try to upload with a Cordova app. This is the url he calls when starting upload:
http://meteor.local/gridfs/sentfiles2/_resumable?resumableChunkNumber=1&resumableChunkSize=2096128&resumableCurrentChunkSize=33080&resumableTotalSize=33080&resumableType=image%2Fjpeg&resumableIdentifier=dcd1e1279320899e1801f307&resumableFilename=sentImage.jpg&resumableRelativePath=sentImage.jpg&resumableTotalChunks=1
It responses with the whole Meteor head/body HTML document.
This may be related... https://github.com/23/resumable.js/issues/252
I'll tomorrow take a look into the source code any try to implement some Cordova settings. Will send you a PR if it works without problems.
At the moment I'm facing the following problem, if I change my base url to this:
resumable_client.coffee
url = "#{@baseURL}/_resumable"
url = Meteor.absoluteUrl(url) if Meteor.isCordova
r = new Resumable
target: url
...
the uploaded file get's added to my "sentfiles2" collection. But I get a 404 at this link:
Request URL: http://192.168.178.34:3000/gridfs/sentfiles2/_resumable?resumableChunkNumber=1&resumableChunkSize=2096128&resumableCurrentChunkSize=42448&resumableTotalSize=42448&resumableType=image%2Fjpeg&resumableIdentifier=5a0a5fc5b8acb9987c45dda4&resumableFilename=sentImage.jpg&resumableRelativePath=sentImage.jpg&resumableTotalChunks=1
He throws the 404 in http_access_server.coffee at 289:
# Perform the collection query
req.gridFS = @findOne lookup
unless req.gridFS
res.writeHead(404, {'Content-Type':'text/plain','Access-Control-Allow-Origin': 'http://meteor.local'})
res.end()
return
At this time, lookup
has the value {md5:'_resumable'}
. Shouldn't it have the _id value of the uploaded document?
This is my client side code:
SentFiles2.resumable.on('fileAdded', function (file) {
SentFiles2.insert({
_id: file.uniqueIdentifier, // This is the ID resumable will use
filename: file.fileName,
contentType: file.file.type
},
function (err, _id) { // Callback to .insert
if (err) { return console.error("File creation failed!", err); }
window.setTimeout(function() {
SentFiles2.resumable.upload(); //Files is created, timeout was for testing sync problems
},10000);
}
);
});
I think you almost have it. You need to add the suffix /_resumable
to your target
setting. Like is done here: https://github.com/vsivsi/meteor-file-collection/blob/master/src/resumable_client.coffee#L29
Without that, the resumable chunk requests are being routed to the wrong request handlers.
@vsivsi Uff...sorry, I forgot to post the value of "url" (now edited). It's already added, but gives me that 404 error.
In fact, if you are configuring your own Resumable object, it needs to match all of the options as the one being declared at the link above...
It's the same object like the original one, except that I've added the "target" parameter to it. The only changes I did so far are adding the access headers and setting the target url.
And you're setting resumable: true
on the server-side file collection?
Yep, this are my collection settings:
SentFiles2 = new FileCollection('sentfiles2',
{ resumable: true, // Enable built-in resumable.js upload support
baseURL:"/gridfs/sentfiles2",
http: [
{ method: 'get',
path: '/:md5', // this will be at route "/gridfs/myFiles/:md5"
lookup: function (params, query) { // uses express style url params
return { md5: params.md5 }; // a query mapping url to myFiles
}
}
]
}
);
What is the value of url
that you are using to set the target in your Resumable object?
It's the following:
url = "#{@baseURL}/_resumable"
url = Meteor.absoluteUrl(url) if Meteor.isCordova
So I get http://192.168.178.34:3000/gridfs/sentfiles2/_resumable
as target url on my Cordova device (it's linked to my dev version on the Mac).
It should be http://192.168.178.34:3000/gridfs/sentfiles2/_resumable
The problem you are having is that the resumable requests are being routed to the md5 handler you are declaring rather than matching the resumable route defined here:
https://github.com/vsivsi/meteor-file-collection/blob/master/src/resumable_server.coffee#L242
And added to the route list (before any user defined routes) here: https://github.com/vsivsi/meteor-file-collection/blob/master/src/http_access_server.coffee#L369
So either the client side route that is defined for resumable is still wrong somehow. Or the server side route is getting screwed up somehow, but it doesn't sound like you've made any server changes...
Argh sorry, but yeah I missed the _resumable part here, so it is in the target url. I've just seen that he adds this to my sentfiles2 collection, when I start uploading a file:
{
"_id": ObjectId('c36c874a4afa0f8e40952d59'),
"length": 0,
"md5": "d41d8cd98f00b204e9800998ecf8427e",
"uploadDate": ISODate('2015-11-03T17:44:27.866Z'),
"chunkSize": 2096128,
"filename": "sentImage.jpg",
"metadata": {},
"aliases": [],
"contentType": "image/jpeg"
}
The interesting here is, that all of my files have length 0 and all the same md5 value (but they are different images). Every file has a different ResumableFile object like this (every file has another size, so I'm wondering why md5 and length all the same in my collection)
{
size: 25689,
file: Blob,
fileName: 'sentImage.jpeg',
isUploading: function(),
[...]
}
When you insert the file into the collection on the client side, it is created as a zero length file on the server, pending the completion of the resumable chunk uploads. All zero length files have the same md5 sum. When the resumable chunks are uploaded, they are placed in temporary gridFS files, and once they all arrive, the chunks are renumbered and assigned to the real file and the md5 sum is updated.
The issue is that the chunk uploads are failing because they aren't ending up on the proper route server side.
Just to be clear, the file document you are seeing is coming from the SentFiles2.insert(...)
not from the resumable upload. This is correct behavior.
When your resumable chunks are being successfully uploaded, you will see files named _Resumable_<a bunch of numbers>
in your collection until the upload of all chunks completes. You aren't getting that far yet, it seems.
I've just logged the file object of SentFiles2.resumable.on('fileAdded', function (file) { ... });
After some correction, my 404 status code is now changed to 204
Remote Address:192.168.178.34:3000
Request URL:http://192.168.178.34:3000/gridfs/sentfiles2/_resumable?resumableChunkNumber=1&resumableChunkSize=2096128&resumableCurrentChunkSize=42448&resumableTotalSize=42448&resumableType=image%2Fjpeg&resumableIdentifier=c41cca8fdd527858d74a26b8&resumableFilename=sentImage.jpg&resumableRelativePath=sentImage.jpg&resumableTotalChunks=1
Request Method:HEAD
Status Code:204 No Content
After this request, I get 10 requests with a 404 error on
Remote Address:192.168.178.34:3000
Request URL:http://192.168.178.34:3000/gridfs/sentfiles2/_resumable
Request Method:OPTIONS
Status Code:404 Not Found
It's in http_access_server.coffee (line 320):
@router.route('/*')
.all (req, res, next) -> # Make sure a file has been selected by some rule
unless req.gridFS
res.writeHead(404, {'Content-Type':'text/plain','Access-Control-Allow-Origin': 'http://meteor.local'})
res.end()
return
next()
I'll check the req object tomorrow.
The 204 responses are correct! That is the resumable client asking the server if it has that chunk yet, and the server responding: No.
https://github.com/vsivsi/meteor-file-collection/blob/master/src/resumable_server.coffee#L237
I don't know anything about that OPTIONS request. Never seen it before from resumable. Is that related to CORS support or something? There is no handler for that in my server-side resumable support, so that may be an issue.
Yeah correct, it is related to CORS. Think I've to response with "200" to make it work. The browser checks if the server allows CORS.
In my example, I got the 204 response as I've changed the lookup function
lookup: function (params, query) {
//Just for a test
return { md5: "d41d8cd98f00b204e9800998ecf8427e" }; //this MD5 exists in the collection.
}
Hi there,
Sorry been busy with lots of other things and havnt got to the pr for cordova support. Will try to find it, its not ready for a merge and still on a older code base.
Just remember that doing this in js on cordova is very cpu intensive for the mobile device. But you can write a plugin or modify the Cordova file transfer one to do it natively that works with this package if you want later.
Regards, Riaan