Meteor-CollectionFS
Meteor-CollectionFS copied to clipboard
Get file's binary data from GridFS
Hi, how to get binary data of given file from GridFS? I need to send it to another server using HTTP.post() request.
I have a similar problem that I've been breaking my brain about, and that is getting the dataUri from a file stored using the FileSystem store. I need to read the file from the FileSystem, encrypt it (with something that doesn't support streams) and then write it to S3 using cfs-s3 store.
Any help would be appreciated.
FS.Collection uses internally cfs:dataman to deal with reading and transforming one data format into another but I don't know if it's good track. Moreover there are createReadStream and createWriteStream functions. I was looking for some kind of object that can attach to ReadStream and read binary data out of it but without success. Problem still unresolved.
You might want to look through the cfs:file tests which do some binary manipulation. Also, fileObj.createReadStream(storeName) gets you a stream from the store (see API), and you can use the data-man package to convert that read stream to some other binary format. There are also quite a few node packages on npm that deal with stream/buffer/binary conversion.
If you figure out anything, please post some code for the benefit of others.
@aldeed but how to convert ReadStream to Buffer using data-man package? In cfs:file-tests is no code doing that. You only get data from temp file which is not the case when you are using GridFS.
I could only get Buffer using that code:
var dataMan = new DataMan('http://localhost:3000' + thumbnail.url(), thumbnail.type());
dataMan.getBuffer(function (err, buffer) {
console.log(buffer);
});
(Need to install cfs:data-man before using that code)
But it's one of the worst solutions I could think of:
- It makes HTTP request even though I have direct access to GridFS or local FileSystem,
- It's asynchronous and useless in Meteor methods (of course I could wrap it using Meteor.wrapAsync but that's not the point),
- HTTP request needs passing full URL (not URI) to the file. So I have to make it configurable depending on what server I'm running on (development/production).
@jagi, see the dataman api here. Callbacks are optional on the server. Take a look at the data-man tests, too. More of the conversion tests are there.
I didn't test, but I think this will work:
var buffer = new DataMan(thumbnail.createReadStream(storeName), thumbnail.type({store: storeName})).getBuffer();
@aldeed when I'm using GridFS and above code I get error Error: DataMan constructor received data that it doesn't support. When I use FileSystem I doesn't get any error but nothing happens. Any idea how to make it work?
And right new DataMan().getBuffer() work synchronously. Thanks :).
And I have one more question. Does HTTP module in FS support sending files over HTTP like standard Node.js request package with FormData?
Just a note: nope we havent added formdata - but its actually not too hard to add, so it could be added at some point.
Ok nice, now I'm using my custom implementation of it to HTTP request synchronous, but it's not perfect.
DataMan apparently doesn't detect ReadStream as a Stream.Readable so I've tried make little workaround and execute DataMan code manually:
var readStream = thumbnail.createReadStream();
var dataMan = new DataMan.ReadStream(readStream, thumbnail.type());
var buffer = Meteor._wrapAsync(Function.prototype.bind(dataMan.getBuffer, dataMan))();
It doesn't show any error but it stucks in the last line. I was waiting few minutes without any effect. The same situation happened with FileSystem storage.
I think it's not event implemented: https://github.com/CollectionFS/Meteor-data-man/blob/master/server/data-man-readstream.js#L24
Am I right?
Got it working with basic node.js streams on the server:
var doc = FS.File(collection.findOne({_id: id}));
var readStream = doc.createReadStream();
var buffer = new Buffer(0);
readStream.on('readable', function() {
buffer = Buffer.concat([buffer, readStream.read()]);
});
readStream.on('end', function() {
console.log(buffer.toString('base64'));
});
It would be much better to have this working in the api directly.
A better implementation that can be reused and called synchronously on the server:
var getBase64Data = function(doc, callback) {
var buffer = new Buffer(0);
// callback has the form function (err, res) {}
var readStream = doc.createReadStream();
readStream.on('readable', function() {
buffer = Buffer.concat([buffer, readStream.read()]);
});
readStream.on('error', function(err) {
callback(err, null);
});
readStream.on('end', function() {
// done
callback(null, buffer.toString('base64'));
});
};
var getBase64DataSync = Meteor.wrapAsync(getBase64Data);
Oh geez, yeah I guess we never implemented that piece. If you guys figure it out and want to do a pull request, feel free. Something like @icellan's last post is probably pretty close to what's needed.
@icellan It doesn't work for me. First I've tried using your approach with wrapAsync but, that returned error. Next, I've tried asynchronous version to just check if it works at all. And readable event doesn't occur at all and buffer in the end event is empty. Do you have any idea why is that? Are you using your function in the Meteor method?
Ok it works but I had to use data event instead of readable.
@jagi, that's right it should be the data event. Just a typo in @icellan's post.
@jagi @aldeed Sorry for not getting back earlier, but I'm still using "readable" in my app and this is working with no problems. The initial code came from the node.js stream handbook, where they say not to use "data":
Note that whenever you register a "data" listener, you put the stream into compatability mode so you lose the benefits of the new streams2 api.
You should pretty much never register "data" and "end" handlers yourself anymore. If you need to interact with legacy streams, use libraries that you can .pipe() to instead where possible.
https://github.com/substack/stream-handbook
@icellan @jagi Hi, i'm trying your code in a Tinytest but nothing happen. It stays blocked as if the stream was not readable. I'm using [email protected] and [email protected] on a Meteor 1.1.0.2 platform.
// creation of collections
var imageStore = new FS.Store.GridFS(prefix + "-myfiles", {});
MyFilesCol = new FS.Collection(prefix + "-myfiles", {
stores: [imageStore]
});
// your solution
getBase64Data = function(docId, callback) {
var doc = new FS.File(WatchedItemsImages.findOne({_id: docId}));
var readStream = doc.createReadStream();
var buffer = new Buffer(0);
readStream.on('readable', function() {
console.log('readable');
buffer = Buffer.concat([buffer, readStream.read()]);
});
readStream.on('end', function() {
console.log(buffer.toString('base64'));
callback(null, buffer.toString('base64'));
});
};
getBase64DataSync = Meteor.wrapAsync(getBase64Data);
Tinytest.add('try to download an image and save it in db using CFS:GRIDFS', function(test) {
// clean collections
_.each([MyFilesCol], function(col) {
col.remove({});
});
// get an image from a url and store it manually using dataUri
var response = HTTP.get('https://www.hyundaicanada.com/content/Assets/2016/360/ELANTRA/Exterior/Shimmering_Silver/en/01.jpg', {
"npmRequestOptions": {
"encoding": null
}
}),
getDataUri = function(response) {
var contentType = response.headers && response.headers['content-type'] ? response.headers && response.headers['content-type'] : 'image/jpeg',
base64Content = response.content.toString('base64');
return "data:" + contentType + ";base64," + base64Content;
};
var insertedImage = WatchedItemsImages.insert(getDataUri(response));
// trying to retrieve the image binary data
var data = getBase64DataSync(insertedImage._id);
test.equal(typeof data, undefined);
});
Ok, so after reading a large part of the source code, it seems that the best way to write a file is :
var docCol = MyCollection.findOne(),
doc = new FS.File(docCol),
readable = doc.createReadStream(prefix + "-myfiles"),
writeable = Npm.require('fs').createWriteStream('/tmp/stream.jpg'),
buffer = [];
readable.on('data', function(buf) {
buffer.push(buf);
});
readable.on('end', function() {
console.log('readable end', buffer.concat().toString('base64'));
});
readable.pipe(writeable);
really simple in fact, except that's a sherlock holmes work
Thanks @Rebolon for your code! On my side, it looks like I had to pass buffer.concat()[0].toString('base64') to get base64 data (meaning, to get the first element of the concatenated array). Here's what I used in the end:
// Helper function to retrieve the binary content of a CFS file with base64 encoding
var getBase64Data = function(file, callback) {
// callback has the form function (err, res) {}
var readStream = file.createReadStream();
var buffer = [];
readStream.on('data', function(chunk) {
buffer.push(chunk);
});
readStream.on('error', function(err) {
callback(err, null);
});
readStream.on('end', function() {
callback(null, buffer.concat()[0].toString('base64'));
});
};
getBase64DataSync = Meteor.wrapAsync(getBase64Data);
Thanks a lot guys.
Finally it works for my project with the code below:
var getBase64Data = function(doc, callback) {
var buffer = new Buffer(0);
// callback has the form function (err, res) {}
var readStream = doc.createReadStream();
readStream.on('data', function(chunk) {
buffer = Buffer.concat([buffer, chunk]);
});
readStream.on('error', function(err) {
callback(err, null);
});
readStream.on('end', function() {
// done
callback(null, buffer.toString('base64'));
});
};
var getBase64DataSync = Meteor.wrapAsync(getBase64Data);
Hi Has there been any solution for this. I have tried all approaches stipulated above, and my webapp always crushes with STDERR. I can't seem to get the stream/binary out of the fs.collection.
I've litterally copy-pasted the snippet of code I use in my repo, and it works. Not sure what's going wrong on your side.
Thanks everyone for the discussion. And @a-becker42, your snippet worked perfectly for me as well! Great! Probably some of the other proposals could work as well, but might be useful for others to know that this one worked for me too.
Thanks Guys... the code works fine. initially struggled to get my stream out of fs.collection, but now all is fine, thanks once again for sharing.
Glad it helped. Again, as mentioned by @aldeed, this should really be implemented in the package itself.
+1
@a-becker42 After some more testing I discovered that the piece of code you've posted here is only working for files which are smaller than 64kb. At least in my case it will just work for the first chunk. Maybe the chunk size could be tweaked.
The reason for this is that you're using the concat() function in a wrong way (). As mentioned in the documentation concat will concatenate an array with the arrays you pass as argument. In some of the others code snippets concat is used on a buffer which is different. In this case this call does nothing and afterwards you are just taking the first chunk from the original array.
@Rebolon Your second post seems also vulnerable to this issue.
This means that for me only the solution of @dappl is working. I think that if someone is using the other piece it is highly recommended to switch because sooner or later you might encounter a bigger file which will get you puzzled.
It might be better to postpone the concatenation to the end of the process. But could not get it working immediately by just replacing the concat() with a join('').
Anyway, thanks again everyone for this thread, it has been really valuable for me.
Thanks @icellan :+1: