piscina
piscina copied to clipboard
CPU intensive processing and writing result to files
Imagine I have an array with 1 million filenames: ['1.txt', '2.txt', etc., ..., '1000000.txt'] and I need to do heavy processing and then write the result in the respective file?
What would be the method to efficiently use all the cores of the CPU to spread the CPU intensive processing towards different cores among different filenames?
Normally I would use this:
const fs = require('fs')
const async = require('async')
const heavyProcessing = require('./heavyProcessing.js')
const files = ['1.txt', '2.txt', ..., '1000000.txt']
async.each(files, function (file, cb) {
fs.writeFile(file, heavyProcessing(file), function (err) {
if (err) cb(Error(err)); else cb()
})
})
How can I use Piscina? Can I just run 1 million times with no concern for memory?
What do you think about this apporach?
main.js
const path = require('path');
const fs = require('async')
const Piscina = require('piscina');
const piscina = new Piscina({
filename: path.resolve(__dirname, 'worker.js')
});
const files = ['1.txt', '2.txt', ..., '1000000.txt']
async.each(files, function (file, cb) {
(async function() {
try {
const err = await piscina.run(file)
if (err) cb(Error(err)) else cb()
} catch(err) {
cb(Error(err))
}
})();
})
worker.js
const fs = require('fs')
const heavyProcessing = require('./heavyProcessing.js')
module.exports = (file) => {
try {
fs.writeFileSync(file, heavyProcessing(file))
return null
} catch (e) {
return Error(e)
}
}
just to let everyone know, that indeed this works
I create a gist explaining everything https://gist.github.com/jfoclpf/325bb925fedf50a9cf96bd00d99e2243