node-persist icon indicating copy to clipboard operation
node-persist copied to clipboard

Concurrent `setItem` and `getItem` can lead into an unahdled `does not look like a valid storage file`

Open tndev opened this issue 7 years ago • 16 comments

Concureent setItem and getItem can lead into an unahdled does not look like a valid storage file:

parse error:  {} for:
     Error: [node-persist][readFile] .node-persist/storage/37b51d194a7513e45b56f6524f2d51f2 does not look like a validstorage file!
      at fs.readFile (node_modules/node-persist/src/local-storage.js:278:66)
      at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:525:3)

The code used to reproduce create that error was:

const localdb = require('node-persist')
const storage = localdb.create({ ttl: true, logging: true })

async function test2 () {
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
}

async function test3 () {
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
}

async function main () {
  await storage.init()

  try {
    await Promise.all([
      test2(),
      test3(),
      test2(),
      test3(),
      test2()
    ])
  } catch (err) {
    console.error(err)
  }
}

main().catch(err => {
  console.error(err)
})

The problem cannot be realiable be reproduces.

tndev avatar Oct 01 '18 11:10 tndev

we could use file locking i guess, maybe try https://github.com/moxystudio/node-proper-lockfile

akhoury avatar Oct 02 '18 10:10 akhoury

Yes that might be a solution. I honestly don't use your library so I don't have much knowledge about how it works, and if and what kind of bottelneck that would introduce.

I just stumbled across a question on stackoverflow regarding your library, and figured out that problem while I tried to give an solution to the question.

tndev avatar Oct 02 '18 10:10 tndev

can you link the question please?

akhoury avatar Oct 02 '18 20:10 akhoury

This is the SO question https://stackoverflow.com/questions/52582180/node-persist-on-heavy-node-js-app-not-returning-values/52632704#52632704

biswanaths avatar Oct 03 '18 17:10 biswanaths

I think the better approach would be to serialise read and writes based on the queue. One approach as mentioned in my SO answer(see the gist) above is to chain promises. I think we can enhance it to simply chain promises per key rather than whole universe of read and write through the library.

Please see the basic idea here : https://gist.github.com/biswanaths/b8999c9e4e5a9e979dcde061074528f6

I believe whatever solution to this problem should live in library. The end user does not need to know and work around whether the library is storing kv per file or using a single file.

biswanaths avatar Oct 04 '18 11:10 biswanaths

well, what if it's a different process writing to the same file?

akhoury avatar Oct 06 '18 17:10 akhoury

@akhoury , Oh, yes. This obviously does will not work of multiprocess I don't know where node-persist, use case falls, like single-process, multi-process or may be getting used as part of a distributed application.

Also going through the old case, I think Issue 31 more or less deals with the same issue.

biswanaths avatar Oct 08 '18 08:10 biswanaths

I've also hit this problem, but in my case, I'm not performing asynchronous get/sets but a series of ~10 sets asynchronously using Promise.all(array.map([function that creates promises])). The library hits on the same file name a number of times and ends up writing n values to the same file, thus ending up with a corrupt file.

Given that asynchronicity is commonplace in nodejs, it would be a lot better if this was handled by the library using file locks.

DragonEmbers avatar Sep 04 '19 17:09 DragonEmbers

This problem is still happening. What is the status of the development about this topic ?

Weedoze avatar Jun 24 '20 05:06 Weedoze

When storage.init (storage being node-persist) encounters a corrupt file it throws this error.

I tried using try/catch and .catch() callbacks to handle this error gracefully, but neither work because the error is not bubbling up to where I can handle it.

For those looking for a drop-in replacement for this library, I've switched to https://www.npmjs.com/package/node-localstorage and it seems to be working fine. The api is basically similar with getItem() and setItem()

robbie-cahill avatar Jul 16 '20 11:07 robbie-cahill

node-localstorage doesn't appear to be async, or offer an async api.

Regarding this plugin's error - I have a .catch around every line of code related to it (init, getItem, setItem, removeItem) and it's still throwing the uncaught exception. Even more confusing, it happens at a periodic interval - which would be explained by the fact that my code periodically polls and uses the plugin for reading/writing, but I actually removed all references in the code to the corrupted file in question. It's unfortunate as this is otherwise a fine plugin.

ryanleesmith avatar Oct 11 '20 21:10 ryanleesmith

Ok, figured it out - the interval for expiring keys does a scan, which is where the uncaught exception is originating from, and why you can't catch it. Passing in expiredInterval: 0 as an option in the init resolves the issue for me. If I understand correctly, this means keys will not be auto-expired, but getItem will still honor the TTL. Depending on your usage, this may or may not be an issue.

ryanleesmith avatar Oct 11 '20 22:10 ryanleesmith

Wait, wouldn't this kind of invalidate the library for use as a server-side application that's handling requests? Would it work to wrap getItem and setItem with something that tracks files being modified, or should I just jump to something else?

SamNChiet avatar Mar 27 '21 21:03 SamNChiet

I think the better approach would be to serialise read and writes based on the queue. One approach as mentioned in my SO answer(see the gist) above is to chain promises. I think we can enhance it to simply chain promises per key rather than whole universe of read and write through the library.

Please see the basic idea here : https://gist.github.com/biswanaths/b8999c9e4e5a9e979dcde061074528f6

I believe whatever solution to this problem should live in library. The end user does not need to know and work around whether the library is storing kv per file or using a single file.

this worked a treat - it seems like the library author has robustness concerns, but honestly? I'm not using this because it's the most robust database solution out there, accounting for external multiprocess writes and all that - I'm using it because it's a dang nice interface for serializing bits of data for fun projects.

I still think locking files on the OS level would be the best solution, but even this simple queue basically solves my problem. Thanks!

SamNChiet avatar Mar 27 '21 21:03 SamNChiet

Not that I'm contributing much, but I've also run into this. It's an extremely bizarre error because this error can happen whether or not you're already viewing the file. EG: if you pick up the habit of looking at the files as they're being modified all goes well until you write code that apparently creates this issue, then suddenly the file is no longer being written to, and you're left with the strange error which seems to imply JSON.parse() or something like it failed. My accidental solution was to debounce requests and make sure they execute on one second boundaries. It's effectively an extremely dumb queue, because it can drop data (but in my case the data's not critical anyway).

Andersama avatar Aug 25 '21 20:08 Andersama

I've made everything work, tests were running fine, time to prod: same errors. Pretty unexpected that it won't be able to handle multiple requests. I quickly re-implemented the whole feature in Redis, but it caused a 4hr downtime to our service.

wintercounter avatar Jan 23 '22 15:01 wintercounter

Bro how is this still a bug

ayy1337 avatar Jan 23 '23 01:01 ayy1337

hey, sorry for the serious delay on this, life happened.

[email protected] should fix this problem within the same process, and I think that's only fair, but handling a cluster or multi "machines" writing on the same disk is not really this library's job. If you find yourself using a multi machines services then this is not the library for you. In order for me to solve this across a multi process/machines environment, I would need some external running background or network service, i.e. redis or the likes of it.

Copying OP's test case, note the writeQueue option comment.

const localdb = require('node-persist');
const storage = localdb.create({ 
    ttl: true, 
    writeQueue: true, // turn this to false to disable write queue and you will see the initial error "does not look like a valid storage file"
    logging: false 
})

async function test2 () {
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
  await storage.setItem('bar', { 'foo': 'Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.' })
  await storage.getItem('bar')
}

async function test3 () {
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
  await storage.getItem('bar')
}

async function main () {
  await storage.init()

  try {
    await Promise.all([
      test2(),
      test3(),
      test2(),
      test3(),
      test2()
    ])
  } catch (err) {
    console.error(err)
  }
}

main().catch(err => {
  console.error(err)
})

akhoury avatar Feb 23 '23 19:02 akhoury

Thanks for your work Aziz! I was wondering if this latest update would fix this issue that we have been getting intermittently: Error: EMFILE: too many open files. Or should I open a seperate issue? Thanks!

mdodge-ecgrow avatar Feb 24 '23 15:02 mdodge-ecgrow

@akhoury Hello Aziz! I just wanted to let you know that I found this project and issue, as your last commit fd0d7cf for version 3.1.1 yesterday seems to have a syntax error, so this broke our project when I ran it today:

/var/lib/jenkins/workspace/sonatype-oss-index/node_modules/node-persist/src/local-storage.js:367
			if (!this.q[file]?.length) {
			                  ^

SyntaxError: Unexpected token '.'

audiokan avatar Feb 24 '23 16:02 audiokan

That's not a syntax error, it's called optional chaining and you need at least Node 14 to make it work.

wintercounter avatar Feb 25 '23 00:02 wintercounter

@wintercounter is correct, but to be fair package.json node engine says >=10.12.0 so I just removed that optional chaining in 3.1.3

akhoury avatar Feb 25 '23 18:02 akhoury

Thanks, this indeed fixed it! As I am also not too familiar with node and JavaScript itself (just maintaining the Jenkins server) I did not know this, thanks for the info! And yes, as I did not receive any version dependency conflicts it did not occur that this might be the problem instead of it being a syntax error.

So thank you all for your work, I appreciate it, and have a nice weekend!

audiokan avatar Feb 25 '23 18:02 audiokan