isbn3 icon indicating copy to clipboard operation
isbn3 copied to clipboard

make parser 70x times faster?

Open rkeytacked opened this issue 1 year ago • 3 comments

Hi, I am working some time now on a really fast ISBN parser/formatter in Java. And for this Java library (https://github.com/creativecouple/isbn-validation-java/) I recently found a way to speed-up the parsing throughput from 6k to 13k ops/milliseconds (meaning just 75 nanoseconds per parse operation).

When trying out this approach for other programming languages, I found your isbn3 NPM library. Your benchmark script was not able to measure that tiny amount of time correctly, so I put a 1000x loop around the parsing like this:

for (let i=0; i<1000; i++) {
  const data = isbns.map(isbn => parse(isbn))
}

My question is: Are you interested in rewriting your parsing engine? Otherwise I would try to create a new npm package with that approach.

I compared your latest version of isbn3 to the old npm lib isbn and then my temporary prototype using either of these different imports:

const { parse } = require('..') // your latest version from Github
const { ISBN: { parse } } = require('isbn') // using npm i --no-save isbn (7 years old!!)
const { parse } = require('isbn-validation-js') // using npm i --no-save git://github.com/creativecouple/isbn-validation-java#tmp-javascript-version

This is the result on my machine with these three approaches:

ISBN3

$ npm run benchmark

load module: 5.498ms
parsed 1000x 5640 non-hyphenated ISBNs in: 1:06.333 (m:ss.mmm)
ISBN 978-0-00-443799-6 Group Name English language

ISBN (7 years old!!)

$ npm run benchmark

load module: 1.033ms
parsed 1000x 5640 non-hyphenated ISBNs in: 10.214s
ISBN 978-0-00-443799-6 Group Name English speaking area

my prototype from https://github.com/creativecouple/isbn-validation-java/tree/tmp-javascript-version

$ npm run benchmark

load module: 1.425ms
parsed 1000x 5640 non-hyphenated ISBNs in: 975.009ms
ISBN 978-0-00-443799-6 Group Name English language

So the old isbn package is still faster than your current version, but as you see it is possible to go sub-second for 5,640,000 parsing operations.

rkeytacked avatar Sep 08 '23 08:09 rkeytacked