node-geoip-native icon indicating copy to clipboard operation
node-geoip-native copied to clipboard

More speed and less memory usage via code generation

Open nitram509 opened this issue 13 years ago • 10 comments
trafficstars

Hi,

I've made some more progressive enhancements within my fork of 'node-geoip-native'. The basic ideas:

  • generating JS code from CSV file in 'build phase' or manual
  • generated sources can be used via normal require statement
    • thus theres no warmup needed anymore
    • thus the memory usage drops heavily
  • split the IPs and Countries in two separate arrays,
    • thus the memory usage drops heavily
  • using more compact version of binary search
  • fixed all tests with lookup errors (see unit tests)

In general, I've measured ca. 10% performance increase during lookup and a drop from ca. 50MB to ca. 30MB heap usage.

Please, check out this branch 'code_generation' and tell me what you think.

Tanks in advance Martin

nitram509 avatar Nov 04 '12 23:11 nitram509

Hello Martin! I have just realized, that geoip return incorrect results sometimes, so I will try your fork instead. Thank you for sharing! Also, latest ip database from MaxMind has different size: http://dev.maxmind.com/geoip/geolite

pozirk avatar Apr 18 '13 15:04 pozirk

Looks like your version returns incorrect results as well. For example, check against ip: 63.155.159.123, should be US, but return Japan.

pozirk avatar Apr 18 '13 15:04 pozirk

Jup, it seems there are still wrong lookups. Thanks for pointing this out. I've added this 63.155.159.123 to the unit tests (lookuoTest.js) and try to fix it.

nitram509 avatar Apr 18 '13 20:04 nitram509

Looks like it gives wrong results all the time. :)

Some examples: "63.150.141.128","63.156.193.255","1066831232","1067237887","US","United States" "63.156.194.0","63.156.195.255","1067237888","1067238399","JP","Japan" 63.155.159.123 returns Japan.

"93.32.0.0","93.71.255.255","1562378240","1564999679","IT","Italy" "93.72.0.0","93.79.255.255","1564999680","1565523967","UA","Ukraine" 93.79.113.99 returns Italy.

"138.187.0.0","138.191.255.255","2327511040","2327838719","CH","Switzerland" "138.192.0.0","138.193.255.255","2327838720","2327969791","US","United States" 138.188.103.143 return US.

"213.87.0.0","213.88.127.255","3579248640","3579346943","RU","Russian Federation" "213.88.128.0","213.88.184.255","3579346944","3579361535","SE","Sweden" 213.87.133.200 returns Sweden.

pozirk avatar Apr 18 '13 20:04 pozirk

Please, verify my last commit. I figured out, what was wrong. The binary search algorithm, I've implemented before assumed, that the target_ip exists within all records. But thats simply not the case. Thus I've switched to use another binary search approach ('Deferred detection of equality'). This way, the midPoint can be exactly the index, we're search for OR it can be index-1 OR it can be index+1. So I've added this two additional checks. Maybe they can be simplified ;-) The good news: :100: % tests are green

nitram509 avatar Apr 18 '13 21:04 nitram509

I'm curious, what do you think about this approach "using code generation" in general?

nitram509 avatar Apr 18 '13 21:04 nitram509

I didn't dig too much into the code, but your variant is working 50% faster for me. Generation of arrays is not a problem and should be done only once. Just need to check the "ip - country" accuracy, but seems fine for now. Thank you!

pozirk avatar Apr 20 '13 15:04 pozirk

Uncovered two more bugs: The last and second last IP-range are not correctly detected. I'm working on a fix.

nitram509 avatar Apr 27 '13 16:04 nitram509

huh, I hope not too many of my players live in these IP-ranges. :)

pozirk avatar Apr 27 '13 17:04 pozirk

Done. I've fixed the two bugs. In general, the new lookupTests contains 302 unique tests, which give me a better feeling now :-)

I would be glad, if you consider to accept this pull request.

nitram509 avatar Apr 27 '13 19:04 nitram509