electrum-server icon indicating copy to clipboard operation
electrum-server copied to clipboard

database building process

Open ser opened this issue 10 years ago • 7 comments

@ecdsa - the database building process, in my opinion, is now a pretty weak point of the electrum ecosystem. most of servers share the same build, and the process starts to be very long, which might be pretty risky if we must rebuild the database rapidly by any reason.

do you see any method to split the process among cores or independent servers, and join the result later? if yes, i do volunteer to write a script utilising one of major cloud services.

ser avatar Mar 14 '14 04:03 ser

it is not possible to parallelize this process, if that's what you mean. I agree it would be useful to have builds from independent sources.

ecdsa avatar Mar 14 '14 06:03 ecdsa

As we discussed it on IRC, the conclusions are:

  1. it is not possible to parallelise the process
  2. the current code could be improved by changing the tree
  3. things could be improved by rewriting that part of code into C and Mark Friedenbach is doing that

the problem should be definitely investigated

ser avatar Mar 14 '14 07:03 ser

@ecdsa do you think it could be technically possible and worth to store electrum data here?

https://developers.google.com/bigquery/pricing

ser avatar Mar 28 '14 02:03 ser

The process is very CPU-bound in my case. A low-end CPU can barely keep up with building one block every 10 minutes. It would take many weeks to catch up on as little as 1000 blocks.

Any news on the C implementation?

I haven't looked much into the code but I'm curious to know why it can't be parallelised if that's still fresh in your mind? Thanks.

infertux avatar Nov 20 '14 08:11 infertux

it does not look cpu-bound here but my server gets very high io-load everytime there is a new block... which is the bigger problem imho :(

iSOcH avatar Mar 16 '15 08:03 iSOcH

Hello,

My low-end PC (2xcore, 4GB RAM) has build two days Electrum database (BTC) so far and about 25% has done. My HDD i/o and CPU seems to be the problem. I noticed that there alternative progran for Python. It called PyPy (http://pypy.org/) and it's seems little bit faster than Python. I tried this, but finally I got this compilation error when instaling plyvel module (all other modules installed ok): https://github.com/wbolster/plyvel/issues/38

This might be one solution for speed problem, if plyvel will compile.

santzi avatar Jul 15 '15 07:07 santzi

I just completed a quick benchmark to compare the performance of PyPy vs. CPython 2.

I was unable to compile plyvel with the PyPy headers so instead I used an alternative implementation of a leveldb-client called py-leveldb and wrote a wrapper for that that mimics plyvel (good enough for testing I guess).

So I used the $40/mo digitalocean server to test both implementations. There was nothing running on the server to interfere with the building process (well except bitcoind). I ran both interpreters consecutively and used timeout to kill them after exactly two hours.

To eliminate a possible performance win/lose caused by the use of py-leveldb, both PyPy and CPyton used my wrapper of py-leveldb in this benchmark.

This is the number of blocks that got imported after 2 hours: PyPy (2.5.0): 122941 CPython (2.7.9): 111042

So it seems like PyPy was able to import almost 10% more blocks in the same time. I didn't run excessive tests but it seems like electrum-server worked with both interpreters.

That said, I guess it's a good idea to use PyPy if possible but it wouldn't help much with the actual issue here.

I'll ask upstream at wbolster/plyvel#38 if they're able to get PyPy support.

bauerj avatar Aug 03 '15 18:08 bauerj