unicorn-binance-local-depth-cache
unicorn-binance-local-depth-cache copied to clipboard
Slow data retrieval
Is your feature request related to a problem? Please describe. Depth data retrieval is 20-50ms late compared to the binance api ThreadedDepthCacheManager (from sammchardy). This is far from negligible for a 100ms update rate. Tested with binance spot 'BTCUSDT' get_bids and get_asks.
Describe the solution you'd like Be at par with other available solutions.
Describe alternatives you've considered I have considered the library https://github.com/sammchardy/python-binance
How did you measure that?
It is possible that this cache is slower.
I noticed raceconditions when I created it and added a threading.lock, which I don't think python-binance uses. Also we don't use a callback function but a pipe.
I just saw that the time.sleep() command should actually be in an else block so that it only pauses briefly when the stream_buffer is empty: https://github.com/LUCIT-Systems-and-Development/unicorn-binance-local-depth-cache/blob/a7fd5004e1a02e09fd2ee0e87c589fbf4d13f13a/unicorn_binance_local_depth_cache/manager.py#L409
We can also test the whole thing without time.sleep().
you wanna test again with the latest commit?
I run in parallel the same process with the 2 different libraries and record the timestamp as soon as a change is detected in the order book. This library detects the change consistently 20 to 50 ms after the python-binance library. I am running time critical orders and this is a test I need to pass before I can migrate. I was looking into it and noticed the 10ms time sleep which can explain part of the slowness. I can test a new version as soon as you deploy it (preferably with a new release it's faster for me to update but I can test a new commit as well)
It is much better after removing the sleep(0.01)
, we are now 5-20ms late vs 20-50ms before. Thanks!
If the gap goes below 10ms this becomes acceptable, at the moment we are still dropping some quotes that happen in the period which is not very nice for a real time process.
Do you see any further optimization ?
I have done other tests comparing the same depth5@100ms
requests with your other library unicorn-binance-websocket-api. Here I get +/- 10ms vs Python-Binance which is absolutely fine. The latency therefore comes from this library.
Let me know if you think you can improve the latency in this module otherwise I can use unicorn-binance-websocket-api.
What would be the downside using the latter one? I am mindful having a smooth re-connection or re-sync of the websocket when something goes wrong, is it better managed with this than with unicorn-binance-websocket-api?
thanks for testing and all the feedback!
i am still experimenting with this code here...
Did you test ubwa with stream_buffer or callback function?
i think threading lock can be a reason too. the more get_asks() and get_bids() you do, the more you block writing to the depth_cache. this is important, but slows down a bit. python-binance is not doing that.
I think the callback is faster in general because its processing in realtime and not putting something on a stack.
I will think about it and try some ways how to perform better.
I use binance_websocket_api_manager.pop_stream_data_from_stream_buffer()
with ubwa. is there a faster way with callback?
Should I wait for ubldc to get faster or do I have the same functionality and auto-reconnect whenever there is an issue with ubwa?
I don't mind having a more few lines of code if eventually it is 20ms faster.
For the callback mode I've tried the following but data is received only once, there is no update every 100ms I must do something wrong.
def print_stream_signals(signal_type=False, stream_id=False, data_record=False):
print(f"callback: {signal_type} - {stream_id} - {data_record}")
ubwa = BinanceWebSocketApiManager(enable_stream_signal_buffer=True, process_stream_signals=print_stream_signals)
aggtrade_stream_id = ubwa.create_stream(["depth5@100ms"], ['btcusdt'])
time.sleep(20)
ubwa.stop_stream(aggtrade_stream_id)
ubwa.stop_manager_with_all_streams()
print("finished!")
even with stream_buffer is equal to python binance? thats fine :)
there are two callback functions atm. process_stream_signals is for signals: https://github.com/LUCIT-Systems-and-Development/unicorn-binance-websocket-api/wiki/%60stream_signal_buffer%60
what you want is this: https://unicorn-binance-websocket-api.docs.lucit.tech/unicorn_binance_websocket_api.html?highlight=process_stream_data#unicorn_binance_websocket_api.manager.BinanceWebSocketApiManager
is the below working in callback mode then ?
Note stream_buffer_name returns callback: False
def print_stream_data(received_data, exchange="binance.com", stream_buffer_name="False"):
print(f"{received_data}")
ubwa = BinanceWebSocketApiManager(process_stream_data=print_stream_data)
aggtrade_stream_id = ubwa.create_stream(["depth5@100ms"], ['btcusdt'])
time.sleep(20)
ubwa.stop_stream(aggtrade_stream_id)
ubwa.stop_manager_with_all_streams()
The above is 4-5ms faster than python-binance most of the time while the stream_buffer mode was varying and could be 5-10ms slower. That's nice then what is the downside using the callback mode? Will it remain sync and reconnect if needed? Below 10ms becomes less critical and I am fine losing a few ms for more reliability. What is not usable for me is >10ms latency.
a callback is faster, because it directly triggers further processing.
when i started writing ubwa (unicorn-binance-websocket-api) i started with the callback implementation, but on peak times of trading on binance (x5 of websocket volume) many streams crashed in a loop and the reeason was this:
ubwa is written as an asynchronous websocket client, this means, receiving and processing is not a loop, on each receive something similar like a thread starts and executes the the code of processing the received data. if you receive new data before the last one is processed, it gets executed immediatly not when the last one is ready (like an old school loop).
that means, if you receive 100 trades each second, you get something like 100 threads, if trading volume increase, its possible to receive 500 trades each second or more. in this peak times there were too many parallel processings started and the cpu got bloated and streams restarted in a loop.
with the stream_buffer i developed a way to decouple receiving data and processing it. if you receive more data than your system is able to process than the stream_buffer simply stores the data till you are able to pick it up. if not every seconds count then this is a good way to handle this problem.
out of this a seconds advantage was created. if you are not able to save the data to a database (just an example) because the database is down, then you are able to catch the database exception and use add_stream_data_to_stream_buffer() and put the data back to the stream_buffer. that way you can store Gigabytes of data (cache it) till you move it forwared again to your database.
also debbuging within the the callback is not good, because other try-except blocks are catching errors from inside it. maybe its worth to improve that...
in this special case (DepthCache) we are not collecting trades. depth update rates are dynamic but it will not exceed the interval per seconds rate and we need realtime....
for an update and switch to callback i need an update in ubwa to accept callbacks in create_streams().
If I do
ubwa = BinanceWebSocketApiManager(process_stream_data=print_stream_data)
aggtrade_stream_id = ubwa.create_stream(["depth5@100ms"], ['btcusdt'])
isn't it callbacks in create_stream()? (sorry I may be confused but I thought you directed me towards this for callbacks)
Thanks for these explanations, can you also clarify how you manage the auto-reconnect ? This is an important aspect for the user, to make sure it reconnects if needed (this is an issue with the legacy Python Binance). Is the auto-reconnect managed the same way in callback / stream_buffer /local DepthCache ?
Assuming ubwa in callback mode (using process_stream_data if I understood well) handles properly reconnects, this is very fast. Because I am only monitoring the depth5@100ms of a couple of pairs and need real time this may be the best solution for me. I hope it is not going to crash during busy periods. Would you recommend I wait for some implementation in ubldc or can I go ahead with this approach without expecting too much issues? I will migrate tomorrow to your lib, the legacy python-binance is disconnecting randomly for me and the support is not active. Thanks for developing actively this alternative.
now you can set one global callback and each stream is using the same one...
i want to implement a parameter to provide an individual callback function for each stream.
the lib cachtes all errors and handles them :)
you can activate throwing an exception if a stream is not repairable. this means there is a wrong api key/secret, i think all other cases are catched.
if its repairable then a request for a new start of the stream is initiated and it gets started again. UBWA contains a parameter restart_timeout
=6 .... if within 6 seconds a new stream isnt active, the restart will get initiated again.
the websocket sends a ping each 20 seconds and if there is no answer within 20 seconds the connection gets closed. closing handshake has a timeout of 10 seconds. this are the websockets defaults and i want to set them down a bit in UBWA.
Here in UBLDC i set them to ping_interval=1, ping_timeout=5 and close_timeout=0.1. That way the stream restarts very fast and sends the stream_signal "DISCONNECT". As soon we receive this signal in UBLDC, we set the depth_cache to out of sync and start a new init which waits till a new connection is established and the first diff depth update gets received.
During its out of sync an exception gets thrown if you want to access depth_cache data.
just try, start the ./dev_test.py file of this repo and unplug the internet on the pc and see whats happening.
stream_buffer/callback is not related to reconnect handling.
i think i implement this in the next couple of days.
just try it, it should be working! the behaviour depends on the type of channels (depth should be fine, it stays at a max of 10 receives per seconds, so its not unlimited affected by peaks) and its very important what your do with the data after receiving it within the callback functiion. the faster its finished the better it is.
nice, welcome to the unicorns :) we have a telegram group... maybe you want to join!
getting feedback from the community is helping me. your suggestions are very helpfull!
Thanks for these explanations I will read them twice to make sure I don't miss anything. What is it that you're going to implement in the coming days ?
I implemented the callback parameter for create_stream: https://unicorn-binance-websocket-api.docs.lucit.tech/CHANGELOG.html#id1
now i switch the depth_cache from stream_buffer to callback.
That's great, I have migrated my tool to BinanceWebSocketApiManager using process_stream_data
(which is I understand the callback mode). I am now waiting for the next crash of the legacy python-binance to switch the new tool into production!
In test mode everything seems to work fine.
Once you switch to callback this lib, is there an advantage I switch to it rather than the create_stream
? What will I gain / lose with local-depth-cache?
yes, thats the callback for received records :)
this lib is still the same, just some milliseconds faster. the benefit of this lib is described here: https://github.com/LUCIT-Systems-and-Development/unicorn-binance-local-depth-cache#why-a-local-depth_cache
Ok I will do some speed comparisons as soon as it is ready! Thanks
the callback version is still not ready, but i made a few optimizations to ubwa and the stream_buffer version.
can you test 0.7.0 please?
I have tested the new 0.7.0 and can experience 20-50ms latency vs python-binance. I am surprised because I had a previous version (the one without the time.sleep(0.01)) that had a 5-20ms latency only but that's what I get unless I've mixed some versions. The new out of sync messages are very neat.
thats wired. but i fixed a wrong implementation of the asyncio loop in ubwa which caused instability. i expected a speed up.
would you share with me your test script?
you can controll the out of sync timeouts play with closing timeout (0.1), ping interval (1) and ping timeout (3).
My script is basic and my test manual: I run in parallel the same with python-Binance and compare timestamps visually.
market = 'BTCUSDT'
a_prev = {}
ubldc = BinanceLocalDepthCacheManager(exchange="binance.com")
ubldc.create_depth_cache(markets=market, update_interval=100)
while True:
if not ubldc.is_depth_cache_synchronized(market):
print(f"is_synchronized: {ubldc.is_depth_cache_synchronized(market)}")
try:
a = {'bids': ubldc.get_bids(market=market)[:1], 'asks': ubldc.get_asks(market=market)[:1]}
if a != a_prev:
print(a, time.time())
a_prev = a
except DepthCacheOutOfSync as error_msg:
print(f"ERROR: {error_msg}")
Hello Oliver,
I am also experiencing problems using python-finance as well as Futures-for-Binance so I'd like to give your library a try. Is there a reason your BinanceLocalDepthCacheManager does not support margin trading ? And by the way, was the latency problem versus python-binance fixed. Thanks in advance and congratulations on a very nice library!
Kind regards, Andre
Thanks for the message that we are now equally fast! Maybe we are even faster now :)
Our very latest release which will be released in the next few days runs completely as a compiled C-Extention and uses this new UBWA interface, which is fast, robust and guarantees an ordered sequence of data in an asynchronous context (best mix of speed and reliability in terms of data order and stability): https://github.com/LUCIT-Systems-and-Development/unicorn-binance-websocket-api?tab=readme-ov-file#or-await-the-stream-data-in-an-asyncio-coroutine
The advantage of UNICORN Binance Local Depth Cache arises above all when used in larger environments under high load.
It would be nice to get a new comparison test with python-binance and the binance-connector after the release.