CoinTaxman icon indicating copy to clipboard operation
CoinTaxman copied to clipboard

Optimize getting Price data

Open scientes opened this issue 3 years ago • 54 comments

Closes #14

This tries to fetch the price data in batches from an supported exchange (must be supported by ccxt) if this fails for certain datapoints it does nothing and thus lets the current implementation try to fetch the price data. The current state works, but is not very well documented and may leave room for improvement in certain areas.

This was only tested using my personal dataset of transactions with Binance. Fetching Batches from multiple exchanges is not Supported yet.

Also produces very much output in the logs due to ccxt on logging.DEBUG. This should maybe be fixed.

Suggestions Welcome

scientes avatar Mar 05 '21 17:03 scientes

I sadly do not have the time to work on this further this month, but ill try and finish it to the beginning of the next month. But if anybody wants to make changes that move this PR forward go ahead

scientes avatar Mar 10 '21 17:03 scientes

@provinzio i just pushed a PoC for a graph based approach. this approach is atm a standalone and not yet integrated into the actual program. Also it does not look pretty and could have been coded better at some places. But i wanted to share my current progress so you could have a first look at it and find stuff i missed.

Things i know that need improvement:

  • there is not that much in the way of configuration possible
  • if a exchange does not support 1Week candles the active timeframe of a pair can not be determined and it is ignored a fallback to 1d candles is needed here. Also if the timeframe is larger than 1000 candles (19 years in 1w candles 2.7 years in 1d candles) errors will be thrown
  • I wanted to sort the paths by volume but that is not possible without requesting the data for all paths wich takes a long time 1-2s per path and there are about 5-100 paths depending on max path length and amount of pairs available for a pair
  • the pathfinding algo is a little wasteful due to it calculating all possible paths and can be slow with many pairs (with 2800 pairs, 7 exchanges,it took about 3-4s to find all paths)

But on the upside

This should make the price calculation multi exchange ready and when implemented properly should provide ample fallback paths so that the price calculation should always work, even with every fiat currency.

scientes avatar Apr 04 '21 20:04 scientes

I like the general idea and would love to see the end result. I wonder why cctx does not support anything like this. Looks like a basic feature for me.

I wanted to sort the paths by volume but that is not possible without requesting the data for all paths wich takes a long time 1-2s per path and there are about 5-100 paths depending on max path length and amount of pairs available for a pair

Are we able, to get all pairs with one request or something similar? The algorithm might profit from fetching a lot of data and preprocessing it.

provinzio avatar Apr 05 '21 16:04 provinzio

I like the general idea and would love to see the end result. I wonder why cctx does not support anything like this. Looks like a basic feature for me.

I wanted to sort the paths by volume but that is not possible without requesting the data for all paths wich takes a long time 1-2s per path and there are about 5-100 paths depending on max path length and amount of pairs available for a pair

Are we able, to get all pairs with one request or something similar? The algorithm might profit from fetching a lot of data and preprocessing it.

well yes in a way i can fetch information per exchange on which pairs it offers but i have not found a way to determine per pair from when to when it was active, and because of that i currently request the weekly candles per pair to determine the timeframe in which the pair was traded as a side bonus i get the volume information per week

scientes avatar Apr 06 '21 07:04 scientes

I wonder why cctx does not support anything like this. Looks like a basic feature for me.

well cctx is supposed to be a libary which standartises the apis for the exchanges for the end user nothing more nothing less and i suppose that most exchanges do not provide the data which describes from when to when a pair was active.

i could however sort the pairs/paths by current volume but that does not really help. also the volume feature is not really needed and is a small side bonus to the actual purpose which is to find a price for every trade, which i think that my current implementation can achieve

scientes avatar Apr 06 '21 07:04 scientes

The current implementation is tested with my data and works for me. the intial pair takes some time to find the right path when we have old data (2017) but due to caching after the initial pair it should be pretty fast

scientes avatar Apr 08 '21 08:04 scientes

@scientes could you rebase your branch before I review?

provinzio avatar Apr 08 '21 08:04 provinzio

i hope i didn't mess anything up. i do not rebase that often^^

scientes avatar Apr 08 '21 09:04 scientes

after rebase i get this error:

  File "/home/bastian/Documents/git/CoinTaxman/src/main.py", line 20, in <module>
    from book import Book
  File "/home/bastian/Documents/git/CoinTaxman/src/book.py", line 25, in <module>
    import misc
  File "/home/bastian/Documents/git/CoinTaxman/src/misc.py", line 40, in <module>
    L = TypeVar("L", bound=list[Any])
TypeError: 'type' object is not subscriptable

scientes avatar Apr 08 '21 09:04 scientes

after rebase i get this error:

  File "/home/bastian/Documents/git/CoinTaxman/src/main.py", line 20, in <module>
    from book import Book
  File "/home/bastian/Documents/git/CoinTaxman/src/book.py", line 25, in <module>
    import misc
  File "/home/bastian/Documents/git/CoinTaxman/src/misc.py", line 40, in <module>
    L = TypeVar("L", bound=list[Any])
TypeError: 'type' object is not subscriptable

What is your python version? This shouldn't be a problem with 3.9

provinzio avatar Apr 08 '21 09:04 provinzio

well yeah code switched to 3.8.5 without me noticing

scientes avatar Apr 08 '21 09:04 scientes

I am surprised that I can not see the workflow checks directly in this PR.. whatever :D could you fix the checks? You can have a look into the Makefile to see how you can run the formatter/linter.

You should run at least black and isort.

It would be awesome if you can fix the mypy errors, which help incredibly to find bugs related to wrong types.

provinzio avatar Apr 08 '21 09:04 provinzio

Well i get Emails that the Pipelines are running and failing (flake8 and mypy)

scientes avatar Apr 08 '21 10:04 scientes

i hope i didn't mess anything up. i do not rebase that often^^

I don't really get how you combined the branches. It looks like a rebase with an additional merge. The commit history tree looks really fucked up :D You might want to look into it to avoid unnoticed merging errors. It could be the case, that your push after the rebase wasn't a git push --force, but a merge onto the server... but for some weird reason, the merging commit shows differences, which shouldn't be possible?

Rebasing should work with git rebase origin/main and git push --force to overwrite the server commit history.

provinzio avatar Apr 08 '21 10:04 provinzio

Well i just used the rebase function in vs Code. And after that it showed 11 pulls from origin dunno why

scientes avatar Apr 08 '21 11:04 scientes

Ah i missed the force and thats why the merge :smile:

scientes avatar Apr 08 '21 11:04 scientes

The differences i Do Not understand

scientes avatar Apr 08 '21 11:04 scientes

Well i just used the rebase function in vs Code. And after that it showed 11 pulls from origin dunno why

I'd really would like to keep a clean commit history. But I guess it started already when you merged the newest changes into your main instead of rebasing (a40e2ef7f8e0af5fc97b4394458882044cff59db).

But I don't want you to have a lot of work because of my stupid ideals. While fixing I even found an unwanted merging error. I fixed the history in my Branch @scientes. It has the same data as 5278f0dbda6ef0c4cd3973802ab84aeb9fe457fa, which is the current head of your branch ohlcv-batch.

You could do (from your branch ohlcv-batch) git reset --hard origin/@scientes and git push --force to overwrite your branch with mine.

If you have more data right now: there are different ways how we could bring them togther. For example: Commit all your changes to ohlcv-batch. Create temporary branch to save this point git checkout -b tmp. Do the reset. Cherry-pick the additional commits from your tmp branch onto the new ohlcv-batch.

Please excuse my sturdiness. I hope I dont scare you off, but we might profit from a clean commit history later on.


Edit: Btw you can compare the branches with the excellent VS Code Extension Git Tree compare to see for yourself.

provinzio avatar Apr 08 '21 11:04 provinzio

i think git doesnt like the branch name @scientes eToro and main work on reset but @scientes fails with

git reset --hard  origin/@scientes
fatal: ambiguous argument 'origin/@scientes': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

edit: forgot to fetch from all remotes (not only my fork)

scientes avatar Apr 08 '21 11:04 scientes

now its clean thx

scientes avatar Apr 08 '21 12:04 scientes

now I'd like to know why github actions do not show on my pushes

scientes avatar Apr 08 '21 12:04 scientes

I found something useful in the internet about it. The workflow was only triggered on pushs (and not on pull requests). This should fix it 7bbe237c98f2de39bd9c4630b31116f081e4c1dc.

If you want, you can try to rebase onto the newest commit :p

provinzio avatar Apr 08 '21 12:04 provinzio

Btw, avoid # type: ignore for now. It's not bad if the check fails (for now). It's there to show you the errors :) In most cases, it's possible to handle the error accordingly instead of just ignoring it. We should try to avoid ignoring whereever possible.

For example in price_data:459 you say that timestamps is a list, but in 464 you define it as a generator which is something completly different. Both are iterable, but you cannot iterate over a generator twice nor can you use for example len(...).

Just remove timestamps: list = [] and discard the # type: ignore.


Edit: Btw you don't have to say sth like operations_grouped:dict = {} mypy acknowledges the dict for itself. But you might want to describe the dict in detail with dict[str, str] if necessary.

This shouldn't be necessary to fix mypy errors, but describing it with dict[str, str] can help to find some deeper hidden type errors and makes it easier to understand, which values are supposed to be stored in the dict.


Edit2: If you really think, that a # type: ignore is mandatory you should add a comment why it is.

provinzio avatar Apr 08 '21 12:04 provinzio

looks like i don't need to rebase

scientes avatar Apr 08 '21 15:04 scientes

Maybe we should think about either whitelisting E501 or better making the maximum line length longer to something like 100.

I Left two conversations open where I'm not sure if they count as resolved

scientes avatar Apr 08 '21 15:04 scientes

Maybe we should think about either whitelisting E501 or better making the maximum line length longer to something like 100.

Why do you think that? Long lines are cumbersome to read in a splitted editor or a small laptop and it's without bigger problems possible to keep your lines short. (example https://github.com/provinzio/CoinTaxman/blob/main/src/book.py#L106)


Edit: I'd encourage you to use black as default formatter (have a look in the settings) and run on save (VS Code Wiki). It keeps your code clean without lifting a finger. It'll fix most of the flake8 errors, too.

provinzio avatar Apr 08 '21 15:04 provinzio

If you want, you can add me as a contributor to your fork. I could fix some mypy errors/linting and other small stuff while reviewing. I guess, this will reduce the work from both of us.

I hope to get on it on the weekend.

provinzio avatar Apr 08 '21 15:04 provinzio

If you want, you can add me as a contributor to your fork. I could fix some mypy errors/linting and other small stuff while reviewing. I guess, this will reduce the work from both of us.

I hope to get on it on the weekend.

you should be now

scientes avatar Apr 08 '21 16:04 scientes

Maybe we should think about either whitelisting E501 or better making the maximum line length longer to something like 100.

Why do you think that? Long lines are cumbersome to read in a splitted editor or a small laptop and it's without bigger problems possible to keep your lines short. (example https://github.com/provinzio/CoinTaxman/blob/main/src/book.py#L106)

Edit: I'd encourage you to use black as default formatter (have a look in the settings) and run on save (VS Code Wiki). It keeps your code clean without lifting a finger. It'll fix most of the flake8 errors, too.

i currently had autopep8 as formatter changed it now

scientes avatar Apr 08 '21 16:04 scientes

I had to follow this stackoverflow answer to install pycares which is required by ccxt.

We might want to say something about this in the README. I am not sure if this is mandatory or just a random problem of mine.

provinzio avatar Apr 08 '21 17:04 provinzio

I had to follow this stackoverflow answer to install pycares which is required by ccxt.

We might want to say something about this in the README. I am not sure if this is mandatory or just a random problem of mine.

its not in the ccxt install instructions but pycares is installed on my pc but im not sure if i installed it manually or not

scientes avatar Apr 08 '21 17:04 scientes

We also should rename the variables, using a, b, c, .. is bad practice and just makes it more difficult to understand/fix.

provinzio avatar Apr 10 '21 18:04 provinzio

We also should rename the variables, using a, b, c, .. is bad practice and just makes it more difficult to understand/fix.

yes sorry for that^^

scientes avatar Apr 10 '21 19:04 scientes

What do you think of using networkx for the shortest path calculation? We could initiaize the graph with all pairs and calculate the shortest path from all required transactions with very less code.

provinzio avatar Apr 11 '21 14:04 provinzio

the

What do you think of using networkx for the shortest path calculation? We could initiaize the graph with all pairs and calculate the shortest path from all required transactions with very less code.

yes that may be an improvement using wheighted paths (we can define the wheight via a function if ive seen that correctly) and something like djikstra.

scientes avatar Apr 11 '21 14:04 scientes

I'm not sure how to implement the config for exchanges. currently it is hardcoded in the config.py file, but ideally we should read the csvs in book and add them dynamically to the exchanges used for the path calculation, but how to do that neatly is currently above my head

scientes avatar Apr 27 '21 20:04 scientes

Also should i implement the networkx or should we go on with the custom graph.

scientes avatar May 02 '21 07:05 scientes

Also should i implement the networkx or should we go on with the custom graph.

I'd prefer an additional class with the graph and shortest path algorithm. But why bother when it's already there.

A custom graph might be useful, when we can tweak the performance noticeably, otherwhise sticking with networkx might save us some time.

provinzio avatar May 02 '21 08:05 provinzio

I've tested a few times with own statements and fixed some bugs. But sadly i'm out of time for the next few months and wouldn't like this to go stale. the current state works fine but needs to be tested for edge cases (i have a few warnings where he wants to insert the same price twice for a timestamp) and i'd suggest merging this but making issues for still open suggestions.

To the networkx idea: currently most of the time is spent on the ratelimits, i even changed them so that they count the time where we compute time to the ratelimit wait duration to speed it up a little, so as long as the graph opertaitons are faster than the ratelimit wait time, a performance improvement in the pathfinding does not affect overall performance. also with the graphs a large chunk of the graph compute time also depend on the ratelimits of the exchanges, so the biggest improvment would be to reduce the amount of api calls made, which is already done, but could be better in some places.

scientes avatar Jun 15 '21 09:06 scientes

@provinzio what is still missing for a merge? I havent looked into this project for a while and wanted to get this merge finished

scientes avatar Nov 27 '21 10:11 scientes

Thanks for implementing ccxt. I think this also really helpful for solving the other open issues. I've noticed one unintended behavior with the current state of this pull request: It appears that currently the price data is not drawn per default from the exchange that was used for trading.

Consider the following example Kraken CSV:

"txid","refid","time","type","subtype","aclass","asset","amount","fee","balance"
"","","2021-12-01 00:00:00","trade","","currency","ZEUR",-10.0,0.0,0.0
"","","2021-12-01 00:00:00","trade","","currency","ZUSD",10.0,0.0,0.0
"","","2021-12-01 00:00:01","trade","","currency","ZUSD",-10.0,0.0,0.0
"","","2021-12-01 00:00:01","trade","","currency","ADA",5.0,0.0,0.0

This results in the following output:

taxman       DEBUG    Starting evaluation...
price_data   INFO     getting data from 01-Dec-2021 (00:00) to 01-Dec-2021 (00:00) for USD
price_data   DEBUG    found path over ADA/USD (coinbasepro) -> ADA/EUR (binance)
price_data   INFO     getting data from 01-Dec-2021 (00:00) to 01-Dec-2021 (00:00) for ADA
price_data   DEBUG    found path over ADA/EUR (binance)

Therefore, instead of directly accessing ADA/USD from Kraken, price_data fetches the prices from Coinbase Pro and Binance instead. This could lead do different results as the prices may vary on different exchanges.

Edit: Can be resolved by adding "kraken" to EXCHANGES in config.py: EXCHANGES = ["kraken", "binance", "coinbasepro"]. I suggest to automatically define the exchanges based on which CSV exports are present.

Griffsano avatar Dec 12 '21 13:12 Griffsano

automatic I didn't implement because price_data is currently initialized before the book module which means that i build the graph before i know from which exchanges the csv's are from.

scientes avatar Dec 13 '21 12:12 scientes

Ah ok. What about a warning then when the CSV exports are read-in and the respective exchange is not listed in the config.py? Just to make the user aware that they could use additional APIs to get more accurate price data.

Edit: Resolved by https://github.com/scientes/CoinTaxman/pull/2

Griffsano avatar Dec 13 '21 12:12 Griffsano

closes #90

Griffsano avatar Dec 13 '21 13:12 Griffsano

Hey @scientes,

for the ohlcv-batch branch, I'm getting inaccurate price data for Kraken exports (see table below). The fetched prices are constant over multiple timestamps. I'm working on commit bbd8bb81b9c2219d0c4c96e43d32ef3fc2312772, and added "kraken" to the EXCHANGES list in config.py.

Console output (for each entry):

2021-12-28 17:58:15,444 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for BTC
2021-12-28 17:58:17,073 price_data   DEBUG    found path over BTC/EUR (kraken)

However, if you replace BTC by e.g. XXBT (the "dirty" Kraken coin name), the "old" API call is made and the prices look more realistic (they vary for each timestamp):

2021-12-28 17:54:11,570 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for XBT
...
2021-12-28 17:54:11,598 price_data   DEBUG    Querying trades for XXBTZEUR at 2021-11-02 12:00:00+00:00 (offset=10m): Calling https://api.kraken.com/0/public/Trades?pair=XXBTZEUR&since=1635853800000000000

Have you experienced something similar? Could it be that the resolution of the OHLCV is too low (e.g. only weekly data)?

Inaccurate price data in kraken.db, table BTC/EUR:

utc_time price
2021-11-01 12:00:00+00:00 43383.85
2021-11-02 12:00:00+00:00 43383.85
2021-11-03 12:00:00+00:00 43383.85
2021-11-04 12:00:00+00:00 43383.85
2021-11-05 12:00:00+00:00 43383.85
2021-11-06 12:00:00+00:00 43383.85
2021-11-07 12:00:00+00:00 43383.85
2021-11-08 12:00:00+00:00 43356.85
2021-11-09 12:00:00+00:00 43356.85
2021-11-10 12:00:00+00:00 43356.85
2021-11-11 12:00:00+00:00 43356.85
2021-11-12 12:00:00+00:00 43356.85
2021-11-13 12:00:00+00:00 43356.85
2021-11-14 12:00:00+00:00 43356.85
2021-11-15 12:00:00+00:00 43356.85
2021-11-16 12:00:00+00:00 43356.85
2021-11-17 12:00:00+00:00 43356.85
2021-11-18 12:00:00+00:00 43356.85
2021-11-19 12:00:00+00:00 43397.3
2021-11-20 12:00:00+00:00 43397.3
2021-11-21 12:00:00+00:00 43397.3
2021-11-22 12:00:00+00:00 43397.3
2021-11-23 12:00:00+00:00 43397.3
2021-11-24 12:00:00+00:00 43397.3
2021-11-25 12:00:00+00:00 43397.3
2021-11-26 12:00:00+00:00 43397.3
2021-11-27 12:00:00+00:00 43397.3
2021-11-28 12:00:00+00:00 43397.3
2021-11-29 12:00:00+00:00 43397.3
2021-11-30 12:00:00+00:00 43394.15

Dummy CSV export file:

"txid","refid","time","type","subtype","aclass","asset","amount","fee","balance"
"","","2021-11-01 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-01 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-03 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-03 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-05 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-05 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-07 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-07 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-09 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-09 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-11 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-11 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-13 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-13 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-15 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-15 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-17 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-17 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-19 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-19 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-21 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-21 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-23 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-23 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-25 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-25 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-27 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-27 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-29 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-29 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0

Griffsano avatar Dec 28 '21 17:12 Griffsano

Could it be that the resolution of the OHLCV is too low (e.g. only weekly data)?

Shouldn't be. The weekly calls are only for looking for the time period in which the pair Was traded on the platform. For the actual data 1m candles are used.

However, if you replace BTC by e.g. XXBT (the "dirty" Kraken coin name), the "old" API call is made and the prices look more realistic (they vary for each timestamp):

Weird normally ccxt should convert the Symbols automatically to the exchange Variant. But maybe theres some Bug or wrong Implementation on Our side there

Ill Look into it tomorrow

scientes avatar Dec 28 '21 18:12 scientes

Hey @scientes,

for the ohlcv-batch branch, I'm getting inaccurate price data for Kraken exports (see table below). The fetched prices are constant over multiple timestamps. I'm working on commit bbd8bb8, and added "kraken" to the EXCHANGES list in config.py.

Console output (for each entry):

2021-12-28 17:58:15,444 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for BTC
2021-12-28 17:58:17,073 price_data   DEBUG    found path over BTC/EUR (kraken)

However, if you replace BTC by e.g. XXBT (the "dirty" Kraken coin name), the "old" API call is made and the prices look more realistic (they vary for each timestamp):

2021-12-28 17:54:11,570 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for XBT
...
2021-12-28 17:54:11,598 price_data   DEBUG    Querying trades for XXBTZEUR at 2021-11-02 12:00:00+00:00 (offset=10m): Calling https://api.kraken.com/0/public/Trades?pair=XXBTZEUR&since=1635853800000000000

Have you experienced something similar? Could it be that the resolution of the OHLCV is too low (e.g. only weekly data)?

Inaccurate price data in kraken.db, table BTC/EUR: utc_time price 2021-11-01 12:00:00+00:00 43383.85 2021-11-02 12:00:00+00:00 43383.85 2021-11-03 12:00:00+00:00 43383.85 2021-11-04 12:00:00+00:00 43383.85 2021-11-05 12:00:00+00:00 43383.85 2021-11-06 12:00:00+00:00 43383.85 2021-11-07 12:00:00+00:00 43383.85 2021-11-08 12:00:00+00:00 43356.85 2021-11-09 12:00:00+00:00 43356.85 2021-11-10 12:00:00+00:00 43356.85 2021-11-11 12:00:00+00:00 43356.85 2021-11-12 12:00:00+00:00 43356.85 2021-11-13 12:00:00+00:00 43356.85 2021-11-14 12:00:00+00:00 43356.85 2021-11-15 12:00:00+00:00 43356.85 2021-11-16 12:00:00+00:00 43356.85 2021-11-17 12:00:00+00:00 43356.85 2021-11-18 12:00:00+00:00 43356.85 2021-11-19 12:00:00+00:00 43397.3 2021-11-20 12:00:00+00:00 43397.3 2021-11-21 12:00:00+00:00 43397.3 2021-11-22 12:00:00+00:00 43397.3 2021-11-23 12:00:00+00:00 43397.3 2021-11-24 12:00:00+00:00 43397.3 2021-11-25 12:00:00+00:00 43397.3 2021-11-26 12:00:00+00:00 43397.3 2021-11-27 12:00:00+00:00 43397.3 2021-11-28 12:00:00+00:00 43397.3 2021-11-29 12:00:00+00:00 43397.3 2021-11-30 12:00:00+00:00 43394.15

Dummy CSV export file:

"txid","refid","time","type","subtype","aclass","asset","amount","fee","balance"
"","","2021-11-01 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-01 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-03 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-03 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-05 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-05 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-07 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-07 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-09 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-09 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-11 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-11 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-13 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-13 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-15 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-15 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-17 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-17 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-19 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-19 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-21 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-21 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-23 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-23 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-25 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-25 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-27 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-27 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-29 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-29 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0

this error is extremely weird and kraken specific i think: the data reported by kraken through ccxt is off

start,stop,candle_ts,symbol,open,high,low,close,avg
1635768000000,1635768000000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1635768000000,1635768000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1635854400000,1635854400000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1635854400000,1635854400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1635940800000,1635940800000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1635940800000,1635940800000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636027200000,1636027200000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636027200000,1636027200000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636113600000,1636113600000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636113600000,1636113600000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636200000000,1636200000000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636200000000,1636200000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636286400000,1636286400000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636286400000,1636286400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636372800000,1636372800000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636372800000,1636372800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636459200000,1636459200000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636459200000,1636459200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636545600000,1636545600000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636545600000,1636545600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636632000000,1636632000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636632000000,1636632000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636718400000,1636718400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636718400000,1636718400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636804800000,1636804800000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636804800000,1636804800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636891200000,1636891200000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636891200000,1636891200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636977600000,1636977600000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636977600000,1636977600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637064000000,1637064000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1637064000000,1637064000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637150400000,1637150400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1637150400000,1637150400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637236800000,1637236800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637236800000,1637236800000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637323200000,1637323200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637323200000,1637323200000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637409600000,1637409600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637409600000,1637409600000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637496000000,1637496000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637496000000,1637496000000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637582400000,1637582400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637582400000,1637582400000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637668800000,1637668800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637668800000,1637668800000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637755200000,1637755200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637755200000,1637755200000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637841600000,1637841600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637841600000,1637841600000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637928000000,1637928000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637928000000,1637928000000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638014400000,1638014400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1638014400000,1638014400000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638100800000,1638100800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1638100800000,1638100800000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638187200000,1638187200000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638187200000,1638187200000,1640923440000,BTC/EUR,41741.1,41741.1,41692.4,41720.3,41730.7
1638273600000,1638273600000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638273600000,1638273600000,1640923440000,BTC/EUR,41741.1,41741.1,41692.4,41720.3,41730.7

as you can see the time frame requested and the timestamp of the candle which is reported back vary significantly. more interesting is that the timestamp reported back is not much older than 12 hours to execution time the best solution until i fugure out how to get older data from kraken would be to not use kraken for price data at all for now

scientes avatar Dec 31 '21 16:12 scientes

https://www.freqtrade.io/en/stable/exchanges/#historic-kraken-data

The Kraken API does only provide 720 historic candles, which is sufficient for Freqtrade dry-run and live trade modes, but is a problem for backtesting. To download data for the Kraken exchange, using --dl-trades is mandatory, otherwise the bot will download the same 720 candles over and over, and you'll not have enough backtest data.

using trades instead of candles was another approach i wanted to implement in the future as an alternative anyways but I'd rather not do this in this PR. it has taken to long anyways

scientes avatar Dec 31 '21 16:12 scientes

@provinzio i think the current state is merge ready. what do you think?

scientes avatar Dec 31 '21 16:12 scientes

Would be great to have this available in a merged fashion together with the latest fix for kraken (#100) on main. Currently, I get also stuck on my analysis of binance trades (recursion problem and some buy/sell issue).

oldgitdaddy avatar Jan 03 '22 10:01 oldgitdaddy

Would be great to have this available in a merged fashion together with the latest fix for kraken (#100) on main. Currently, I get also stuck on my analysis of binance trades (recursion problem and some buy/sell issue).

I try to work on it as soon as possible, this is definitly an awesome PR. I appreciate everybody who helps reviewing the code. Feel free to give your feedback as well.

provinzio avatar Jan 03 '22 17:01 provinzio

Just noticed that ccxt has a very fast release cycle and has almost hourly package updates. I'd suggest unpinning the ccxt version in the requirements.txt package so that when Cointaxman is installed, the newest version of ccxt is always installed.

https://github.com/ccxt/ccxt/tags

scientes avatar Jan 21 '22 21:01 scientes

Just to let you know, I've tested this PR with my Binance/Coinbase logs and it worked flawlessly :) Awesome graph-theory-based path finding approach, thanks for implementing!

Griffsano avatar Jan 26 '22 20:01 Griffsano

  • [ ] Check points vom #111

provinzio avatar Jun 04 '22 10:06 provinzio