crypto icon indicating copy to clipboard operation
crypto copied to clipboard

crypto_history not working

Open augur2 opened this issue 3 years ago • 16 comments

The function crypto_history is not working anymore

I think the problem is, that coinmarketcap changed their website and now the scraper function inside the crypto_history function reads the wrong table on the website:

library(crypto) crypto_history(c("ETH"), start_date = 20201005)

Error in names(x) <- value : 'names' attribute [11] must be the same length as the vector [6]

augur2 avatar Nov 04 '20 18:11 augur2

wrote my own function:

scrape_coinmarketcap.txt

augur2 avatar Nov 05 '20 14:11 augur2

same issue:

same issue, slightly different funciton call:

library("crypto") df_history <- crypto_history(limit=20)

Error in names(x) <- value : 'names' attribute [11] must be the same length as the vector [6]

jas1 avatar Nov 11 '20 02:11 jas1

Yep, came here for the same issue. Augur2 great function! Thank you! How were you able to diagnose the issue of it reading the wrong table on coinmarket cap?

Can anyone suggest another package with similar functionality?

KyleBenzle avatar Nov 11 '20 04:11 KyleBenzle

Yep, came here for the same issue. Augur2 great function! Thank you! How were you able to diagnose the issue of it reading the wrong table on coinmarket cap?

Can anyone suggest another package with similar functionality?

Thanks KyleBenzle!

I went to 'Inspect element' in my browser and searched for some numbers that I saw in the table...

Maybe this is an alternative / more elegant approach:

https://stackoverflow.com/questions/64761914/scraping-historical-data-from-coinmarketcap

augur2 avatar Nov 12 '20 14:11 augur2

wrote my own function:

scrape_coinmarketcap.txt

Thanks augur2! Noting for others that the input for currency is the symbol not the actual name anymore (ex: "Bitcoin" is now "BTC" using this function.

isaaczhao23 avatar Nov 15 '20 16:11 isaaczhao23

updated the function:

scrape_coinmarketcap.txt

augur2 avatar Nov 18 '20 10:11 augur2

See also here for a similar solution ... and much more 👍

https://github.com/deanfantazzini/bitcoinFinance

deanfantazzini avatar Nov 24 '20 12:11 deanfantazzini

Sorry guys been sidelined with work priorities and havent really given this package the time I should have..

This is pretty much the godsend for working out the classes you need to extract https://selectorgadget.com/...

I'll try have a look at whats happening

JesseVent avatar Nov 24 '20 12:11 JesseVent

I just pushed a change that pretty much just aligns it to what you were doing @augur2 so thanks for that - please test it out by installing latest version devtools::install_github("jessevent/crypto")

Thanks

JesseVent avatar Nov 28 '20 10:11 JesseVent

Not working again...

Getting the following error:

histeth <- crypto_history(c("ETH"), start_date = 20201129) ♥ If this helps you become rich please consider donating

ERC-20: 0x375923Bf82F0b728d23A5704261a6e16341fd860 XRP: rK59semLsuJZEWftxBFhWuNE6uhznjz2bK

Scraping historical crypto data

Error in table$props$initialState$cryptocurrency$ohlcvHistorical[[1]] : subscript out of bounds

seems like CMP changed their website again

augur2 avatar Dec 29 '20 15:12 augur2

As I wrote on my Github page for the R package bitcoinFinance, unfortunately, it is clear that coinmarketcap wants to monetize their historical data and they are pushing people to subscribe to their commercial API. This is the third (!!!) change of their website for historical data in less than 1 year. Currently, I have no time to deal with it. If someone can find a solution, please post it here, or on the Github page of my package bitcoinFinance. Thanks!

deanfantazzini avatar Dec 29 '20 17:12 deanfantazzini

Hey all,

I was able to come up with a solution to the discussed problem. Long story short: I used coingecko.com instead of coinmarketcap.com. I'll try to open a PR to see if @JesseVent will incorporate my changes. In the meantime, feel free to use the code chunk below. You'll need the following packages to make it work: tidyvese, rvest, 'janitor, and timetk`. I'm working on making it scalable to more than a single coin and I'll update my code when I get it.

library(tidyverse)
library(rvest)

currency_list <- c("bitcoin")

create_url <- function(currency, start_date = Sys.Date() %-time% "1 year", end_date = Sys.Date()) {
  
  suppressPackageStartupMessages(library(timetk))
  
  page <- str_c(
    "https://www.coingecko.com/en/coins/",
    currency,
    "/historical_data/usd?",
    "end_date=",
    end_date,
    "&start_date=",
    start_date
  )
  
  return(page)
  
}

scrape_data <- function(page, currency) {
  
  data <- page %>% 
    read_html() %>% 
    html_node(xpath = '//table') %>% 
    html_table() %>% 
    as_tibble() %>% 
    janitor::clean_names() %>% 
    mutate(
      currency = currency,
      across(market_cap:close, parse_number)
    ) %>% 
    select(currency, date, close) %>% 
    drop_na()
  
  return(data)
  
}

currency_list %>% 
  map_chr(create_url) %>% 
  map_df(scrape_data, currency = "bitcoin")

realauggieheschmeyer avatar Dec 31 '20 22:12 realauggieheschmeyer

Hey all,

I was able to come up with a solution to the discussed problem. Long story short: I used coingecko.com instead of coinmarketcap.com. I'll try to open a PR to see if @JesseVent will incorporate my changes. In the meantime, feel free to use the code chunk below. You'll need the following packages to make it work: tidyvese, rvest, 'janitor, and timetk`. I'm working on making it scalable to more than a single coin and I'll update my code when I get it.

library(tidyverse)
library(rvest)

currency_list <- c("bitcoin")

create_url <- function(currency, start_date = Sys.Date() %-time% "1 year", end_date = Sys.Date()) {
  
  suppressPackageStartupMessages(library(timetk))
  
  page <- str_c(
    "https://www.coingecko.com/en/coins/",
    currency,
    "/historical_data/usd?",
    "end_date=",
    end_date,
    "&start_date=",
    start_date
  )
  
  return(page)
  
}

scrape_data <- function(page, currency) {
  
  data <- page %>% 
    read_html() %>% 
    html_node(xpath = '//table') %>% 
    html_table() %>% 
    as_tibble() %>% 
    janitor::clean_names() %>% 
    mutate(
      currency = currency,
      across(market_cap:close, parse_number)
    ) %>% 
    select(currency, date, close) %>% 
    drop_na()
  
  return(data)
  
}

currency_list %>% 
  map_chr(create_url) %>% 
  map_df(scrape_data, currency = "bitcoin")

Thanks realauggieheschmeyer.

To make your solution work, you'll need also the packages "timetk" and "janitor".

I don't know if Coingecko is the right source, because they only have "open" and "close" but not "high" and "low"...

augur2 avatar Jan 01 '21 13:01 augur2

Found this: https://stackoverflow.com/questions/65514076/getting-no-data-when-scraping-a-table

you can scrape the cmc data for free without registration via their API.

Maybe this can be implemented in the crypto_history function...

augur2 avatar Jan 06 '21 21:01 augur2

I corrected the "download_coinmarketcap_daily" function in my package bitcoinFinannce + I added another function named "bitcoincharts_download_large" which is specifically suited for large files from bitcoincharts: https://github.com/deanfantazzini/bitcoinFinance

deanfantazzini avatar Apr 05 '21 21:04 deanfantazzini