finreportr icon indicating copy to clipboard operation
finreportr copied to clipboard

Cannot Open URL

Open georgeaj opened this issue 5 years ago • 25 comments

No finreportr functions work when year = 2019. Have tested on multiple companies and multiple years, problem is not company specific and only exists when year = 2019.

GetBalanceSheet('GOOG', 2019)

Error in fileFromCache(file) : Error in download.file(file, cached.file, quiet = !verbose) : cannot open URL 'https://www.sec.gov/Archives/edgar/data/1652044/000165204419000004/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd'

Session Info:

R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.4.0    stringr_1.4.0    dplyr_0.8.1      purrr_0.3.2      readr_1.3.1      tidyr_0.8.3      tibble_2.1.3     ggplot2_3.1.1   
 [9] tidyverse_1.2.1  finreportr_1.0.1 lubridate_1.7.4  rvest_0.3.4      xml2_1.2.0       edgar_2.0.1     

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5  xfun_0.7          slam_0.1-45       NLP_0.2-0         haven_2.1.0       lattice_0.20-38   colorspace_1.4-1 
 [8] generics_0.0.2    yaml_2.2.0        XML_3.98-1.20     rlang_0.3.4       R.oo_1.22.0       pillar_1.4.1      withr_2.1.2      
[15] glue_1.3.1        R.utils_2.8.0     selectr_0.4-1     readxl_1.3.1      modelr_0.1.4      plyr_1.8.4        cellranger_1.1.0 
[22] munsell_0.5.0     gtable_0.3.0      R.methodsS3_1.7.1 XBRL_0.99.18      qdapRegex_0.7.2   knitr_1.23        tm_0.7-6         
[29] parallel_3.6.0    curl_3.3          broom_0.5.2       Rcpp_1.0.1        backports_1.1.4   scales_1.0.0      jsonlite_1.6     
[36] hms_0.4.2         stringi_1.4.3     grid_3.6.0        cli_1.1.0         tools_3.6.0       magrittr_1.5      lazyeval_0.2.2   
[43] crayon_1.3.4      pkgconfig_2.0.2   assertthat_0.2.1  httr_1.4.0        rstudioapi_0.10   R6_2.4.0          nlme_3.1-139     
[50] compiler_3.6.0   

georgeaj avatar Jun 11 '19 17:06 georgeaj

Hi author and georgeaj, is this issue resolved? I am having the same issue and trying to figure out why. This is a great package and really helpful to pull annual data. Thanks.

dchen728 avatar Jun 26 '19 21:06 dchen728

I haven’t tried it in a few days. I was able to get one company’s 2019 data one time. I suspect it could possibly be that the SEC database can’t handle the amount of requests it gets for current data every day and so it returns nothing. If this is the case then there may not be a solution. After having this problem I wrote my own function to pull the data from the SEC’s excel files that are posted with every filing. I may make it into a package if I get the rest of the kinks out. What method does finreportr use to get the data?

On Jun 26, 2019, at 4:59 PM, dchen728 [email protected] wrote:

Hi author and georgeaj, is this issue resolved? I am having the same issue and trying to figure out why. This is a great package and really helpful to pull annual data. Thanks.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

georgeaj avatar Jun 26 '19 23:06 georgeaj

My understanding is that finreportr pulls the data in XML format from SEC and then parse and convert the data into dataframe in R. It would be great if you could make your function into a package since there are very few ways currently available to pull SEC data into R.

dchen728 avatar Jun 27 '19 00:06 dchen728

I haven’t tried it in a few days. I was able to get one company’s 2019 data one time. I suspect it could possibly be that the SEC database can’t handle the amount of requests it gets for current data every day and so it returns nothing. If this is the case then there may not be a solution. After having this problem I wrote my own function to pull the data from the SEC’s excel files that are posted with every filing. I may make it into a package if I get the rest of the kinks out. What method does finreportr use to get the data? On Jun 26, 2019, at 4:59 PM, dchen728 @.***> wrote: Hi author and georgeaj, is this issue resolved? I am having the same issue and trying to figure out why. This is a great package and really helpful to pull annual data. Thanks. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

My understanding is that finreportr pulls the data in XML format from SEC and then parse and convert the data into dataframe in R. It would be great if you could make your function into a package since there are very few ways currently available to pull SEC data into R.

dchen728 avatar Jun 27 '19 00:06 dchen728

Hello @georgeaj @dchen728 ,

Thank you very much for reporting this issue. Much apologies for the late reply -- I have been very busy lately.

I will take a look this week, and will be in touch if I need help testing patches for this bug.

sewardlee337 avatar Jun 27 '19 00:06 sewardlee337

A brief update:

From what I'm seeing, the underlying issue appears to be due to something about the way the XBRL package interfaces with EDGAR. When finreportr pulls and parses XBRL-format data from the U.S. Securities and Exchange Commission, it calls the XBRL package function xbrlDoAll().

For example, if you try to run:

## ORCL's 2019 financials
url <- "https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531.xml"

## Call xbrlDoAll(), in verbose mode
XBRL::xbrlDoAll(url, cache.dir='XBRLcache',prefix.out="out",verbose=TRUE)

The printout you receive is:

Downloading to cache dir...trying URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531.xml'
downloaded 6.2 MB

Schema:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531.xsd 
Level: 1 ==> https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531.xsd 
Downloading to cache dir...trying URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531.xsd'
downloaded 98 KB

Roles
Elements
XBRLcache/orcl-20190531.xsd  ==> Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_cal.xml 
Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_cal.xml 
Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_cal.xml 
Downloading to cache dir...trying URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_cal.xml'
downloaded 119 KB

Calculations.
XBRLcache/orcl-20190531.xsd  ==> Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_def.xml 
Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_def.xml 
Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_def.xml 
Downloading to cache dir...trying URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_def.xml'
downloaded 356 KB

Definitions.
XBRLcache/orcl-20190531.xsd  ==> Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_lab.xml 
Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_lab.xml 
Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_lab.xml 
Downloading to cache dir...trying URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_lab.xml'
downloaded 879 KB

Labels.
XBRLcache/orcl-20190531.xsd  ==> Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_pre.xml 
Linkbase:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_pre.xml 
Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_pre.xml 
Downloading to cache dir...trying URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/orcl-20190531_pre.xml'
downloaded 643 KB

Presentations.
XBRLcache/orcl-20190531.xsd  ==> Schema:  http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd 
Schema:  http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd 
Level: 2 ==> http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd 
Using file from cache dir...
Elements
XBRLcache/xbrl-instance-2003-12-31.xsd  ==> Schema:  http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd 
Schema:  http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd 
Level: 3 ==> http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd 
Using file from cache dir...
Elements
XBRLcache/xbrl-linkbase-2003-12-31.xsd  ==> Schema:  http://www.xbrl.org/2003/xl-2003-12-31.xsd 
Schema:  http://www.xbrl.org/2003/xl-2003-12-31.xsd 
Level: 4 ==> http://www.xbrl.org/2003/xl-2003-12-31.xsd 
Using file from cache dir...
Elements
XBRLcache/xl-2003-12-31.xsd  ==> Schema:  http://www.xbrl.org/2003/xlink-2003-12-31.xsd 
Schema:  http://www.xbrl.org/2003/xlink-2003-12-31.xsd 
Level: 5 ==> http://www.xbrl.org/2003/xlink-2003-12-31.xsd 
Using file from cache dir...
Elements
XBRLcache/xbrl-linkbase-2003-12-31.xsd  ==> Schema:  http://www.xbrl.org/2003/xlink-2003-12-31.xsd 
Schema:  http://www.xbrl.org/2003/xlink-2003-12-31.xsd 
Already discovered. Skipping
XBRLcache/orcl-20190531.xsd  ==> Schema:  http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd 
Schema:  http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd 
Already discovered. Skipping
XBRLcache/orcl-20190531.xsd  ==> Schema:  http://www.xbrl.org/2005/xbrldt-2005.xsd 
Schema:  http://www.xbrl.org/2005/xbrldt-2005.xsd 
Level: 2 ==> http://www.xbrl.org/2005/xbrldt-2005.xsd 
Using file from cache dir...
Elements
XBRLcache/xbrldt-2005.xsd  ==> Schema:  http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd 
Schema:  http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd 
Already discovered. Skipping
XBRLcache/orcl-20190531.xsd  ==> Schema:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd 
Schema:  https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd 
Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd 
Downloading to cache dir...trying URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd'
Error in fileFromCache(file) : 
  Error in download.file(file, cached.file, quiet = !verbose) : 
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd'

In addition: Warning message:
In download.file(file, cached.file, quiet = !verbose) :
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd': HTTP status was '403 Forbidden'

This issue appears to affect all packages and applications that use this function in the XBRL package. For example: https://github.com/bergant/finstr/issues/12

I will update when I find out more. Thank you so much for your patience.

sewardlee337 avatar Jun 27 '19 01:06 sewardlee337

Hey Seward, thanks for the update. Look forward to hearing more updates. I will keep an eye on the XBRL package as well. Thanks again.

dchen728 avatar Jun 27 '19 02:06 dchen728

I have written to the author of the XBRL package to see if he can offer some guidance.

sewardlee337 avatar Jun 30 '19 20:06 sewardlee337

Thanks for the update.

dchen728 avatar Jul 01 '19 18:07 dchen728

I am having issues with a company that got delisted with a subsequent symbol change (EPE to EPEG).
GetIncome('EPEG', 2019) does not work because because the XML tag has epeg at the end as opposed to the name when it was filed, epe. 'https://www.sec.gov/Archives/edgar/data/1584952/000158495219000003/epeg-20181231.xml'

enFinExplorer avatar Sep 13 '19 13:09 enFinExplorer

I have written to the author of the XBRL package to see if he can offer some guidance.

Any luck with this?

shyams80 avatar Sep 19 '19 13:09 shyams80

I have written to the author of the XBRL package to see if he can offer some guidance.

Any luck with this?

Please see this SO question , might help hack a solution if this is urgent for you.

selgamal avatar Sep 19 '19 17:09 selgamal

I was able to fix the XBRL package with the SO question from above but I ran into another problem where GetFiniancials for a 2019 report year would return the following error: Error: Result must have length 1011, not 0

After doing some digging it appears that the descriptions of cash flow statements, balance sheets, and income have changed from previous years. The 2019 report I was looking at (symbol "SM") has the following:

CONSOLIDATED BALANCE SHEETS (in thousands, except share data) CONSOLIDATED STATEMENTS OF CASH FLOWS (in thousands) CONSOLIDATED STATEMENTS OF OPERATIONS (in thousands, except per share data)

For example, GetIncome only looks for these column headers:

income.descriptions <- c("CONSOLIDATED STATEMENTS OF INCOME", "CONSOLIDATED STATEMENT OF INCOME", "CONSOLIDATED STATEMENTS OF OPERATIONS", "CONSOLIDATED STATEMENT OF OPERATIONS", "CONSOLIDATED STATEMENT OF EARNINGS", "CONSOLIDATED STATEMENTS OF EARNINGS", "INCOME STATEMENTS", "CONSOLIDATED RESULTS OF OPERATIONS")

I made the correction to the descriptions and was able to download the data.

Just thought I would pass along.

mfarr76 avatar Dec 07 '19 12:12 mfarr76

I am also working with this package and observed the same behavior. There are however two distinct problems arising at the same time.

  1. XBRL is not generating correct URLs
  2. The naming of xml files on the sec website has changed recently.

XBRL: The bug in the XBRL package should be fixed as indicated above (I also found the same solution independently). When working working on windows it is best to rebuild the XBRL package from source with the changed file . The libxml file is not available for the compiler and should best be added according to this url: https://stackoverflow.com/questions/39568937/how-to-create-cran-ready-r-package-that-has-external-dependency-libxml2

SEC NAMES: when running (with the fixed XBRL package) the following code:

Income <- finreportr::GetIncome("SBUX", 2019)

results in the following error

Error in fileFromCache(file.inst) : 
  Error in download.file(file, cached.file, quiet = !verbose) : 
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/829224/000082922419000051/sbux-20190929.xml'

if I then check if this file is present on the Edgar website: https://www.sec.gov/Archives/edgar/data/829224/000082922419000051/

I notice that the file sbux-20190929.xml is not present on EDGAR. I checked this also for other companies like Google and Boeing and they all observer the same behavior.

When I then try to find what should be the correct name using the edgarWebR (https://github.com/mwaldstein/edgarWebR) package:

FilingsonEdgar <- edgarWebR::company_filings(x = "SBUX", type = "10-K")
DocumentsonEdgar <-  edgarWebR::filing_documents(x = test$href[1])
link <- DocumentsonEdgar[DocumentsonEdgar[5] == 'XML', 4]

I get the following URL:

https://www.sec.gov/Archives/edgar/data/12927/000001292719000077/a201909sep3010-q_htm.xml

When passing this URL to the revised XBRL package

xbrl.vars <- XBRL::xbrlDoAll(link, verbose=TRUE)

it downloads the data correctly.

CONCLUSION: The finreportr::GetIncome function generates the wrong URL for using in XBRL. When using the debug statement in Rstudio for finreportr::GetIncome you end up in the function finreportr::GetFinancial which has a in-function helper function called GetURL which generates a static URL. I would propose to replace the GetURL function with a slight adaptation of the above mentioned way to retrieve the right url by using the edgarWebR package

GreenGrassBlueOcean avatar Jan 14 '20 14:01 GreenGrassBlueOcean

Hello, I am trying to load J.P. Morgan income statement, but I get the following error, could you help me with some solution. Thanks in advance.

>GetIncome("JPM", 2019)

Error in fileFromCache(file.inst) : Error in download.file(file, cached.file, quiet = !verbose) : no fue posible abrir la URL 'https://www.sec.gov/Archives/edgar/data/19617/000001961719000054/jpm-20181231.xml'_

Handiel avatar Apr 18 '20 23:04 Handiel

I am also working with this package and observed the same behavior. There are however two distinct problems arising at the same time.

  1. XBRL is not generating correct URLs
  2. The naming of xml files on the sec website has changed recently.

XBRL: The bug in the XBRL package should be fixed as indicated above (I also found the same solution independently). When working working on windows it is best to rebuild the XBRL package from source with the changed file . The libxml file is not available for the compiler and should best be added according to this url: https://stackoverflow.com/questions/39568937/how-to-create-cran-ready-r-package-that-has-external-dependency-libxml2

SEC NAMES: when running (with the fixed XBRL package) the following code:

Income <- finreportr::GetIncome("SBUX", 2019)

results in the following error

Error in fileFromCache(file.inst) : 
  Error in download.file(file, cached.file, quiet = !verbose) : 
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/829224/000082922419000051/sbux-20190929.xml'

if I then check if this file is present on the Edgar website: https://www.sec.gov/Archives/edgar/data/829224/000082922419000051/

I notice that the file sbux-20190929.xml is not present on EDGAR. I checked this also for other companies like Google and Boeing and they all observer the same behavior.

When I then try to find what should be the correct name using the edgarWebR (https://github.com/mwaldstein/edgarWebR) package:

FilingsonEdgar <- edgarWebR::company_filings(x = "SBUX", type = "10-K")
DocumentsonEdgar <-  edgarWebR::filing_documents(x = test$href[1])
link <- DocumentsonEdgar[DocumentsonEdgar[5] == 'XML', 4]

I get the following URL:

https://www.sec.gov/Archives/edgar/data/12927/000001292719000077/a201909sep3010-q_htm.xml

When passing this URL to the revised XBRL package

xbrl.vars <- XBRL::xbrlDoAll(link, verbose=TRUE)

it downloads the data correctly.

CONCLUSION: The finreportr::GetIncome function generates the wrong URL for using in XBRL. When using the debug statement in Rstudio for finreportr::GetIncome you end up in the function finreportr::GetFinancial which has a in-function helper function called GetURL which generates a static URL. I would propose to replace the GetURL function with a slight adaptation of the above mentioned way to retrieve the right url by using the edgarWebR package

In the GetFinancial function there is the GetURL function which I believe is the issue. The inst.url string object is created with finishing with the report.period. Apparently now EDGAR has created string endings to the xml file (examples include cal,def,lab,pre) that need to be added in to the inst.url string. I don't know how many suffixes are enumerated. It looks like the @GreenGrassBlueOcean has different string endings on the xml file.

 ##   Function to acquire Instance Document URL
 GetURL <- function(symbol, year) {
      
      lower.symbol <- tolower(symbol)
      
      accession.no.raw <- GetAccessionNo(symbol, year, foreign = FALSE)
      accession.no <- gsub("-", "" , accession.no.raw)
      
      CIK <- CompanyInfo(symbol)
      CIK <- as.numeric(CIK$CIK)
      
      report.period <- ReportPeriod(symbol, CIK, accession.no, accession.no.raw)
      report.period <- gsub("-", "" , report.period)
      
      inst.url <- paste0("https://www.sec.gov/Archives/edgar/data/", CIK, "/", 
                         accession.no, "/", lower.symbol, "-", report.period, ".xml")
      return(inst.url)
 }

IEORTools avatar Jul 26 '20 00:07 IEORTools

FilingsonEdgar <- edgarWebR::company_filings(x = "SBUX", type = "10-K") DocumentsonEdgar <- edgarWebR::filing_documents(x = test$href[1]) link <- DocumentsonEdgar[DocumentsonEdgar[5] == 'XML', 4]

Returned: Error in edgarWebR::filing_documents(x = test$href[1]) : object 'test' not found

But I knew where the file was so wrote the URL manually (and checked it many times).

All good until I got to the xbrl.vars <- XBRL::xbrlDoAll(link, verbose=TRUE)

Whereupon, same as mentioned before:

`..trying URL 'https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd' Error in fileFromCache(file) : Error in download.file(file, cached.file, quiet = !verbose) : cannot open URL 'https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd'

In addition: Warning message: In download.file(file, cached.file, quiet = !verbose) : cannot open URL 'https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd': HTTP status was '404 Not Found'`

Going to carry on trying to find some answers but any ideas welcome.

L-plate-coder avatar Sep 15 '20 04:09 L-plate-coder

The cache doesn’t work like it used to. Here’s an example I put together working through my troubles.

https://medium.com/@fauxRight/exploring-xbrl-in-r-part-1-c5fbdba3054b

On Mon, Sep 14, 2020 at 11:23 PM L-plate-coder [email protected] wrote:

FilingsonEdgar <- edgarWebR::company_filings(x = "SBUX", type = "10-K") DocumentsonEdgar <- edgarWebR::filing_documents(x = test$href[1]) link <- DocumentsonEdgar[DocumentsonEdgar[5] == 'XML', 4]

Returned:

Error in edgarWebR::filing_documents(x = test$href[1]) : object 'test' not found

But I knew where the file was so wrote the URL manually (and checked it many times).

All good until I got to the xbrl.vars <- XBRL::xbrlDoAll(link, verbose=TRUE)

Whereupon, same as mentioned before:

`> xbrl.vars <- XBRL::xbrlDoAll(link, verbose = T)

Downloading to cache dir...trying URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231x10k59d41b_htm.xml '

downloaded 4.3 MB

Schema: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231.xsd

Level: 1 ==> https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231.xsd

Downloading to cache dir...trying URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231.xsd '

downloaded 99 KB

Roles

Elements

xbrl.Cache/abt-20191231.xsd ==> Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_pre.xml

Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_pre.xml

Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_pre.xml

Downloading to cache dir...trying URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_pre.xml '

downloaded 853 KB

Presentations.

xbrl.Cache/abt-20191231.xsd ==> Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_cal.xml

Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_cal.xml

Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_cal.xml

Downloading to cache dir...trying URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_cal.xml '

downloaded 111 KB

Calculations.

xbrl.Cache/abt-20191231.xsd ==> Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_def.xml

Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_def.xml

Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_def.xml

Downloading to cache dir...trying URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_def.xml '

downloaded 516 KB

Definitions.

xbrl.Cache/abt-20191231.xsd ==> Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_lab.xml

Linkbase: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_lab.xml

Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_lab.xml

Downloading to cache dir...trying URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/abt-20191231_lab.xml '

downloaded 1.0 MB

Labels.

xbrl.Cache/abt-20191231.xsd ==> Schema: http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd

Schema: http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd

Level: 2 ==> http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd

Downloading to cache dir...trying URL ' http://www.xbrl.org/dtr/type/numeric-2009-12-16.xsd'

Content type 'application/xml' length 3462 bytes

downloaded 3462 bytes

Elements

xbrl.Cache/numeric-2009-12-16.xsd ==> Schema: http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd

Schema: http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd

Level: 3 ==> http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd

Downloading to cache dir...trying URL ' http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd'

Content type 'application/xml' length 23637 bytes (23 KB)

downloaded 23 KB

Elements

xbrl.Cache/xbrl-instance-2003-12-31.xsd ==> Schema: http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd

Schema: http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd

Level: 4 ==> http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd

Downloading to cache dir...trying URL ' http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd'

Content type 'application/xml' length 16088 bytes (15 KB)

downloaded 15 KB

Elements

xbrl.Cache/xbrl-linkbase-2003-12-31.xsd ==> Schema: http://www.xbrl.org/2003/xl-2003-12-31.xsd

Schema: http://www.xbrl.org/2003/xl-2003-12-31.xsd

Level: 5 ==> http://www.xbrl.org/2003/xl-2003-12-31.xsd

Downloading to cache dir...trying URL ' http://www.xbrl.org/2003/xl-2003-12-31.xsd'

Content type 'application/xml' length 8734 bytes

downloaded 8734 bytes

Elements

xbrl.Cache/xl-2003-12-31.xsd ==> Schema: http://www.xbrl.org/2003/xlink-2003-12-31.xsd

Schema: http://www.xbrl.org/2003/xlink-2003-12-31.xsd

Level: 6 ==> http://www.xbrl.org/2003/xlink-2003-12-31.xsd

Downloading to cache dir...trying URL ' http://www.xbrl.org/2003/xlink-2003-12-31.xsd'

Content type 'application/xml' length 3350 bytes

downloaded 3350 bytes

Elements

xbrl.Cache/xbrl-linkbase-2003-12-31.xsd ==> Schema: http://www.xbrl.org/2003/xlink-2003-12-31.xsd

Schema: http://www.xbrl.org/2003/xlink-2003-12-31.xsd

Already discovered. Skipping

xbrl.Cache/abt-20191231.xsd ==> Schema: http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd

Schema: http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd

Level: 2 ==> http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd

Downloading to cache dir...trying URL ' http://www.xbrl.org/dtr/type/nonNumeric-2009-12-16.xsd'

Content type 'application/xml' length 4024 bytes

downloaded 4024 bytes

Elements

xbrl.Cache/nonNumeric-2009-12-16.xsd ==> Schema: http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd

Schema: http://www.xbrl.org/2003/xbrl-instance-2003-12-31.xsd

Already discovered. Skipping

xbrl.Cache/abt-20191231.xsd ==> Schema: http://www.xbrl.org/lrr/role/negated-2009-12-16.xsd

Schema: http://www.xbrl.org/lrr/role/negated-2009-12-16.xsd

Level: 2 ==> http://www.xbrl.org/lrr/role/negated-2009-12-16.xsd

Downloading to cache dir...trying URL ' http://www.xbrl.org/lrr/role/negated-2009-12-16.xsd'

Content type 'application/xml' length 2108 bytes

downloaded 2108 bytes

Elements

xbrl.Cache/negated-2009-12-16.xsd ==> Schema: http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd

Schema: http://www.xbrl.org/2003/xbrl-linkbase-2003-12-31.xsd

Already discovered. Skipping

xbrl.Cache/abt-20191231.xsd ==> Schema: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd

Schema: https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd

Level: 2 ==> https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd

Downloading to cache dir...trying URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd '

Error in fileFromCache(file) :

Error in download.file(file, cached.file, quiet = !verbose) :

cannot open URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd '

In addition: Warning message:

In download.file(file, cached.file, quiet = !verbose) :

cannot open URL ' https://www.sec.gov/Archives/edgar/data/1800/000110465920023904/https://xbrl.sec.gov/dei/2019/dei-2019-01-31.xsd': HTTP status was '404 Not Found'`

Going to carry on trying to find some answers but any ideas welcome.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sewardlee337/finreportr/issues/17#issuecomment-692454655, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIZEVZGSODMMN6VHGF6PJYLSF3T4LANCNFSM4HXA74QQ .

enFinExplorer avatar Sep 15 '20 04:09 enFinExplorer

This is an issue with the XBRL library when the Schema URL is HTTPS. Specifically this part of the XBRL/R/XBRL.R file in the library (sourced from https://cran.r-project.org/web/packages/XBRL/index.html):

  fixFileName <- function(dname, file.name) {
    if (substr(file.name, 1, 5) != "http:") {
      if (substr(file.name, 1, 5) == "../..") { ## A better solution is preferred, but it works for now
        file.name <- paste0(dirname(dirname(dname)), "/",  substr(file.name, 7, nchar(file.name)))
      } else if (substr(file.name, 1, 2) == "..") {
        file.name <- paste0(dirname(dname), "/", substr(file.name, 4, nchar(file.name)))
      } else {
        file.name <- paste0(dname,"/", file.name)
      }
    }
    file.name
  }

It checks that if the URL doesn't start with "http:" then it starts modifying it. It prepends the file name with the parent directory of the original request:

dname = 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/'
file.name = 'https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd'
fixFileName returns 'https://www.sec.gov/Archives/edgar/data/1341439/000156459019023119/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd'

I'm new to R, but looking into how to recompile the library and force it to use the fixed version of the file.

jwozny avatar Feb 08 '21 04:02 jwozny

Is it possible this is the same issue? Looks I'm getting a "doubled" URL - HTTP status was '404 Not Found'

Thanks for your time!

> GetIncome('GOOG', 2019)
Error in fileFromCache(file) : 
  Error in download.file(file, cached.file, quiet = !verbose) : 
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1652044/000165204419000004/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd'

In addition: Warning message:
In download.file(file, cached.file, quiet = !verbose) :
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1652044/000165204419000004/https://xbrl.sec.gov/dei/2018/dei-2018-01-31.xsd': HTTP status was '404 Not Found'

Versioning:

# R --version
R version 4.0.4 (2021-02-15) -- "Lost Library Book"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

# uname -a
Linux t470p 5.4.0-65-generic #73~18.04.1-Ubuntu SMP Tue Jan 19 09:02:24 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux`

mmbostwick avatar Feb 23 '21 02:02 mmbostwick

Seems like the package doesn't work anymore. Most of the tutorial here doesn't work at all https://rpubs.com/rwalkerWU/Usingfinreportr

for example JPM.IS <- GetIncome("JPM", 2015) JPM.BS <- GetBalanceSheet("JPM", 2015) JPM.SCF <- GetCashFlow("JPM", 2015) AnnualReports("JPM") #sometimes works

Trying any number of commands gives Error in open.connection(x, "rb") : HTTP error 403. or Error in fileFromCache(file.inst) : Error in download.file(file, cached.file, quiet = !verbose) : cannot open URL 'https://www.sec.gov/Archives/edgar/data/1326801/000132680116000043/fb-20151231.xml'

In addition: Warning messages: 1: closing unused connection 4 (http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=JPM&type=10-k&dateb=&owner=exclude&count=100) 2: closing unused connection 3 (http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=JPM&type=10-k&dateb=&owner=exclude&count=100) 3: In download.file(file, cached.file, quiet = !verbose) : cannot open URL 'https://www.sec.gov/Archives/edgar/data/1326801/000132680116000043/fb-20151231.xml': HTTP status was '403 Forbidden'

Most of the '403 Forbidden' cannot open URL errors are due to SEC EDGAR requiring a user agent authentication. I was able to fix for XBRL package with the following. Insert your own name and email in the string.

options(HTTPUserAgent = "yourname [email protected]")

IEORTools avatar Jan 29 '22 22:01 IEORTools

Most of the '403 Forbidden' cannot open URL errors are due to SEC EDGAR requiring a user agent authentication. I was able to fix for XBRL package with the following. Insert your own name and email in the string.

options(HTTPUserAgent = "yourname [email protected]")

This worked for me! Thanks @IEORTools

coleburdette avatar May 11 '22 20:05 coleburdette

Here is a possible solution by editing the XBRL source code to fix the URL issue

https://stackoverflow.com/questions/53651481/schema-file-does-not-exist-in-xbrl-parse-file

IEORTools avatar Nov 13 '22 00:11 IEORTools

Thanks, I’ll check it out.

A

On 13 Nov 2022, at 11:14 am, Larry @.***> wrote:

Here is a possible solution by editing the XBRL source code to fix the URL issue

https://stackoverflow.com/questions/53651481/schema-file-does-not-exist-in-xbrl-parse-file https://stackoverflow.com/questions/53651481/schema-file-does-not-exist-in-xbrl-parse-file — Reply to this email directly, view it on GitHub https://github.com/sewardlee337/finreportr/issues/17#issuecomment-1312601990, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKMDQ45OLRHF74HPYDWQOSDWIAXHFANCNFSM4HXA74QQ. You are receiving this because you commented.

L-plate-coder avatar Nov 13 '22 05:11 L-plate-coder