finreportr
finreportr copied to clipboard
No filings available for given year
My run of the example README.md code for GetIncome
, GetBalanceSheet
, and GetCashFlow
does not produce the expected output. For example, running GetIncome("GOOG", 2015)
produces:
Error in GetAccessionNo(symbol, year, foreign = FALSE) :
no filings available for given year
https://github.com/openjournals/joss-reviews/issues/119
Thank you for the catch. The example in the README is outdated because of GOOG's change in corporate structure in 2015 -- the SEC website no longer hosts GOOG's 2015 annual report.
I have changed examples in README.md to 2016 accordingly (e.g. GetIncome("GOOG", 2016)
).
GetIncome("GOOG", 2016)
appears to download files to a XBRLcache
folder but returns the following error:
Error in fileFromCache(file) : Error in download.file(file, cached.file, method = "auto", quiet = !verbose) : cannot download all files
In addition: Warning message: In download.file(file, cached.file, method = "auto", quiet = !verbose) : URL 'https://www.sec.gov/Archives/edgar/data/1652044/000165204416000012/goog-20151231_pre.xml': status was '404 Not Found'
sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] finreportr_1.0.1 devtools_1.12.0.9000
loaded via a namespace (and not attached):
[1] Rcpp_0.12.7 XML_3.98-1.5 digest_0.6.10 dplyr_0.5.0.9000
[5] withr_1.0.2 assertthat_0.1 XBRL_0.99.17 R6_2.2.0
[9] DBI_0.5-1 magrittr_1.5 httr_1.2.1 stringi_1.1.2
[13] lazyeval_0.2.0 curl_2.2 xml2_1.0.0.9001 tools_3.3.2
[17] stringr_1.1.0 selectr_0.3-0 pkgload_0.0.0.9000 rvest_0.3.2
[21] memoise_1.0.0 tibble_1.2
Thanks! I will look into it and try to reproduce this error to figure out what is going on.
I've tried it on Windows and Ubuntu, and it seems to work on my end.
sessionInfo
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] finreportr_1.0.1 devtools_1.12.0
loaded via a namespace (and not attached):
[1] magrittr_1.5 R6_2.2.0 assertthat_0.1 DBI_0.5-1 tools_3.2.3 withr_1.0.2 dplyr_0.5.0 tibble_1.2
[9] curl_2.2 Rcpp_0.12.7 memoise_1.0.0 digest_0.6.10
@jsta - Could you try deleting the XBRLcache
folder and try running GetIncome("GOOG", 2016)
again? If that still does not work, could you send me a list of file contents in the folder, so that I may see what's going on?
Still getting an error. I tracked down the source of the error to the line in the GetInstFile
function calling the XBRL::xbrlDoAll
function. I cannot even run the examples from the man page for XBRL::xbrlDoAll
. It fails with the same error.
xbrlDoAll
calls a function in the XBRL
package.
Can you try running XBRL::xbrlDoALL('https://www.sec.gov/Archives/edgar/data/1288776/000165204416000012/goog-20151231.xml')
? This will help me determine if the problem is in the XBRL
package.
(This function from the XBRL
package will download a lot of files, which you may want to delete afterwards.)
Thats what I was saying in my earlier comment. I am having trouble with the XBRL
package itself. Your example gives me the same error message as I originally reported. Too bad XBRL
doesn't have an official Github page to report issues.
Why is XBRL
not in your sessionInfo()
under loaded via a namespace
?
After I run GetIncome()
, XBRL
appears under loaded via a namespace
:
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] finreportr_1.0.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.7 XML_3.98-1.5 dplyr_0.5.0 assertthat_0.1 XBRL_0.99.17 R6_2.2.0 DBI_0.5-1 magrittr_1.5
[9] httr_1.2.1 stringi_1.1.2 curl_2.2 lazyeval_0.2.0 xml2_1.0.0 tools_3.2.3 stringr_1.1.0 selectr_0.3-0
[17] rvest_0.3.2 tibble_1.2
I am corresponding with the author of the XBRL
package directly via email to get some insight into why GetIncome() doesn't run properly in your R session...
The author of XBRL
also runs Ubuntu 16.04. His only suggestion:
Maybe running an: update.packages() may help.
If it works on some computers and R sessions, but not others, I suspect a settings or configuration problem. Will continue to investigate.
@jsta - I am still trying to diagnose why the XBRL
package does not work on your computer/R session.
Can you type the following into your terminal, and show me the output?
curl -I -v https://www.sec.gov/Archives/edgar/data/1652044/000165204416000012/goog-20151231_pre.xml
I am hoping this will give me more information to diagnose the underlying issue.
I think I solved the downloading issue using the fix here. Unfortunately, now I'm getting an error with fixFileName
and I'm not familiar enough with Rcpp to debug whats going on in xbrlGetSchemaName.cpp
.
Using options(error = recover)
:
Error in if (substr(file.name, 1, 5) != "http:") { : argument is of length zero
Enter a frame number, or 0 to exit
1: xbrlDoAll(inst, cache.dir = "XBRLcache", prefix.out = "out", verbose = TRUE) 2: xbrlDoAll.R#30: xbrl$processSchema(xbrl$getSchemaName()) 3: XBRL.R#115: cat("Schema: ", file, "\n") 4: xbrl$getSchemaName() 5: XBRL.R#110: fixFileName(dname.inst, .Call("xbrlGetSchemaName", doc.inst, PACK
Ok, it turns out that the download.file
issue described here was the problem. Installing XBLR
from my fork with devtools::install_github("jsta/XBLR")
fixed the problem completely!
head(GetIncome("GOOG", 2016))
Metric Units Amount startDate endDate 1 Revenues usd 55519000000 2013-01-01 2013-12-31 2 Revenues usd 66001000000 2014-01-01 2014-12-31 3 Revenues usd 74989000000 2015-01-01 2015-12-31 4 Cost of Revenue usd 21993000000 2013-01-01 2013-12-31 5 Cost of Revenue usd 25691000000 2014-01-01 2014-12-31 6 Cost of Revenue usd 28164000000 2015-01-01 2015-12-31
@jsta - I've done some research on the Stack Overflow solution. It seems like the issue has to do with the method that download.file()
uses to download the XBRL files (see official help page). method = "curl"
seems to be what works for you.
I've never had a problem with the default setting method = "auto"
after testing the XBRL
package on three computers (Windows and Ubuntu), so I suspected that it's an issue with global options settings in your R session. I've tried toggling through different download.file()
methods in my options settings, but still couldn't reproduce your bug. For example:
> ### method = "wget"
> getOption("download.file.method")
[1] "wget"
>
> head(GetIncome("GOOG", 2016))
Metric Units Amount startDate endDate
1 Revenues usd 55519000000 2013-01-01 2013-12-31
2 Revenues usd 66001000000 2014-01-01 2014-12-31
3 Revenues usd 74989000000 2015-01-01 2015-12-31
4 Cost of Revenue usd 21993000000 2013-01-01 2013-12-31
5 Cost of Revenue usd 25691000000 2014-01-01 2014-12-31
6 Cost of Revenue usd 28164000000 2015-01-01 2015-12-31
>
> ### method = "internal"
> options(download.file.method = "internal")
> head(GetIncome("GOOG", 2016))
Metric Units Amount startDate endDate
1 Revenues usd 55519000000 2013-01-01 2013-12-31
2 Revenues usd 66001000000 2014-01-01 2014-12-31
3 Revenues usd 74989000000 2015-01-01 2015-12-31
4 Cost of Revenue usd 21993000000 2013-01-01 2013-12-31
5 Cost of Revenue usd 25691000000 2014-01-01 2014-12-31
6 Cost of Revenue usd 28164000000 2015-01-01 2015-12-31
>
> ### method = "auto"
> options(download.file.method = "auto")
> head(GetIncome("GOOG", 2016))
Metric Units Amount startDate endDate
1 Revenues usd 55519000000 2013-01-01 2013-12-31
2 Revenues usd 66001000000 2014-01-01 2014-12-31
3 Revenues usd 74989000000 2015-01-01 2015-12-31
4 Cost of Revenue usd 21993000000 2013-01-01 2013-12-31
5 Cost of Revenue usd 25691000000 2014-01-01 2014-12-31
6 Cost of Revenue usd 28164000000 2015-01-01 2015-12-31
Therefore, I currently suspect that what's causing problems on your end is some network restriction that specifically you are subjected to.
If this is the case, how should we proceed?
Hmm, without my XBRL
fork I am getting the same error in a Docker instance of this image. I am dialed into a remote server on a separate network.
R version 3.3.2 (2016-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux stretch/sid
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] finreportr_1.0.1 devtools_1.12.0
loaded via a namespace (and not attached): [1] Rcpp_0.12.8 XML_3.98-1.5 digest_0.6.10 dplyr_0.5.0 withr_1.0.2
[6] assertthat_0.1 XBRL_0.99.17 R6_2.2.0 DBI_0.5-1 git2r_0.16.0
[11] magrittr_1.5 httr_1.2.1 stringi_1.1.2 lazyeval_0.2.0 curl_2.3
[16] xml2_1.0.0 tools_3.3.2 stringr_1.1.0 selectr_0.3-0 rvest_0.3.2
[21] memoise_1.0.0 knitr_1.15.1 tibble_1.2
In my opinion the ideal would be to have more tests and see if travis has the same error.
As far as the JOSS review, I understand that there is not much you can do about an error in a dependency. Now that I have it working with my XBRL
fork, I will proceed with the review and I suppose get the opinion of the JOSS editor afterwards...
I had the same issue on osx and debian, "fixing" XBRL's method from 'auto' to 'curl' solved this.
Hello,
Assuming I have first failed with
xbrl.vars <- xbrlDoAll(inst,verbose = FALSE),
have options(error=recover), and used R debugger to identify errornous file "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd".
Then I get an errors for:
download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp" ) download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp", method="auto" ) download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp", method="libcurl" )
BUT success for
download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp", method="curl" )
After some Googlin, I found an issue discussed in http://r.789695.n4.nabble.com/dowload-file-method-quot-libcurl-quot-and-GET-vs-HEAD-requests-td4722037.html
In R 3.2.4, if you ran download.file(method="libcurl"), it issues a HTTP GET request for the file. However, in R 3.3.0, it issues a HTTP HEAD request first, and then a GET requet. This can result in problems when the web server gives an error for a HEAD request, even if the file is available with a GET request.
with NO CAN DO resolution :(
No I don't think there is a way to avoid the HEAD request.
Suggestion: XBLR package should not define method='auto' because 'download.file uses
getOption("download.file.method", default = "auto")
and would allow user to override method at will.
BR, Jukka