finreportr icon indicating copy to clipboard operation
finreportr copied to clipboard

No filings available for given year

Open jsta opened this issue 7 years ago • 18 comments

My run of the example README.md code for GetIncome, GetBalanceSheet, and GetCashFlow does not produce the expected output. For example, running GetIncome("GOOG", 2015) produces:

Error in GetAccessionNo(symbol, year, foreign = FALSE) :
no filings available for given year

https://github.com/openjournals/joss-reviews/issues/119

jsta avatar Nov 14 '16 23:11 jsta

Thank you for the catch. The example in the README is outdated because of GOOG's change in corporate structure in 2015 -- the SEC website no longer hosts GOOG's 2015 annual report.

I have changed examples in README.md to 2016 accordingly (e.g. GetIncome("GOOG", 2016)).

sewardlee337 avatar Nov 14 '16 23:11 sewardlee337

GetIncome("GOOG", 2016) appears to download files to a XBRLcache folder but returns the following error:

Error in fileFromCache(file) : Error in download.file(file, cached.file, method = "auto", quiet = !verbose) : cannot download all files

In addition: Warning message: In download.file(file, cached.file, method = "auto", quiet = !verbose) : URL 'https://www.sec.gov/Archives/edgar/data/1652044/000165204416000012/goog-20151231_pre.xml': status was '404 Not Found'

jsta avatar Nov 15 '16 00:11 jsta

sessionInfo()

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] finreportr_1.0.1     devtools_1.12.0.9000

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7        XML_3.98-1.5       digest_0.6.10      dplyr_0.5.0.9000  
 [5] withr_1.0.2        assertthat_0.1     XBRL_0.99.17       R6_2.2.0          
 [9] DBI_0.5-1          magrittr_1.5       httr_1.2.1         stringi_1.1.2     
[13] lazyeval_0.2.0     curl_2.2           xml2_1.0.0.9001    tools_3.3.2       
[17] stringr_1.1.0      selectr_0.3-0      pkgload_0.0.0.9000 rvest_0.3.2       
[21] memoise_1.0.0      tibble_1.2

jsta avatar Nov 15 '16 00:11 jsta

Thanks! I will look into it and try to reproduce this error to figure out what is going on.

sewardlee337 avatar Nov 15 '16 00:11 sewardlee337

I've tried it on Windows and Ubuntu, and it seems to work on my end.

sessionInfo

R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] finreportr_1.0.1 devtools_1.12.0 

loaded via a namespace (and not attached):
 [1] magrittr_1.5   R6_2.2.0       assertthat_0.1 DBI_0.5-1      tools_3.2.3    withr_1.0.2    dplyr_0.5.0    tibble_1.2    
 [9] curl_2.2       Rcpp_0.12.7    memoise_1.0.0  digest_0.6.10 

@jsta - Could you try deleting the XBRLcache folder and try running GetIncome("GOOG", 2016) again? If that still does not work, could you send me a list of file contents in the folder, so that I may see what's going on?

sewardlee337 avatar Nov 16 '16 20:11 sewardlee337

Still getting an error. I tracked down the source of the error to the line in the GetInstFile function calling the XBRL::xbrlDoAll function. I cannot even run the examples from the man page for XBRL::xbrlDoAll. It fails with the same error.

jsta avatar Nov 17 '16 01:11 jsta

xbrlDoAll calls a function in the XBRL package.

Can you try running XBRL::xbrlDoALL('https://www.sec.gov/Archives/edgar/data/1288776/000165204416000012/goog-20151231.xml')? This will help me determine if the problem is in the XBRL package.

(This function from the XBRL package will download a lot of files, which you may want to delete afterwards.)

sewardlee337 avatar Nov 17 '16 01:11 sewardlee337

Thats what I was saying in my earlier comment. I am having trouble with the XBRL package itself. Your example gives me the same error message as I originally reported. Too bad XBRL doesn't have an official Github page to report issues.

jsta avatar Nov 17 '16 02:11 jsta

Why is XBRL not in your sessionInfo() under loaded via a namespace?

jsta avatar Nov 17 '16 15:11 jsta

After I run GetIncome(), XBRL appears under loaded via a namespace:

R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] finreportr_1.0.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7    XML_3.98-1.5   dplyr_0.5.0    assertthat_0.1 XBRL_0.99.17   R6_2.2.0       DBI_0.5-1      magrittr_1.5  
 [9] httr_1.2.1     stringi_1.1.2  curl_2.2       lazyeval_0.2.0 xml2_1.0.0     tools_3.2.3    stringr_1.1.0  selectr_0.3-0 
[17] rvest_0.3.2    tibble_1.2 

I am corresponding with the author of the XBRL package directly via email to get some insight into why GetIncome() doesn't run properly in your R session...

sewardlee337 avatar Nov 17 '16 19:11 sewardlee337

The author of XBRL also runs Ubuntu 16.04. His only suggestion:

Maybe running an: update.packages() may help.

If it works on some computers and R sessions, but not others, I suspect a settings or configuration problem. Will continue to investigate.

sewardlee337 avatar Nov 20 '16 02:11 sewardlee337

@jsta - I am still trying to diagnose why the XBRL package does not work on your computer/R session.

Can you type the following into your terminal, and show me the output?

curl -I -v https://www.sec.gov/Archives/edgar/data/1652044/000165204416000012/goog-20151231_pre.xml

I am hoping this will give me more information to diagnose the underlying issue.

sewardlee337 avatar Nov 26 '16 22:11 sewardlee337

I think I solved the downloading issue using the fix here. Unfortunately, now I'm getting an error with fixFileName and I'm not familiar enough with Rcpp to debug whats going on in xbrlGetSchemaName.cpp.

Using options(error = recover):

Error in if (substr(file.name, 1, 5) != "http:") { : argument is of length zero

Enter a frame number, or 0 to exit

1: xbrlDoAll(inst, cache.dir = "XBRLcache", prefix.out = "out", verbose = TRUE) 2: xbrlDoAll.R#30: xbrl$processSchema(xbrl$getSchemaName()) 3: XBRL.R#115: cat("Schema: ", file, "\n") 4: xbrl$getSchemaName() 5: XBRL.R#110: fixFileName(dname.inst, .Call("xbrlGetSchemaName", doc.inst, PACK

jsta avatar Nov 27 '16 00:11 jsta

Ok, it turns out that the download.file issue described here was the problem. Installing XBLR from my fork with devtools::install_github("jsta/XBLR") fixed the problem completely!

head(GetIncome("GOOG", 2016))

Metric Units Amount startDate endDate 1 Revenues usd 55519000000 2013-01-01 2013-12-31 2 Revenues usd 66001000000 2014-01-01 2014-12-31 3 Revenues usd 74989000000 2015-01-01 2015-12-31 4 Cost of Revenue usd 21993000000 2013-01-01 2013-12-31 5 Cost of Revenue usd 25691000000 2014-01-01 2014-12-31 6 Cost of Revenue usd 28164000000 2015-01-01 2015-12-31

jsta avatar Nov 27 '16 02:11 jsta

@jsta - I've done some research on the Stack Overflow solution. It seems like the issue has to do with the method that download.file() uses to download the XBRL files (see official help page). method = "curl" seems to be what works for you.

I've never had a problem with the default setting method = "auto" after testing the XBRL package on three computers (Windows and Ubuntu), so I suspected that it's an issue with global options settings in your R session. I've tried toggling through different download.file() methods in my options settings, but still couldn't reproduce your bug. For example:

> ### method = "wget"
> getOption("download.file.method")
[1] "wget"
> 
> head(GetIncome("GOOG", 2016))
           Metric Units      Amount  startDate    endDate
1        Revenues   usd 55519000000 2013-01-01 2013-12-31
2        Revenues   usd 66001000000 2014-01-01 2014-12-31
3        Revenues   usd 74989000000 2015-01-01 2015-12-31
4 Cost of Revenue   usd 21993000000 2013-01-01 2013-12-31
5 Cost of Revenue   usd 25691000000 2014-01-01 2014-12-31
6 Cost of Revenue   usd 28164000000 2015-01-01 2015-12-31
>
> ### method = "internal"
> options(download.file.method = "internal")
> head(GetIncome("GOOG", 2016))
           Metric Units      Amount  startDate    endDate
1        Revenues   usd 55519000000 2013-01-01 2013-12-31
2        Revenues   usd 66001000000 2014-01-01 2014-12-31
3        Revenues   usd 74989000000 2015-01-01 2015-12-31
4 Cost of Revenue   usd 21993000000 2013-01-01 2013-12-31
5 Cost of Revenue   usd 25691000000 2014-01-01 2014-12-31
6 Cost of Revenue   usd 28164000000 2015-01-01 2015-12-31
>
> ### method = "auto"
> options(download.file.method = "auto")
> head(GetIncome("GOOG", 2016))
           Metric Units      Amount  startDate    endDate
1        Revenues   usd 55519000000 2013-01-01 2013-12-31
2        Revenues   usd 66001000000 2014-01-01 2014-12-31
3        Revenues   usd 74989000000 2015-01-01 2015-12-31
4 Cost of Revenue   usd 21993000000 2013-01-01 2013-12-31
5 Cost of Revenue   usd 25691000000 2014-01-01 2014-12-31
6 Cost of Revenue   usd 28164000000 2015-01-01 2015-12-31

Therefore, I currently suspect that what's causing problems on your end is some network restriction that specifically you are subjected to.

If this is the case, how should we proceed?

sewardlee337 avatar Nov 28 '16 06:11 sewardlee337

Hmm, without my XBRL fork I am getting the same error in a Docker instance of this image. I am dialed into a remote server on a separate network.

R version 3.3.2 (2016-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux stretch/sid

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] finreportr_1.0.1 devtools_1.12.0

loaded via a namespace (and not attached): [1] Rcpp_0.12.8 XML_3.98-1.5 digest_0.6.10 dplyr_0.5.0 withr_1.0.2
[6] assertthat_0.1 XBRL_0.99.17 R6_2.2.0 DBI_0.5-1 git2r_0.16.0
[11] magrittr_1.5 httr_1.2.1 stringi_1.1.2 lazyeval_0.2.0 curl_2.3
[16] xml2_1.0.0 tools_3.3.2 stringr_1.1.0 selectr_0.3-0 rvest_0.3.2
[21] memoise_1.0.0 knitr_1.15.1 tibble_1.2

In my opinion the ideal would be to have more tests and see if travis has the same error.

As far as the JOSS review, I understand that there is not much you can do about an error in a dependency. Now that I have it working with my XBRL fork, I will proceed with the review and I suppose get the opinion of the JOSS editor afterwards...

jsta avatar Nov 30 '16 01:11 jsta

I had the same issue on osx and debian, "fixing" XBRL's method from 'auto' to 'curl' solved this.

jibanes avatar Dec 29 '16 18:12 jibanes

Hello,

Assuming I have first failed with

xbrl.vars <- xbrlDoAll(inst,verbose = FALSE),

have options(error=recover), and used R debugger to identify errornous file "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd".

Then I get an errors for:

download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp" ) download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp", method="auto" ) download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp", method="libcurl" )

BUT success for

download.file( "https://www.sec.gov/Archives/edgar/data/21344/000002134413000050/ko-20130927.xsd", "apu.tmp", method="curl" )

After some Googlin, I found an issue discussed in http://r.789695.n4.nabble.com/dowload-file-method-quot-libcurl-quot-and-GET-vs-HEAD-requests-td4722037.html

In R 3.2.4, if you ran download.file(method="libcurl"), it issues a HTTP GET request for the file. However, in R 3.3.0, it issues a HTTP HEAD request first, and then a GET requet. This can result in problems when the web server gives an error for a HEAD request, even if the file is available with a GET request.

with NO CAN DO resolution :(

No I don't think there is a way to avoid the HEAD request.

Suggestion: XBLR package should not define method='auto' because 'download.file uses

getOption("download.file.method", default = "auto")

and would allow user to override method at will.

BR, Jukka

jarjuk avatar Mar 22 '17 12:03 jarjuk