Fix stan_csv_reader and test
The stan_csv_reader is broken and has been for some time. The internal state is extremely fragile and right now it incorrectly parses the input data file. Any fix would require some significant hacking -- we really should fix this as a consequence of the refactor in #405 and #488, after which we'll have an actual spec for the reader to match.
Unit tests never revealed this to be an issue because the associated test, stan_csv_reader_test, uses static output files for comparison that were deprecated. We now have the makefile-fu to automatically generate these files and avoid this issue in the future.
Michael --- could you start a branch and add a test that fails? Or just include a file that fails here?
I'm pushing this off to 2.4.0+ for now, since it doesn't seem to be critical. If it is, can someone push it back down to 2.4.0 and commit to fixing it?
Michael --- if you want this fixed, could you be more specific about what's going wrong?
I'm pushing this issue off indefinitely pending some indication of what's broken.
The problem is that the csv_reader is hardcoded to an older set of configuration options. It needs to automatically sync up with the current config (say, by actually using the config code), but that requires a decision on how the config should be used in the API.
On Sep 22, 2014, at 11:29 PM, Bob Carpenter [email protected] wrote:
Michael --- if you want this fixed, could you be more specific about what's going wrong?
I'm pushing this issue off indefinitely pending some indication of what's broken.
— Reply to this email directly or view it on GitHub.
I'm bumping this issue forward. This code is messy and really should just be rewritten.
@bob-carpenter, do you have any guiding principles for writing a parser for the csvs? This is also causing stan-dev/cmdstan#510.
My only strong recommendation is to write a generic CSV parser, then plug it into Stan. Don't mix in all sorts of stuff about reading from comments or particular orders of things in the Stan files.
That's reasonable, but it also means that this would be hard to resolve without changing out CmdStan output. Maybe we just punt on elegance and have CmdStan start outputting multiple files to isolate a clean CSV for samples?
On Wed, Nov 30, 2016 at 12:13 PM, Bob Carpenter [email protected] wrote:
My only strong recommendation is to write a generic CSV parser, then plug it into Stan. Don't mix in all sorts of stuff about reading from comments or particular orders of things in the Stan files.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/507#issuecomment-263982114, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdNlvKWynzxvnGPcc4kugW90dak4fzOks5rDdjqgaJpZM4BbKEP .
+1 to the multiple files approach
On Wed, Nov 30, 2016, 4:05 PM Michael Betancourt [email protected] wrote:
That's reasonable, but it also means that this would be hard to resolve without changing out CmdStan output. Maybe we just punt on elegance and have CmdStan start outputting multiple files to isolate a clean CSV for samples?
On Wed, Nov 30, 2016 at 12:13 PM, Bob Carpenter [email protected] wrote:
My only strong recommendation is to write a generic CSV parser, then plug it into Stan. Don't mix in all sorts of stuff about reading from comments or particular orders of things in the Stan files.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/507#issuecomment-263982114, or mute the thread < https://github.com/notifications/unsubscribe-auth/ABdNlvKWynzxvnGPcc4kugW90dak4fzOks5rDdjqgaJpZM4BbKEP
.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/507#issuecomment-263994946, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfA6Xm7v8l5qD-6Y37uuL6i6halwe60ks5rDeUFgaJpZM4BbKEP .
Or, split the processing into two steps. One that just pulls the CSV header and draws following ordinary CSV interpretation rules (not exactly a spec, but a strong convention). And then handle other things in a separate pass.
+1. I think we can implement it that way for now (2 passes).
On Wed, Nov 30, 2016 at 6:27 PM, Bob Carpenter [email protected] wrote:
Or, split the processing into two steps. One that just pulls the CSV header and draws following ordinary CSV interpretation rules (not exactly a spec, but a strong convention). And then handle other things in a separate pass.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/507#issuecomment-264029811, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZ_FxSIV8qzqBktWp9ODh4uaAZ7LodYks5rDgZUgaJpZM4BbKEP .