CSV.jl icon indicating copy to clipboard operation
CSV.jl copied to clipboard

improve time to first csv load

Open rafaqz opened this issue 3 years ago • 8 comments

This PR aims to at least partially solve #974

This first commit demonstrates that simply breaking up large unstable functions can greatly improve compilation time. Splitting File here into threaded and single threaded contexts means only the one we are using gets compiled. For me this shaves the time from #974 from 16seconds to 12 seconds. Without changing anything else.

There are probably a lot more potential for this. The functions in file.jl all seem to be too long.

rafaqz avatar Jan 31 '22 11:01 rafaqz

bravo!

bkamins avatar Jan 31 '22 12:01 bkamins

Fixing type stability and adding precompilination runs brings the CSV.File load time down to 7 seconds, and using CSV is 1 second.

There is another second available by removing the @refargs macro so Context precompiles, but I'm not sure what the purpose of this macro is?

rafaqz avatar Feb 01 '22 08:02 rafaqz

Sorry for not being able to review earlier; I think this is exciting stuff! One comment I'll make is that a lot of the current code organization was done in an effort to avoid recompilation costs as much as possible. It's become clear to me now that doing that has brought on a pretty severe cost for TTFR, but I'd also like to keep that in mind as we make efforts to reduce TTFR. I.e. some of the changes here mean that if you slightly change a keyword argument, we have to pay significant compilation costs again, which hopefully we can avoid. In CSV.jl early days, we were at the opposite end of the spectrum where every single call to CSV.File almost guaranteed recompiling everything; but I think I agree that we've probably swung back too far the other way by compiling too much when you initially call CSV.File and we can allow more incremental compilation as various features are actually used/needed while parsing.

As a reviewer comment, it would be nice to probably split this up into separate PRs? As-is, it's very hard to review since there's so much moving/changing around.

quinnj avatar Feb 01 '22 14:02 quinnj

I figured something like that might have been the case. If we can get precompilation working nearly everywhere and functions are broken up enough that they are decoupled, we should hopefully only see minor recompilation for small changes. Its really the file.jl internals that take a long time and they are precompiling for all common types now it seems. Context takes under a second the first time, so subsequent changes shouldn't be too bad either.

(Notice I've added type annotations to a lot of functions... that's just to remind me which functions will and wont need to recompile from changes further out, and to push more of them towards not recompiling)

I know this PR is currently pretty messy and buggy... its pretty much a blind search for type stability with Cthulhu.jl, not really taking continuity or clarity into account.

Unfortunately its hard to split up the PR while working on it because precompilation only works past some certain threshold of stability and decreased function size (that I don't totally understand to be honest), and before that most things have no noticeable effect. Once I can lock down everything required to make this work I can go back and split it all up into separate changes.

To summarise the mess of code changes, the main kinds of changes are:

  • smaller more specific functions
  • splitting unrelated functionality
  • splitting unstable sections from stable sections of code to isolate things
  • lots of @noinline to force functions to precompile separately (it seems to help with that anyway)
  • more concrete field types. pool being a Float64 or a Tuple causes a lot of instability, so 2 fields with a default of typemax(Int) for poollimit (which is already the default for limit) means it's always type stable (I guess it could also be one field thats always a tuple)
  • typed Context argments, removing the @refargs macro
  • removing the __init__ method and using precompilation instead, once it starts working

Some of these things are probably incantations that do nothing, its hard to tell exactly what causes precompilation to work and how much type instability and large functions is tolerable.

rafaqz avatar Feb 01 '22 14:02 rafaqz

Precompile in Parsers.jl helps a lot here: https://github.com/JuliaData/Parsers.jl/pull/108

TTFX with both of these changes is now about 5 seconds. (but there are quite a few bugs in this PR, and its a mess)

Surprisingly, once type stability was good enough, precompiling things here stopped improving things, and actually make ttfx worse!

rafaqz avatar Feb 03 '22 23:02 rafaqz

Now that the Parsers PR is merged, what is the status of this?

oscardssmith avatar Feb 14 '22 19:02 oscardssmith

There another 30-40% TTFX improvement in some of this code, but some changes are unnecessary now - this branch had larger gains without the Parsers.jl fix.

The changes here that still make sense will be more controversial for more modest improvements, so won't be high priority for me.

rafaqz avatar Feb 14 '22 19:02 rafaqz

Probably restructuring Context and the preloaded of that would be the most obvious next step.

rafaqz avatar Feb 14 '22 19:02 rafaqz

I suppose this is outdated and can be closed.

ViralBShah avatar Feb 28 '24 00:02 ViralBShah

Agreed, but a lot of these changes seem good (at least from a usability perspective)

oscardssmith avatar Feb 28 '24 00:02 oscardssmith

I didnt mention here but after the later fix to Parsers.jl (removing just 4 @inlines 😂) this PR stopped doing much:

https://github.com/JuliaData/Parsers.jl/pull/1140

All the @noinlines and function barriers were effective because of the constants that Parsers.jl inlined. There are probably a few type stability things here worth keeping, but maybe starting a new PR and benchmarking without the Parsers.jl problems would make more sense.

rafaqz avatar Feb 28 '24 08:02 rafaqz