AlgebraOfGraphics.jl icon indicating copy to clipboard operation
AlgebraOfGraphics.jl copied to clipboard

handling missing values

Open mkborregaard opened this issue 4 years ago • 2 comments

AlgebraOfGraphics don't seem to handle missing values the same as Makie. Consider Screenshot 2021-10-15 at 11 03 44

Using dropmissing on the full DataFrame is not viable, as the missings are likely to be in columns you don't use. A workaround is

data(dropmissing(allison, [:BodyWt, :Gestation])) *
	mapping(:BodyWt, :Gestation) |> 
        draw

but I believe you would expect AoG to handle missing like Makie (and it would be more user friendly - the current behaviour seems like a bug.)

mkborregaard avatar Oct 15 '21 12:10 mkborregaard

The discrepancy between what I would expect (and Makie does) and what AoG does is more pronounced for line graphs: I want the missing data points to break the line (i.e. be missing). AoG by default instead maps those values to max+1, as seen above, and even the proposed workaround doesn't fix the issue: dropmissing doesn't break the line but links the surrounding points, so that there's a potentially misleading line passing through the "missing" datapoints.

grahamas avatar Mar 02 '22 18:03 grahamas

I would rather get an error and pass skipmissing=true where needed rather than have incomplete and misleading results by default. Maybe there could even be a functional "preprocessing" API similar to the Analysis API which would allow transforming the data prior to plotting.

data(d) * preprocess(dropmissing) * ...

jariji avatar Aug 27 '22 19:08 jariji