inspectdf icon indicating copy to clipboard operation
inspectdf copied to clipboard

show_plot() bars not aligned w/ values below.

Open sfd99 opened this issue 3 years ago • 6 comments

Hi Alastair,

the inspectdf PKG is really USEFUL!.

But a show_plot() quirk...

try:

unique(mtcars$carb) 
[1] 4 1 2 3 6 8
 inspect_num(mtcars) %>% show_plot()

See?. The vertical bars in the CARB plot are not "aligned" with the unique value markers below, (in the x-axis).

The bars are all slightly "displaced" to the right... (not "on top" of the unique CARB values: 4 1 2 3 6 8 ). Even Zoomming the size of the Rstudio [Plots] Panel doesn't help.

Same problem with columns for: AM, GEAR , VS and CYL ...etc

Hope you can help. Thanks Alastair!

sfd99 San Francisco latest Rstudio/R/Ubuntu Linux inspectdf 0.0.11

image

sfd99 avatar Apr 02 '21 14:04 sfd99

Hey @sfd99 thanks for raising the issue! and sorry for the long delay in responding. Although it looks a bit strange once visualised, I think this is the expected behaviour - the histogram binning is carried out by base R's hist() function, and those bins won't necessarily center over the values when there are only a small number of unique values. For example, hist(mtcars$carb, breaks = 20) and hist(mtcars$cyl, breaks = 20) both generate images that are consistent with inspectdf.

However, I do see the problem, and I think there should be a feature to make it obvious that the histogram doesn't make much sense. I'll have a think and get back if there is an update. thanks again :)

alastairrushworth avatar Dec 10 '21 17:12 alastairrushworth

Hi Alastair,

Thanks for the response.

Yes, pls do let us know if you find a solution to this weird hist quirk. Looking forward to it!.

Great PKG...

Best, sfd99 San Francisco latest Rstudio/R/Ubuntu Linux inspectdf 0.0.11

sfd99 avatar Dec 11 '21 00:12 sfd99

I found a starting point to this misalignment hist problem:

https://stackoverflow.com/questions/41486027/how-to-align-the-bars-of-a-histogram-with-the-x-axis

Take a look!...

Did this Dr, Google query: how to align histogram bars with the corresponding values in R

One of the quoted solutions there, when x-values are int:

This will center the hist bar directly on top of the x-axis value:

data <- data.frame(number = c(5, 10, 11 ,12,12,12,13,15,15)) ggplot(data,aes(x = number)) + geom_histogram(binwidth = 0.5)

But there must be better R ways... SFd99

sfd99 avatar Dec 11 '21 01:12 sfd99

And finally, one last, good possible sol:

ggplot(mtcars, aes(x = factor(cyl))) + geom_bar()

image

VIA: https://www.guru99.com/r-bar-chart-histogram.html

sfd99 avatar Dec 11 '21 02:12 sfd99

thanks! I think a general solution to this is more complex than it sounds. It's not uncommon to have int columns that have very many unique values, where it wouldn't make sense to treat them as categories. Perhaps there could be a simple rule eg. fewer than 10 unique values and int --> count frequencies rather than using buckets. I'm not sure.

alastairrushworth avatar Dec 12 '21 12:12 alastairrushworth

Hi Alastair,

Yes, I agree.

That rule (your comment above), could be a possible solution to the hist alignment quirk...

If it works in a generic way, the inspectdf PKG:: show_plot() would be the first to solve it. :-)

thanks! SFd99

sfd99 avatar Dec 12 '21 22:12 sfd99