how to draw geom_bands or geom_text on bar graph with discrete bars
Happy New Year,
An easy one for the start of the year.
I'm trying to draw a geom_bands and text across discrete bars.
In the graph below I want to draw three vertical bands of different colours that highlight the bins belonging to the same company - Subaru: ['Outback', 'Impreza', 'BRZ'], Volkswagen: ['Jetta', 'Passat'] and AMC ['Matador', 'Rambler', 'Pacer'] and also label each band.
I'm not sure how to obtain the position and bin width of the discrete bins so to use as numerical values for geom_band(xmin, xmax). Same issue for geom_text.
` import numpy as np import pandas as pd from lets_plot import *
LetsPlot.setup_html() np.random.seed(69)
cars = pd.DataFrame({ 'Models': ['Outback', 'Impresa', 'BRZ', 'Jetta', 'Passat', 'Matador', 'Rambler', 'Pacer'], 'Val': np.random.uniform(0,100, size=8) })
p3 = (ggplot(data=cars, mapping=aes(x='Models', weight='Val')) + geom_bar()) `
Thanks!
Hi, thanks, happy New Year to you too! The equivalent continuous positions would be 0 for the Outback tick-mark through 7 for the Pacer tick-mark.
Hi Alshan,
I'm not sure what you mean. Are you suggesting that for the car bins I use categorical values in the x-axis and for the bands I provide them numerical x positions and the plot will internally order the bands and bars along the x-axis?
I tried that with the following code
` import numpy as np import pandas as pd from lets_plot import *
LetsPlot.setup_html() np.random.seed(69)
'''
data for bins
""
cars = pd.DataFrame({
'Models_str': ['Outback', 'Impresa', 'BRZ', 'Jetta', 'Passat', 'Matador', 'Rambler', 'Pacer'],
'Models': [0,1,2,3,4,5,6,7],
'Val': np.random.uniform(0,100, size=8),
})
'''
data for bands and geom_text
'''
cars_band = pd.DataFrame({
'Brand': ['Subaru', 'Volkswagen', 'AMC'],
'pos_minx': [-0.5, 2.5, 4.5],
'pos_maxx':[2.5, 4.5, 7.5],
'pos_maxy':3*[100],
'M':['#41DC8E', '#E0FFFF','#90D5FF']
})
p3 = (ggplot(data=cars, mapping=aes(x='Models_str', weight='Val')) + geom_band(data=cars_band, mapping=aes(xmin='pos_minx', xmax='pos_maxx', fill='Brand', color='Brand')) + geom_bar() + scale_fill_manual(values=cars_band.M) + scale_color_manual(values=cars_band.M) )
p3.show()
`
I get the following plot
To give you an idea what of the graph that I would like to draw.. I converted all the centres into numerical positions and replaced the labels of the continuous x axis.
The plotting line using, the same data above, is below
` ''' dictionary for x-axis ''' d1 = {0: 'Outback', 1: 'Impreza', 2: 'BRZ', 3: 'Jetta', 4: 'Passat', 5: 'Matador', 6: 'Rambler', 7: 'Pacer'}
p3a = (ggplot(data=cars, mapping=aes(x='Models', weight='Val'))
+ geom_band(data=cars_band, mapping=aes(xmin='pos_minx', xmax='pos_maxx',
fill = 'Brand', color='Brand'), alpha=0.5)
+ geom_text(data=cars_band, mapping=aes(x='pos_minx', y='pos_maxy', label='Brand'),
size=8, fontface='bold', hjust='left')
+ geom_bar()
+ scale_fill_manual(values=cars_band.M)
+ scale_color_manual(values=cars_band.M)
+ scale_x_continuous(labels=d1)
+ theme(legend_position='none')
p3a.show()
`
Is there an easier or more direct way to do this.
I'm just a bit worried with that I may be missing an obvious conversion to generate coordinates for all categorical information. Perhaps there is an easier way to convert them using a lets-plot function or element or I'm not preparing the data correctly so I can use scale_x_discrete . This is the element I use in other plots with gem_bar for categorical data.
For example, what would I do if I use the dodge position for the geom_bar?
Another related question.. is all the drawing code written in Kotlin? When I have a bit of time, I'm tempted to to explore how everything is calculated to see if I can fix something. Is there a developer documentation describing how library works. ?
Thanks!
Is there an easier or more direct way to do this.
In my opinion, an easier way would be to keep the x-axis discrete and add all annotating bands and texts with their attributes set via parameters (as opposed to using data mapping via aes()).
Here's how it can work:
p4 = ggplot(data=cars, mapping=aes(x='Models_str', weight='Val'))
# Add annotating geoms
brand = ['Subaru', 'Volkswagen', 'AMC']
pos_minx = [-0.5, 2.5, 4.5]
pos_maxx = [2.5, 4.5, 7.5]
pos_maxy = 3*[100]
colors = ['#41DC8E', '#E0FFFF','#90D5FF']
for i in range(3):
p4 += (geom_band(xmin=pos_minx[i], xmax=pos_maxx[i], fill=colors[i], color=colors[i], alpha=0.5) +
geom_text(label=brand[i], x=pos_minx[i], y=pos_maxy[i], size=8, fontface='bold', hjust='left'))
# Add bars ontop
p4 + geom_bar() + ggsize(700, 400)
For example, what would I do if I use the dodge position for the geom_bar?
Numeric coordinates of the tick-marks will be the same and you will use decimal numbers for positioning anywhere between or beyond the tick-marks. So, basically will be no difference.
is all the drawing code written in Kotlin?
Right, all is written in Kotlin then compiled to JS.
Hi @syd-doyen, could you try v4.7.0?
Now continuous data is handled by discrete scales as you would expect. Take a look: https://nbviewer.org/github/JetBrains/lets-plot/blob/master/docs/f-25b/numeric_data_on_discrete_scale.ipynb