data_science_in_julia_for_hackers
data_science_in_julia_for_hackers copied to clipboard
Chapter 3 Julian code feedback
Went through chapter 3 and changed some of the code to make it more "Julian". Most of the changes I made are in the bandits part, but I also made a small change to the code that reads the rainfall data:
begin
#Read the CSV file and transform it into a DataFrame
rain_data = CSV.read("data/historico_precipitaciones.csv", DataFrame)
#Rename the columns
colnames = ["Year", "Month", "mm", "Days"]
#Symbol is the type of object used to represent the labels of a dataset
rename!(rain_data, Symbol.(colnames))
#We use a dictionary to translate de Month names
translate = Dict(
"Enero" => "January",
"Febrero" => "February",
"Marzo" => "March",
"Abril" => "April",
"Mayo" => "May",
"Junio" => "June",
"Julio" => "July",
"Agosto" => "August",
"Septiembre" => "September",
"Octubre" => "October",
"Noviembre" => "November",
"Diciembre" => "December"
)
replace!(rain_data.Month, translate...)
end
The main change here is replacing the for
loop that was making the replacement in the Month
column.
I changed the definition of the beta_bandit
struct to BetaBandit
as it's the Julian way to use UpperCamelCase to name structs (see style guide in the docs.
Additionally I used Base.@kwdef
to have default values for the BetaBandit
struct. This goes in line with the following statement in the Constructors section of the Julia Docs
It is good practice to provide as few inner constructor methods as possible: only those taking all arguments explicitly and enforcing essential error checking and transformation. Additional convenience constructor methods, supplying default values or auxiliary transformations, should be provided as outer constructors that call the inner constructors to do the heavy lifting. This separation is typically quite natural.
# init with params a=1, b=1 (uniform distribution)
Base.@kwdef mutable struct BetaBandit
# the real success probability of the bandit
p::Float64
# a parameter from the beta distribution (successes)
a::Int64 = 1
# b parameter from the beta distribution (fails)
b::Int64 = 1
# total number of trials of bandit
N::Int64 = 0
end
Another way to do it would be to define an outside constructor for the struct, as such:
mutable struct BetaBandit
# the real success probability of the bandit
p::Float64
# a parameter from the beta distribution (successes)
a::Int64
# b parameter from the beta distribution (fails)
b::Int64
# total number of trials of bandit
N::Int64
end
BetaBandit(p) = BetaBandit(p, 1, 1, 0)
I modified the update_bandit
function to update_bandit!
. The Julia style guide suggests using !
at the end of method names that mutate their arguments. I also used the ternary operator to make the code more concise and Julian:
function update_bandit!(bandit::BetaBandit, outcome::Bool)
outcome ? bandit.a += 1 : bandit.b += 1
bandit.N += 1
end
For the rest of the bandit code I changed the following:
- Refactored
find_best_bandit
into a standalone function (this makes the Thompson sampling algorithm more clear IMO) -
N_TRIALS
andBANDIT PROBABILITIES
intoconst
since they won't change. - Refactored
plot_bandits
into a standalone function. Again, I think this makes the Thompson algorithm easier to see. - Used short circuit evaluation to update the
reward
variable in a more Julian way.
Here's the rest of the code:
const N_TRIALS = 100
const BANDIT_PROBABILITIES = [0.3, 0.5, 0.75]
function find_best_bandit(bandits)
_, mxidx = findmax([sample_bandit(bandit) for bandit in bandits])
bandits[mxidx]
end
function beta_bandit_experiment(band_probs, trials)
bandits = [BetaBandit(;p) for p in band_probs]
reward = 0
for i in 1:trials
best_bandit = find_best_bandit(bandits)
exp = pull_arm(best_bandit)
exp && (reward += 1)
update_bandit!(best_bandit, exp)
end
bandits
end
function plot_bandits(bandits)
plot()
for i in 1:length(bandits)
display(
plot!(
Beta(bandits[i].a, bandits[i].b),
xlim=(0, 1),
xlabel="Success probability of bandit",
ylabel="Probability density",
lw=2
)
)
end
current()
end
Hope you find this helpful!
@Francososa THANK you for taking the time to read the book and the feedback. We are in the middle of a big refactor. With @marianLambda @HermanObst and @pefontana we will discuss in the following weeks this. Thanks again!