Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
Probabilistic-Programming-and-Bayesian-Methods-for-Hackers copied to clipboard
chapter 3,the cluster prediction’s issue.
the print “Probability of belongs to cluster 1:” should be “Probability of belongs to cluster 0:” ?? or i miss the meanings?
I think this is the case as well. Earlier in the chapter he states:
A priori, we do not know what the probability of assignment to cluster 1 is, so we form a uniform variable on (0,1) . We call call this p1 , so the probability of belonging to cluster 2 is therefore p2=1−p1 .
and defines the variables as:
with pm.Model() as model:
p1 = pm.Uniform('p', 0, 1)
p2 = 1 - p1
p = T.stack([p1, p2])
assignment = pm.Categorical("assignment", p,
shape=data.shape[0],
testval=np.random.randint(0, 2, data.shape[0]))
where p1
is the probability of belonging to the lower-mean (~120) cluster and p2
is the probability of belonging to the higher-mean (~190) cluster.
But later on he refers to the clusters as cluster 0 and cluster 1
we are interested in asking "Is the probability that x is in cluster 1 greater than the probability it is in cluster 0?", where the probability is dependent on the chosen parameters.
Where cluster 0 is the lower-mean cluster in this case and cluster 1 is the higher-mean cluster:
but then uses p_trace
(p1) which is the probability of belonging to cluster 0 instead of using p2 (1-p1):
v = p_trace * norm_pdf(x, loc=center_trace[:, 0], scale=std_trace[:, 0]) > \
(1 - p_trace) * norm_pdf(x, loc=center_trace[:, 1], scale=std_trace[:, 1])
print("Probability of belonging to cluster 1:", v.mean())
I think this typo stems from the swtichup in syntax of "cluster 1 and cluster 2" to "cluster 0 and cluster 1"