handson-ml2 icon indicating copy to clipboard operation
handson-ml2 copied to clipboard

Chapter 9 notebook minor issues

Open lebaste77 opened this issue 5 years ago • 3 comments

Hi,

It seems that there are somes issues or differences with my execution of the chapter 9 handbook.

My configuration :

print(sys.version_info)
print(sklearn.__version__)
print(np.__version__)
print(mpl.__version__)

sys.version_info(major=3, minor=8, micro=5, releaselevel='final', serial=0) 0.23.2 1.19.2 3.3.2

9.1 : Iris classification and GaussianMixture

It seems that the mapping of the prediction has changed in the GaussianMixture class in the sklearn version I use. Here is the code that works in my case :

y_pred = GaussianMixture(n_components=3, random_state=42).fit(X).predict(X)
mapping = np.array([1, 2, 0])
y_pred = np.array([mapping[cluster_id] for cluster_id in y_pred])

9.1.1 : plot_centroid function

The size of the cross is not correct in the handbook : it creates this (i used a different random_state for input data) : image instead of this : image This image was generated with the following code :

def plot_centroids(centroids, weights=None, circle_color='w', cross_color='k'):
    if weights is not None:
        centroids = centroids[weights > weights.max() / 10]
    plt.scatter(centroids[:, 0], centroids[:, 1],
                marker='o', s=50, linewidths=8,
                color=circle_color, zorder=10, alpha=0.9)
    plt.scatter(centroids[:, 0], centroids[:, 1],
                marker='x', s=50, linewidths=2,
                color=cross_color, zorder=11, alpha=1)

lebaste77 avatar Nov 07 '20 18:11 lebaste77

Thanks for your feedback. Unfortunately, when Scikit-Learn changes, the algorithms might vary slightly (and/or the random number generator), and it's impossible to get the exact same results that I obtained when creating the notebooks. I'll see if I can make the "mapping = ..." line more robust.

Similarly, when upgrading Matplotlib, some little things may change. I just ran the colab notebook for this chapter, and the crosses looked fine, but it uses Matplotlib 3.2 by default. After upgrading to Matplotlib 3.3, the crosses look huge, like in your screenshot. It's tricky to keep the notebooks up to date, and even trickier to make them output the same result for various versions of libraries.

Hope this helps.

ageron avatar Nov 20 '20 23:11 ageron

Hai @ageron ,Can yo please assign me this issue I will work on this and get back to you with solution in a couple of days.

ravitejasssihl avatar Oct 19 '22 14:10 ravitejasssihl

Hi @ravitejasssihl , That's kind of you, thanks! Sure, if you send a PR, I'll be happy to check it and merge it. Please ensure that the PR only includes the relevant diff. Side note: the 3rd edition of this book is now available (see https://homl.info/amazon3), and I updated all the libraries and everything, so perhaps it's best to just point people at the new notebook for chapter 9 at https://github.com/ageron/handson-ml3/blob/main/09_unsupervised_learning.ipynb ?

ageron avatar Oct 20 '22 07:10 ageron