lecture-python-intro icon indicating copy to clipboard operation
lecture-python-intro copied to clipboard

ENH: Test Execution for Google Collab

Open kp992 opened this issue 1 year ago • 21 comments

This PR adds google collab compatibility testing for the intro lecture series

Related to https://github.com/QuantEcon/meta/issues/139.

kp992 avatar May 08 '24 09:05 kp992

Deploy Preview for taupe-gaufre-c4e660 ready!

Name Link
Latest commit 0a5716baebf945864bc7a57e0192ac6015cc23af
Latest deploy log https://app.netlify.com/sites/taupe-gaufre-c4e660/deploys/664ab13889fd5c0008787cf7
Deploy Preview https://deploy-preview-441--taupe-gaufre-c4e660.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify[bot] avatar May 08 '24 09:05 netlify[bot]

🚀 Deployed on https://664ab333977ec6417c152035--taupe-gaufre-c4e660.netlify.app

github-actions[bot] avatar May 08 '24 09:05 github-actions[bot]

@mmcky Seems like larger machine is not assigned to run the job.

kp992 avatar May 09 '24 02:05 kp992

thanks @kp992 I have enabled ubuntu-latest-m as a larger runner for this repo. It will cost some money (as they aren't free) but it will be interesting to see how much this ends up being.

mmcky avatar May 09 '24 04:05 mmcky

  • [x] the inflation_history lecture fails on this test without https://github.com/QuantEcon/lecture-python-intro/pull/439

Thanks @kp992 I have reconfigured QuantEcon github and we will trial this repo with the larger runner to see how expensive it will be.

mmcky avatar May 09 '24 04:05 mmcky

This error is currently being reported

pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas==1.5.3, but you have pandas 2.1.4 which is incompatible.

however this error doesn't show up if I use google collab to test directly by adding the same code into the colab version of the lecture: https://colab.research.google.com/github/QuantEcon/lecture-python-intro.notebooks/blob/master/inflation_history.ipynb

Screenshot 2024-05-09 at 4 30 52 PM

@kp992 any ideas?

mmcky avatar May 09 '24 05:05 mmcky

What if we install the latest version of pandas globally instead of installing in specific lecture?

kp992 avatar May 09 '24 10:05 kp992

@mmcky Woaah!! It's green.

kp992 avatar May 09 '24 13:05 kp992

What if we install the latest version of pandas globally instead of installing in specific lecture?

thanks @kp992 that won't work as we need it to be installed in Google Collab by the notebook for it to work there.

mmcky avatar May 10 '24 01:05 mmcky

@kp992 I think the key issue is that google colab appears much older via their docker container than via their current live interface. Pandas in the container is 1.5.3 and online is 2.0.3 so pip install isn't working on the container due to too many version mismatches

mmcky avatar May 10 '24 02:05 mmcky

@kp992 I can't find anything else that will help with this re: collab docker container.

Perhaps we need to build our own container with a pip freeze of the google colab environment? The downside is that is something we will have to maintain. Do you know how frequently they update the google collab environment?

mmcky avatar May 13 '24 05:05 mmcky

@kp992 I just followed these instructions to connect collab to a locally running instance and I am getting pandas=2.0.3 so maybe our action isn't working inside of the container properly?

https://research.google.com/colaboratory/local-runtimes.html

Would you mind to download this docker container and see if you get pandas==1.5.3 or pandas==2.0.3 inside the container?

Update: Actually my docker container is not running properly as I am using arm64 and not x86

mmcky avatar May 13 '24 06:05 mmcky

Sure, I will check on this by tomorrow.

kp992 avatar May 13 '24 08:05 kp992

@mmcky Actually, I am also facing some issues upon installing docker on my mac. Should we try it on our Linux VM that we have? I didn't want to play with it because John might be using it and maybe something could break.

Do you know how frequently they update the google collab environment?

There are frequent updates because of changing dependencies among the libraries.

kp992 avatar May 14 '24 15:05 kp992

Hi @mmcky, I was able to pull the docker image and following the steps mentioned there, I tested locally and I get pandas 1.5.3 so our job is doing the right thing. Just that the public image is not upated with their private image that they are using online maybe.

kp992 avatar May 14 '24 16:05 kp992

We can do one thing. Let's just do a manual testing for a list of failing lectures(currently just one lectures/inflation_history.md) and skip this from collab automated testing job. Once the image is updated we can remove the lectures from this. If you want I can write a python script to do this.

kp992 avatar May 15 '24 12:05 kp992

@mmcky How does this look https://github.com/QuantEcon/lecture-python-intro/pull/441/commits/63d8503779b1ead181f76588ee88ab894943d71f? I checked the logs and it was failing during reload and not during pip installation. So the workaround is to find pandas version without actually importing it so we don't need reload at all.

kp992 avatar May 15 '24 13:05 kp992

@mmcky How does this look 63d8503? I checked the logs and it was failing during reload and not during pip installation. So the workaround is to find pandas version without actually importing it so we don't need reload at all.

@kp992 I don't fully understand why reload is working on Google Collab and not in this action. From my inspection of the logs the main issue is pandas==1.5.3 is specified by some google-colab package.

mmcky avatar May 17 '24 01:05 mmcky

From the logs:

[?25hRequirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas==2.1.4) (1.16.0)
----- stdout -----
Installing collected packages: tzdata, pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 1.5.3
----- stdout -----
    Uninstalling pandas-1.5.3:
      Successfully uninstalled pandas-1.5.3
----- stdout -----
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires pandas==1.5.3, but you have pandas 2.1.4 which is incompatible.[0m[31m
[0mSuccessfully installed pandas-2.1.4 tzdata-2024.1
------------------

This shows though there was some dependency error, and pip was successfully able to install pandas-2.14. Now if we use google-colab by importing it, we can expect it to fail. Since we don't use it in our code, we can ignore.

The actual error is coming from the below line which is reload:

[0;31mImportError[0m                               Traceback (most recent call last)
[0;32m<ipython-input-2-e9f1b8a0cfbd>[0m in [0;36m<cell line: 4>[0;34m()[0m
[1;32m      4[0m [0;32mif[0m [0mVersion[0m[0;34m([0m[0mpandas[0m[0;34m.[0m[0m__version__[0m[0;34m)[0m [0;34m<[0m [0mVersion[0m[0;34m([0m[0;34m'2.1.4'[0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[1;32m      5[0m   [0mget_ipython[0m[0;34m([0m[0;34m)[0m[0;34m.[0m[0msystem[0m[0;34m([0m[0;34m'pip install pandas==2.1.4'[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;32m----> 6[0;31m   [0mreload[0m[0;34m([0m[0mpandas[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m

I checked some other repos who are also facing this similar issue when doing reload fails. So to avoid all reload and importing again, I bypassed it by checking version of pandas without importing.

kp992 avatar May 17 '24 03:05 kp992

thanks @kp992 what I don't understand is why I can do it in Google Collab though.

mmcky avatar May 17 '24 03:05 mmcky

@kp992 given the discrepancies between the docker image and the inconsistencies between what you can do in the live environment and in this docker container -- I am not confident this is a good approach. We may want to instead use pip freeze on the live collab environment and make our own ubuntu based docker container.

If there is a way to interact with colab to get pip freeze results programmatically that would be amazing as then we can do automatic docker updates.

mmcky avatar May 17 '24 04:05 mmcky

Hmm, strange thing, when I tried today pulling the docker image from colab and getting a pip list on our linux server, I get pandas as 2.0.3. I did the same thing 3-4 days back when I got 1.5.3.

kp992 avatar May 19 '24 12:05 kp992

Triggering the job again to see if it might have been updated recently.

kp992 avatar May 19 '24 12:05 kp992

Hey @mmcky, I was right. Previously it was pandas 1.5.3, but see the latest run, we are getting 2.0.3. So the docker image is now updated in the last 2-3 days to the latest image that we see colab website.

kp992 avatar May 19 '24 12:05 kp992

good news - thanks for letting me know @kp992 so it looks like the docker public runtime releases lag the online version a bit but I think we can accept that. 👍

mmcky avatar May 20 '24 00:05 mmcky

Yeah, looks good to me know. thanks @mmcky

kp992 avatar May 20 '24 12:05 kp992

thanks @kp992.

@jstac this is probably the best we can do for google collab testing. I am going to run this here for a couple weeks to see how costly it is in terms of compute and then roll it out more widely

mmcky avatar May 21 '24 00:05 mmcky