Added output cells to these notebooks
Addressing issue #1258
@touma-I This PR is only about the output cells of the pdfprocessing in the examples folder.
@sujee As you can see from the comments by Maroun, he is asking for more than just adding the output cells that I have done.
@swith005 There are two issues here:
- If you click on the "Google Colab" icon in the notebook, it uploads the version of notebook that is in the
devbranch and not the version of notebook that is in this PR! If you want to test a notebook that is still in PR on Google Colab, you have to start Colab in the browser and then upload the notebook (in the PR) manually to the Colab. - Even if you do this, there is no guarantee that the Ray version runs successfully on Colab, but I think from the error you have above, the problem is number 1.
@swith005 These two notebooks install DPK modules by pip installing requirements.txt in the local environment and if you look at the requirements.txt in the PR, you will see that it is using 1.1.1 and not 1.1.1.dev0. When running on Google Colab, please refer to my note above for instructions on opening a notebook that is still in PR. I see that in the Python version, it uses 1.1.1 and in the ray version, it uses 1.1.1.dev1 that should be changed to 1.1.1.
@sujee As you can see from the comments by Maroun, he is asking for more than just adding the output cells that I have done.
@sujee @shahrokhDaijavad there are a lot of good stuff in this notebook and it will be nice if we can streamline things so others can use it as a template for their work. Right now, I am worried only a few of us understand how it works and we should try to streamline it so it is easier to consume by others. Few things we discussed before and we may want to take action on assuming this is the last iteration we do on this notebook:
-
Let's get rid of the wget utils.py and either add it to the notebook itself or submit a PR to the data-processing-lib util library if we feel those services are widely needed in other notebooks
-
There are a lot of special things going on for collab vs non-collab. It will be nice if we can streamline this. From my experience, if we build it for collab, then we can run it as-is in any environment. If this is not a correct assumption, please let me know where collab extensions break the notebook when running in the environment.
-
Get rid of requirements.txt and add them to the notebook regardless of collab or no collab.
-
Can you help me understand why we are doing condo