hamilton icon indicating copy to clipboard operation
hamilton copied to clipboard

Add docker container(s) to help run examples

Open skrawcz opened this issue 3 years ago • 6 comments

Is your feature request related to a problem? Please describe. The friction to getting the examples up and running is installing the dependencies. A docker container with them already provided would reduce friction for people to get started with Hamilton.

Describe the solution you'd like

  1. A docker container, that has different python virtual environments, that has the dependencies to run the examples.
  2. The container has the hamilton repository checked out -- so it has the examples folder.
  3. Then using it would be:
  • docker pull image
  • docker start image
  • activate python virtual environment
  • run example

Describe alternatives you've considered Not doing this.

Additional context This was a request from a Hamilton talk.

skrawcz avatar May 11 '22 21:05 skrawcz

Hi @skrawcz I am interested to work on this issue.

bovem avatar May 12 '22 06:05 bovem

Hi @skrawcz I am interested to work on this issue.

Hi @bovem. That's great. Do you have an idea of what to do? Or do you need some more guidance and specifications?

skrawcz avatar May 13 '22 05:05 skrawcz

Thanks @skrawcz . I will create a PR and ask you questions on the go, as they arrive.

bovem avatar May 13 '22 06:05 bovem

Hey @skrawcz . I do have some queries

  1. Why is it required to have different virtual environments. Can I just create a consolidated requirements.txt and that should do the work?
  2. Are there any known dependency conflicts?
  3. Does it have to be any specific base container image?
  4. The environment should be for python2 or python3?

bovem avatar May 13 '22 15:05 bovem

Hey @skrawcz . I do have some queries

  1. Why is it required to have different virtual environments. Can I just create a consolidated requirements.txt and that should do the work?

Yes in theory. But it's not guaranteed to always be true. Would prefer separate ones, since that will also be closer to how people would use Hamilton; they wouldn't have all spark, ray, dask dependencies installable if they're not using them.

  1. Are there any known dependency conflicts?

Not that I am aware of.

  1. Does it have to be any specific base container image?

Python3 - I think it's fine to target 3.8 or 3.9. Note, for spark, the container will also need java.

  1. The environment should be for python2 or python3?

Python 3 -- 3.8+

Thanks!

skrawcz avatar May 13 '22 17:05 skrawcz

We should bump this up in priority -- since people without a python environment can't easily get started -- and docker might be a simpler solution for them to try Hamilton.

skrawcz avatar Sep 07 '22 07:09 skrawcz

Hi @skrawcz I was able to create different virtual environments for the examples but I was facing some issues while running following examples. I also tried installing hamilton using pip install sf-hamilton inside the virtual environments but that didn't resolve the issue.

  • async
root@4182a9717aaf:/hamilton/examples/async# source hamilton/bin/activate
(hamilton) root@4182a9717aaf:/hamilton/examples/async# uvicorn fastapi_example:app
Traceback (most recent call last):
  File "/hamilton/examples/async/hamilton/bin/uvicorn", line 8, in <module>
    sys.exit(main())
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/uvicorn/main.py", line 408, in main
    run(
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/uvicorn/main.py", line 576, in run
    server.run()
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/uvicorn/server.py", line 60, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/uvicorn/server.py", line 67, in serve
    config.load()
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/uvicorn/config.py", line 479, in load
    self.loaded_app = import_from_string(self.app)
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/uvicorn/importer.py", line 24, in import_from_string
    raise exc from None
  File "/hamilton/examples/async/hamilton/lib/python3.10/site-packages/uvicorn/importer.py", line 21, in import_from_string
    module = importlib.import_module(module_str)
  File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/hamilton/examples/async/./fastapi_example.py", line 3, in <module>
    from hamilton.experimental import h_async
ModuleNotFoundError: No module named 'hamilton.experimental'
  • dask
root@4182a9717aaf:/hamilton/examples/dask# source hamilton/bin/activate
(hamilton) root@4182a9717aaf:/hamilton/examples/dask# cd hello_world/
(hamilton) root@4182a9717aaf:/hamilton/examples/dask/hello_world# python3 run.py 
[INFO] 2022-10-06 04:42:43,407 __main__(24): LocalCluster(2fe1e048, 'tcp://127.0.0.1:46211', workers=4, threads=16, memory=14.97 GiB)
[INFO] 2022-10-06 04:42:44,077 __main__(50):    spend  signups  avg_3wk_spend  spend_per_signup  spend_zero_mean_unit_variance
0     10        1            NaN            10.000                      -1.064405
1     10       10            NaN             1.000                      -1.064405
2     20       50      13.333333             0.400                      -0.483821
3     40      100      23.333333             0.400                       0.677349
4     40      200      33.333333             0.200                       0.677349
5     50      400      43.333333             0.125                       1.257934
2022-10-06 04:42:44,391 - distributed.client - ERROR - 
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
    comm = await asyncio.wait_for(
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/tcp.py", line 503, in connect
    convert_stream_closed_error(self, e)
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <distributed.comm.tcp.TCPConnector object at 0x7ff29e4195d0>: ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/utils.py", line 742, in wrapper
    return await func(*args, **kwargs)
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/client.py", line 1246, in _reconnect
    await self._ensure_connected(timeout=timeout)
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/client.py", line 1276, in _ensure_connected
    comm = await connect(
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/core.py", line 315, in connect
    await asyncio.sleep(backoff)
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 605, in sleep
    return await future
asyncio.exceptions.CancelledError
[ERROR] 2022-10-06 04:42:44,395 asyncio.events(768): 
Traceback (most recent call last):
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/client.py", line 1443, in _handle_report
    msgs = await self.scheduler_comm.comm.read()
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
    convert_stream_closed_error(self, e)
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Client->Scheduler local=tcp://127.0.0.1:38494 remote=tcp://127.0.0.1:46211>: Stream is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/utils.py", line 742, in wrapper
    return await func(*args, **kwargs)
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/client.py", line 1451, in _handle_report
    await self._reconnect()
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/utils.py", line 742, in wrapper
    return await func(*args, **kwargs)
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/client.py", line 1246, in _reconnect
    await self._ensure_connected(timeout=timeout)
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/client.py", line 1276, in _ensure_connected
    comm = await connect(
  File "/hamilton/examples/dask/hamilton/lib/python3.10/site-packages/distributed/comm/core.py", line 315, in connect
    await asyncio.sleep(backoff)
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 605, in sleep
    return await future
asyncio.exceptions.CancelledError

I have also opened this PR: https://github.com/stitchfix/hamilton/pull/203 for the changes I've made.

bovem avatar Oct 06 '22 05:10 bovem

OK, so I'm pretty sure I managed to debug the first at least -- there's a directory called hamilton inside it -- this is confusing python which thinks its a module, so its not finding experimental.

For the second, the pipeline runs succesfully, but it fails anyway. This is not docker-image-specific, and occurs in the main branch as well :/ I think its a failure in closing, but need to dig in further. Shouldn't be a blocker for you though.

Hope this helps!

elijahbenizzy avatar Oct 07 '22 20:10 elijahbenizzy

Thanks @elijahbenizzy I changed the name of python virtual environment to hamilton-env and added sf-hamilton to the requirements.txt and I got this issue with the async example

(hamilton-env) root@99745e7d8c97:/hamilton/examples/async# uvicorn fastapi_example:app
Traceback (most recent call last):
  File "/hamilton/examples/async/hamilton-env/bin/uvicorn", line 8, in <module>
    sys.exit(main())
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/uvicorn/main.py", line 408, in main
    run(
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/uvicorn/main.py", line 576, in run
    server.run()
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/uvicorn/server.py", line 60, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "/usr/local/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/uvicorn/server.py", line 67, in serve
    config.load()
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/uvicorn/config.py", line 479, in load
    self.loaded_app = import_from_string(self.app)
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/uvicorn/importer.py", line 24, in import_from_string
    raise exc from None
  File "/hamilton/examples/async/hamilton-env/lib/python3.9/site-packages/uvicorn/importer.py", line 21, in import_from_string
    module = importlib.import_module(module_str)
  File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/hamilton/examples/async/./fastapi_example.py", line 5, in <module>
    from . import async_module
ImportError: attempted relative import with no known parent package

bovem avatar Oct 09 '22 07:10 bovem

OK, so it works one level up:

[hamilton] bovem/examples (adding-dockerfile): uvicorn async.fastapi_example:app
INFO:     Started server process [18090]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

This is probably due to the relative import -- sloppy on my part. Fixed in this PR: https://github.com/stitchfix/hamilton/pull/204. Mind rebasing?

elijahbenizzy avatar Oct 09 '22 19:10 elijahbenizzy

Thanks, I rebased from your branch and I was able to run the async example but with this command uvicorn fastapi_example:app

Logs:

(hamilton-env) root@7736d3906b0b:/hamilton/examples/async# uvicorn fastapi_example:app
INFO:     Started server process [1366]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
^CINFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [1366]

Now, all the examples are running except dask. I think this PR: https://github.com/stitchfix/hamilton/pull/203 is ready for review. Should I change its state from draft?

bovem avatar Oct 10 '22 04:10 bovem

Thanks @bovem - the dask one is related to how we shut dask down I believe. The dataframe is printed and it is correct, so I don't think there's an error per se. We should note this in the example, and make an issue to track it; otherwise will take a look at your PR this week. Thanks @bovem .

skrawcz avatar Oct 10 '22 05:10 skrawcz

This was added in #209 . Closing this issue. Thanks @bovem !

skrawcz avatar Oct 14 '22 20:10 skrawcz