LLM-Engineers-Handbook icon indicating copy to clipboard operation
LLM-Engineers-Handbook copied to clipboard

Running book feedback & errata Q&A dialogue with authors ...

Open nmvega opened this issue 11 months ago • 9 comments

Hello:

Sadly, due to this pyenv install issue, I'm unable to install Python v3.11.x as well as indicated versions near it. (In a nutshell, root access to various directories under /usr/local/ is necessary, which I don't want to grant (since everything should be wholly contained beneath ~/.pyenv/versions)).

Can you recommend a modified setup procedure without use of pyenv and using, say, python -m venv <name> instead?

Thank you!

nmvega avatar Jan 08 '25 18:01 nmvega

Because the above is a show-stopper (progress cannot be made), I:

  1. Disabled pyenv.
  2. Created & Activated a Python virtual environment via traditional means: python -m venv llm-handbook-pyvenv.d

But now I'm running into poetry issues, namely this (basically poetry dependency resolution never completing).

# Manually create & activate virtual environment.
nmvega@ollama$ cd /path/to/llm-engineering-handbook/git-clone/
nmvega@ollama$ python -m venv ./llm-handbook-pyvenv.d
nmvega@ollama$ source ./llm-handbook-pyvenv.d/bin/activate

# Attempt #1:
(llm-handbook-pyvenv.d) nmvega@ollama$ poetry install --without aws
Updating dependencies
Resolving dependencies (2435.82)
[ ... Runs forever ... ] <Ctrl+C>

# Try installing Python packages manually.
(llm-handbook-pyvenv.d) nmvega@ollama$ poetry export > ./requirements.txt
(llm-handbook-pyvenv.d) nmvega@ollama$ pip install -r ./requirements.txt
(llm-handbook-pyvenv.d) nmvega@ollama$ pip list # confirm.

# Attempt #2:
(llm-handbook-pyvenv.d) nmvega@ollama$ poetry install --without aws
Installing dependencies from lock file
Package operations: 8 installs, 0 updates, 0 removals
  - Installing distlib (0.3.8): Pending...
  - Installing platformdirs (4.2.2): Pending...
[ ... Runs forever ... ] <Ctrl+C>

# Attempt #3
(llm-handbook-pyvenv.d) nmvega@ollama$ mv poetry.lock poetry.lock.FCS # Rename lock file.
(llm-handbook-pyvenv.d) nmvega@ollama$ poetry install --without aws
Updating dependencies
Resolving dependencies (3428.72)
[ ... Runs forever ... ] <Ctrl+C>

Back to square one.

As you can see, I've tried numerous alternatives unsuccessfully. I'm stuck and cannot proceed with the book.

Can you create a modified, simpler setup procedure without use of pyenv and poetry?

nmvega avatar Jan 08 '25 20:01 nmvega

WORKAROUND

If you should run into this issue (and many will), here's a workaround:

Disable "pyenv(1)" Bash integration, then manually Create & Activate a Python virtual environment:

nmvega@ollama$ poetry config virtualenvs.create false
nmvega@ollama$ /usr/bin/python3.11 -m venv ./llm-handbook-pyvenv.d
nmvega@ollama$ source ./llm-handbook-pyvenv.d/bin/activate
nmvega@ollama$ pip install --quiet -U --no-cache pip
(llm-handbook-pyvenv.d) nmvega@ollama$

Prepare two Python "requirements" files. The "requirements_2.txt" file contains dependency package names and versions that poetry(1) attempted to install, but pended forever. They are correct package names and versions (not randomly selected):

(llm-handbook-pyvenv.d) nmvega@ollama$ poetry export --without aws > ./requirements_1.txt
(llm-handbook-pyvenv.d) nmvega@ollama$ vi ./requirements_2.txt # Add these package entries:
distlib==0.3.8
platformdirs==4.2.2
cfgv==3.4.0
identify==2.6.0
nodeenv==1.9.1
virtualenv==20.26.3
pre-commit==3.8.0
ruff==0.4.10
(llm-handbook-pyvenv.d) nmvega@ollama$

Install the Python packages:

(llm-handbook-pyvenv.d) nmvega@ollama$ pip install --no-cache -r ./requirements_1.txt
(llm-handbook-pyvenv.d) nmvega@ollama$ pip install --no-cache -r ./requirements_2.txt

Finally:

(llm-handbook-pyvenv.d) nmvega@ollama$ poetry install --without aws
Installing dependencies from lock file
No dependencies to install or update
Installing the current project: llm-engineering (0.1.0)

(llm-handbook-pyvenv.d) nmvega@ollama$ echo $?
0

Done!

nmvega avatar Jan 08 '25 21:01 nmvega

Also, the command user$ poetry self add poethepoet[poetry_plugin] hung indefinitely, so I killed it.

Instead, I followed the "poe the poet" installation docs and ran pip install poethepoet, which worked. The poe plugin now lets me run poetry poe [...] commands.

Alternatively, if necessary, you can directly run the underlying bash commands without poetry, as described in the README.md file.

The book is excellent, detailed and rational. Work through the bugs and adjustments for your system, and it’s worth it. For example, I’ll still need to swap docker(1) with podman-compose(1).

nmvega avatar Jan 08 '25 22:01 nmvega

Hello @nmvega

Thanks for providing this alternative setup for other people encountering the same issue.

I will leave the issue open, as we will not add this to the main repository, as the standard way to go in industry-level repositories is using poetry (optionally pyenv for python version management) or uv.

We want to keep that as the recommended approach because it's more robust: failing verbosely at the beginning is better than finding errors due to version mismatches when running the code.

But your tutorial is super valuable if other people encounter the same issue. Thanks for doing this.

iusztinpaul avatar Jan 10 '25 07:01 iusztinpaul

I get it. Thank you for the robust reply. I'm really enjoying the book (going meticulously though it), and (currently 4am here in New York City), I'm refactoring my AI R&D server to set everything up. Thank you for such a comprehensive book.

nmvega avatar Jan 10 '25 09:01 nmvega

@nmvega Excited that you enjoy it.

Happy learning ✌️

iusztinpaul avatar Jan 14 '25 13:01 iusztinpaul

Hello again:

I noticed that the implementation of the zenml-server is via its Python library rather than, for example, a container image like other services. Assuming zenml client and zenml server versions are kept compatible, would there be deployment implications with the book's code if a container were used for the zenml server?

I created this:

version: '3.9'

# =====================================================================
name: agentic_gen_ai  # Resulting "podman pod" name: pod_agentic_gen_ai
# =====================================================================

# =====================================================================
# This network must already exist via: nmvega$ docker network create 'MLops_network'
# =====================================================================
networks:
  MLops:
    name: MLops_network
    external: true
# =====================================================================

# =====================================================================
# These volumes must already exist via: nmvega$ docker volume create volName
# =====================================================================
volumes:
  agentic_gen_ai_zenml_local_stores:
    external: true
# =====================================================================

services:
  zenml-server:
    image: docker.io/zenmldocker/zenml-server:latest 
    container_name: "zenml-server"
    hostname: "zenml-server"
    profiles: ["gpu-nvidia", "cpu"] # Remove if you don't use Docker profiles.
    ports:
      - 0.0.0.0:8237:8080  # To match Port that the book expects, use: "0.0.0.0:8237:8080"
    volumes:
      - agentic_gen_ai_zenml_local_stores:/zenml/.zenconfig/local_stores/
    networks: ['MLops']
    deploy:
      resources:
        limits:
          cpus: '24'
          memory: 128g  # 192GB RAM total available.
      restart_policy:
        condition: on-failure
        delay: 10s
        max_attempts: 3
        window: 120s
(llm-handbook-pyvenv.d) nmvega@ollama$ zenml login http://192.168.0.12:8237
(llm-handbook-pyvenv.d) nmvega@ollama$ zenml server list
┏━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┓
┃ ACTIVE │ TYPE   │ ID                     │ NAME                   │ VERSION │ STATUS    │ DASHBOARD URL          │ API URL               │ AUTH STATUS   ┃
┠────────┼────────┼────────────────────────┼────────────────────────┼─────────┼───────────┼────────────────────────┼───────────────────────┼───────────────┨
┃   👉   │ REMOTE │ 67d4a0b8-3861-4ed4-acd │ llm_engineering_handbo │ 0.72.0  │ available │ http://192.168.0.12:82 │ http://192.168.0.12:8 │ never expires ┃
┃        │        │ 1-a0bb4e73a560         │ ok                     │         │           │ 37                     │ 237                   │               ┃
┗━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━

nmvega avatar Jan 14 '25 19:01 nmvega

I don't think so, but using the Python server is far easier than using the Docker version.

Personally, I think the Docker version it's worth considering for:

  1. Portability
  2. On-prem deployments

But I wanted to keep it simple.

@nmvega

iusztinpaul avatar Jan 17 '25 11:01 iusztinpaul

On page 58, at the top,

Output: A list of raw documents stored in the NoSQL data warehouse

This should say database or data lake or document store. A data warehouse is for structured, tabular data.

cstaulbee avatar Feb 18 '25 15:02 cstaulbee