cmc-csci143
cmc-csci143 copied to clipboard
Good LLM workflows
In class today, we discussed some good LLM workflows. This github issue will serve as a repository of good/useful LLM workflows for our pagila problem set. In order to encourage you all to experiment with these workflows, I will offer 1-2 points of extra credit for anyone who suggests a particularly useful workflow.
Recall that my bashrc contains the following lines for setting up the claude and groq commands:
function llm_blue() {
printf "\033[94m"
command llm "$@"
printf "\033[0m"
}
alias groq='llm_blue -s "keep your response short, between 5-20 lines" -m groq/llama-3.3-70b-specdec'
alias claude='llm_blue -s "keep your response short, between 5-20 lines" -m claude-3-5-sonnet-latest'
Notice in particular that I have updated my groq command to use the "speculative decoding" model, which is 10x faster than the already fast groq models, and that I also have an alias for claude. (Using claude will require an API key from anthropic and IIRC a $5 minimum payment.)
Two good example workflows that we discussed in class are shown below.
To attempt to one-shot solve a problem:
$ claude <<EOF
Given the following schema:
$(cat pagila/pagila-schema.sql)
Answer the following SQL question:
$(cat sql/01.sql)
EOF
If the prompt above fails due to formatting issues, you can provide feedback about how to fix the formatting with a prompt like:
$ claude <<EOF
Given the following schema:
$(cat pagila/pagila-schema.sql)
Answer the following SQL question:
$(cat sql/18.sql)
The expected output should be in the format of:
$(head expected/18.out)
EOF
To claim your extra credit, simply reply to this post with a prompt structure and a short explanation on a suggested use case.
I am attaching the prompt I have been using to help me debug, as well as a portion of a sample output.
GPT is an alias for llm -m o1-mini and I am using -c to take advantage of the prompts suggested by Izbicki, which include the schema.
$ GPT -c <<EOF
Below is the actual output produced by running the SQL query from sql/01.sql:
$(docker compose exec -T pg psql -U postgres -f sql/01.sql)
Below is the expected output as provided in expected/01.out:
$(cat expected/01.out)
Please compare the actual output with the expected output. Identify and list all differences between them and suggest possible reasons for these differences.
EOF
### **1. Number of Rows**
- **Actual Output:** **521 rows**
- **Expected Output:** **659 rows**
**Difference:** The actual output contains **138 fewer rows** than the expected output.
---
### **2. Columns**
- **Both Outputs:**
- `film_id`
- `title`
**Difference:** **No difference** in the number of columns; both outputs include the same two columns.
---
### **3. Specific Data Discrepancies**
#### **a. Missing Films in Actual Output**
Several `film_id` and `title` pairs present in the expected output are **absent** in the actual output. Here are some notable examples:
| film_id | Expected Title | Actual Title |
|---------|---------------------------|-----------------------|
| 6 | AGENT TRUMAN | **Missing** |
| 9 | ALABAMA DEVIL | **Missing** |
| 33 | APOLLO TEEN | **Missing** |
| 35 | ARACHNOPHOBIA ROLLERCOASTER | **Missing** |
| 37 | ARIZONA BANG | **Missing** |
| ... | ... | ... |
| 999 | ZOOLANDER FICTION | **Present** |
| 1000 | ZORRO ARK | **Present** |
*Note: The table above includes only a subset of the missing films for brevity.*
#### **b. Additional Films in Actual Output**
Upon inspection, the **actual output does not contain any films** that are **not present** in the expected output. All films in the actual output appear to be either matching or subsets of the expected films, with no extras observed.
---
### **4. Potential Reasons for Differences**
The discrepancies between the actual and expected outputs can be attributed to several factors:
...
And if you are extra lazy you can store the prompt in a run_debug.sh file and call it using ./run_debug.sh N where N is the number of the exercise you care about.
#!/bin/bash
# Usage: ./run_debug.sh <number>
# Example: ./run_debug.sh 10
if [ -z "$1" ]; then
echo "Usage: $0 <number>"
exit 1
fi
num="$1"
llm -m o1-mini -c <<EOF
Below is the actual output produced by running the SQL query from sql/${num}.sql:
$(docker compose exec -T pg psql -U postgres -f sql/${num}.sql)
Below is the expected output as provided in expected/${num}.out:
$(cat expected/${num}.out)
Please compare the actual output with the expected output. Identify and list all differences between them and suggest possible reasons for these differences.
EOF
@mvalsania I like the debugging use-case and the script to automate, so I've added +2 ec in sakai :)
I'll also point out that the diff command provides similar output in a machine readable format. It is worth learning to read the diff output directly, and I suspect adding the diff output to the LLM prompt could improve the LLMs ability to give advice in weird edge case.