helpers icon indicating copy to clipboard operation
helpers copied to clipboard

Describe use and architecture of llm_transform.py flow

Open gpsaggese opened this issue 7 months ago • 4 comments

We want to document llm_transform.py

I started the doc in the branch https://github.com/causify-ai/helpers/pull/703

[ ] Read the documentation on how we document (diataxis, etc) ./docs/documentation_meta/all.diataxis.explanation.md ./docs/documentation_meta/all.writing_docs.how_to_guide.md ./docs/documentation_meta/all.architecture_diagrams.explanation.md

[ ] Write a reference doc about how llm_transform.py works (use ./docs/tools/all.llm_transform.reference.md as a basis)

./dev_scripts_helpers/llms/dockerized_llm_apply_cfile.py ./dev_scripts_helpers/llms/llm_apply_cfile.py ./dev_scripts_helpers/llms/dockerized_llm_transform.py ./dev_scripts_helpers/llms/llm_transform.py

[ ] Write an howto doc (use ./docs/tools/all.llm_transform.how_to_guide.md)

E.g.,

> llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py -o research_amp/causal_kg/scrape_fred_metadata.py.new -p code_fix_from_imports

gpsaggese avatar May 10 '25 01:05 gpsaggese

Plan of Action

Framework for reference.md

Synopsis

  • Write a brief description of llm_transform.py
  • For each function in the llm_transform.py * Technical description of the function * Parameters description * Return description

Basic Usage

> llm_transform.py -i input.txt -o output.txt -p uppercase

A short description of how to run the script with essential command-line arguments, such as input, output, and a prompt tag, without any advanced options or configurations.

List of Transforms

bash
> llm_transform.py -i input.txt -o output.txt -p list

It will list all the available prompt tags

We will categorize the transforms in three categories: Code Fixes, Code Review & Refactoring, and Markdown Processing

Code Fixes

These transformations will make changes to the existing code like docstring improvements, logging statements, string formatting fixes etc.

Write a technical description for each of the code fixes transformation

For example,

bash
> llm_transform.py -i input.txt -o output.txt -p code_fix_by using_f_strings
  • Fixes the code to use f-strings, such as f"Hello, {name}", instead of conventional string formatting such as "Hello, %s."
  • Performs post transform operation 'remove_code_delimiters', which removes delimiters like """ from the code.

Code Review & Refactoring

bash
llm_transform.py -i dev_scripts_helpers/documentation/render_images.py -o cfile -p code_review_correctness

Write a short technical description of code_review_correctness description, followed by information on input file and output file.

bash
llm_transform.py -i dev_scripts_helpers/documentation/render_images.py -o cfile -p code_review_refactoring 

Write a short technical description of code_review_refactoring description, followed by information on input file and output file

Markdown Processing

Write a technical description for each of the transformations in markdown processing, just like done in case of code fixes.

<----------------EOF------------------->

Framework for how_to_doc.md

Please Note: Since all.llm_transform.how_to_guide.md already contains some content like architecture, so will append the step-by-step execution of CLI commands, such as:

bash
> llm_transform.py -i research_amp/causal_kg/scrape_fred_metadata.py -o research_amp/causal_kg/scrape_fred_metadata.py.new -p code_fix_from_imports

Write a Step-by-step instructions for each transformation listed in reference.md.

Step-by-Step Instructions

Step 1 – Understand the command format, including -i, -o, and -p.

Step 2 – Run the script with a specific prompt, ensuring you understand the purpose of the prompt.

Step 3 – Verify the output:     * Use the compare prompt tag.     * Check that the transformation has been applied as expected.

Step 4 – Troubleshoot in case of issues.

<----------------EOF------------------->

sameer617 avatar May 15 '25 15:05 sameer617

@sameer617 can you pick up my branch above and modify it, then do a PR?

We use GitHub collaboration approach. Also make sure to go through the onboarding docs very carefully to understand how we do things. We follow a very precise process.

gpsaggese avatar May 15 '25 15:05 gpsaggese

@gpsaggese , Hi, need some more time to open a PR. I’ve modified the documents in the above branch (made changes locally not pushed yet), but I’m encountering an issue while running the linter. I'll proceed with the PR as soon as that's resolved.

sameer617 avatar May 17 '25 13:05 sameer617

  • ETA expired sorry. @sameer617 we will give you another task. In the meantime passing to @indrayudd and @aver81

  • Next steps

    • [ ] Read the Issue and understand where we are
    • [ ] Move the doc about llm_transform from doc_toolchain into the branch / new files
    • [ ] Explain all the pieces of llm_transform using the style of https://github.com/causify-ai/helpers/pull/764

gpsaggese avatar May 30 '25 17:05 gpsaggese