gptel icon indicating copy to clipboard operation
gptel copied to clipboard

Optionally strip PROPERTIES when sending requests from Org mode buffers?

Open weavermarquez opened this issue 1 year ago • 1 comments

As indicated in the dry runs, the system message is included twice when querying from a gptel buffer turned org file.

Expected behaviour:

The gptel :PROPERTIES: values, especially GPTEL_SYSTEM, should not be included in the user prompt.

Replication Steps:

  1. Ensure gptel--system-message is set.
  2. Create a chatbot temporary buffer e.g. *Claude*
  3. Save it as an org file; this will create a PROPERTIES drawer.
  4. Run interactive gptel-menu, Dry Run > Inspect query (Lisp) [1] or (JSON) [2]

Examples

[1] Lisp

(:model "claude-3-5-sonnet-20240620" :system "     Preface your responses with a relevant hashtag at the beginning of each response.

     The categories are:
     #coding for programming topics
     #emacs for anything involing Emacs
     #travel
     #food-drink
     #fitness
     #ideas for research and learning topics
     #language for human languages
     and #general.
" :stream :json-false :max_tokens 4096 :messages
[(:role "user" :content ":PROPERTIES:
:GPTEL_MODEL: claude-3-5-sonnet-20240620
:GPTEL_BACKEND: Claude
:GPTEL_SYSTEM: Preface your responses with a relevant hashtag at the beginning of each response.\\n\\n     The categories are:\\n     #coding for programming topics\\n     #emacs for anything involing Emacs\\n     #travel\\n     #food-drink\\n     #fitness\\n     #ideas for research and learning topics\\n     #language for human languages\\n     and #general.\\n
:GPTEL_MAX_TOKENS: 4096
:GPTEL_BOUNDS: nil
:END:

*** Testing")]
:temperature 1.0)

[2] JSON

{
  "model": "claude-3-5-sonnet-20240620",
  "system": "     Preface your responses with a relevant hashtag at the beginning of each response.\n\n     The categories are:\n     #coding for programming topics\n     #emacs for anything involing Emacs\n     #travel\n     #food-drink\n     #fitness\n     #ideas for research and learning topics\n     #language for human languages\n     and #general.\n",
  "stream": false,
  "max_tokens": 4096,
  "messages": [
    {
      "role": "user",
      "content": ":PROPERTIES:\n:GPTEL_MODEL: claude-3-5-sonnet-20240620\n:GPTEL_BACKEND: Claude\n:GPTEL_SYSTEM: Preface your responses with a relevant hashtag at the beginning of each response.\\n\\n     The categories are:\\n     #coding for programming topics\\n     #emacs for anything involing Emacs\\n     #travel\\n     #food-drink\\n     #fitness\\n     #ideas for research and learning topics\\n     #language for human languages\\n     and #general.\\n\n:GPTEL_MAX_TOKENS: 4096\n:GPTEL_BOUNDS: nil\n:END:\n\n*** Testing"
    }
  ],
  "temperature": 1.0
}

weavermarquez avatar Oct 04 '24 01:10 weavermarquez

The system message is not being sent twice, as you can see from the :system parameter of the request. gptel sends the buffer contents as-is, which includes the system message since it's been written to the buffer.

So the question is if all :PROPERTIES: blocks should be stripped when constructing the user prompt. See prior discussion in #141 (specifically this comment and my response) and one possible workaround based on #325.

karthink avatar Oct 04 '24 20:10 karthink

Would it be be nice to be able to (optionally) strip comments from Org (#) and MD (HTML comments) files as well. Or more generally any mode that has comment-start:

https://github.com/karthink/gptel/blame/b2e54046fef11566a087587f37d7ea5b194c2074/gptel.el#L351

What do you think @karthink? I might attempt to implement this if you're OK with the idea.

pabl0 avatar Feb 19 '25 21:02 pabl0

Would it be be nice to be able to (optionally) strip comments from Org (#) and MD (HTML comments) files as well. Or more generally any mode that has comment-start:

This stream of suggestions/requests is going to keep growing. Next users will want org keywords (like #+foo: and #+attr_latex:) to be ignored. See #325, for example.

The logical end point of this growing list is to essentially run org-export on the chat buffer and send the result. You can already do that today, and I don't want to re-implement a worse version of org-export.

(org-export is also a pretty heavy operation, as it conses a full Org parse tree in memory, two duplicate buffers and runs a fair bit of lisp.)

What do you think @karthink? I might attempt to implement this if you're OK with the idea.

There are incoming changes in #626 that will make your attempt obsolete, so I suggest holding off for now. Some changes in that PR are required for many things, including this issue, #325 and #328. My takeaway from these requests is that gptel can provide a general recipe instead of bespoke solutions for each request:

After #626 I need to add some infrastructure for "prompt filters", a hook that runs on prompt text before it is parsed. Functions added to this hook can modify the prompt text as appropriate. Stripping syntax types like properties and comments can then be carried out in this hook.

A hook function to strip properties can be added at this stage. That's the only one I plan to include by default.

Users can write filters to strip anything else they don't want, or transform the input text in some way. For Org buffers specifically, if you want to strip all syntax you can just org-export to ascii and send that instead.

karthink avatar Feb 19 '25 22:02 karthink

:PROPERTIES: (property drawers) are now stripped from the prompt before sending. You can verify this with a dry-run.

This is currently enabled by default, but I might make it optional soon.

You can additionally strip any other Org elements you don't want (like comments, say) by customizing gptel-org-ignore-elements. Note that adding org-elements to this list can slow down gptel-send and cause Emacs to hitch.

karthink avatar Mar 11 '25 06:03 karthink

Closing as completed. Please reopen if this does not work as expected. To inspect what's being sent, run (gptel-expert-commands t), then use the dry-run options from gptel's transient menu.

karthink avatar Mar 11 '25 06:03 karthink