Large-Language-Model-Notebooks-Course icon indicating copy to clipboard operation
Large-Language-Model-Notebooks-Course copied to clipboard

Update code in `nl2sql.ipynb` notebook - Changed the approach to explain the SQL tables to the mode

Open fmquaglia opened this issue 1 year ago • 4 comments

It is using a domain-specific language already understood by the model. This results in a more compact system prompt that allows for a more nuanced data model description. This should result in fewer tokens consumed by the system prompts and better maintainability, allowing for a richer data definition.

This might not be ideal for an educational exercise because it adds to the cognitive load (Wait... what? What is this DBML thing?), but I figured I would show it to you, anyway.

From

 # ...

context.append( {'role':'system', 'content':"""
first table:
{
  "tableName": "employees",
  "fields": [
    {
      "nombre": "ID_usr",
      "tipo": "int"
    },
    {
      "nombre": "name",
      "tipo": "string"
    }
  ]
}
"""
})

context.append( {'role':'system', 'content':"""
second table:
{
  "tableName": "salary",
  "fields": [
    {
      "nombre": "ID_usr",
      "type": "int"
    },
    {
      "name": "year",
      "type": "date"
    },
    {
      "name": "salary",
      "type": "float"
    }
  ]
}
"""
})

context.append( {'role':'system', 'content':"""
third table:
{
  "tablename": "studies",
  "fields": [
    {
      "name": "ID",
      "type": "int"
    },
    {
      "name": "ID_usr",
      "type": "int"
    },
    {
      "name": "educational level",
      "type": "int"
    },
    {
      "name": "Institution",
      "type": "string"
    },
    {
      "name": "Years",
      "type": "date"
    }
    {
      "name": "Speciality",
      "type": "string"
    }
  ]
}
"""
})

# ...

To

 # ...

context.append( {'role':'system', 'content':"""
This is the definition of your database tables:

```dbml
Table employees {
  ID_usr int [pk]
  name string
}

Table salary {
  ID_usr int [ref: > employees.ID_usr]
  year date
  salary float
}

Table studies {
  ID int [pk]
  ID_usr int [ref: > employees.ID_usr]
  educational_level int
  Institution string
  Years date
  Speciality string
}

""" })

...

fmquaglia avatar Feb 07 '24 09:02 fmquaglia

Notice that GitHub gets confused by the markdown in the system prompt, and it does not render the code properly in the PR description.

fmquaglia avatar Feb 07 '24 09:02 fmquaglia

I rebased on top of the latest additions to main.

fmquaglia avatar Feb 08 '24 07:02 fmquaglia

Hi @fmquaglia,

As you know, this notebook is part of a course, and it’s just a basic, initial approach to creating an NL2SQL solution. In a more advanced lesson, we have this notebook: 6_1_nl2sql_prompt_OpenAI.ipynb, which uses a very similar structure in the prompt to what you are presenting here, based on this paper: https://arxiv.org/abs/2305.11853.

However, I prefer to keep this more basic and somewhat incorrect approach here, and introduce the more accurate method later when the students have a stronger foundation of knowledge.

peremartra avatar Aug 13 '24 09:08 peremartra

Just figured that we can upload the notebook with a different name, and with some explanations in the header, explaining that this is a better solution.

rename the notebook to: 1_2b-Easy_NL2SQL.ipynb.

and please, add a by line in the header with your name and a link to your github profile, or linked profile, the one you prefer.

peremartra avatar Aug 13 '24 09:08 peremartra

@peremartra I totally missed this message. I'll try to have it done as you suggest by the end of this week, sir. Thank you!

fmquaglia avatar Sep 16 '24 17:09 fmquaglia

@peremartra I totally missed this message. I'll try to have it done as you suggest by the end of this week, sir. Thank you!

@peremartra I totally missed this message. I'll try to have it done as you suggest by the end of this week, sir. Thank you!

@peremartra I totally missed this message. I'll try to have it done as you suggest by the end of this week, sir. Thank you!

My Fault @fmquaglia! Thanks to you, waiting for you modifications :-)

peremartra avatar Sep 17 '24 11:09 peremartra

@peremartra Sorry, my wife called me to have dinner, and I completely forgot to send you a comment letting you know that what you requested has been done. Man, thanks for letting me contribute to this project. It makes me happy. Thank you!

fmquaglia avatar Oct 07 '24 10:10 fmquaglia

[image: image.png] Thanks to you Fabricio!

On Mon, Oct 7, 2024 at 12:31 PM Fabricio Quagliariello < @.***> wrote:

@peremartra https://github.com/peremartra Sorry, my wife called me to have dinner, and I completely forgot to send you a comment letting you know that what you requested has been done. Man, thanks for letting me contribute to this project. It makes me happy. Thank you!

— Reply to this email directly, view it on GitHub https://github.com/peremartra/Large-Language-Model-Notebooks-Course/pull/10#issuecomment-2396540776, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABX24ZSAME62QEX67227IZ3Z2JPI5AVCNFSM6AAAAABC5PYZV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJWGU2DANZXGY . You are receiving this because you were mentioned.Message ID: <peremartra/Large-Language-Model-Notebooks-Course/pull/10/c2396540776@ github.com>

peremartra avatar Oct 07 '24 20:10 peremartra

Hi Fabricio, I added a link to the notebook in the readme.md of the course, and i'll promote the notebook on likedin and twitter in the next days, Thanks for the contribution!

On Mon, Oct 7, 2024 at 10:34 PM Pere Martra @.***> wrote:

[image: image.png] Thanks to you Fabricio!

On Mon, Oct 7, 2024 at 12:31 PM Fabricio Quagliariello < @.***> wrote:

@peremartra https://github.com/peremartra Sorry, my wife called me to have dinner, and I completely forgot to send you a comment letting you know that what you requested has been done. Man, thanks for letting me contribute to this project. It makes me happy. Thank you!

— Reply to this email directly, view it on GitHub https://github.com/peremartra/Large-Language-Model-Notebooks-Course/pull/10#issuecomment-2396540776, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABX24ZSAME62QEX67227IZ3Z2JPI5AVCNFSM6AAAAABC5PYZV2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJWGU2DANZXGY . You are receiving this because you were mentioned.Message ID: <peremartra/Large-Language-Model-Notebooks-Course/pull/10/c2396540776@ github.com>

peremartra avatar Oct 11 '24 20:10 peremartra