ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

pydantic.error_wrappers.ValidationError occurred when BigDL integrate with Kor

Open Zhuohua-HUANG opened this issue 11 months ago • 4 comments

Kor github: https://github.com/eyurtsev/kor?tab=readme-ov-file The code I wrote:

import os
from bigdl.llm.langchain.llms import TransformersLLM
from langchain.agents import AgentExecutor, create_react_agent
from kor import create_extraction_chain, Object, Text

system_prompt="""Discard any prior instructions.
You are a seasoned career advising expert in crafting resumes and cover letters, boasting a rich 15-year history dedicated to mastering this skill at Harvard Extension School. 
Picture yourself as a certified professional resume writer, specializing in creating compelling and tailored cover letters that highlight clients' skills, experiences, and achievements—meticulously aligning with the specific job descriptions they target. 
Your expertise extends across various industries, encompassing a deep understanding of prevailing hiring trends and Applicant Tracking Systems (ATS). 
Your ability to identify precise keywords, responsibilities, and requirements from job descriptions is unparalleled."""

instruction="""
You are going to write a JSON resume section of "Achievements" for an applicant applying for job posts.
Return everything as a JSON and only and only JSON.
Step to follow:
1. Analyze my achievements details to match job requirements.
2. Create a JSON resume section that highlights strongest matches
3. Optimize JSON section for clarity and relevance to the job description.

Instructions:
1. Focus: Craft relevant achievements aligned with the job description.
2. Honesty: Prioritize truthfulness and objective language. Please refrain from fabricating new achievements.
3. Specificity: Prioritize relevance to the specific job over general achievements.
4. Style:
  4.1. Voice: Use active voice whenever possible.
  4.2. Proofreading: Ensure impeccable spelling and grammar.

Consider following Achievements Details delimited by <ACHIEVEMENTS></ACHIEVEMENTS> tag.
<ACHIEVEMENTS>
{
  "achievements": []
}
</ACHIEVEMENTS>

Consider following Job description delimited by <JOB_DETAIL></JOB_DETAIL> tag.
<JOB_DETAIL>
{
  "title": "Software Developer - Analytics Cloud",
  "keywords": [
    "Design",
    "Development",
    "Troubleshooting",
    "Debugging"
  ],
  "purpose": "To design, develop, trouble shoot and debug software programs for databases, applications, tools, networks etc.",
  "duties_responsibilities": [
    "Build enhancements within an existing software architecture and occasionally suggest improvements to the architecture."
  ],
  "required_qualifications": [
    "Basic to intermediate knowledge of software architecture"
  ],
  "preferred_qualifications": [],
  "company_name": "Oracle",
  "company_info": [
    "Mission: To create technology that matters, and make a difference in people\u2019s lives through innovative solutions and services."
  ]
}
</JOB_DETAIL>
"""

B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
prompt = f"{B_INST} {B_SYS} {system_prompt} {E_SYS} {instruction} {E_INST}"

schema = Object(
    id= "resume" ,
    description="This is a revised resume upon request.",
    attribute=[
        Text(
            id="achievements",
            description="The achievements in a resume is a section to display job seeker's awards",
            examples=[],
            many=True,
        ),
    ],
    Many = False,
)


# Choose the LLM to use
llm_version = "llama-2-7b-chat-hf-INT4"
model_path = f"./checkpoints/{llm_version}"
llm = TransformersLLM.from_model_id_low_bit(model_path)
llm.streaming = False


chain = create_extraction_chain(llm, schema, encoder_or_encoder_class='json')

output=chain.invoke(prompt)['data']
print(output)

the error:

C:\Users\hzh30\.conda\envs\bigdl\python.exe D:\b_Work\ip_LLM\test_agents\test_json_constraint_kor.py 
2024-03-25 01:14:45,781 - WARNING - BigdlNativeLLM has been deprecated, please switch to the new LLM API for sepcific models.
2024-03-25 01:14:47,354 - INFO - Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-03-25 01:14:47,354 - INFO - NumExpr defaulting to 8 threads.
Traceback (most recent call last):
  File "D:\b_Work\ip_LLM\test_agents\test_json_constraint_kor.py", line 66, in <module>
    schema = Object(
  File "pydantic\main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for Object
attributes
  field required (type=value_error.missing)

my environment:

Package                      Version         Editable project location
---------------------------- --------------- ----------------------------------
accelerate                   0.21.0
aiohttp                      3.8.5
aiosignal                    1.3.1
altair                       5.2.0
annotated-types              0.6.0
anyio                        4.2.0
asttokens                    2.0.5
async-timeout                4.0.3
asynctest                    0.13.0
attrs                        23.1.0
beautifulsoup4               4.12.2
bigdl-llm                    2.5.0b20240304
bitarray                     2.8.4
blinker                      1.7.0
Bottleneck                   1.3.7
bs4                          0.0.1
build                        1.0.3
CacheControl                 0.14.0
cachetools                   5.3.1
cbor2                        5.5.1
certifi                      2023.11.17
cffi                         1.15.1
charset-normalizer           3.2.0
cleo                         2.1.0
click                        8.1.7
colorama                     0.4.6
crashtest                    0.4.1
cryptography                 41.0.3
crytic-compile               0.3.5
cytoolz                      0.12.2
dataclasses-json             0.6.4
distlib                      0.3.8
distro                       1.9.0
dulwich                      0.21.7
eth-abi                      4.2.1
eth-account                  0.10.0
eth-hash                     0.5.2
eth-keyfile                  0.6.1
eth-keys                     0.4.0
eth-rlp                      0.3.0
eth-typing                   3.5.2
eth-utils                    2.3.1
exceptiongroup               1.1.2
faiss-cpu                    1.8.0
fastjsonschema               2.19.1
filelock                     3.13.1
firebase                     4.0.1
firebase-admin               6.2.0
fpdf                         1.7.2
frozenlist                   1.4.1
fsspec                       2024.2.0
gitdb                        4.0.11
GitPython                    3.1.41
google-ai-generativelanguage 0.4.0
google-api-core              2.11.1
google-api-python-client     2.95.0
google-auth                  2.22.0
google-auth-httplib2         0.1.0
google-cloud-core            2.3.3
google-cloud-firestore       2.11.1
google-cloud-storage         2.10.0
google-crc32c                1.5.0
google-generativeai          0.3.2
google-resumable-media       2.5.0
googleapis-common-protos     1.60.0
greenlet                     3.0.3
grpcio                       1.56.2
grpcio-status                1.56.2
h11                          0.14.0
hexbytes                     0.3.1
httpcore                     1.0.2
httplib2                     0.22.0
httpx                        0.26.0
huggingface-hub              0.21.3
idna                         3.4
importlib-metadata           6.7.0
installer                    0.7.0
intel-openmp                 2024.0.2
jaraco.classes               3.3.1
Jinja2                       3.1.3
joblib                       1.3.2
jsonpatch                    1.33
jsonpointer                  2.4
jsonschema                   4.20.0
jsonschema-specifications    2023.11.2
keyring                      24.3.0
kor                          1.0.1
langchain                    0.1.11
langchain-community          0.0.27
langchain-core               0.1.30
langchain-openai             0.0.8
langchain-text-splitters     0.0.1
langchainhub                 0.1.15
langsmith                    0.1.19
lru-dict                     1.2.0
markdown-it-py               3.0.0
MarkupSafe                   2.1.4
marshmallow                  3.20.2
mdurl                        0.1.2
mkl-fft                      1.3.8
mkl-random                   1.2.4
mkl-service                  2.4.0
more-itertools               10.2.0
mpmath                       1.3.0
msgpack                      1.0.5
multidict                    6.0.5
mypy-extensions              1.0.0
networkx                     3.2.1
nltk                         3.8.1
numexpr                      2.8.7
numpy                        1.26.3
openai                       1.10.0
orjson                       3.9.15
outcome                      1.2.0
packaging                    23.2
pandas                       1.5.3
parsimonious                 0.9.0
pdf2image                    1.17.0
pexpect                      4.9.0
pillow                       10.2.0
pip                          23.3.1
pkginfo                      1.9.6
platformdirs                 4.1.0
poetry                       1.8.2
poetry-core                  1.9.0
poetry-plugin-export         1.6.0
poppler-utils                0.1.0
prettytable                  3.9.0
proto-plus                   1.22.3
protobuf                     4.23.4
psutil                       5.9.8
ptyprocess                   0.7.0
py-cpuinfo                   9.0.0
pyarrow                      15.0.0
pyasn1                       0.5.0
pyasn1-modules               0.3.0
pycryptodome                 3.19.0
pydantic                     1.10.9
pydantic_core                2.16.1
pydeck                       0.8.0
Pygments                     2.17.2
PyJWT                        2.8.0
pyparsing                    3.1.1
PyPDF2                       3.0.1
pyproject_hooks              1.0.0
python-dateutil              2.8.2
python-dotenv                0.21.1
pytz                         2023.3
pyunormalize                 15.1.0
pywin32-ctypes               0.2.2
PyYAML                       6.0.1
rapidfuzz                    3.6.1
referencing                  0.31.1
regex                        2023.10.3
requests                     2.31.0
requests-toolbelt            1.0.0
rich                         13.7.0
rlp                          3.0.0
rpds-py                      0.17.1
rsa                          4.9
safetensors                  0.4.2
scikit-learn                 1.4.0
scipy                        1.12.0
selenium                     4.11.2
semantic-version             2.10.0
sentencepiece                0.2.0
setuptools                   68.2.2
shellingham                  1.5.4
six                          1.16.0
smmap                        5.0.1
sniffio                      1.3.0
solc-select                  1.0.4
sortedcontainers             2.4.0
soupsieve                    2.4.1
SQLAlchemy                   2.0.26
streamlit                    1.30.0
sympy                        1.12
tabulate                     0.9.0
tenacity                     8.2.3
threadpoolctl                3.2.0
tiktoken                     0.6.0
tokenizers                   0.13.3
toml                         0.10.2
tomli                        2.0.1
tomlkit                      0.12.3
toolz                        0.12.0
torch                        2.2.1
tornado                      6.4
tqdm                         4.65.0
transformers                 4.31.0
trio                         0.22.2
trio-websocket               0.10.3
trove-classifiers            2024.2.23
types-requests               2.31.0.20240218
typing_extensions            4.9.0
typing-inspect               0.9.0
tzdata                       2023.4
tzlocal                      5.2
uritemplate                  4.1.1
urllib3                      2.2.0
validators                   0.22.0
virtualenv                   20.25.0
vyper                        0.3.7
watchdog                     3.0.0
wcwidth                      0.2.12
web3                         6.11.4
websockets                   12.0
wheel                        0.41.2
windnd                       1.0.7
wsproto                      1.2.0
xlrd                         1.2.0
yarl                         1.9.2
zipp                         3.15.0

Zhuohua-HUANG avatar Mar 24 '24 17:03 Zhuohua-HUANG