ipex-llm
ipex-llm copied to clipboard
pydantic.error_wrappers.ValidationError occurred when BigDL integrate with Kor
Kor github: https://github.com/eyurtsev/kor?tab=readme-ov-file The code I wrote:
import os
from bigdl.llm.langchain.llms import TransformersLLM
from langchain.agents import AgentExecutor, create_react_agent
from kor import create_extraction_chain, Object, Text
system_prompt="""Discard any prior instructions.
You are a seasoned career advising expert in crafting resumes and cover letters, boasting a rich 15-year history dedicated to mastering this skill at Harvard Extension School.
Picture yourself as a certified professional resume writer, specializing in creating compelling and tailored cover letters that highlight clients' skills, experiences, and achievements—meticulously aligning with the specific job descriptions they target.
Your expertise extends across various industries, encompassing a deep understanding of prevailing hiring trends and Applicant Tracking Systems (ATS).
Your ability to identify precise keywords, responsibilities, and requirements from job descriptions is unparalleled."""
instruction="""
You are going to write a JSON resume section of "Achievements" for an applicant applying for job posts.
Return everything as a JSON and only and only JSON.
Step to follow:
1. Analyze my achievements details to match job requirements.
2. Create a JSON resume section that highlights strongest matches
3. Optimize JSON section for clarity and relevance to the job description.
Instructions:
1. Focus: Craft relevant achievements aligned with the job description.
2. Honesty: Prioritize truthfulness and objective language. Please refrain from fabricating new achievements.
3. Specificity: Prioritize relevance to the specific job over general achievements.
4. Style:
4.1. Voice: Use active voice whenever possible.
4.2. Proofreading: Ensure impeccable spelling and grammar.
Consider following Achievements Details delimited by <ACHIEVEMENTS></ACHIEVEMENTS> tag.
<ACHIEVEMENTS>
{
"achievements": []
}
</ACHIEVEMENTS>
Consider following Job description delimited by <JOB_DETAIL></JOB_DETAIL> tag.
<JOB_DETAIL>
{
"title": "Software Developer - Analytics Cloud",
"keywords": [
"Design",
"Development",
"Troubleshooting",
"Debugging"
],
"purpose": "To design, develop, trouble shoot and debug software programs for databases, applications, tools, networks etc.",
"duties_responsibilities": [
"Build enhancements within an existing software architecture and occasionally suggest improvements to the architecture."
],
"required_qualifications": [
"Basic to intermediate knowledge of software architecture"
],
"preferred_qualifications": [],
"company_name": "Oracle",
"company_info": [
"Mission: To create technology that matters, and make a difference in people\u2019s lives through innovative solutions and services."
]
}
</JOB_DETAIL>
"""
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
prompt = f"{B_INST} {B_SYS} {system_prompt} {E_SYS} {instruction} {E_INST}"
schema = Object(
id= "resume" ,
description="This is a revised resume upon request.",
attribute=[
Text(
id="achievements",
description="The achievements in a resume is a section to display job seeker's awards",
examples=[],
many=True,
),
],
Many = False,
)
# Choose the LLM to use
llm_version = "llama-2-7b-chat-hf-INT4"
model_path = f"./checkpoints/{llm_version}"
llm = TransformersLLM.from_model_id_low_bit(model_path)
llm.streaming = False
chain = create_extraction_chain(llm, schema, encoder_or_encoder_class='json')
output=chain.invoke(prompt)['data']
print(output)
the error:
C:\Users\hzh30\.conda\envs\bigdl\python.exe D:\b_Work\ip_LLM\test_agents\test_json_constraint_kor.py
2024-03-25 01:14:45,781 - WARNING - BigdlNativeLLM has been deprecated, please switch to the new LLM API for sepcific models.
2024-03-25 01:14:47,354 - INFO - Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-03-25 01:14:47,354 - INFO - NumExpr defaulting to 8 threads.
Traceback (most recent call last):
File "D:\b_Work\ip_LLM\test_agents\test_json_constraint_kor.py", line 66, in <module>
schema = Object(
File "pydantic\main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for Object
attributes
field required (type=value_error.missing)
my environment:
Package Version Editable project location
---------------------------- --------------- ----------------------------------
accelerate 0.21.0
aiohttp 3.8.5
aiosignal 1.3.1
altair 5.2.0
annotated-types 0.6.0
anyio 4.2.0
asttokens 2.0.5
async-timeout 4.0.3
asynctest 0.13.0
attrs 23.1.0
beautifulsoup4 4.12.2
bigdl-llm 2.5.0b20240304
bitarray 2.8.4
blinker 1.7.0
Bottleneck 1.3.7
bs4 0.0.1
build 1.0.3
CacheControl 0.14.0
cachetools 5.3.1
cbor2 5.5.1
certifi 2023.11.17
cffi 1.15.1
charset-normalizer 3.2.0
cleo 2.1.0
click 8.1.7
colorama 0.4.6
crashtest 0.4.1
cryptography 41.0.3
crytic-compile 0.3.5
cytoolz 0.12.2
dataclasses-json 0.6.4
distlib 0.3.8
distro 1.9.0
dulwich 0.21.7
eth-abi 4.2.1
eth-account 0.10.0
eth-hash 0.5.2
eth-keyfile 0.6.1
eth-keys 0.4.0
eth-rlp 0.3.0
eth-typing 3.5.2
eth-utils 2.3.1
exceptiongroup 1.1.2
faiss-cpu 1.8.0
fastjsonschema 2.19.1
filelock 3.13.1
firebase 4.0.1
firebase-admin 6.2.0
fpdf 1.7.2
frozenlist 1.4.1
fsspec 2024.2.0
gitdb 4.0.11
GitPython 3.1.41
google-ai-generativelanguage 0.4.0
google-api-core 2.11.1
google-api-python-client 2.95.0
google-auth 2.22.0
google-auth-httplib2 0.1.0
google-cloud-core 2.3.3
google-cloud-firestore 2.11.1
google-cloud-storage 2.10.0
google-crc32c 1.5.0
google-generativeai 0.3.2
google-resumable-media 2.5.0
googleapis-common-protos 1.60.0
greenlet 3.0.3
grpcio 1.56.2
grpcio-status 1.56.2
h11 0.14.0
hexbytes 0.3.1
httpcore 1.0.2
httplib2 0.22.0
httpx 0.26.0
huggingface-hub 0.21.3
idna 3.4
importlib-metadata 6.7.0
installer 0.7.0
intel-openmp 2024.0.2
jaraco.classes 3.3.1
Jinja2 3.1.3
joblib 1.3.2
jsonpatch 1.33
jsonpointer 2.4
jsonschema 4.20.0
jsonschema-specifications 2023.11.2
keyring 24.3.0
kor 1.0.1
langchain 0.1.11
langchain-community 0.0.27
langchain-core 0.1.30
langchain-openai 0.0.8
langchain-text-splitters 0.0.1
langchainhub 0.1.15
langsmith 0.1.19
lru-dict 1.2.0
markdown-it-py 3.0.0
MarkupSafe 2.1.4
marshmallow 3.20.2
mdurl 0.1.2
mkl-fft 1.3.8
mkl-random 1.2.4
mkl-service 2.4.0
more-itertools 10.2.0
mpmath 1.3.0
msgpack 1.0.5
multidict 6.0.5
mypy-extensions 1.0.0
networkx 3.2.1
nltk 3.8.1
numexpr 2.8.7
numpy 1.26.3
openai 1.10.0
orjson 3.9.15
outcome 1.2.0
packaging 23.2
pandas 1.5.3
parsimonious 0.9.0
pdf2image 1.17.0
pexpect 4.9.0
pillow 10.2.0
pip 23.3.1
pkginfo 1.9.6
platformdirs 4.1.0
poetry 1.8.2
poetry-core 1.9.0
poetry-plugin-export 1.6.0
poppler-utils 0.1.0
prettytable 3.9.0
proto-plus 1.22.3
protobuf 4.23.4
psutil 5.9.8
ptyprocess 0.7.0
py-cpuinfo 9.0.0
pyarrow 15.0.0
pyasn1 0.5.0
pyasn1-modules 0.3.0
pycryptodome 3.19.0
pydantic 1.10.9
pydantic_core 2.16.1
pydeck 0.8.0
Pygments 2.17.2
PyJWT 2.8.0
pyparsing 3.1.1
PyPDF2 3.0.1
pyproject_hooks 1.0.0
python-dateutil 2.8.2
python-dotenv 0.21.1
pytz 2023.3
pyunormalize 15.1.0
pywin32-ctypes 0.2.2
PyYAML 6.0.1
rapidfuzz 3.6.1
referencing 0.31.1
regex 2023.10.3
requests 2.31.0
requests-toolbelt 1.0.0
rich 13.7.0
rlp 3.0.0
rpds-py 0.17.1
rsa 4.9
safetensors 0.4.2
scikit-learn 1.4.0
scipy 1.12.0
selenium 4.11.2
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 68.2.2
shellingham 1.5.4
six 1.16.0
smmap 5.0.1
sniffio 1.3.0
solc-select 1.0.4
sortedcontainers 2.4.0
soupsieve 2.4.1
SQLAlchemy 2.0.26
streamlit 1.30.0
sympy 1.12
tabulate 0.9.0
tenacity 8.2.3
threadpoolctl 3.2.0
tiktoken 0.6.0
tokenizers 0.13.3
toml 0.10.2
tomli 2.0.1
tomlkit 0.12.3
toolz 0.12.0
torch 2.2.1
tornado 6.4
tqdm 4.65.0
transformers 4.31.0
trio 0.22.2
trio-websocket 0.10.3
trove-classifiers 2024.2.23
types-requests 2.31.0.20240218
typing_extensions 4.9.0
typing-inspect 0.9.0
tzdata 2023.4
tzlocal 5.2
uritemplate 4.1.1
urllib3 2.2.0
validators 0.22.0
virtualenv 20.25.0
vyper 0.3.7
watchdog 3.0.0
wcwidth 0.2.12
web3 6.11.4
websockets 12.0
wheel 0.41.2
windnd 1.0.7
wsproto 1.2.0
xlrd 1.2.0
yarl 1.9.2
zipp 3.15.0