lark icon indicating copy to clipboard operation
lark copied to clipboard

ImportError when packaging a standalone application with PyInstaller

Open Mazzesy opened this issue 10 months ago • 6 comments

When attempting to package my application using PyInstaller, I encounter an error related to the "lark" library. When trying to initiate the SelfQueryRetriever from langchain, I encounter the following problem:

Traceback (most recent call last): File "test.py", line 39, in File "langchain\retrievers\self_query\base.py", line 144, in from_llm File "langchain\chains\query_constructor\base.py", line 154, in load_query_constructor_chain File "langchain\chains\query_constructor\base.py", line 115, in _get_prompt File "langchain\chains\query_constructor\base.py", line 72, in from_components File "langchain\chains\query_constructor\parser.py", line 150, in get_parser ImportError: Cannot import lark, please install it with 'pip install lark'.

I have already ensured that the "lark" library is installed using the appropriate command: pip install lark.

I have also tried to add a hook-lark.py file to the PyInstaller as suggested here #548.

With the following code the problem can be reproduced:

from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.retrievers import SelfQueryRetriever
from langchain.llms import OpenAI
from langchain.chains.query_constructor.base import AttributeInfo

embeddings = OpenAIEmbeddings()

persist_directory = "data"
text= ["test"]

chunk_size = 1000
chunk_overlap = 10
r_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap,
                                            separators=["\n\n", "(?<=\. )", "\n"])
docs = r_splitter.create_documents(text)

for doc in docs:
    doc.metadata = {"document": "test"}

db = Chroma.from_documents(documents=docs, embedding=embeddings, persist_directory=persist_directory)

db.persist()

metadata_field_info = [
                AttributeInfo(
                    name="document",
                    description="The name of the document the chunk is from.",
                    type="string",
                ),
            ]

document_content_description = "Test document"

llm = OpenAI(temperature=0)
retriever = SelfQueryRetriever.from_llm(
    llm,
    db,
    document_content_description,
    metadata_field_info,
    verbose=True
)

The spec-file to create the standalone application looks like this:

# -*- mode: python ; coding: utf-8 -*-


block_cipher = None


a = Analysis(
    ['test.py'],
    pathex=[],
    binaries=[],
    datas=[],
    hiddenimports=['tiktoken_ext', 'tiktoken_ext.openai_public', 'onnxruntime', 'chromadb', 'chromadb.telemetry.posthog', 'chromadb.api.local', 'chromadb.db.duckdb'],
    hookspath=['.'],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    win_no_prefer_redirects=False,
    win_private_assemblies=False,
    cipher=block_cipher,
    noarchive=False,
)
pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)

a.datas += Tree('path\to\langchain', prefix='langchain')

exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.zipfiles,
    a.datas,
    [],
    name='test',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=True,
    upx_exclude=[],
    runtime_tmpdir=None,
    console=True,
    disable_windowed_traceback=False,
    argv_emulation=False,
    target_arch=None,
    codesign_identity=None,
    entitlements_file=None,
)

Can you help? Thanks in advance!

Mazzesy avatar Aug 14 '23 15:08 Mazzesy