CodexGraph - building is done but nothing was added to Neo4J
Initial Checks
- [X] I have searched GitHub for a duplicate issue and I'm sure this is something new
- [X] I have read and followed the docs & demos and still think this is a bug
- [X] I am confident that the issue is with modelscope-agent (not my code, or another library in the ecosystem)
What happened + What you expected to happen
I am trying to index a repository with CodexGraph, everything is supposedly working - I tested the connection and it works fine, the logs do not give any errors, but after I run "build" I can't see any records added to Neo4J.
Part of logs (note that the progress bars were added by me to check out if it was extraction that failed, apparently it did not)
Successfully processed /home/kuba/Projects/github_search/org/llama_prompting.py
Successfully processed /home/kuba/Projects/github_search/org/target_vocab.py
Successfully processed /home/kuba/Projects/github_search/org/json_utils.py
Successfully processed /home/kuba/Projects/github_search/org/tmp/f1.py
Successfully processed /home/kuba/Projects/github_search/org/.ipynb_checkpoints/promptify_runner-checkpoint.py
Successfully processed /home/kuba/Projects/github_search/org/pd_processing.py
Successfully processed /home/kuba/Projects/github_search/org/tmp/nbow_tasks.py
Successfully processed /home/kuba/Projects/github_search/org/.ipynb_checkpoints/promptify_utils-checkpoint.py
Building modules and classes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 306/306 [00:07<00:00, 40.53it/s]
Building classes and methods: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 306/306 [00:00<00:00, 447.86it/s]
Building inherited methods: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 110/110 [00:00<00:00, 834.14it/s]
✍️ Shallow indexing (18 s)
Versions / Dependencies
Python 3.10.6 on Ubuntu
requirements.txt:
addict==2.4.0
aiohappyeyeballs==2.3.5
aiohttp==3.10.3
aiosignal==1.3.1
aliyun-python-sdk-core==2.15.1
aliyun-python-sdk-kms==2.16.3
altair==5.4.0
annotated-types==0.7.0
anyio==4.4.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==24.2.0
babel==2.16.0
backoff==2.2.1
beautifulsoup4==4.12.3
bleach==6.1.0
blinker==1.8.2
cachetools==5.4.0
certifi==2024.7.4
cffi==1.17.0
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
comm==0.2.2
contourpy==1.2.1
crcmod==1.7
cryptography==43.0.0
cycler==0.12.1
dashscope==1.20.3
dataclasses-json==0.6.7
datasets==2.20.0
debugpy==1.8.5
decorator==5.1.1
deepdiff==7.0.1
defusedxml==0.7.1
Deprecated==1.2.14
dill==0.3.8
dirtyjson==1.0.8
distro==1.9.0
einops==0.8.0
emoji==2.12.1
et-xmlfile==1.1.0
exceptiongroup==1.2.2
executing==2.0.1
faiss-cpu==1.8.0.post1
fasteners==0.19
fastjsonschema==2.20.0
filelock==3.15.4
filetype==1.2.0
fonttools==4.53.1
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.6.1
gitdb==4.0.11
GitPython==3.1.43
greenlet==3.0.3
grpcio==1.65.4
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.24.5
idna==3.7
iniconfig==2.0.0
interchange==2021.0.4
ipykernel==6.29.5
ipython==8.18.1
ipywidgets==8.1.3
isoduration==20.11.0
jedi==0.17.2
jieba==0.42.1
Jinja2==3.1.4
jiter==0.5.0
jmespath==0.10.0
joblib==1.4.2
json5==0.9.25
jsonpatch==1.33
jsonpath-python==1.0.6
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.4
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.11
kiwisolver==1.4.5
langchain==0.2.12
langchain-community==0.2.11
langchain-core==0.2.29
langchain-experimental==0.0.64
langchain-text-splitters==0.2.2
langdetect==1.0.9
langsmith==0.1.98
llama-cloud==0.0.13
llama-index==0.10.64
llama-index-agent-openai==0.2.9
llama-index-cli==0.1.13
llama-index-core==0.10.64
llama-index-embeddings-openai==0.1.11
llama-index-indices-managed-llama-cloud==0.2.7
llama-index-legacy==0.9.48
llama-index-llms-openai==0.1.29
llama-index-multi-modal-llms-openai==0.1.9
llama-index-program-openai==0.1.7
llama-index-question-gen-openai==0.1.3
llama-index-readers-file==0.1.33
llama-index-readers-json==0.1.5
llama-index-readers-llama-parse==0.1.6
llama-index-retrievers-bm25==0.1.5
llama-parse==0.4.9
lxml==5.3.0
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib==3.9.1.post1
matplotlib-inline==0.1.7
mdurl==0.1.2
mistune==3.0.2
modelscope==1.17.1
-e git+https://github.com/modelscope/modelscope-agent/@b0143952ef5dbfd1c191e898c40899d416fcbb61#egg=modelscope_agent
monotonic==1.6
multidict==6.0.5
multiprocess==0.70.16
mypy-extensions==1.0.0
narwhals==1.3.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.2.1
nltk==3.8.2
notebook==7.2.1
notebook_shim==0.2.4
numpy==1.26.4
openai==1.40.3
opencv-python==4.10.0.84
openpyxl==3.1.5
ordered-set==4.1.0
orjson==3.10.7
oss2==2.18.6
overrides==7.7.0
packaging==24.1
pandas==2.2.2
pandocfilters==1.5.1
pansi==2020.7.3
parso==0.7.0
pdfminer.six==20240706
pexpect==4.9.0
pillow==10.4.0
platformdirs==4.2.2
pluggy==1.5.0
prometheus_client==0.20.0
prompt_toolkit==3.0.47
protobuf==5.27.3
psutil==6.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
py2neo==2021.2.4
pyarrow==17.0.0
pyarrow-hotfix==0.6
pycparser==2.22
pycryptodome==3.20.0
pydantic==2.8.2
pydantic_core==2.20.1
pydeck==0.9.1
Pygments==2.18.0
pyparsing==3.1.2
pypdf==4.3.1
pytest==8.3.2
pytest-mock==3.14.0
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-iso639==2024.4.27
python-json-logger==2.0.7
python-magic==0.4.27
pytz==2024.1
PyYAML==6.0.2
pyzmq==26.1.0
qtconsole==5.5.2
QtPy==2.4.1
rank-bm25==0.2.2
rapidfuzz==3.9.6
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
requests-toolbelt==1.0.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.1
rpds-py==0.20.0
safetensors==0.4.4
scipy==1.13.1
seaborn==0.13.2
Send2Trash==1.8.3
sentencepiece==0.2.0
simplejson==3.19.2
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
sortedcontainers==2.4.0
soupsieve==2.5
SQLAlchemy==2.0.32
stack-data==0.6.3
streamlit==1.37.1
striprtf==0.0.26
tabulate==0.9.0
tenacity==8.5.0
terminado==0.18.1
tiktoken==0.7.0
tinycss2==1.3.0
tokenizers==0.19.1
toml==0.10.2
tomli==2.0.1
tornado==6.4.1
tqdm==4.66.5
traitlets==5.14.3
transformers==4.44.0
types-python-dateutil==2.9.0.20240316
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
unstructured==0.15.1
unstructured-client==0.25.4
uri-template==1.3.0
urllib3==2.2.2
watchdog==4.0.2
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
widgetsnbextension==4.0.11
wrapt==1.16.0
xxhash==3.4.1
yarl==1.9.4
Reproduction script
That would be pretty hard since everything happens in CodexGraph streamlit.
Issue Severity
High: It blocks me from completing my task.
The provided log output doesn't appear to show any issues. To further investigate the problem, it is recommended to try building a single file using the following command:
<PYTHON_ENV_PATH> modelscope_agent\environment\graph_database\indexer\run_index_single.py --file_path <FILE_PATH> --root_path <ROOT_PATH> --task_id <TASK_ID> --url <DATABASE_URL> --user <USERNAME> --password <PASSWORD> --db_name <DATABASE_NAME> --env <PYTHON_ENV_PATH> --shallow
Parameter explanations:
<PYTHON_ENV_PATH>: The path to the Python environment (Python <= 3.9) where the required dependencies are installed.
<FILE_PATH>: The path to the file that needs to be processed.
<ROOT_PATH>: The root directory of the project, used to determine relative paths.
<TASK_ID>: A unique ID to identify the task.
<DATABASE_URL>: The URL of the Neo4j database, usually in the format bolt://<HOST>:<PORT>.
<USERNAME>: The username to connect to the Neo4j database.
<PASSWORD>: The password to connect to the Neo4j database.
<DATABASE_NAME>: The name of the Neo4j database to use.
Can codexgraph support locally deployed models?
python 3.10 is the issue. i was facing the same issue with python 3.11. switched to python 3.9. it worked.
Same issue even with python 3.9