graphrag
graphrag copied to clipboard
[Issue]: ValueError when specifying custom timestamp_column in CSV to YAML conversion
Describe the issue
When attempting to convert CSV data into a YAML format, specifying a custom column for the timestamp results in a ValueError. The exception is raised within the pandas library, specifically at the following location:
.pyenv/versions/graphrag/lib/python3.10/site-packages/pandas/core/reshape/concat.py, on line 507, with the error message “No objects to concatenate”.
This issue occurs during the data input process where the CSV data is expected to be formatted according to the settings in a YAML file.
Steps to reproduce
- Set setting.yaml input: type: file # or blob file_type: csv # or csv base_dir: "input" file_encoding: utf-8 file_pattern: ".*\.csv" timestamp_column: "event_time"
- run
python -m graphrag.index --root ./myFolder - exception raised
GraphRAG Config Used
input: type: file # or blob file_type: csv # or csv base_dir: "input" file_encoding: utf-8 file_pattern: ".*\.csv" text_column: "description" timestamp_column: "event_time"
Logs and screenshots
python -m graphrag.index --root ./ragtest_event_csv
🚀 Reading settings from ragtest_event_csv/settings.yaml
Traceback (most recent call last):
File
"/Users/brian_liang/.pyenv/versions/3.10.4/lib/python3.10/runpy.py", line
196, in _run_module_as_main
return _run_code(code, main_globals, None,
File
"/Users/brian_liang/.pyenv/versions/3.10.4/lib/python3.10/runpy.py", line
86, in _run_code
exec(code, run_globals)
File "/Users/brian_liang/graphrag/graphrag/index/__main__.py", line 76,
in <module>
index_cli(
File "/Users/brian_liang/graphrag/graphrag/index/cli.py", line 161, in
index_cli
_run_workflow_async()
File "/Users/brian_liang/graphrag/graphrag/index/cli.py", line 159, in
_run_workflow_async
asyncio.run(execute())
File
"/Users/brian_liang/.pyenv/versions/3.10.4/lib/python3.10/asyncio/runners
.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1517, in
uvloop.loop.Loop.run_until_complete
File "/Users/brian_liang/graphrag/graphrag/index/cli.py", line 123, in
execute
async for output in run_pipeline_with_config(
File "/Users/brian_liang/graphrag/graphrag/index/run.py", line 144, in
run_pipeline_with_config
dataset = dataset if dataset is not None else await
_create_input(config.input)
File "/Users/brian_liang/graphrag/graphrag/index/run.py", line 133, in
_create_input
return await load_input(config, progress_reporter, root_dir)
File "/Users/brian_liang/graphrag/graphrag/index/input/load_input.py",
line 81, in load_input
results = await loader(config, progress, storage)
File "/Users/brian_liang/graphrag/graphrag/index/input/csv.py", line
135, in load
result = pd.concat(files_loaded)
File
"/Users/brian_liang/.pyenv/versions/graphrag/lib/python3.10/site-packages
/pandas/core/reshape/concat.py", line 382, in concat
op = _Concatenator(
File
"/Users/brian_liang/.pyenv/versions/graphrag/lib/python3.10/site-packages
/pandas/core/reshape/concat.py", line 445, in __init__
objs, keys = self._clean_keys_and_objs(objs, keys)
File
"/Users/brian_liang/.pyenv/versions/graphrag/lib/python3.10/site-packages
/pandas/core/reshape/concat.py", line 507, in _clean_keys_and_objs
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
⠋ GraphRAG Indexer
└──Loading Input (csv) - 1 files loaded (0 filtered) ━━━━ 100% 0:0… 0:0…
Additional Information
- GraphRAG Version: 0.1.1
- Operating System: MACOS 14.5
- Python Version: 3.10.4
- Related Issues: