[Question]: Parsing method table failed to parse xlxs
Describe your problem
The error is as follows:
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 146, in build
cks = chunker.chunk(row["name"], binary=binary, from_page=row["from_page"],
File "/ragflow/rag/app/table.py", line 141, in chunk
dfs = excel_parser(
File "/ragflow/rag/app/table.py", line 66, in __call__
res.append(pd.DataFrame(np.array(data), columns=headers))
File "/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py", line 816, in __init__
mgr = ndarray_to_mgr(
File "/usr/local/lib/python3.10/dist-packages/pandas/core/internals/construction.py", line 336, in ndarray_to_mgr
_check_values_indices_shape_match(values, index, columns)
File "/usr/local/lib/python3.10/dist-packages/pandas/core/internals/construction.py", line 420, in _check_values_indices_shape_match
raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")
ValueError: Shape of passed values is (0, 1), indices imply (0, 7)
Could you attach the file?
Could you attach the file?
The files are company property. The Excel file contains an excessive number of sheets, and the table has some merged cells.
Could you mock up a sample that will fail.
Hello, how do you use the parse module? Is it through command line interface (CLI) or deployment?
Hello, how do you use the parse module? Is it through command line interface (CLI) or deployment?
You can configure the Chunk method in the portal.
Hello, how do you use the parse module? Is it through command line interface (CLI) or deployment?
You can configure the Chunk method in the portal.
Thank you, but I just want to utilize the parse module.
Thank you, but I just want to utilize the parse module.
Maybe you can look at the source code of Ragflow.
请问一下,你这个问题解决了吗?我也遇到了这个问题。
@lvyoudashuju We intend to create an international community, so we encourage using English for communication.
@KevinHuSh Any updates on this issue?
Could you mock up a sample that will fail.
@KevinHuSh This is a sample file.
This file can be parse smoothly on demo site.
请问一下,你这个问题解决了吗?我也遇到了这个问题。
Could you elabrate on your issue? Attach the sample file if it's okay for you.
请问一下,你这个问题解决了吗?我也遇到了这个问题。
Could you elabrate on your issue? Attach the sample file if it's okay for you.
Recreate the scenario: Create a new knowledge base, select the table parsing method, and import an Excel file (using 1.xlsx as an example).
I tried it, and it seems that when there are more than two sheets in the Excel file, this issue occurs.
Even with only two sheets, only the first sheet is parsed, and the subsequent sheets are not.
Method 'Table' is for data dumped from DB, so, you'd better parse it by 'General'.
Recreate the scenario: Create a new knowledge base, select the table parsing method, and import an Excel file (using 1.xlsx as an example).
I tried it, and it seems that when there are more than two sheets in the Excel file, this issue occurs.
Even with only two sheets, only the first sheet is parsed, and the subsequent sheets are not.
It's not exactly what you said. I uploaded an excel file containing 2 sheets and it could be parsed successfully, but when I uploaded another excel file containing 4 sheets, the parsing failed.
As @KevinHuSh said, Method 'Table' is for data dumped from DB, so, you'd better parse it by 'General'.
