ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]:

Open Unaiideko opened this issue 9 months ago • 0 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

....

RAGFlow image version

v0.17.0

Other environment information

I am using a docker containers enviroment deployed on an ec2 AWS virtual machine.

Actual behavior

Document parsing feature 'Resume' not available when creating a KB via API.

Possibilities when creating a new KB for the parser:

parser_config
The parser configuration of the dataset. A ParserConfig object's attributes vary based on the selected chunk_method:

chunk_method="naive":
{"chunk_token_num":128,"delimiter":"\\n!?;。;!?","html4excel":False,"layout_recognize":True,"raptor":{"user_raptor":False}}.
chunk_method="qa":
{"raptor": {"user_raptor": False}}
chunk_method="manuel":
{"raptor": {"user_raptor": False}}
chunk_method="table":
None
chunk_method="paper":
{"raptor": {"user_raptor": False}}
chunk_method="book":
{"raptor": {"user_raptor": False}}
chunk_method="laws":
{"raptor": {"user_raptor": False}}
chunk_method="picture":
None
chunk_method="presentation":
{"raptor": {"user_raptor": False}}
chunk_method="one":
None
chunk_method="knowledge-graph":
{"chunk_token_num":128,"delimiter":"\\n!?;。;!?","entity_types":["organization","person","location","event","time"]}
chunk_method="email":
None
Returns
Success: A dataset object.
Failure: Exception
Examples

Posibilities for the parser configuration when creating a KB via User Interface:

Chunk method: Resume (and all the ones above)

So this exception raises when trying to create a kb via API using as parser configuration 'resume': Exception: 'resume' is not in ['naive', 'manual', 'qa', 'table', 'paper', 'book', 'laws', 'presentation', 'picture', 'one', 'knowledge_graph', 'email', 'tag']

Expected behavior

It should be implemented and the KB may be creatable using resume parser config.

Steps to reproduce

dataset_config = {
            "avatar": "",
            "description": "",
            "embedding_model": "",
            "language":  "English",
            "permission": "me", 
            "chunk_method": "resume",
            "parser_config": DataSet.ParserConfig(
                rag=True,
                res_dict={
                    "chunk_token_num":128,
                    "delimiter":"\\n!?;。;!?",
                    "html4excel":False,
                    "layout_recognize":True,
                    "raptor":{"user_raptor":False}
                    }
            )
}


dataset = rag_object.create_dataset(dataset_config )

Additional information

No response

Unaiideko avatar Mar 11 '25 08:03 Unaiideko