ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

ROADMAP 2024

Open writinwaters opened this issue 1 year ago • 12 comments

v0.9.0

  • [ ] Support KG extraction, mind map generation.
  • [x] #1739
  • [x] Support parsing audio files. #1514
  • [x] Support XInference rerank model #1455
  • [x] Support Gemini/Groq/nvidia/openRouter/LM Studio #1036 #1432 #1602

v0.8.0

  • [x] Support RAG graph orchestration and workflow #918
  • [x] Support Self RAG #1069
  • [x] Support several operators for initial agent/workflow

v0.7.0

  • [x] Implements RAPTOR for better chunking. #882
  • [x] Supports ARM platform. #842
  • [x] Supports HTML file.
  • [x] Integrates reranker.

v0.6.0

  • [x] Print version or commit-id when RAGFlow is started. Or showing these information on UI. #643
  • [x] Chunks retrieval APIs #821
  • [x] Files in knowledge base should also be found in file manager. #800
  • [x] System components monitoring. #848
  • [x] Supports simple document layout to speed up file parsing.#799
  • [x] Streaming conversation output. #709
  • [x] Default language will be given according to the browse setting and also can be configured. #801

Long-term plan

  • [ ] RAGFlow documents #720
  • [ ] APIs #1102

writinwaters avatar Mar 28 '24 06:03 writinwaters

                                                 INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
  • Running on all addresses (0.0.0.0) [00:00<?, ?B/s]
  • Running on http://127.0.0.1:9380
  • Running on http://172.18.0.6:9380 INFO:werkzeug:Press CTRL+C to quit tsr.onnx: 100%|██████████| 12.2M/12.2M [00:01<00:00, 11.3MB/s] layout.manual.onnx: 100%|██████████| 12.2M/12.2M [00:01<00:00, 8.83MB/s] layout.paper.onnx: 100%|██████████| 12.2M/12.2M [00:01<00:00, 11.6MB/s]] layout.onnx: 100%|██████████| 12.2M/12.2M [00:01<00:00, 7.65MB/s]3MB/s] Fetching 9 files: 100%|██████████| 9/9 [00:06<00:00, 1.42it/s]s] Fetching 9 files: 100%|██████████| 9/9 [00:06<00:00, 1.39it/s] INFO:werkzeug:172.18.0.1 - - [07/Apr/2024 11:35:22] "GET / HTTP/1.1" 200 - INFO:werkzeug:172.18.0.1 - - [07/Apr/2024 11:35:22] "GET /favicon.ico HTTP/1.1" 200 - WARNING:root:Realtime synonym is disabled, since no redis connection.

ptdaxiake avatar Apr 07 '24 03:04 ptdaxiake

v0.1.0

  • [x] URL support: Capable of web crawling and the corresponding content extration. @KevinHuSh

@KevinHuSh Hi, I wonder what's the current status? Maybe i can collaborate on it. My tech Stack is python backend and i think web crawling is essential for workflow of software development.

Umpire2018 avatar Apr 10 '24 02:04 Umpire2018

v0.1.0

  • [x] URL support: Capable of web crawling and the corresponding content extration. @KevinHuSh

@KevinHuSh Hi, I wonder what's the current status? Maybe i can collaborate on it. My tech Stack is python backend and i think web crawling is essential for workflow of software development.

Crawling web page is big thing in my understanding. We do not have a clear picture for this. Here are two important points to note: Crawling Task Dispaching The way to execute JS on the page Page classification which is related the structure of the data we store The extraction of the main parts of the page

If you have any good solution to these points, please let me know....

KevinHuSh avatar Apr 11 '24 05:04 KevinHuSh

Crawling web page is big thing in my understanding. We do not have a clear picture for this. Here are two important points to note: Crawling Task Dispaching The way to execute JS on the page Page classification which is related the structure of the data we store The extraction of the main parts of the page

If you have any good solution to these points, please let me know....

@KevinHuSh Please refer to #315

Maybe i can start with AWS Bedrock models to contribute to the project, then Support x-inference as model provider. Feel free to contact me via here or wx.

Umpire2018 avatar Apr 11 '24 06:04 Umpire2018

@writinwaters @KevinHuSh Hi. Requesting to look at the issue I created: https://github.com/infiniflow/ragflow/issues/345

Maybe fixing these issues would help us adopt ragflow better.

tvvignesh avatar Apr 13 '24 12:04 tvvignesh

Not supporting streaming really affects the user experience. I hope it can be supported soon, as the implementation is not complicated.

dashi6174 avatar May 10 '24 02:05 dashi6174

v0.6.0

Whether it can provide users with accurate answers and quick answers, one is subject to accuracy and the other is subject to quick response

Miki-lin avatar May 10 '24 12:05 Miki-lin

v0.6.0

Whether it can provide users with accurate answers and quick answers, one is subject to accuracy and the other is subject to quick response

You can file a new issue, so we can discuss in that issue.

JinHai-CN avatar May 10 '24 13:05 JinHai-CN

Are there any plans to use big language models for knowledge graphs?

xs818818 avatar May 11 '24 03:05 xs818818

Are there any plans to use big language models for knowledge graphs?

We've been thinking about this for a while, but haven't figured out how to implement it in RAGFlow. If you have any good issues, feel free to create a new issue and we'll discuss it!

JinHai-CN avatar May 11 '24 04:05 JinHai-CN

Can it automatically continue when it says 'Due to length...' ? The current handling of length issues feels very rudimentary.

AAlexDing avatar May 11 '24 11:05 AAlexDing

Postpone feature request of reranker configuration to 0.7.0.

JinHai-CN avatar May 20 '24 11:05 JinHai-CN

v0.1.0

  • [x] URL support: Capable of web crawling and the corresponding content extration. @KevinHuSh

@KevinHuSh Hi, I wonder what's the current status? Maybe i can collaborate on it. My tech Stack is python backend and i think web crawling is essential for workflow of software development.

Crawling web page is big thing in my understanding. We do not have a clear picture for this. Here are two important points to note: Crawling Task Dispaching The way to execute JS on the page Page classification which is related the structure of the data we store The extraction of the main parts of the page

If you have any good solution to these points, please let me know....

There is already an existing service that does this job pretty well, could be interesting to look into this for crawling website content: https://www.firecrawl.dev/

wukimidaire avatar Aug 12 '24 10:08 wukimidaire

v0.1.0

  • [x] URL support: Capable of web crawling and the corresponding content extration. @KevinHuSh

@KevinHuSh Hi, I wonder what's the current status? Maybe i can collaborate on it. My tech Stack is python backend and i think web crawling is essential for workflow of software development.

Crawling web page is big thing in my understanding. We do not have a clear picture for this. Here are two important points to note: Crawling Task Dispaching The way to execute JS on the page Page classification which is related the structure of the data we store The extraction of the main parts of the page If you have any good solution to these points, please let me know....

There is already an existing service that does this job pretty well, could be interesting to look into this for crawling website content: https://www.firecrawl.dev/

We will look into this website.

JinHai-CN avatar Aug 12 '24 11:08 JinHai-CN