AutoGPT icon indicating copy to clipboard operation
AutoGPT copied to clipboard

Data Ingestion doesn't work

Open Tejaswgupta opened this issue 2 years ago • 20 comments

Duplicates

  • [X] I have searched the existing issues

Steps to reproduce 🕹

Running autogpt/data_ingestion.py fails on stable , but i got it work by borrowing code from an open PR which is adding the following to the file:

import sys
from pathlib import Path

sys.path.append(str(Path(__file__).resolve().parent.parent))

Running: python autogpt/data_ingestion.py --dir auto_gpt_workspace Returns: Using memory of type: LocalCache Directory 'auto_gpt_workspace' ingested successfully.

I also tried inserting the contents of older auto-got.json into the blank version, but it still got overridden.

Current behavior 😯

log-ingestion.txt also remains empty after this and the script starts from the very beginning that is running google search.

Expected behavior 🤔

It should load previous from memory and start from there?

Your prompt 📝

No response

Tejaswgupta avatar Apr 15 '23 18:04 Tejaswgupta

new PR here, you can also just move the script back to the root directory, without changing the code. https://github.com/Significant-Gravitas/Auto-GPT/pull/1679

Running: python autogpt/data_ingestion.py --dir auto_gpt_workspace

edit: ahh, sorry, I get what you are saying now. auto_gpt_workspace is the root folder for data_ingestion.py commads.

So to ingest files inside of auto_gpt_workspace it should be:

python data_ingestion.py --dir .

to search inside auto_gpt_workspace/NewFolder it would be -- dir NewFolder

Let me know how that goes.

Slowly-Grokking avatar Apr 15 '23 19:04 Slowly-Grokking

Once ingested, how do you actually have the auto GPT search it's memory banks or "recall" information from memory?

omikolaj avatar Apr 16 '23 01:04 omikolaj

Once ingested, how do you actually have the auto GPT search it's memory banks or "recall" information from memory?

Been working on an efficient way. So far all fails.

Lucyan11 avatar Apr 16 '23 03:04 Lucyan11

Would love to see this work. I've also tried all the suggestions to no avail.

boktoday avatar Apr 16 '23 07:04 boktoday

Would love to see this work. I've also tried all the suggestions to no avail.

what error are you getting?

Slowly-Grokking avatar Apr 16 '23 07:04 Slowly-Grokking

I'm new to this.

File "/usr/local/lib/python3.10/argparse.py", line 1859, in parse_known_args namespace, args = self._parse_known_args(args, namespace) File "/usr/local/lib/python3.10/argparse.py", line 2112, in _parse_known_args self.error(msg % ' '.join(names)) File "/usr/local/lib/python3.10/argparse.py", line 2583, in error self.exit(2, _('%(prog)s: error: %(message)s\n') % args) File "/usr/local/lib/python3.10/argparse.py", line 2570, in exit _sys.exit(status) SystemExit: 2

data_ingestion.py -h --file Home Microgreens Quick Guide.pdf --dir /my_files --init --overlap 200 --max_length 4000 File "", line 1 data_ingestion.py -h --file Home Microgreens Quick Guide.pdf --dir /my_files --init --overlap 200 --max_length 4000

boktoday avatar Apr 16 '23 07:04 boktoday

I was able to ingest one file using the PR #1679 (move data_ingestion.py from autogpt/ to the root dir).

The file I ingested has been placed on:

auto_gpt_workspace/ingredients/Almonds.txt

And the command executed:

python data_ingestion.py --file ingredients/Almonds.txt

I'm using redis as backend, and I can see the data correctly embedded and ingested.

adrianlzt avatar Apr 16 '23 09:04 adrianlzt

The ingestion isn't the problem. It is successfully ingested. The issue is instructing AutoGPT to use it.

omikolaj avatar Apr 16 '23 11:04 omikolaj

Has anyone got this to work with local memory? Im still having issues.

ErrorModeCo avatar Apr 16 '23 15:04 ErrorModeCo

I can’t get it to use the information ingested using local memory. Has anyone got this working with a different memory store?

blazickjp avatar Apr 16 '23 15:04 blazickjp

The local memory is consulted here: https://github.com/Significant-Gravitas/Auto-GPT/blob/a91ef5695403066d5a9435ba0cee0f6186836c10/autogpt/chat.py#L82

            relevant_memory = (
                ""
                if len(full_message_history) == 0
                else permanent_memory.get_relevant(str(full_message_history[-9:]), 10)
            )

At the beginning, s there is no message history, the memory is not used.

Maybe an option is to try to access the local memory using embeddings from the goals.

Which use cases do you have? Maybe getting local memory from the goals does not work for all cases.

adrianlzt avatar Apr 16 '23 18:04 adrianlzt

data_ingestion.py -h --file Home Microgreens Quick Guide.pdf --dir /my_files --init --overlap 200 --max_length 4000

It doesn't like spaces in the name. Try putting it in quotes "Home Microgreens Quick Guide.pdf", renaming to not have spaces, or giving it a full path name with escape characters before the spaces.

I (well, hopefully someone else ;p) will look into making data_ingestion.py be more flexible with paths, as this will certainly be a commonly faced issue.

Slowly-Grokking avatar Apr 16 '23 19:04 Slowly-Grokking

Has anyone got this to work with local memory? Im still having issues.

From my understanding, auto-gpt.json gets wiped when autogpt starts. I believe in order to have a chance at using ingested data that you must first startup autogpt, and run data_ingestion.py in a separate terminal, before submitting 'Y' for autogpt to continue.

I believe that in order to make best use of this, we will have to make a command that will check the relevant memory, as merely asking autogpt to check it's memory via human feedback or goals, most often has it checking read_file: "memory"

Slowly-Grokking avatar Apr 16 '23 19:04 Slowly-Grokking

Doesn't seem to work for me on local memory. Whenever I use --debug it never pulls anything in even though I can see AutoGpt.json has the embeddings in it. When I use redis it works fine

Xenoamor avatar Apr 16 '23 22:04 Xenoamor

I’ll try redis and see if that helps

blazickjp avatar Apr 16 '23 23:04 blazickjp

I'm trying Redis. I noticed index AutoGpt is created in Redis while .env specifies MEMORY_INDEX=auto-gpt. Configuring .env to either index and prompting for references to values in files preloaded in redis doesn't work for me. I can execute manual redis search commands such as FT.SEARCH AutoGpt "FRANKLIN" LIMIT 0 10 and that yields results but autogpt is ignorant of it regardless of suggesting it reference it's own local memory.

SetheenDev avatar Apr 16 '23 23:04 SetheenDev

Oh, the .env.template has duplicate key names for MEMORY_INDEX by default. Commenting out other memory providers.

SetheenDev avatar Apr 16 '23 23:04 SetheenDev

@Tejaswgupta is it working for you now or can we close this issue?

Qoyyuum avatar Apr 17 '23 12:04 Qoyyuum

Can anyone give quick guidance on verifying the index in redis? I’ve never used redis before.

blazickjp avatar Apr 17 '23 13:04 blazickjp

  • Connect to redis by running redis-cli
  • Query for indexes by sending FT._LIST

SetheenDev avatar Apr 17 '23 17:04 SetheenDev

Oh, the .env.template has duplicate key names for MEMORY_INDEX by default. Commenting out other memory providers.

@SetheenDev https://github.com/Significant-Gravitas/Auto-GPT/issues/2191

qualeo avatar Apr 18 '23 18:04 qualeo

any successful data-ingestions with local_json? (since other memory backends aren't currently supported) after fixing a few errors myself (missing workspace_path in the config object, invalid number of arguments when calling ingest_file) and still no luck. I'm getting:

AutoGPT-Ingestion ERROR Error while ingesting directory 'data': Could not get Agent from decorated command's args

no luck with specific files either. what gives? any idea?

drora avatar Jul 10 '23 13:07 drora

This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.

github-actions[bot] avatar Sep 06 '23 21:09 github-actions[bot]

This issue was closed automatically because it has been stale for 10 days with no activity.

github-actions[bot] avatar Sep 18 '23 01:09 github-actions[bot]