autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Postgres support added to Autogen Studio

Open m-carter1 opened this issue 1 year ago • 8 comments

Why are these changes needed?

Currently the data for Autogen Studio is stored on a sqlite database saved to the filesystem. There are many limitations with this such as when deploying to a container the database file may not be persisted etc.

This PR allows users to configure if they want to connect to a Postgres database (set via env vars) or if not it will use the default sqlite.

I changed the DBManager to be an interface so more database types can be added in the future.

Checks

  • [x ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
  • [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR.
  • [ ] I've made sure all auto checks have passed.

m-carter1 avatar Jan 27 '24 15:01 m-carter1

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (1ab2354) 32.48% compared to head (66bdf91) 32.48%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1429   +/-   ##
=======================================
  Coverage   32.48%   32.48%           
=======================================
  Files          41       41           
  Lines        4907     4907           
  Branches     1120     1120           
=======================================
  Hits         1594     1594           
  Misses       3187     3187           
  Partials      126      126           
Flag Coverage Δ
unittests 32.44% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jan 27 '24 16:01 codecov-commenter

Hi @m-carter1 ,

Thanks so much for this! It paves the way to much better db support in AutoGen studio. One early comment here is that I am wondering if we should try merging this into the autogenstudio branch (dev/feature branch for autogenstudio) per the ags contribution guide which is ahead of main?

Do you want to take a pass at this, else, I can try to do this too. Let me know.

@victordibia I've had a go at merging: PR 1446

m-carter1 avatar Jan 29 '24 11:01 m-carter1

@m-carter1 ,

Thanks. However, that merge seems to have a few issues:

  • It seems it is not based on the autogenstudio branch (one way to do this would be to clone the autogenstudio branch and then add/test your changes on top)
  • It seems to be touching many files outside of the samples/autogenstudio folder. In general, we want zero touch points outside of this folder for autogenstudio related PRs.

Given the above, I will review and move this towards merging in main and then handle the integration with the autogenstudio branch :). For future PRs, we can start with autogenstudio!

I'll finish my review and update you shortly!

victordibia avatar Jan 29 '24 23:01 victordibia

@victordibia ah ok my bad, yes i just merged this branch (based off of main) with the autogenstudio branch.

I will use autogenstudio branch for any future changes :)

m-carter1 avatar Jan 30 '24 08:01 m-carter1

@m-carter1 can you also add connection pooling to make it more robust. Also there is bug for timestamp conversion in get_gallery method, please check below highlighted code.

for row in result: if isinstance(row.get('timestamp'), datetime): row['timestamp'] = row['timestamp'].isoformat() gallery_item = Gallery( id=row["id"], session=Session(**json.loads(row["session"])), messages=[Message(**message) for message in json.loads(row["messages"])], tags=json.loads(row["tags"]), timestamp=row["timestamp"], )

ashish31negi avatar Feb 11 '24 15:02 ashish31negi

Hi @m-carter1 , all,

Just to revisit this PR. First of all, thanks for contributing this PR @m-carter1, it is a step towards improving AGS backend api. The ideas here are related to a few other issues that we are all discussing, several of them consolidated in #1694 ..

  • [ ] Need to link entities in db for better ux and enforce protections etc
  • [ ] Integrate an ORM like SQLAlchemy / SQLModel for broader db backend support ..
  • [ ] Better serialization of data
  • [ ] Improved API specs

I'll update this as progress begins

victordibia avatar Mar 14 '24 19:03 victordibia

Hi @m-carter1, I modified AutogenStudio's code myself to use PostgreSQL instead of sqlite, one thing to note is that PostgreSQL is stricter with JSON structures compared to SQLite. For instance, in the workflows table, the sender and receiver fields may encounter issues during retrieval if the JSON structure deviates even slightly due to its growing complexity (single quote/double quote, new line, some symbol not being properly escaped, etc.). It will work fine with sqlite but not with postgreSQL, at least in my case. In order to fix it, you might have to touch upsert_ functions in dbutils to make sure what you inserted is 100% clean formattable json structure. Hope it helps.

ShaneYuTH avatar Mar 15 '24 20:03 ShaneYuTH

⚠️ GitGuardian has uncovered 96 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard. Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
12853598 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret e43a86c78f3f947b6e142b3aaf36e7a9852f7078 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret bdb40d77d7be9ea42e5fa28f6e851edf549ef0af test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 954ca451f949a4924578bb7e3e93c97ad4ba1dd5 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
10404662 Triggered Generic CLI Secret eff19acf1365e34fe17d9ac0939666f32b3ceda5 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 06a0a5ddb39fb3000e40cb4872741dead9fcb7e0 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 0524c774dda696855e9a8f87fe0b9ed96cce13f6 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret d7ea410501cb96bc97a203a0a95431541515f9cc .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret e43a86c78f3f947b6e142b3aaf36e7a9852f7078 .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret 841ed315e2f19d79a6b86ed587eb6e0fc4a0c0da .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 802f099588bedf1d022b2bba5fb534635df8e6f1 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 9a484d8589761d58940e3c0b79215204ce6a23c1 .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret e973ac38ea4f7b36687ea03aa44f770d7e2ddcac .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret 89650e74f572bbc60e9c24e04b4c601616e439c7 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret e07b06bc939353dee2afc6dd52c159d6cb3428d5 .github/workflows/dotnet-release.yml View secret
10404662 Triggered Generic CLI Secret abe4c419c45a244e2399508f9ffef606ce6a4685 .github/workflows/dotnet-build.yml View secret
10404662 Triggered Generic CLI Secret 7362fb9cf4f0a6b3e7ae4e97229afad655d5d87b .github/workflows/dotnet-release.yml View secret
12853599 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret e43a86c78f3f947b6e142b3aaf36e7a9852f7078 test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 954ca451f949a4924578bb7e3e93c97ad4ba1dd5 test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret bdb40d77d7be9ea42e5fa28f6e851edf549ef0af test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret abad9ff4487444324d1916a6b94c8049b4cab9e7 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 954ca451f949a4924578bb7e3e93c97ad4ba1dd5 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret c7bb588a684a038dbee7f9dc9afd6d44ce35ac3a test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret b97b99d4b2cfe735a7aa46258508d4a7cda9cad5 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret e43a86c78f3f947b6e142b3aaf36e7a9852f7078 test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
10493810 Triggered Generic Password 49e8053dd1e5456d3758b4a85f5721e9c9b12e16 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b4fc0c649fa2b1cbf0fd72fa6f14f026d0 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 49e8053dd1e5456d3758b4a85f5721e9c9b12e16 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b4fc0c649fa2b1cbf0fd72fa6f14f026d0 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63596fce7b84e0ef8a7bd6518d5f4336eaf notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339ac749b3c7d51f1b0fae156d41c02b214e notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 49e8053dd1e5456d3758b4a85f5721e9c9b12e16 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63596fce7b84e0ef8a7bd6518d5f4336eaf notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339ac749b3c7d51f1b0fae156d41c02b214e notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password d422c63596fce7b84e0ef8a7bd6518d5f4336eaf notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 97fa339ac749b3c7d51f1b0fae156d41c02b214e notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 501610b4fc0c649fa2b1cbf0fd72fa6f14f026d0 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10404696 Triggered Generic High Entropy Secret 954ca451f949a4924578bb7e3e93c97ad4ba1dd5 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret bdb40d77d7be9ea42e5fa28f6e851edf549ef0af test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret e43a86c78f3f947b6e142b3aaf36e7a9852f7078 test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret bdb40d77d7be9ea42e5fa28f6e851edf549ef0af test/oai/test_utils.py View secret
12853602 Triggered Generic High Entropy Secret 79dbb7bc2561713bc11225849e408dc74db1228f test/oai/test_utils.py View secret
11616921 Triggered Generic High Entropy Secret a86d0fde2e667f9177eea55da17312b770c9d76b notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 394561b4629222c19d8bb3bc58222fc8813a5833 notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 3eac646b8974e1d1be3fde557b055563f56f2f5f notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret f45b55337aa7f9d6151f21f2c55ba5dc83f95b79 notebook/agentchat_agentops.ipynb View secret
11616921 Triggered Generic High Entropy Secret 65632487224aae2c7faca67de40fd2bda3ad3905 notebook/agentchat_agentops.ipynb View secret
12853598 Triggered Generic High Entropy Secret 2b3a9ae05569d38cfc89d9daa4b1612b8406d178 test/oai/test_utils.py View secret
12853598 Triggered Generic High Entropy Secret c03558f5b2708c31df36666486f54b2714435162 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret c03558f5b2708c31df36666486f54b2714435162 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 2b3a9ae05569d38cfc89d9daa4b1612b8406d178 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 0a3c6c49834b4f68c688c1d2e76a5ebb8f7d91e2 test/oai/test_utils.py View secret
10404693 Triggered Generic High Entropy Secret 76f5f5a66532a3b95d03fe0ae8a56a59e43012e1 test/oai/test_utils.py View secret
10404662 Triggered Generic CLI Secret 954ca451f949a4924578bb7e3e93c97ad4ba1dd5 .github/workflows/dotnet-build.yml View secret
12853599 Triggered Generic High Entropy Secret 2b3a9ae05569d38cfc89d9daa4b1612b8406d178 test/oai/test_utils.py View secret
12853599 Triggered Generic High Entropy Secret c03558f5b2708c31df36666486f54b2714435162 test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 76f5f5a66532a3b95d03fe0ae8a56a59e43012e1 test/oai/test_utils.py View secret
10404694 Triggered Generic High Entropy Secret 0a3c6c49834b4f68c688c1d2e76a5ebb8f7d91e2 test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 3b79cc6fdfa485026262ec44aa5f1f0238bf7c3d test/oai/test_utils.py View secret
10404695 Triggered Generic High Entropy Secret 11baa52155c3846299d39ed0b5a814b33b0eb671 test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret c03558f5b2708c31df36666486f54b2714435162 test/oai/test_utils.py View secret
12853600 Triggered Generic High Entropy Secret 2b3a9ae05569d38cfc89d9daa4b1612b8406d178 test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret c03558f5b2708c31df36666486f54b2714435162 test/oai/test_utils.py View secret
12853601 Triggered Generic High Entropy Secret 2b3a9ae05569d38cfc89d9daa4b1612b8406d178 test/oai/test_utils.py View secret
10493810 Triggered Generic Password 3b79cc6fdfa485026262ec44aa5f1f0238bf7c3d notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 11baa52155c3846299d39ed0b5a814b33b0eb671 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 11baa52155c3846299d39ed0b5a814b33b0eb671 notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10493810 Triggered Generic Password 3b79cc6fdfa485026262ec44aa5f1f0238bf7c3d notebook/agentchat_pgvector_RetrieveChat.ipynb View secret
10404696 Triggered Generic High Entropy Secret 0a3c6c49834b4f68c688c1d2e76a5ebb8f7d91e2 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 76f5f5a66532a3b95d03fe0ae8a56a59e43012e1 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret c03558f5b2708c31df36666486f54b2714435162 test/oai/test_utils.py View secret
10404696 Triggered Generic High Entropy Secret 2b3a9ae05569d38cfc89d9daa4b1612b8406d178 test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret 2b3a9ae05569d38cfc89d9daa4b1612b8406d178 test/oai/test_utils.py View secret
10422482 Triggered Generic High Entropy Secret c03558f5b2708c31df36666486f54b2714435162 test/oai/test_utils.py View secret

and 16 others.

🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

gitguardian[bot] avatar Jul 20 '24 21:07 gitguardian[bot]