Archon Should my crawled_pages and code

I've just added some knowledge to my Archon instance by crawling the docs sites of Convex and Next.JS. Everything seemed to process ok, pages, code examples etc. and both crawls completed without issue.

However in my Supabase project both the crawled_pages and code_examples tables are empty. The sources table shows entries for Convex and Next.js and I see things in Settings and so on.

Is this expected?

Aug 20 '25 16:08 streeyt

This is not expected, have you refreshed the supabase page?

Also can you check if you have any errors in the Archon-Server logs in Docker?

Aug 20 '25 16:08 Wirasm

I have the same issue. Some people said it is a about not using an embedding model like gemini embedding 001 but I have no idea how to set that up. Which provider did you use for sracping?

Aug 20 '25 17:08 mazemaster9

@streeyt

Are you using the database instance from supabase.com or a local instance?

@Wirasm I think this is related to #302.

The 'code examples' (from my testing) only work if you have an LLM API key defined in the 'Settings' page and you are using a supabase.com instance. If you are using a local supabase instance, the storing of the knowledge and code examples does not work, as described in #302.

Aug 20 '25 17:08 Dandaman42

@Dandaman42

Yes, I refreshed the Supabase page numerous times.

I'm using the hosted (Supabase.com) Supabase, not local.

I have OpenAI AND Gemini API Keys configured in Archon. RAG is using gemini-2.5-flash with the gemini-embedding-001 embedding model. tbh I didn't have those setup for the first two crawls I did, and the issue with the empty tables was apparent. I now have the RAG settings configured and the tables are still empty with subsequent crawls.

The Archon UI shows lots of crawled pages and there were plenty of code examples extracted, especially from Next.js, but still nothing in what seem to be the relevant tables @ Supabase.

Aug 20 '25 17:08 streeyt

try changing your embedding dimension in the .env to 3072 as thats the standard for gemini-embedding-001

go to .env
find EMBEDDING_DIMENSIONS=1536
change the value from 1536 to 3072

Let me know i that helps

Aug 20 '25 17:08 Wirasm

try changing your embedding dimension in the .env to 3072 as thats the standard for gemini-embedding-001

go to .env

find EMBEDDING_DIMENSIONS=1536

change the value from 1536 to 3072

Let me know i that helps

Thanks but sadly that didn't help. I made the edit, restarted the Docker images and re-crawled a couple of my Knowledge entries. Everything appears to be working ok in the UI, but I never get anything written to the relevant db tables in Supabase.

Aug 20 '25 21:08 streeyt

It seems that there is an issue with the dimensions most likely, i will look into it

Aug 21 '25 08:08 Wirasm

Probably not relevant, but I meant to mention, this is happening on two separate Archon installs for me, one on my desktop Mac and one on my MacBook. Same config and db

Aug 21 '25 08:08 streeyt

I dont know if this issue is the same as mine, however, working from a sub folder under Archon helped relieve all my rag issues. I have win 11, claude code via win (not wsl).

Aug 21 '25 11:08 day-trading-oracle

had the same issue, turns out i chose the wrong embedding model for gemini (i forgot to change it from default to gemini-embedding-001)

Aug 21 '25 18:08 rennyS

That's strange. I'm using the Gemini Embedding model the same as you have and I get nothing written to the dB tables.

Aug 21 '25 18:08 streeyt

Aug 21 '25 19:08 rennyS

How even yo can change models. I copy the api key from aistudio and paste it in the ui. That one api key is for all models. How and where you choose models. There is only a static text entry area in the ui and i guess its for only naming. There is no dropdown menu to choose a model

21 Ağu 2025 Per 22:02 tarihinde Laurence @.***> şunu yazdı:

rennyS left a comment (coleam00/Archon#388) https://github.com/coleam00/Archon/issues/388#issuecomment-3211755442 9D985ED5-2743-466A-98AE-D6F4CA8A4804.png (view on web) https://github.com/user-attachments/assets/0d49f6eb-cb7f-4ed3-9927-b26a8ebf896f

— Reply to this email directly, view it on GitHub https://github.com/coleam00/Archon/issues/388#issuecomment-3211755442, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKTPCAJZBFLPP3L4WBRL3WL3OYJWFAVCNFSM6AAAAACEL73OLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMJRG42TKNBUGI . You are receiving this because you commented.Message ID: @.***>

Aug 21 '25 19:08 mazemaster9

How even yo can change models. I copy the api key from aistudio and paste it in the ui. That one api key is for all models. How and where you choose models. There is only a static text entry area in the ui and i guess its for only naming. There is no dropdown menu to choose a model

21 Ağu 2025 Per 22:02 tarihinde Laurence @.***> şunu yazdı: …

rennyS left a comment (coleam00/Archon#388) <#388 (comment)> 9D985ED5-2743-466A-98AE-D6F4CA8A4804.png (view on web) https://github.com/user-attachments/assets/0d49f6eb-cb7f-4ed3-9927-b26a8ebf896f

— Reply to this email directly, view it on GitHub <#388 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKTPCAJZBFLPP3L4WBRL3WL3OYJWFAVCNFSM6AAAAACEL73OLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMJRG42TKNBUGI . You are receiving this because you commented.Message ID: @.***>

you have to manually enter the model name

Aug 21 '25 19:08 rennyS

I think I have the same settings that you have. The only difference I can see is that I checked "Use contextual embeddings" - could that be related?

Aug 21 '25 19:08 streeyt

I am having a similar problem. Just from a basic user experience perspective, not having this as a drop-down with known compatible configurations was a bit of an issue on initial setup.

Aug 21 '25 20:08 Aston77

Yes, that's it! After reinstalling archon and changing the model to gemini-embedding-001 its working for me now.

streeyt @.***>, 21 Ağu 2025 Per, 22:46 tarihinde şunu yazdı:

streeyt left a comment (coleam00/Archon#388) https://github.com/coleam00/Archon/issues/388#issuecomment-3211869170

[image: Image] https://private-user-images.githubusercontent.com/8315434/480663083-0d49f6eb-cb7f-4ed3-9927-b26a8ebf896f.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MDU3NTAsIm5iZiI6MTc1NTgwNTQ1MCwicGF0aCI6Ii84MzE1NDM0LzQ4MDY2MzA4My0wZDQ5ZjZlYi1jYjdmLTRlZDMtOTkyNy1iMjZhOGViZjg5NmYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDgyMSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTA4MjFUMTk0NDEwWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MjNhMmFmN2NkODBjZGNhMmJkMjk4YzVmOTc5MjZlOTdlMzliZGM3YzBiMzRjNTlmNTlkYjQ2NGRlMjk4OWIyZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.XpM6XWVRLvZC3vZAo3Nf7glYOiMnVfYpZ4OVBaNvVaU

I think I have the same settings that you have. The only difference I can see is that I checked "Use contextual embeddings" - could that be related? Screenshot.2025-08-21.at.20.43.35.png (view on web) https://github.com/user-attachments/assets/bb1c5e68-4cdf-44df-abd5-64784416e2b6

— Reply to this email directly, view it on GitHub https://github.com/coleam00/Archon/issues/388#issuecomment-3211869170, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKTPCAJWRRARMDCEW2XW6AT3OYOYNAVCNFSM6AAAAACEL73OLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMJRHA3DSMJXGA . You are receiving this because you commented.Message ID: @.***>

Aug 21 '25 21:08 mazemaster9

Do you have "Use contextual embeddings" checked?

Yes, that's it! After reinstalling archon and changing the model to gemini-embedding-001 its working for me now.

streeyt @.***>, 21 Ağu 2025 Per, 22:46 tarihinde şunu yazdı:

streeyt left a comment (coleam00/Archon#388) https://github.com/coleam00/Archon/issues/388#issuecomment-3211869170

[image: Image] https://private-user-images.githubusercontent.com/8315434/480663083-0d49f6eb-cb7f-4ed3-9927-b26a8ebf896f.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NTU4MDU3NTAsIm5iZiI6MTc1NTgwNTQ1MCwicGF0aCI6Ii84MzE1NDM0LzQ4MDY2MzA4My0wZDQ5ZjZlYi1jYjdmLTRlZDMtOTkyNy1iMjZhOGViZjg5NmYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDgyMSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTA4MjFUMTk0NDEwWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MjNhMmFmN2NkODBjZGNhMmJkMjk4YzVmOTc5MjZlOTdlMzliZGM3YzBiMzRjNTlmNTlkYjQ2NGRlMjk4OWIyZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.XpM6XWVRLvZC3vZAo3Nf7glYOiMnVfYpZ4OVBaNvVaU

I think I have the same settings that you have. The only difference I can see is that I checked "Use contextual embeddings" - could that be related? Screenshot.2025-08-21.at.20.43.35.png (view on web) https://github.com/user-attachments/assets/bb1c5e68-4cdf-44df-abd5-64784416e2b6

— Reply to this email directly, view it on GitHub https://github.com/coleam00/Archon/issues/388#issuecomment-3211869170, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKTPCAJWRRARMDCEW2XW6AT3OYOYNAVCNFSM6AAAAACEL73OLOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTEMJRHA3DSMJXGA . You are receiving this because you commented.Message ID: @.***>

Aug 21 '25 21:08 streeyt

I am having a similar problem. Just from a basic user experience perspective, not having this as a drop-down with known compatible configurations was a bit of an issue on initial setup.

I had 'text-embedding-001' instead of 'gemini-embedding-001'

This worked

Aug 21 '25 21:08 Aston77

I am facing the same issue

Aug 22 '25 15:08 Crofter777

What I have noticed is that it will silently fail if you exceed the Gemini free tier. It crawls the webpage, but it won't notify you if the embeddings aren't being created.

In some ways, it would actually be better as two separate processes...

Crawl and allow for review of what was crawled at a greater depth so that you could have some insight into what depth level was needed for your purposes.
Then allow embeddings to be created at the depth.

As it currently stands, there really isn't a great deal of visibility into what you are embedding, outside of reviewing the table entries which itself isn't a user-friendly experience.

Aug 23 '25 18:08 Aston77

good point here @Aston77

Aug 27 '25 18:08 Wirasm

don't want to duplicate so dropping a link. there is also the migrations/db tables that expect embedding size

EDIT: quoting link

The .env has the environment variable EMBEDDING_DIMENSIONS but the migration/complete_setup.sql creates tables expecting the 1536 size. Using the migration/RESET_DB.sql and manually editing the migration/complete_setup.sql references to 1536 to the same value i set in EMBEDDING_DIMENSIONS has shown success on knowledge crawl (tested with/without "Use Contextual Embeddings", no further testing)

Sep 04 '25 15:09 theProf

Should my crawled_pages and code_examples tables be empty?