Infinite file size ballooning, memory leaks, and crashes being caused by langflow.db
Related separate issue: https://github.com/langflow-ai/langflow/issues/3970
Bug Description
Whenever I try to embed my 70MB text file, it overtime starts eating up all my RAM. Then even booting langflow begins eating up ram, I am assuming because of that 70MB file
Initially, I was attempting to perform RAG and noticed my system lagging heavily then I narrowed it down to the file component
Reproduction
Before installation, I created a python venv using
python3 -m venv langflow
source langflow/bin/activate
Then I installed langflow while in that environment
pip install langflow -U
Steps to reproduce
Please see both of these links https://github.com/langflow-ai/langflow/issues/4008#issuecomment-2399700910 https://github.com/langflow-ai/langflow/issues/4008#issuecomment-2407550796
Who can help?
No response
Operating System
OS: Ubuntu 24.04.1 LTS x86_64
Langflow Version
Langflow v1.0.18
Python Version
3.12
Screenshot
Flow File
No response
Hey, @EDLLT! I'm here to help you with bugs, questions, and becoming a contributor. Let's tackle this issue together while we wait for a human maintainer.
The issue you're experiencing with memory consumption when repeatedly building the "file" component in Langflow could be related to the caching mechanism. Langflow uses various caching services, such as AsyncInMemoryCache, ThreadingInMemoryCache, and RedisCache, which might not be properly releasing memory when handling large files like your 70MB text file. This can lead to increased memory usage each time the component is rebuilt [1][2].
To mitigate this issue, you might want to check the configuration of the caching services to ensure they are set up to efficiently manage memory. Additionally, consider clearing the cache manually or adjusting the cache settings to prevent excessive memory consumption. If the problem persists, it might be necessary to look into the specific implementation of the FileComponent to identify any potential memory leaks or inefficiencies in memory management.
To continue talking to Dosu, mention @dosu.
Through further experimentation, I discovered that even uploading is causing a memory leak. If you simply upload the same file multiple times, you'll notice the memory usage increasing
I've tested running this multiple times while the other parts are commented out
data = parse_text_file_to_data(resolved_path, silent_errors)
It wasn't causing any memory leaks(after loading it, memory returned back to the exact same spot)
However, this was problematic
self.status = data if data else "No data"
It seems that this line is causing a memory leak as every time I rebuild the component with that line, memory doesn't seem to get back but rather increases on every build
Seems like this is also problematic as it's increasing memory usage on rebuild
return data or Data()
Figured out something very problematic. The langflow.db file is storing all every component's output, it seems
Seems like this file plays a huge role
langflow_source-code/src/backend/base/langflow/services/database/models/vertex_builds/crud.py
It commits to the db the output results of the component, I am assuming for caching upon rebuilds.
The problem is, it doesn't really cache properly as it caches the same file's content over and over again then committing to the db making it balloon
Found more. What's the purpose of logging every vertex build? As it seems to be accumulating and taking up tremendous amounts of storage space in the db file causing langflow to crash
langflow_source-code/src/backend/base/langflow/graph/utils.py
def log_vertex_build(
flow_id: str,
vertex_id: str,
valid: bool,
params: Any,
data: ResultDataResponse,
artifacts: dict | None = None,
):
try:
if not get_settings_service().settings.vertex_builds_storage_enabled:
return
vertex_build = VertexBuildBase(
flow_id=flow_id,
id=vertex_id,
valid=valid,
params=str(params) if params else None,
# ugly hack to get the model dump with weird datatypes
data=json.loads(data.model_dump_json()),
# ugly hack to get the model dump with weird datatypes
artifacts=json.loads(json.dumps(artifacts, default=str)),
)
with session_getter(get_db_service()) as session:
inserted = crud_log_vertex_build(session, vertex_build)
logger.debug(f"Logged vertex build: {inserted.build_id}")
except Exception as e:
logger.exception(f"Error logging vertex build: {e}")
edit: Okay, it seems like its purpose is to cache results. The problem with it is that it doesn't clear up the previous vertex builds once the component's been rebuilt There also needs to be a size limit for when to cache and when to not cache as langflow crashes with large ones
Found more. What's the purpose of logging every vertex build? As it seems to be accumulating and taking up tremendous amounts of storage space in the db file causing langflow to crash
@nicoloboschi
Hey @EDLLT, thanks for providing this step-by-step to reproduce the issue!
I observed the same issue when processing huge volume of data. Looks like in both vertex_build table and transaction table both are logging the input and output of each component.
so if you are logging 70MB worth of text in one column, this will make the consumption of that table impossible. then if you delete those tables, everything will resume working, but you loose the metadata stored in langflow DB.
Hey @Cristhianzl can you please help with this issue?
hey @codenprogressive, @EDLLT how are you?
We have a fix for this error coming soon in the next release. The front end can now handle this amount of data without breaking, as it did for you, @EDLLT.
The point is that we can't truncate the data to save it in the database because it would break other features like the freeze/freeze path. We need to save the runs on vertex_build and transaction tables for these features to work properly.
So my advice is: when you're working with large files or large amounts of data, please try to use these features as well. This way, the file or large data won't be processed twice. :)
I'm closing this issue because the error has been fixed on the MAIN branch and will be included in the next Langflow release!
Thank you!!
@Cristhianzl Hey! Thanks for the prompt fix.
I have tested it out and unfortunately the issues that I had mentioned in previous comments still occur and new issues that arise from the frontend
In these, I talk about the file component but I think the same issue occurs to every component that deals with data
Issues that still occur
-
Deleting the entire flow clears up the relevant parts of the vertex_build cache; however, deleting the file component does not delete its relevant data from the vertex_build cache. This causes
langflow.dbto pile up space from non-existing components overtime within the flow -
When rebuilding the file component explicitly(doesn't matter if it's frozen or not), the previously cached data does not get removed from
vertex_buildinlangflow.dband the data gets duplicated within vertex_build. -
Uploading the same file multiple times using langflow's file component increases memory usage. In my case, uploading the
random_ascii_70MB.txt10 times increased memory usage by ~1GB -
- Upon reloading the page, langflow spikes up in memory then crashes the langflow frontend
Fixed issues
- When processing a large amount of data using the file component
- After it has built successfully, we are able to press on Data within langflow to view the contents of our file.
- Then, when I try to refresh the page and try to view the data within my already built file component, it crashes my tab(tested on both brave and chrome)
-
-
@EDLLT hi,
Are you using the freeze/freeze path feature to prevent reprocessing files that have already been processed? Note that if you don't use these features, the database will be increased each time you run the flow.
Are you on the main branch locally? Please note that we haven't released the fix for this yet.
Are you using the freeze/freeze path feature to prevent reprocessing files that have already been processed? Note that if you don't use these features, the database will be increased each time you run the flow.
Yes, I have tried using the freeze feature which still resulted in the same problems
Are you on the main branch locally? Please note that we haven't released the fix for this yet.
Yes, I am building langflow from the source code's main branch, commit bffb0f129bc61bacc57ec2591d3e6525e3088b93
I'll try to get to the bottom of this. Meanwhile, could this issue be reopened?
Here's a useful patch I've written for debugging purposes. It helps show how many vertex builds there are, when they get returned and when they are getting committed to the database
Here's a patch to crud.py to make it more verbose for debugging.
Example of what it outputs when building components/refreshing the page
When building a component, the data gets committed into the db
/usr/lib/python3.12/asyncio/base_events.py:726: ResourceWarning: unclosed event loop <_UnixSelectorEventLoop running=False closed=False debug=False>
_warn(f"unclosed event loop {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
log_vertex_build called with vertex_build id: File-osnlJ
Vertex build data size: 1816 bytes
Created VertexBuildTable with id: File-osnlJ
Table contents:
timestamp: 2024-10-08T15:05:12.996697+00:00
id: File-osnlJ
data: {"results": {}, "outputs": {"data": {"message": {"file_path": "/home/edllt/.cache/langflow/b2e77365-53e4-4f56-80a9-7d51a553913d/2024-10-08_18-01-39_random_text.txt"...
artifacts: {"data": {"rep...
params: None
valid: True
flow_id: b2e77365-53e4-4f56-80a9-7d51a553913d
build_id: dc9903b9-30d9-4fb5-97fa-b681287b097f
Successfully committed VertexBuildTable with id: File-osnlJ
log_vertex_build finished
Upon refreshing, it seems like all vertex builds are being returned
get_vertex_builds_by_flow_id called with flow_id: b2e77365-53e4-4f56-80a9-7d51a553913d, limit: 1000
Returning 3 vertex builds
Using this script, as well as using sql commands to view the db, I figured that the page crash is probably because it's returning a huge amount of data upon refreshing. That along with the fact that the previously built components' cached outputs in the db don't get cleared up makes it take up a lot of storage and ram
@EDLLT
We are verifying if there is any issue with the freeze/freeze path feature, and we are going to fix It before the next release. I will also check into the frontend crashing.
I'll reopen the issue. Thanks again!
hi @codenprogressive @EDLLT
We have confirmed that the freeze feature is functioning as expected. I will implement a fix to optimize data storage in the database, which will reduce memory usage and prevent potential frontend crashes.
Thanks for your feedback and patience. Feel free to contact us anytime :)
#4078
Issues that still occur
Deleting the entire flow clears up the relevant parts of the vertex_build cache; however, deleting the file component does not delete its relevant data from the vertex_build cache. This causes
langflow.dbto pile up space from non-existing components overtime within the flowWhen rebuilding the file component explicitly(doesn't matter if it's frozen or not), the previously cached data does not get removed from
vertex_buildinlangflow.dband the data gets duplicated within vertex_build.Uploading the same file multiple times using langflow's file component increases memory usage. In my case, uploading the
random_ascii_70MB.txt10 times increased memory usage by ~1GBNew issues
When processing a large amount of data using the file component
- After it has built successfully, we are able to press on Data within langflow to view the contents of our file.
- Then, when I try to refresh the page and try to view the data within my already built file component, it crashes my tab(tested on both brave and chrome)
![]()
![]()
Taking data from the file component then using split text processes successfully but
- Upon reloading the page(on the flow not the main menu), langflow starts taking up a significant amount of ram(reached up to ~15GB) before it settles down and afterwards, the browser's memory starts spiking up to 2~3GB before crashing the tab(tested on both brave and chrome)
@Cristhianzl I haven't tested the fix yet but looking at the PR, it seems that it only addressed the frontend crashing issue through truncating long strings.
Other issues like data duplication in the database from rebuilds, dead data in the database belonging to non-existing components when deleting them, uploading file taking memory upon each reupload, etc(i have written all the issues in the previous comment) don't seem to have been addressed
@EDLLT, hi
We are concerned about this and are close to the point where we will no longer use the vertex_build table. I hope that within the next few weeks, we can disable this table and eliminate data duplication.
Truncating the data stored in the database (PR #4078) will significantly reduce memory usage, prevent frontend crashes, and lower storage requirements.
@Cristhianzl Hey Apologies for continuing to bother you about this.
I have tested the main branch with the latest commit which includes your recently merged PR. It seems to not crash upon refreshing during the first component's build anymore; however, the frontend still ends up spiking memory and ultimately crashing when refreshing the page after building the component more than once even if the components were frozen.
(This occurs due to the previously mentioned data duplication issue in vertex_build not taking freezing into account and not clearing up previously built component's data as mentioned earlier. Manually deleting all entries from the vertex_build table stops the crashing)
Also, may I request for this issue to stay open up until all the database issues including the ones I had mentioned earlier are fixed? As they all seem to play a role in this
edit:
How to reproduce the crash
For this example, we could generate a 70MB file containing random characters with new lines(in my case, I tried using my own real data but this file should reproduce the same issue)tr -cd '[:print:]\n' < /dev/urandom | head -c 70000000 > random_ascii_with_newlines.txt
Upload it to the file component
Get the text split component then split it using the default values(Chunk size 1000, Chunk overlap 200)
Build split text component Refresh the page No crash
Build split text component again(doesn't matter if you freeze them or not)
Refresh page
Langflow crashes
I've edited this comment to include steps to reproduce the crash https://github.com/langflow-ai/langflow/issues/4008#issuecomment-2407550796
hey @EDLLT you are correct.
I found the problem. Somehow after the first run, the params column is been stored with memory heap, causing the cascade error to the frontend.
There’s no problem with the freeze feature. It operates using a cache table in memory, not the vertex_build table.
We are still working on the improvement to remove this vertex_build table, this could take a while but, in the next weeks we are not going to have this table anymore.
After lots of runs of this flow (this is a 50MB CSV file uploaded), the memory heap is not happening anymore :) This field was not been truncated as the others.
My suggestion now is to clean up the vertex_build and transactions tables. Once you do that, the data will be truncated and displayed accordingly on the frontend, which should prevent any memory leaks or frontend crashes.
I would like to thank you for your patience and help! Really appreciated!
PR: #4118
I will let this issue open until you confirm everything is working for you.
@Cristhianzl Unfortunately langflow still crashes on the latest commit d0fdc568902990142762a4ca2de3c50ca6976c28 (Btw, the first thing I had done was to clear up my transaction and vertex_build tables, create a new flow, and make sure that I am on the latest commit d0fdc568902990142762a4ca2de3c50ca6976c28)
sqlite> delete from "transaction";
sqlite> delete from vertex_build;
The steps to reproduce it remain the same https://github.com/langflow-ai/langflow/issues/4008#issuecomment-2407550796
Also here's the random ascii file and flow(hopefully would help with reproducability)
The flow Langflow Crasher.json
The ASCII 70MB file https://drive.google.com/file/d/1WBp0LoZiPqCBc4IhdabtV___7ZqPaV0a/view?usp=sharing
@EDLLT how are you running the Langflow?
@EDLLT how are you running the Langflow?
Within langflow's github source code on the main branch
source .venv/bin/activate on both tabs
make backend on one terminal tab make frontend on another terminal tab
hi @EDLLT,
You are absolutely correct about the error you're reporting. I'm following the steps you provided to reproduce it.
I have a solution that would temporarily fix this error. However, the real solution will be to remove the vertex_build table. So, we are going to wait until we reach the point where we can remove this table to fully resolve the problem.
For now, I ask for your patience. Thank you!
Seems like i do have similar issue, also my ram usage is growing as sqlite file - now its 7GB... version of my langflow is 1.0.18 and its ran by python3.10...
and when memory ran out:
--- End of logging error --- ERROR 2024-11-13 18:09:20 - ERROR - utils utils.py:159 - Error logging transaction: (sqlite3.OperationalError) disk I/O error (Background on this error at: https://sqlalche.me/e/20/e3q8) --- Logging error in Loguru Handler #4 --- Record was: {'elapsed': datetime.timedelta(seconds=23076, microseconds=901699), 'exception': None, 'extra': {}, 'file': (name='utils.py', path='/home/langflow/.local/lib/python3.10/site-packages/langflow/graph/utils.py'), 'function': 'log_transaction', 'level': (name='ERROR', no=40, icon='❌'), 'line': 159, 'message': 'Error logging transaction: (sqlite3.OperationalError) disk I/O error\n(Background on this error at: https://sqlalche.me/e/20/e3q8)', 'module': 'utils', 'name': 'langflow.graph.utils', 'process': (id=3284, name='MainProcess'), 'thread': (id=139889603801088, name='MainThread'), 'time': datetime(2024, 11, 13, 18, 9, 20, 204206, tzinfo=datetime.timezone(datetime.timedelta(seconds=3600), 'CET'))} Traceback (most recent call last): File "/home/langflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1144, in _commit_impl self.engine.dialect.do_commit(self.connection) File "/home/langflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 702, in do_commit dbapi_connection.commit() sqlite3.OperationalError: disk I/O error
i am using API to connect to langflow, the flow is pretty large - yet things like that should not really occur.
i am also using 8 workers.
no file uploads, just simple input texts. not large, but a lot of API queries.
I have a solution that would temporarily fix this error. However, the real solution will be to remove the
vertex_buildtable. So, we are going to wait until we reach the point where we can remove this table to fully resolve the problem.
If removing vertex build is not possible or will take significant effort, another solution I could think of would be to add checks to prevent duplicates of the same data as well as remove old cached values when adding a new value. I think this solution is simpler; however, my understanding of vertex_build is not full yet. So, did I misunderstand or will this approach work? Also, if I submit a PR doing this, will it be merged?
@EDLLT hi!
If it's a simple task, feel free to take it on—any help is greatly appreciated! We’re excited to collaborate with other engineers here to find the best possible solution.
Please make sure to tag this issue in the PR so we can provide context and keep everyone informed about the work we’re doing.
Thanks!
@Cristhianzl I've noticed that this issue's been closed. Has it been fixed? Which PR fixes the issue if so?
I'll keep It open. Just closed due lack of new messages.
Thanks!