Unexpectedly Large Database Disk Usage (24GB) vs Small Backup Size (19MB) - Space Mostly in /.dolt/temptf
Database Name: insight_metabase
Version: 1.75.0
After the application ran for about 10 days, the database file size was approximately 24G. After executing the dolt gc command, the database file size was reduced to about 14G.
#dolt version
dolt version 1.75.0
cd /db/dolt/insight_metabase
/db/dolt/insight_metabase#du -h -d 1
24G ./.dolt
24G .
After executing the dolt gc command:
/db/dolt/insight_metabase#dolt gc
/db/dolt/insight_metabase#du -h -d 1
14G ./.dolt
14G .
However, our actual data volume is not that large. When I used dolt backup, the backup file was very small, only about 19MB.
/db/dolt/insight_metabase#mkdir /db/dolt_bak/insight_metabase
/db/dolt/insight_metabase#dolt backup add local-backup file:///db/dolt_bak/insight_metabase
/db/dolt/insight_metabase#dolt backup sync local-backup
/db/dolt/insight_metabase#cd /db/dolt_bak/insight_metabase
/db/dolt_bak/insight_metabase# du -h -d 1
4.0K ./oldgen
19M .
Our database has only one branch and one commit.
/db/dolt/insight_metabase# dolt branch
* main
/db/dolt/insight_metabase# dolt log
commit g9efkgekkdj74773dclc3t6m35c3m4p1 (HEAD -> main)
Author: Dolt System Account <[email protected]>
Date: Wed Oct 22 02:42:03 +0000 2025
Initialize data repository
After investigation, we found that the folder /db/dolt/insight_metabase/.dolt/temptf occupies almost most of the space.
Our questions are:
- Why is the disk space usage so large?
- How can we release the occupied disk space?
- What can we do to reduce the rate of disk space usage?
The configuration file config.yaml we are using is as follows (the remote_url_template in remote_url_template is partially omitted):
data_dir: /db/dolt
cfg_dir: /etc/dolt/config
log_level: info
log_format: text
behavior:
read_only: false
autocommit: true
disable_client_multi_statements: false
dolt_transaction_commit: false
event_scheduler: "ON"
auto_gc_behavior:
enable: true
archive_level: 1
listener:
host: 0.0.0.0
port: 3306
max_connections: 1000
back_log: 50
max_connections_timeout_millis: 60000
read_timeout_millis: 28800000
write_timeout_millis: 28800000
tls_key: null
tls_cert: null
require_secure_transport: null
allow_cleartext_passwords: null
max_logged_query_len: 0
# Advanced Configuration
metrics:
labels: {}
host: null
port: -1
system_variables: {}
user_session_vars: []
jwks: []
cluster:
standby_remotes:
- name: standby
remote_url_template: http://......:50051/{database}
bootstrap_role: primary
bootstrap_epoch: 1
remotesapi:
port: 50051
This is a weird one. I'm not sure why temptf would get so large. This is an @reltuk question.
@lihh1992 Thank you for the great bug report. I can confirm that I can reproduce and that this is not behaving as expected. The temptf files are created as part of cluster replication and should be cleaned up after the replication action is completed.
I can confirm that I can reproduce this behavior locally. It seems to be an issue that was introduced in 1.75.0.
I am investigating currently and will hopefully have a root cause and PR out for a fix soon. In the meantime, you can clean up the temptf files by stopping and starting the server. If you run the server with the environment variable DOLT_ARCHIVE_PULL_STREAMER=0 set, Dolt will not leave these files behind after a successful push.
I will keep you posted on my investigation and the progress on the root cause fix.
The following PR resolves the issue. I am working on getting it through CI and included in the next Dolt release:
https://github.com/dolthub/dolt/pull/9987
The root cause fix was just released in https://github.com/dolthub/dolt/releases/tag/v1.76.2.
Thank you again for the bug report. Closing for now, but feel free to reopen if you need anything further here.
@reltuk After the issue occurred, I rebuilt the database using backup data. The reconstructed database size was less than 100 MB, but now the disk usage of the database file has exceeded 5 GB.
Around 10:00 this morning, I upgraded Dolt to version 1.76.2.
/db/dolt/insight_metabase/.dolt/temptf# dolt version
dolt version 1.76.2
/db/dolt/insight_metabase# du -h -d 1
5.0G ./.dolt
5.0G .
However, the size of the replica database is only 27 MB:
/db/dolt/insight_metabase# du -h -d 1
27M ./.dolt
27M .
Most of the space is still occupied by the temptf directory:
/db/dolt/insight_metabase/.dolt/temptf# find . -type f | wc -l
435211
After running dolt gc, there was no significant reduction in disk usage.
My questions are:
- Why did
dolt gcnot clean up the data intemptf? - How can I free up the occupied disk space?
- Will the disk usage continue to grow rapidly?
@lihh1992
Thank you for the update. I cannot reproduce the lingering temptf files when running with cluster replication on 1.76.2. Can you verify that the version of the running server is 1.76.2?
dolt sql -q 'select dolt_version()'
from within the dolt database directory, while the server is running. (The directory we want is the one that has the .dolt/sql-server.info file in it.)
To clean up the files, you can stop the running sql-server process and then just delete them.
$ # After stopping the running sql-server process
$ rm .dolt/temptf/*
Then you can start the sql-server process again.
You can perform a full GC by running call dolt_gc('--full'). This may take longer than a normal GC, but it will collect data which was previously reachable from a branch head but is no longer reachable at all. By default, all data reachable from a branch head at the time of a garbage collection is moved into the old gen and is not revisited in the future by non-full garbage collections.
I will leave this issue re-opened while we continue to investigate whether temptf files are still being incorrectly left behind by 1.76.2.
The following is the command execution process and its results:
root@dolt-0:/var/lib/dolt# cd /db/dolt
root@dolt-0:/db/dolt# du -h -d 1
1.1M ./insight_system
372M ./insight_jobadmin
5.1G ./insight_metabase
15G ./.dolt
14G ./.dolt_dropped_databases
34G .
root@dolt-0:/db/dolt# cd .dolt
root@dolt-0:/db/dolt/.dolt# ls
sql-server.info tmp
root@dolt-0:/db/dolt/.dolt# dolt sql -q 'select dolt_version()'
Warning: Unknown global config option 'sqlserver.cluster.dolt_cluster_role'. Use `dolt config --global --unset sqlserver.cluster.dolt_cluster_role` to remove.
Warning: Unknown global config option 'sqlserver.cluster.dolt_cluster_role_epoch'. Use `dolt config --global --unset sqlserver.cluster.dolt_cluster_role_epoch` to remove.
+----------------+
| dolt_version() |
+----------------+
| 1.76.2 |
+----------------+
root@dolt-0:/db/dolt# cd ../insight_metabase/
root@dolt-0:/db/dolt/insight_metabase# dolt sql -q 'select dolt_version()'
Warning: Unknown global config option 'sqlserver.cluster.dolt_cluster_role'. Use `dolt config --global --unset sqlserver.cluster.dolt_cluster_role` to remove.
Warning: Unknown global config option 'sqlserver.cluster.dolt_cluster_role_epoch'. Use `dolt config --global --unset sqlserver.cluster.dolt_cluster_role_epoch` to remove.
+----------------+
| dolt_version() |
+----------------+
| 1.76.2 |
+----------------+
root@dolt-0:/db/dolt/insight_metabase# dolt sql
Warning: Unknown global config option 'sqlserver.cluster.dolt_cluster_role'. Use `dolt config --global --unset sqlserver.cluster.dolt_cluster_role` to remove.
Warning: Unknown global config option 'sqlserver.cluster.dolt_cluster_role_epoch'. Use `dolt config --global --unset sqlserver.cluster.dolt_cluster_role_epoch` to remove.
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit. "\help" for help.
insight_metabase/main*> call dolt_gc('--full')
-> ;
+--------+
| status |
+--------+
| 0 |
+--------+
1 row in set (0.88 sec)
insight_metabase/main*> exit
Bye
root@dolt-0:/db/dolt/insight_metabase# du -h -d 1
5.0G ./.dolt
5.0G .
root@dolt-0:/db/dolt/insight_metabase# cat /etc/dolt/config/config.yaml
data_dir: /db/dolt
cfg_dir: /etc/dolt/config
log_level: info
log_format: text
behavior:
read_only: false
autocommit: true
disable_client_multi_statements: false
dolt_transaction_commit: false
event_scheduler: "ON"
auto_gc_behavior:
enable: true
archive_level: 1
listener:
host: 0.0.0.0
port: 6033
max_connections: 1000
back_log: 50
max_connections_timeout_millis: 60000
read_timeout_millis: 28800000
write_timeout_millis: 28800000
tls_key: null
tls_cert: null
require_secure_transport: null
allow_cleartext_passwords: null
max_logged_query_len: 0
# Advanced Configuration
metrics:
labels: {}
host: null
port: -1
system_variables: {}
user_session_vars: []
jwks: []
cluster:
standby_remotes:
- name: standby
remote_url_template: http://dolt-1.dolt-headless.gitee.svc.cluster.local:50051/{database}
bootstrap_role: primary
bootstrap_epoch: 1
remotesapi:
port: 50051
Additionally, I have another question. In the config.yaml file, I have configured the data_dir as /db/dolt. Under the /db/dolt path, the size of the ./.dolt folder is 15GB. Is this normal? I only have the few databases shown in the commands.
@lihh1992 Thank you for your detailed response.
For the unexpectedly large /db/dolt/insight_metabase/.dolt directory, can you please confirm if the size is largely still the result of temptf files please? Is is possible to remove those files manually while the server is not running and to confirm that they do not come back?
The the presence of a large /db/dolt/.dolt directory, can you confirm whether the running the server thinks there is a database name dolt? Can you provide a listing of the files within that directory to see what they are? Running show databases would list all the reigstered databases with the server.
dolt sql-server has support for running from a top-level container directory, where all the databases are in subdirectories, like so, when running from /db/dolt:
/db/dolt/database_one/.dolt
/db/dolt/database_two/.dolt
/db/dolt/database_three/.dolt
but it also has support for exposing the current directory as a database, as if you run dolt sql-server from /db/dolt/database_one and it exposes the database as database_one with its storage files in /db/dolt/database_one/.dolt.
Depending on how the initial databases are created and how the server is launched, sometimes these two modes are even active at the same time. Which is why show databases and the listing of the files will be helpful.
Thank you for your continued help and thoughtful responses, -- Aaron