RocksDict icon indicating copy to clipboard operation
RocksDict copied to clipboard

Number of files explodes, compaction does not work..

Open beviah opened this issue 1 year ago • 6 comments

tried numerous settings.. something does not work right..

there are thousands of tiny log and sst files ... not getting merged.

beviah avatar Dec 29 '24 12:12 beviah

which version are you using? And what platform are you using?

Congyuwang avatar Dec 30 '24 23:12 Congyuwang

rocksdict 0.3.24 Python 3.12.3 Ubuntu 24.04.1 LTS

beviah avatar Jan 01 '25 17:01 beviah

That's kind of strange. Are you using too many column families maybe? Do you have a minimum code that can reproduce it?

Congyuwang avatar Jan 05 '25 10:01 Congyuwang

I managed to get manual compaction working

def speedb_options():
    opt = Options()
    opt.create_if_missing(True)
    opt.create_missing_column_families(True)
    opt.set_max_open_files(-1)  # You don't have this set
    opt.set_max_background_jobs(4)
    opt.set_max_compaction_bytes(512 * 1024 * 1024)
    opt.set_max_subcompactions(4)
    opt.set_compaction_style(DBCompactionStyle.universal())
    opt.increase_parallelism(4)
    opt.set_use_direct_io_for_flush_and_compaction(True)
    opt.set_use_direct_reads(True)
    opt.set_writable_file_max_buffer_size(1024 * 1024)
    opt.set_write_buffer_size(64 * 1024 * 1024)
    opt.set_min_write_buffer_number(2)
    opt.set_max_write_buffer_number(6) 
    opt.set_min_write_buffer_number_to_merge(2)
    opt.set_target_file_size_base(64 * 1024 * 1024)
    opt.set_prefix_extractor(SliceTransform.create_max_len_prefix(8))
    opt.set_atomic_flush(True)
    return opt

i have 4 column families with above options. do not use defaults.

db = Rdict(
   shard_path, 
    speedb_options(), 
    column_families=column_families, 
    access_type=AccessType.read_write()
)

wb = WriteBatch()
wb.set_default_column_family(db.get_column_family_handle(x.cf))
for vid, content in vector_contents.items():
    wb[vid] = content
db.write(wb)

contents are just small jsons or lists of integers, depending on column family, vids are integers.

Will try to reproduce with separate minimal example.

beviah avatar Jan 05 '25 15:01 beviah

i was facing this as well, but it was my fault, i was opening and closing lots of times, every time i wanted to set a new key value so i opened once and did all my operations, then i close now i dont have thousands of sst files

raulcarlomagno avatar Feb 20 '25 12:02 raulcarlomagno

i was facing this as well, but it was my fault, i was opening and closing lots of times, every time i wanted to set a new key value so i opened once and did all my operations, then i close now i dont have thousands of sst files

Thanks for replying. @beviah see if that’s the problem. If it is so, this can be closed.

Congyuwang avatar Feb 20 '25 12:02 Congyuwang

Close due to inactivity.

Congyuwang avatar Nov 11 '25 03:11 Congyuwang