qdrant-client quantization_config and optimizers

I'm experiencing an issue with the QdrantClient where the quantization_config and optimizers_config settings do not seem to be applied when creating a collection. I'm following the instruction given here: https://qdrant.tech/documentation/guides/optimize/ Below is the code I'm using:

  from qdrant_client import QdrantClient, models
  
  client = QdrantClient(":memory:")
  
  client.create_collection(
      collection_name="test-collection",
      vectors_config=models.VectorParams(size=768, distance=models.Distance.COSINE, on_disk=True),
      quantization_config=models.ScalarQuantization(
          scalar=models.ScalarQuantizationConfig(
              type=models.ScalarType.INT8,
              always_ram=True,
          ),
      ),
      optimizers_config=models.OptimizersConfigDiff(default_segment_number=2, max_optimization_threads=4)
  )
  
  test_collection = client.get_collection(collection_name="test-collection")
  print(test_collection.model_dump())

And here is the result:

{'status': <CollectionStatus.GREEN: 'green'>,
 'optimizer_status': <OptimizersStatusOneOf.OK: 'ok'>,
 'vectors_count': None,
 'indexed_vectors_count': 0,
 'points_count': 0,
 'segments_count': 1,
 'config': {'params': {'vectors': {'size': 768,
    'distance': <Distance.COSINE: 'Cosine'>,
    'hnsw_config': None,
    'quantization_config': None,
    'on_disk': True,
    'datatype': None,
    'multivector_config': None},
   'shard_number': None,
   'sharding_method': None,
   'replication_factor': None,
   'write_consistency_factor': None,
   'read_fan_out_factor': None,
   'on_disk_payload': None,
   'sparse_vectors': None},
  'hnsw_config': {'m': 16,
   'ef_construct': 100,
   'full_scan_threshold': 10000,
   'max_indexing_threads': 0,
   'on_disk': None,
   'payload_m': None},
  'optimizer_config': {'deleted_threshold': 0.2,
   'vacuum_min_vector_number': 1000,
   'default_segment_number': 0,
   'max_segment_size': None,
   'memmap_threshold': None,
   'indexing_threshold': 20000,
   'flush_interval_sec': 5,
   'max_optimization_threads': 1},
  'wal_config': {'wal_capacity_mb': 32, 'wal_segments_ahead': 0},
  'quantization_config': None},
 'payload_schema': {}}

As shown, the quantization_config and optimizers_config settings are not reflected in the collection configuration. I got the same result when using the same config on an on-disk collection. Could you please advise if this is a limitation of the in-memory and on-disk modes or if there is a workaround?

Thank you!

Feb 28 '25 00:02 Hosseinberg

client = QdrantClient(":memory:")

You are running "local mode". It doesn't have any optimizations and is only suitable for experiments and CI. If you want to scale you need to either run in docker or on cloud

Feb 28 '25 00:02 generall

Thanks for your quick response. Is it documented somewhere? Just curious to know if there's any other limitations.

Feb 28 '25 00:02 Hosseinberg

qdrant-client
qdrant-client copied to clipboard

quantization_config and optimizers_config not applied

qdrant-client qdrant-client copied to clipboard

quantization_config and optimizers_config not applied

qdrant-client
qdrant-client copied to clipboard