sd-scripts CDC-FM (Carré du Champ Flow Matching) Implementation

Add support for CDC-FM, a geometry-aware noise generation method that improves diffusion model training by adapting noise to the local geometry of the latent space. CDC-FM replaces standard Gaussian noise with geometry-informed noise that better preserves the structure of the data manifold.

Note: Only implemented for Flux network training so far. Can be expanded to other flow matching models SD3, Lumina Image 2.

Deep generative models often face a fundamental tradeoff: high sample qualitycan come at the cost of memorisation, where the model reproduces training datarather than generalising across the underlying data geometry. We introduce Carr´edu champ flow matching (CDC-FM), a generalisation of flow matching (FM), thatimproves the quality-generalisation tradeoff by regularising the probability pathwith a geometry-aware noise. Our method replaces the homogeneous, isotropicnoise in FM with a spatially varying, anisotropic Gaussian noise whose covari-ance captures the local geometry of the latent data manifold. We prove that thisgeometric noise can be optimally estimated from the data and is scalable to largedata. Further, we provide an extensive experimental evaluation on diverse datasets(synthetic manifolds, point clouds, single-cell genomics, animal motion capture,and images) as well as various neural network architectures (MLPs, CNNs, andtransformers). We demonstrate that CDC-FM consistently offers a better quality-generalisation tradeoff. We observe significant improvements over standard FMin data-scarce regimes and in highly non-uniformly sampled datasets, which areoften encountered in AI for science applications. Our work provides a mathemat-ical framework for studying the interplay between data geometry, generalisationand memorisation in generative models, as well as a robust and scalable algorithmthat can be readily integrated into existing flow matching pipelines.

https://arxiv.org/abs/2510.05930

Screenshot 2025-10-09 at 18-21-47 Carr_'e du champ flow matching better quality-generalisation tradeoff in generative models - 2510 05930v1 pdf

Screenshot 2025-10-09 at 18-21-29 Carr_'e du champ flow matching better quality-generalisation tradeoff in generative models - 2510 05930v1 pdf

Note: Written with AI but I guided how it was implemented.

Recommended Configurations:

Single Resolution (e.g., all 512×512):

  --use_cdc_fm \
  --cdc_k_neighbors 256 \
  --cdc_k_bandwidth 8 \
  --cdc_d_cdc 8 \
  --cdc_gamma 1.0

Multi-Resolution with Bucketing (FLUX/SDXL):

  --use_cdc_fm \
  --cdc_k_neighbors 256 \
  --cdc_adaptive_k \
  --cdc_min_bucket_size 16 \
  --cdc_k_bandwidth 8 \
  --cdc_d_cdc 8 \
  --cdc_gamma 0.5

Small Dataset (<1000 images):

   --use_cdc_fm \
  --cdc_k_neighbors 128 \
  --cdc_adaptive_k \
  --cdc_min_bucket_size 8 \
  --cdc_k_bandwidth 8 \
  --cdc_d_cdc 8 \
  --cdc_gamma 1.5

Parameter Guide:

--cdc_k_neighbors

Recommended: 256 (based on paper's CIFAR-10 experiments)
Small datasets (<1000): 128
Medium datasets (1000-10k): 256
Large datasets (>10k): 256-512
Rule: k = min(256, dataset_size / 4)

--cdc_adaptive_k

Recommended: Enable for multi-resolution/bucketed training
Without flag (default): Strict paper methodology - skips buckets with < k_neighbors samples
With flag: Pragmatic approach - uses k = min(k_neighbors, bucket_size - 1) for buckets ≥ min_bucket_size
When to use:
- Multi-resolution training (FLUX with various aspect ratios)
- Training with bucketing enabled
- Datasets where resolution distribution varies widely
When not to use:
- Single resolution datasets (all images same size)
- When you want strict adherence to paper's methodology
- Academic/research settings requiring exact paper reproduction

--cdc_min_bucket_size

Recommended: 16 (default)
Only relevant when --cdc_adaptive_k is enabled
Buckets below this threshold use Gaussian fallback (no CDC)
Range: 8-32 depending on dataset
Lower values (8-12): More buckets get CDC, but less stable for very small buckets
Higher values (24-32): More conservative, only well-populated buckets get CDC

--cdc_k_bandwidth

Recommended: 8 (paper uses this consistently)
Don't change unless you have specific reasons
This determines variable-bandwidth Gaussian kernels

--cdc_gamma

Small datasets (<1000): 1.0-2.0 (stronger regularization)
Medium datasets (1000-5000): 0.8-1.0
Large datasets (>5000): 0.5-0.8
Paper showed γ=2.0 optimal for 250 samples, γ=0.5-1.0 for 2000-5000 samples

--cdc_d_cdc

Recommended: 8-16 for high-dimensional image data
Paper tested 2, 4, 8, 16 - found trade-off between quality and generalization
Higher values capture more geometric structure but may include noise

Oct 09 '25 22:10 rockerBOO

Thank you for this! This seems to be effective when the data set is limited, so it looks very good.

I plan to merge the sd3 branch into main soon, so I'd like to merge this (and a few other PRs) before then.

Oct 09 '25 22:10 kohya-ss

Issue right now is we are caching the neighbors into a file but saving it into the output_dir. this means each run we make a new file. We could:

Only save this in memory and not to a file.
Allow users to set the cache file location.

I'd usually set it with the dataset but if multiple subsets are set it isn't one place.

Oct 10 '25 03:10 rockerBOO

@rockerBOO i plan to test this

this is only for flux lora?

Oct 11 '25 08:10 FurkanGozukara

@rockerBOO i plan to test this

this is only for flux lora?

Yes only Flux LoRA for the moment

Oct 12 '25 05:10 rockerBOO