CDC-FM (Carré du Champ Flow Matching) Implementation
Add support for CDC-FM, a geometry-aware noise generation method that improves diffusion model training by adapting noise to the local geometry of the latent space. CDC-FM replaces standard Gaussian noise with geometry-informed noise that better preserves the structure of the data manifold.
Note: Only implemented for Flux network training so far. Can be expanded to other flow matching models SD3, Lumina Image 2.
Deep generative models often face a fundamental tradeoff: high sample qualitycan come at the cost of memorisation, where the model reproduces training datarather than generalising across the underlying data geometry. We introduce Carr´edu champ flow matching (CDC-FM), a generalisation of flow matching (FM), thatimproves the quality-generalisation tradeoff by regularising the probability pathwith a geometry-aware noise. Our method replaces the homogeneous, isotropicnoise in FM with a spatially varying, anisotropic Gaussian noise whose covari-ance captures the local geometry of the latent data manifold. We prove that thisgeometric noise can be optimally estimated from the data and is scalable to largedata. Further, we provide an extensive experimental evaluation on diverse datasets(synthetic manifolds, point clouds, single-cell genomics, animal motion capture,and images) as well as various neural network architectures (MLPs, CNNs, andtransformers). We demonstrate that CDC-FM consistently offers a better quality-generalisation tradeoff. We observe significant improvements over standard FMin data-scarce regimes and in highly non-uniformly sampled datasets, which areoften encountered in AI for science applications. Our work provides a mathemat-ical framework for studying the interplay between data geometry, generalisationand memorisation in generative models, as well as a robust and scalable algorithmthat can be readily integrated into existing flow matching pipelines.
https://arxiv.org/abs/2510.05930
Note: Written with AI but I guided how it was implemented.
Recommended Configurations:
Single Resolution (e.g., all 512×512):
--use_cdc_fm \
--cdc_k_neighbors 256 \
--cdc_k_bandwidth 8 \
--cdc_d_cdc 8 \
--cdc_gamma 1.0
Multi-Resolution with Bucketing (FLUX/SDXL):
--use_cdc_fm \
--cdc_k_neighbors 256 \
--cdc_adaptive_k \
--cdc_min_bucket_size 16 \
--cdc_k_bandwidth 8 \
--cdc_d_cdc 8 \
--cdc_gamma 0.5
Small Dataset (<1000 images):
--use_cdc_fm \
--cdc_k_neighbors 128 \
--cdc_adaptive_k \
--cdc_min_bucket_size 8 \
--cdc_k_bandwidth 8 \
--cdc_d_cdc 8 \
--cdc_gamma 1.5
Parameter Guide:
--cdc_k_neighbors
- Recommended: 256 (based on paper's CIFAR-10 experiments)
- Small datasets (<1000): 128
- Medium datasets (1000-10k): 256
- Large datasets (>10k): 256-512
- Rule: k = min(256, dataset_size / 4)
--cdc_adaptive_k
- Recommended: Enable for multi-resolution/bucketed training
- Without flag (default): Strict paper methodology - skips buckets with < k_neighbors samples
- With flag: Pragmatic approach - uses
k = min(k_neighbors, bucket_size - 1)forbuckets ≥ min_bucket_size - When to use:
- Multi-resolution training (FLUX with various aspect ratios)
- Training with bucketing enabled
- Datasets where resolution distribution varies widely
- When not to use:
- Single resolution datasets (all images same size)
- When you want strict adherence to paper's methodology
- Academic/research settings requiring exact paper reproduction
--cdc_min_bucket_size
- Recommended: 16 (default)
- Only relevant when
--cdc_adaptive_kis enabled - Buckets below this threshold use Gaussian fallback (no CDC)
- Range: 8-32 depending on dataset
- Lower values (8-12): More buckets get CDC, but less stable for very small buckets
- Higher values (24-32): More conservative, only well-populated buckets get CDC
--cdc_k_bandwidth
- Recommended: 8 (paper uses this consistently)
- Don't change unless you have specific reasons
- This determines variable-bandwidth Gaussian kernels
--cdc_gamma
- Small datasets (<1000): 1.0-2.0 (stronger regularization)
- Medium datasets (1000-5000): 0.8-1.0
- Large datasets (>5000): 0.5-0.8
- Paper showed γ=2.0 optimal for 250 samples, γ=0.5-1.0 for 2000-5000 samples
--cdc_d_cdc
- Recommended: 8-16 for high-dimensional image data
- Paper tested 2, 4, 8, 16 - found trade-off between quality and generalization
- Higher values capture more geometric structure but may include noise
Thank you for this! This seems to be effective when the data set is limited, so it looks very good.
I plan to merge the sd3 branch into main soon, so I'd like to merge this (and a few other PRs) before then.
Issue right now is we are caching the neighbors into a file but saving it into the output_dir. this means each run we make a new file. We could:
- Only save this in memory and not to a file.
- Allow users to set the cache file location.
I'd usually set it with the dataset but if multiple subsets are set it isn't one place.
@rockerBOO i plan to test this
this is only for flux lora?
@rockerBOO i plan to test this
this is only for flux lora?
Yes only Flux LoRA for the moment