CTGAN
CTGAN copied to clipboard
Avoid generating the conditional column
Environment details
- CTGAN version: 0.7.1 (latest)
- Python version: 3.10.11
- Operating System: Mac/Unix
Problem description
I want to generate data conditionally, but I don't want to include the conditioned column in the output of the generator.
What I already tried
Currently, I just trim this column from the output. Intuitively, it creates a big waste everywhere: the network is bigger (thus slower), and the model size is bigger.
Example:
Data that holds two columns: hospital name and patient's age. Let's assume that there are 100 different hospitals, and my sole use of the generative model is to generate new rows for a given hospital. Currently, the model will create 101 input features: 100 one-hot features (for hospital names) and one continuous feature (for age).