Chi Wang

Results 343 comments of Chi Wang

Where is this code snippet from? Could you provide a complete example of how to reproduce the issue?

Could you share the .csv file? A few lines are enough as long as this error can be reproduced. Also, could you let me know the flaml version?

> df = pd.read_csv('higgs.csv') > X, y = df.drop('class', axis=1), df['class'] > X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=123) > automl_model = AutoML() > automl_model.fit(X_train, y_train, task='classification', > time_budget=300,...

> Thanks for your reply, @sonichi . This worked for the higgs dataset, but I can imagine it might appear again for other datasets. Any plans to fix this in...

> I have seen multiple OpenML datasets where the NaN values are stored as "?". I understand that a blind replacement of "?" with NaN might not be desired in...

The solution would be: 1. https://github.com/microsoft/FLAML/blob/21fa6c10ec6f963edbda99e9020ad585ae60a1a5/flaml/automl.py#L1122 -> `if issparse(X_train_all) or self._skip_transform:` 2. add a keyword argument in both `AutoML.__init__()` and `AutoML.fit()` in automl.py and set `self._skip_transform` to it before https://github.com/microsoft/FLAML/blob/21fa6c10ec6f963edbda99e9020ad585ae60a1a5/flaml/automl.py#L2429...

> Hi @sonichi > > Should I create a PR for this change? Happy to give a shot tomorrow. Yes please. :)

Thanks for the question. As long as the second PR doesn't depend on the first PR, you don't need to wait for the first PR to be merged. You can...

Thanks for the suggestion. Does the "way `Ray-Tune` have it" refer to [this](https://docs.ray.io/en/latest/tune/examples/tune-pytorch-lightning.html)?

> @sonichi yes, as far as I remember, all that is necessary is the lightning callback to report training metrics for pruning. Good. Since flaml.tune can use ray as the...