Jiting Xu
Jiting Xu
Implement `to_pytortch()` for modeling in pytorch
background: > instead of a full sql translation of one-hot encoding algorithm, he was envisioning more of as a backend registered function, which probably will be more performant. > That...
preprocessing data using machine learning models before feed it into model training. Examples: - model based imputing - [KNN imputer](https://scikit-learn.org/stable/modules/generated/sklearn.impute.KNNImputer.html) - model based outlier detection - [tidy.outliers](https://github.com/brunocarlin/tidy.outliers) - model based...
Transform processed data back to original feature space. Note that some transformations cannot be converted to the original value. Reference: - [minMax inverse_transform](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html#sklearn.preprocessing.MinMaxScaler.inverse_transform) - [tidymodels - step_inverse()](https://recipes.tidymodels.org/reference/step_inverse.html)
The current transform_table throws AttributeError when step is not fitted. ```python import ibisml as ml t = ibis.memtable({"a": [1, 2, 3], "b": [2,3,4], "c": [3,4,5]}) step = ml.CountEncode(ml.string()) step.transform_table(t) ```...
At present, we haven't quite cracked the code on efficiently running ML models on a SQL backend. One approach to transitioning the preprocessing pipeline involves saving it in a readable...
The current `to_numpy()` in ibisml is based on a pandas dataframe: Convert ibis table to pandas dataframe, then to numpy. It is not efficient. Some backend, like duckdb, could directly[...
## Definition impute or cap/floor the outliers of numeric features by percentile or a user-defined threshold. Examples: Apply caps and floors to each column; if a value is greater than...
Building upon the deliverables outlined in [issue #19](https://github.com/ibis-project/ibisml/issues/19), the objective is to enhance the coverage of ibisml machine learning preprocessing transformations, prioritizing key areas for improvement. **Please share your favorite...
use LM to write human readable response based on the question and returned data Resolves: https://github.com/ibis-project/ibis-birdbrain/issues/55 ``` bot = Bot(con=con, lm_response=True) bot("what is max upb") ```