[Test] V0.7.0
The following diagrams is used for markdown.
Regression - XGBoost - Manual Hyperparameter Selection - Sany Test
Test Result: Successful
Process Step:
Template -> [Section Name: Option]
- Built-in Training Data Option: 1
- Output Data Identifier Column Selection: 1
- World Map Projection: 1
- Distribution in World Map: 3
- Continue World Map: 2
- Data Selection: [2, 5]
- Missing Values Process: 1
- Strategy for Missing Values: 2
- Imputation Method Option: 3
- Feature Engineering: 1
- Feature Engineering: ABC
- Feature Engineering: b * c - d
- Continue Feature Engineering: 2
- Mode Selection: 1
- Data Segmentation - X Set and Y Set: [2, 4]
- Data Segmentation - X Set and Y Set: 5
- Feature Scaling on X Set: 1
- Feature Scaling on X Set: 3
- Feature Selection on X set: 1
- Feature Selection on X set: 1
- Feature Selection on X set: 2
- Data Split - Train Set and Test Set: 0.3
- Model Selection: 9
- Automated Machine Learning: 2
- XGBoost - Hyper-parameters Specification
- N Estimators: 100
- Learning Rate: 0.5
- Max Depth: 4
- Subsample: 0.4
- Colsample Bytree: 1
- Alpha: 0.8
- Lambda: 0.5
Verification Step:
- Check If the position of ID matches the correct row in Geochemistrypi+ID
- Data under the path .../artifacts/data
- Application Data Feature-Engineering Selected.xlsx ✅
- Application Data Feature-Engineering.xlsx ❌
- Application Data Original.xlsx ✅
- Data Original.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected.xlsx ✅
- X Test.xlsx ✅
- X Train.xlsx ✅
- X Without Scaling.xlsx ✅
- Y.xlsx ✅
- Y Train.xlsx ✅
- Y Test.xlsx ✅
- Data under the path .../artifacts/image/map
- Map Projection - Al2O3.xlsx ✅
- Data under the path .../artifacts/image/model_output
- Permutation Importance - Y Test.xlsx ✅
- Predicted vs. Actual Diagram - XGBoost.xlsx ✅
- Residuals Diagram - XGBoost.xlsx ✅
- Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram.xlsx ✅
- check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
- Data under the path .../artifacts/data
- Application Data Feature-Engineering Selected.xlsx ✅
- Application Data Feature-Engineering.xlsx ✅
- Application Data Original.xlsx ✅
- Application Data Predicted.xlsx ✅
- Data Original.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected.xlsx ✅
- X After Feature Selection.xlsx ✅
- X Test.xlsx ✅
- Y Train.xlsx ✅
- X With Scaling.xlsx ✅
- X Without Scaling.xlsx ✅
- Y Test Predict.xlsx ✅
- Y Test.xlsx ✅
- Y Train Predict.xlsx ✅
- Y Train.xlsx ✅
- Y.xlsx ✅
- Data under the path .../artifacts/image/map
- Map Projection - Al2O3.xlsx ✅
- Data under the path .../artifacts/image/model_output
- Feature Importance - XGBoost.xlsx ✅
- Permutation Importance - X Test.xlsx ✅
- Permutation Importance - Y Test.xlsx ✅
- Predicted vs. Actual Diagram - XGBoost.xlsx ❌
- Residuals Diagram - XGBoost.xlsx ❌
- Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram After Log.xlsx ✅
- Distribution Histogram.xlsx ✅
- Probability Plot.xlsx ❌
- Data under the path .../artifacts
- Transform Pipeline Configuration.txt ✅
- Data under the path .../metrics
- Cross Validation - XGBoost.txt ✅
- Model Score - XGBoost.txt ✅
- Data under the path .../parameters
- Hyper Parameters - XGBoost.txt ✅
Regression - random forest - Automatic Hyperparameter tuning - Jianming Test
Test Result: Successful
Process Step:
Template -> [Section Name: Option]
- Built-in Training Data Option: 1
- Output Data Identifier Column Selection: 1
- World Map Projection: 2
- Distribution in World Map: 3
- Continue World Map: 2
- Data Selection: [2, 5]
- Missing Values Process: 1
- Strategy for Missing Values: 2
- Imputation Method Option: 3
- Feature Engineering: 1
- Feature Engineering: ABC
- Feature Engineering: b * c - d
- Continue Feature Engineering: 2
- Mode Selection: 1
- Data Segmentation - X Set and Y Set: [2, 4]
- Data Segmentation - X Set and Y Set: 5
- Feature Scaling on X Set: 1
- Feature Scaling on X Set: 3
- Feature Selection on X set: 1
- Feature Selection on X set: 1
- Feature Selection on X set: 2
- Data Split - Train Set and Test Set: 0.3
- Model Selection: 6
- Automated Machine Learning: 1
- random forest - Automatic Hyperparameter tuning
Verification Step:
- Check If the position of ID matches the correct row in Geochemistrypi+ID
-
Data under the path
.../artifacts/data
- Application Data Feature-Engineering Selected.xlsx ✅
- Application Data Feature-Engineering.xlsx ✅
- Application Data Original.xlsx ✅
- Application Data Predicted.xlsx ❌
- Data Original.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected.xlsx ✅
- X after feature selection.xlsx ✅
- X Test.xlsx ✅
- X Train.xlsx ✅
- X With Scaling.xlsx ✅
- X Without Scaling.xlsx ✅
- Y.xlsx ✅
- Y Train.xlsx ✅
- Y Test.xlsx ✅
-
Data under the path
.../artifacts/image/map
- Map Projection - Al2O3.xlsx ✅
-
Data under the path
.../artifacts/image/model_output
- Permutation Importance - Y Test.xlsx ✅
- Predicted vs. Actual Diagram - XGBoost.xlsx ✅
- Residuals Diagram - XGBoost.xlsx ✅
-
Data under the path
.../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram.xlsx ✅
- check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
-
Data under the path
.../artifacts/data
- Application Data Feature-Engineering Selected.xlsx ✅
- Application Data Feature-Engineering.xlsx ✅
- Application Data Original.xlsx ✅
- Application Data Predicted.xlsx ✅
- Data Original.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected.xlsx ✅
- X After Feature Selection.xlsx ✅
- X Test.xlsx ✅
- Y Train.xlsx ✅
- X With Scaling.xlsx ✅
- X Without Scaling.xlsx ✅
- Y Test Predict.xlsx ✅
- Y Test.xlsx ✅
- Y Train Predict.xlsx ✅
- Y Train.xlsx ✅
- Y.xlsx ✅
-
Data under the path
.../artifacts/image/map
- Map Projection - Al2O3.xlsx ✅
-
Data under the path
.../artifacts/image/model_output
- Feature Importance - XGBoost.xlsx ✅
- Permutation Importance - X Test.xlsx ✅
- Permutation Importance - Y Test.xlsx ✅
- Predicted vs. Actual Diagram - Random Forest❌
- Residuals Diagram - Random Forest.xlsx ❌
-
Data under the path
.../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram After Log.xlsx ✅
- Distribution Histogram.xlsx ✅
- Probability Plot.xlsx ✅
-
Data under the path
.../artifacts
- Transform Pipeline Configuration.txt ✅
-
Data under the path
.../metrics
- Cross Validation - Random Forest.txt ❌
- Model Score - Random Forest.txt ❌
-
Data under the path
.../parameters
- Hyper Parameters - Random Forest.txt ❌
Clustering - DBSCAN - Manual Hyperparameter Selection - PanyanWeng Test
Test Result: Successful
Process Step:
Template -> [Section Name: Option]
- Built-in Training Data Option: 3
- Output Data Identifier Column Selection: 1
- World Map Projection: 2
- Data Selection: [2, 7]
- Missing Values Process: 1
- Strategy for Missing Values: 2
- Imputation Method Option: 1
- Feature Engineering: 2
- Mode Selection: 3
- Feature Scaling on X Set: 1
- Feature Scaling on X Set: 1
- Model Selection: 2
- DBSCAN - Hyper-parameters Specification
- Eps: 0.5
- Min Samples: 5
- Algorithm: 1
- Metric: 1
- Leaf Size: 30
- 2 Dimensions Data Selection: 1
- 2 Dimensions Data Selection: 2
- 3 Dimensions Data Selection: 3
- 3 Dimensions Data Selection: 4
- 3 Dimensions Data Selection: 5
Verification Step:
Data under the path .../artifacts/data
- Cluster Labels - DBSCAN.xlsx: ✅
- Application Data Feature-Engineering.xlsx: ✅
- Application Data Original.xlsx: ✅
- Data Original.xlsx: ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx: ✅
- Data Selected Dropped-Imputed.xlsx: ✅
- Data Selected.xlsx: ✅
- X With Scaling.xlsx: ✅
Data under the path .../artifacts/image/model_output
- Cluster Three-Dimensional Diagram - DBSCAN.xlsx: ✅
- Cluster Two-Dimensional Diagram - DBSCAN.xlsx: ✅
- Silhouette Diagram - Cluster Centers.xlsx: ❌
- Silhouette Diagram - Data With Labels.xlsx: ✅
- Silhouette value Diagram - Data With Labels.xlsx: ✅
Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx: ✅
- Distribution Histogram After Log Transformation.xlsx: ✅
- Distribution Histogram.xlsx: ✅
- Probability Plot.xlsx: ❌
Check if rows in Geochemistrypi+ID match the corresponding rows in Geochemistrypi
Data under the path .../artifacts/data
- Cluster Labels - DBSCAN.xlsx: ✅
- Application Data Feature-Engineering.xlsx: ✅
- Application Data Original.xlsx: ✅
- Data Original.xlsx: ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx: ✅
- Data Selected Dropped-Imputed.xlsx: ✅
- Data Selected.xlsx: ✅
- X With Scaling.xlsx: ✅
Data under the path .../artifacts/image/model_output
- Cluster Three-Dimensional Diagram - DBSCAN.xlsx: ✅
- Cluster Two-Dimensional Diagram - DBSCAN.xlsx: ✅
- Silhouette Diagram - Cluster Centers.xlsx: ❌
- Silhouette Diagram - Data With Labels.xlsx: ✅
- Silhouette value Diagram - Data With Labels.xlsx: ✅
Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx: ✅
- Distribution Histogram After Log Transformation.xlsx: ✅
- Distribution Histogram.xlsx: ✅
- Probability Plot.xlsx: ❌
Anomaly detection - Isolation Forest -Manual Hyperparameter Selection - Haibin Lai Test
Test Result: Successful
Process Step:
Template -> [Section Name: Option]
- Built-in Training Data Option: 5
- Output Data Identifier Column Selection: 1
- World Map Projection: 1
- Distribution in World Map: 4
- Continue World Map: 2
- Data Selection: [2, 5]
- Missing Values Process: 1
- Strategy for Missing Values: 2
- Imputation Method Option: 2
- Feature Engineering: 1
- Feature Engineering: ABD
- Feature Engineering: a + b * d
- Continue Feature Engineering: 2
- Mode Selection: 5
- Feature Selection on X set: 1
- Feature Selection strategy on X set: 2
- Model Selection: 1
- Isolation Forest - Hyper-parameters Specification
- N Estimators: 100
- contamination of the data set: 0.3
- Max Features: 3
- Bootstrap: 1
- Max Samples: 64
- 2 Dimensions Data Selection: 2
- 2 Dimensions Data Selection: 3
- 3 Dimensions Data Selection: 1
- 3 Dimensions Data Selection: 3
- 3 Dimensions Data Selection: 4
Verification Step:
- Check If the position of ID matches the correct row in Geochemistrypi+ID
- Data under the path .../artifacts/data
- Data Original.xlsx ✅
- Data Selected.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- X Abnormal.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly.xlsx with label
is_anomaly; For GeochemistryPI + ID, it's X Abnormal.xlsx with labelis_abnormal) - X Abnormal Detection.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly Detection.xlsx with label
is_anomaly; For GeochemistryPI + ID, it's X Abnormal Detection with labelis_abnormal) - X Normal.xlsx ✅
- X With Scaling.xlsx ✅
- Data under the path .../artifacts/image/map
- Map Projection - SiO2.xlsx ✅
- Data under the path .../artifacts/image/model_output
- Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
- Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
- Anomaly Detection Two-Dimensional Diagram - Isolation Forest..xlsx ✅
- Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram After Log Transformation.xlsx ✅
- Distribution Histogram.xlsx ✅
- Probability Plot.xlsx ✅
- Data under the path .../artifacts
- Transform Pipeline Configuration.txt ✅
- Data under the path .../metrics No Output
- Data under the path .../parameters
- Hyper Parameters - Isolation Forest.txt ✅
- check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
- Data under the path .../artifacts/data
- Data Original.xlsx ✅
- Data Selected.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- X Abnormal.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly.xlsx with label
is_anomaly; For GeochemistryPI + ID, it's X Abnormal.xlsx with labelis_abnormal) - X Abnormal Detection.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly Detection.xlsx with label
is_anomaly; For GeochemistryPI + ID, it's X Abnormal Detection with labelis_abnormal) - X Normal.xlsx ✅
- X With Scaling.xlsx ✅
- Data under the path .../artifacts/image/map
- Map Projection - SiO2.xlsx ✅
- Data under the path .../artifacts/image/model_output
- Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
- Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
- Anomaly Detection Two-Dimensional Diagram - Isolation Forest..xlsx ✅
- Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram After Log Transformation.xlsx ✅
- Distribution Histogram.xlsx ✅
- Probability Plot.xlsx ✅
- Data under the path .../artifacts
- Transform Pipeline Configuration.txt ✅
- Data under the path .../metrics No Output
- Data under the path .../parameters
- Hyper Parameters - Isolation Forest.txt ✅
Anomaly detection - Isolation Forest and Local Outlier Factor -Manual Hyperparameter Selection - Haibin Lai Test
Test Result: Successful
Process Step:
Template -> [Section Name: Option]
- Built-in Training Data Option: 5
- Output Data Identifier Column Selection: 1
- World Map Projection: 1
- Distribution in World Map: 4
- Continue World Map: 2
- Data Selection: [2, 5]
- Missing Values Process: 1
- Strategy for Missing Values: 2
- Imputation Method Option: 1
- Feature Engineering: 2
- Mode Selection: 5
- Feature Selection on X set: 1
- Feature Selection strategy on X set: 1
- Model Selection: 3
- Isolation Forest - Hyper-parameters Specification
- N Estimators: 100
- contamination of the data set: 0.3
- Max Features: 3
- Bootstrap: 1
- Max Samples: 64
- 2 Dimensions Data Selection: 2
- 2 Dimensions Data Selection: 3
- 3 Dimensions Data Selection: 1
- 3 Dimensions Data Selection: 3
- 3 Dimensions Data Selection: 4
Isolation Forest - Hyper-parameters Specification - N Neighbors: 20 - Leaf Size: 30 - P: 2 - Contamination: 0.3 - N Jobs: 1
- 2 Dimensions Data Selection: 2
- 2 Dimensions Data Selection: 3
- 3 Dimensions Data Selection: 1
- 3 Dimensions Data Selection: 3
- 3 Dimensions Data Selection: 4
Verification Step:
- Check If the position of ID matches the correct row in Geochemistrypi+ID
- Data under the path .../artifacts/data
- Data Original.xlsx ✅
- Data Selected.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- X With Scaling.xlsx ✅
- Data under the path .../artifacts/image/map
- Map Projection - SiO2.xlsx ✅
- Data under the path .../artifacts/image/model_output
- No Output file
- Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram After Log Transformation.xlsx ✅
- Distribution Histogram.xlsx ✅
- Probability Plot.xlsx ✅
- Data under the path .../artifacts
- Transform Pipeline Configuration.txt ✅
- Data under the path .../metrics
- No Output file ✅
- Data under the path .../parameters
- No Output file ✅
- Data under the path .../summary
- No Output file ✅
- Data under the path .../Isolation Forest/artifacts/data
- X Abnormal.xlsx ✅
- X Abnormal Dectection.xlsx ✅
- X Normal.xlsx ✅
- Data under the path .../Isolation Forest/artifacts/
- Transform Pipeline Configuration.txt ✅
- Data under the path .../Isolation Forest/parameters/
- Hyper Parameters - Isolation Forest.txt ✅
- Data under the path .../Isolation Forest/artifacts/image/map
- No Output file ✅
- Data under the path .../Isolation Forest/artifacts/image/static
- No Output file ✅
- Data under the path .../Isolation Forest/artifacts/image/model_output
- Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
- Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
- Anomaly Detection Two-Dimensional Diagram - Isolation Forest.xlsx ✅
- Data under the path .../Local Outlier Factor/metrics
- No Output file ✅
- Data under the path .../Local Outlier Factor/parameters
- No Output file ✅
- Data under the path .../Local Outlier Factor/summary
- No Output file ✅
- Data under the path .../Local Outlier Factor/artifacts/data
- X Abnormal.xlsx ✅
- X Abnormal Dectection.xlsx ✅
- X Normal.xlsx ✅
- Data under the path .../Local Outlier Factor/artifacts/image/model_output
- Anomaly Detection Density Estimation - Local Outlier Factor.xlsx ✅
- Anomaly Detection Three-Dimensional Diagram - Local Outlier Factor.xlsx ✅
- Anomaly Detection Two-Dimensional Diagram - Local Outlier Factor..xlsx ✅
- Lof Score Diagram - Local Outlier Factor ✅
- Data under the path .../Local Outlier Factor/artifacts/image
- Transform Pipeline Configuration.txt ✅
- check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
-
- Data under the path .../artifacts/data
- Data Original.xlsx ✅
- Data Selected.xlsx ✅
- Data Selected Dropped-Imputed.xlsx ✅
- Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
- X With Scaling.xlsx ✅
- Data under the path .../artifacts/data
- Data under the path .../artifacts/image/map
- Map Projection - SiO2.xlsx ✅
- Data under the path .../artifacts/image/model_output
- No Output file
- Data under the path .../artifacts/image/statistic
- Correlation Plot.xlsx ✅
- Distribution Histogram After Log Transformation.xlsx ✅
- Distribution Histogram.xlsx ✅
- Probability Plot.xlsx ✅
- Data under the path .../artifacts
- Transform Pipeline Configuration.txt ✅
- Data under the path .../metrics
- No Output file ✅
- Data under the path .../parameters
- No Output file ✅
- Data under the path .../summary
- No Output file ✅
- Data under the path .../Isolation Forest/artifacts/data
- X Abnormal.xlsx ✅
- X Abnormal Dectection.xlsx ✅
- X Normal.xlsx ✅
- Data under the path .../Isolation Forest/artifacts/
- Transform Pipeline Configuration.txt ✅
- Data under the path .../Isolation Forest/parameters/
- Hyper Parameters - Isolation Forest.txt ✅
- Data under the path .../Isolation Forest/artifacts/image/map
- No Output file ✅
- Data under the path .../Isolation Forest/artifacts/image/static
- No Output file ✅
- Data under the path .../Isolation Forest/artifacts/image/model_output
- Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
- Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
- Anomaly Detection Two-Dimensional Diagram - Isolation Forest.xlsx ✅
- Data under the path .../Local Outlier Factor/metrics
- No Output file ✅
- Data under the path .../Local Outlier Factor/parameters
- No Output file ✅
- Data under the path .../Local Outlier Factor/summary
- No Output file ✅
- Data under the path .../Local Outlier Factor/artifacts/data
- X Abnormal.xlsx ✅
- X Abnormal Dectection.xlsx ✅
- X Normal.xlsx ✅
- Data under the path .../Local Outlier Factor/artifacts/image/model_output
- Anomaly Detection Density Estimation - Local Outlier Factor.xlsx ✅
- Anomaly Detection Three-Dimensional Diagram - Local Outlier Factor.xlsx ✅
- Anomaly Detection Two-Dimensional Diagram - Local Outlier Factor..xlsx ✅
- Lof Score Diagram - Local Outlier Factor ❌
- Data under the path .../Local Outlier Factor/artifacts/image
- Transform Pipeline Configuration.txt ✅