Geochemistrypi icon indicating copy to clipboard operation
Geochemistrypi copied to clipboard

[Test] V0.7.0

Open SanyHe opened this issue 1 year ago • 5 comments

SanyHe avatar Nov 05 '24 17:11 SanyHe

The following diagrams is used for markdown.

image image

SanyHe avatar Nov 05 '24 17:11 SanyHe

Regression - XGBoost - Manual Hyperparameter Selection - Sany Test

Test Result: Successful

Process Step:

Template -> [Section Name: Option]

  • Built-in Training Data Option: 1
  • Output Data Identifier Column Selection: 1
  • World Map Projection: 1
  • Distribution in World Map: 3
  • Continue World Map: 2
  • Data Selection: [2, 5]
  • Missing Values Process: 1
  • Strategy for Missing Values: 2
  • Imputation Method Option: 3
  • Feature Engineering: 1
  • Feature Engineering: ABC
  • Feature Engineering: b * c - d
  • Continue Feature Engineering: 2
  • Mode Selection: 1
  • Data Segmentation - X Set and Y Set: [2, 4]
  • Data Segmentation - X Set and Y Set: 5
  • Feature Scaling on X Set: 1
  • Feature Scaling on X Set: 3
  • Feature Selection on X set: 1
  • Feature Selection on X set: 1
  • Feature Selection on X set: 2
  • Data Split - Train Set and Test Set: 0.3
  • Model Selection: 9
  • Automated Machine Learning: 2
  • XGBoost - Hyper-parameters Specification
    • N Estimators: 100
    • Learning Rate: 0.5
    • Max Depth: 4
    • Subsample: 0.4
    • Colsample Bytree: 1
    • Alpha: 0.8
    • Lambda: 0.5

Verification Step:

  1. Check If the position of ID matches the correct row in Geochemistrypi+ID
  • Data under the path .../artifacts/data
    • Application Data Feature-Engineering Selected.xlsx ✅
    • Application Data Feature-Engineering.xlsx ❌
    • Application Data Original.xlsx ✅
    • Data Original.xlsx ✅
    • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
    • Data Selected Dropped-Imputed.xlsx ✅
    • Data Selected.xlsx ✅
    • X Test.xlsx ✅
    • X Train.xlsx ✅
    • X Without Scaling.xlsx ✅
    • Y.xlsx ✅
    • Y Train.xlsx ✅
    • Y Test.xlsx ✅
  • Data under the path .../artifacts/image/map
    • Map Projection - Al2O3.xlsx ✅
  • Data under the path .../artifacts/image/model_output
    • Permutation Importance - Y Test.xlsx ✅
    • Predicted vs. Actual Diagram - XGBoost.xlsx ✅
    • Residuals Diagram - XGBoost.xlsx ✅
  • Data under the path .../artifacts/image/statistic
    • Correlation Plot.xlsx ✅
    • Distribution Histogram.xlsx ✅
  1. check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
  • Data under the path .../artifacts/data
    • Application Data Feature-Engineering Selected.xlsx ✅
    • Application Data Feature-Engineering.xlsx ✅
    • Application Data Original.xlsx ✅
    • Application Data Predicted.xlsx ✅
    • Data Original.xlsx ✅
    • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
    • Data Selected Dropped-Imputed.xlsx ✅
    • Data Selected.xlsx ✅
    • X After Feature Selection.xlsx ✅
    • X Test.xlsx ✅
    • Y Train.xlsx ✅
    • X With Scaling.xlsx ✅
    • X Without Scaling.xlsx ✅
    • Y Test Predict.xlsx ✅
    • Y Test.xlsx ✅
    • Y Train Predict.xlsx ✅
    • Y Train.xlsx ✅
    • Y.xlsx ✅
  • Data under the path .../artifacts/image/map
    • Map Projection - Al2O3.xlsx ✅
  • Data under the path .../artifacts/image/model_output
    • Feature Importance - XGBoost.xlsx ✅
    • Permutation Importance - X Test.xlsx ✅
    • Permutation Importance - Y Test.xlsx ✅
    • Predicted vs. Actual Diagram - XGBoost.xlsx ❌
    • Residuals Diagram - XGBoost.xlsx ❌
  • Data under the path .../artifacts/image/statistic
    • Correlation Plot.xlsx ✅
    • Distribution Histogram After Log.xlsx ✅
    • Distribution Histogram.xlsx ✅
    • Probability Plot.xlsx ❌
  • Data under the path .../artifacts
    • Transform Pipeline Configuration.txt ✅
  • Data under the path .../metrics
    • Cross Validation - XGBoost.txt ✅
    • Model Score - XGBoost.txt ✅
  • Data under the path .../parameters
    • Hyper Parameters - XGBoost.txt ✅

SanyHe avatar Nov 06 '24 05:11 SanyHe

Regression - random forest - Automatic Hyperparameter tuning - Jianming Test

Test Result: Successful

Process Step:

Template -> [Section Name: Option]

  • Built-in Training Data Option: 1
  • Output Data Identifier Column Selection: 1
  • World Map Projection: 2
  • Distribution in World Map: 3
  • Continue World Map: 2
  • Data Selection: [2, 5]
  • Missing Values Process: 1
  • Strategy for Missing Values: 2
  • Imputation Method Option: 3
  • Feature Engineering: 1
  • Feature Engineering: ABC
  • Feature Engineering: b * c - d
  • Continue Feature Engineering: 2
  • Mode Selection: 1
  • Data Segmentation - X Set and Y Set: [2, 4]
  • Data Segmentation - X Set and Y Set: 5
  • Feature Scaling on X Set: 1
  • Feature Scaling on X Set: 3
  • Feature Selection on X set: 1
  • Feature Selection on X set: 1
  • Feature Selection on X set: 2
  • Data Split - Train Set and Test Set: 0.3
  • Model Selection: 6
  • Automated Machine Learning: 1
  • random forest - Automatic Hyperparameter tuning

Verification Step:

  1. Check If the position of ID matches the correct row in Geochemistrypi+ID
  • Data under the path

    .../artifacts/data

    • Application Data Feature-Engineering Selected.xlsx ✅
    • Application Data Feature-Engineering.xlsx ✅
    • Application Data Original.xlsx ✅
    • Application Data Predicted.xlsx ❌
    • Data Original.xlsx ✅
    • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
    • Data Selected Dropped-Imputed.xlsx ✅
    • Data Selected.xlsx ✅
    • X after feature selection.xlsx ✅
    • X Test.xlsx ✅
    • X Train.xlsx ✅
    • X With Scaling.xlsx ✅
    • X Without Scaling.xlsx ✅
    • Y.xlsx ✅
    • Y Train.xlsx ✅
    • Y Test.xlsx ✅
  • Data under the path

    .../artifacts/image/map

    • Map Projection - Al2O3.xlsx ✅
  • Data under the path

    .../artifacts/image/model_output

    • Permutation Importance - Y Test.xlsx ✅
    • Predicted vs. Actual Diagram - XGBoost.xlsx ✅
    • Residuals Diagram - XGBoost.xlsx ✅
  • Data under the path

    .../artifacts/image/statistic

    • Correlation Plot.xlsx ✅
    • Distribution Histogram.xlsx ✅
  1. check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
  • Data under the path

    .../artifacts/data

    • Application Data Feature-Engineering Selected.xlsx ✅
    • Application Data Feature-Engineering.xlsx ✅
    • Application Data Original.xlsx ✅
    • Application Data Predicted.xlsx ✅
    • Data Original.xlsx ✅
    • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
    • Data Selected Dropped-Imputed.xlsx ✅
    • Data Selected.xlsx ✅
    • X After Feature Selection.xlsx ✅
    • X Test.xlsx ✅
    • Y Train.xlsx ✅
    • X With Scaling.xlsx ✅
    • X Without Scaling.xlsx ✅
    • Y Test Predict.xlsx ✅
    • Y Test.xlsx ✅
    • Y Train Predict.xlsx ✅
    • Y Train.xlsx ✅
    • Y.xlsx ✅
  • Data under the path

    .../artifacts/image/map

    • Map Projection - Al2O3.xlsx ✅
  • Data under the path

    .../artifacts/image/model_output

    • Feature Importance - XGBoost.xlsx ✅
    • Permutation Importance - X Test.xlsx ✅
    • Permutation Importance - Y Test.xlsx ✅
    • Predicted vs. Actual Diagram - Random Forest❌
    • Residuals Diagram - Random Forest.xlsx ❌
  • Data under the path

    .../artifacts/image/statistic

    • Correlation Plot.xlsx ✅
    • Distribution Histogram After Log.xlsx ✅
    • Distribution Histogram.xlsx ✅
    • Probability Plot.xlsx ✅
  • Data under the path

    .../artifacts

    • Transform Pipeline Configuration.txt ✅
  • Data under the path

    .../metrics

    • Cross Validation - Random Forest.txt ❌
    • Model Score - Random Forest.txt ❌
  • Data under the path

    .../parameters

    • Hyper Parameters - Random Forest.txt ❌

PotatoXi avatar Nov 19 '24 10:11 PotatoXi

Clustering - DBSCAN - Manual Hyperparameter Selection - PanyanWeng Test

Test Result: Successful

Process Step:

Template -> [Section Name: Option]

  • Built-in Training Data Option: 3
  • Output Data Identifier Column Selection: 1
  • World Map Projection: 2
  • Data Selection: [2, 7]
  • Missing Values Process: 1
  • Strategy for Missing Values: 2
  • Imputation Method Option: 1
  • Feature Engineering: 2
  • Mode Selection: 3
  • Feature Scaling on X Set: 1
  • Feature Scaling on X Set: 1
  • Model Selection: 2
  • DBSCAN - Hyper-parameters Specification
    • Eps: 0.5
    • Min Samples: 5
    • Algorithm: 1
    • Metric: 1
    • Leaf Size: 30
  • 2 Dimensions Data Selection: 1
  • 2 Dimensions Data Selection: 2
  • 3 Dimensions Data Selection: 3
  • 3 Dimensions Data Selection: 4
  • 3 Dimensions Data Selection: 5

Verification Step:

Data under the path .../artifacts/data

  • Cluster Labels - DBSCAN.xlsx:
  • Application Data Feature-Engineering.xlsx:
  • Application Data Original.xlsx:
  • Data Original.xlsx:
  • Data Selected Dropped-Imputed Feature-Engineering.xlsx:
  • Data Selected Dropped-Imputed.xlsx:
  • Data Selected.xlsx:
  • X With Scaling.xlsx:

Data under the path .../artifacts/image/model_output

  • Cluster Three-Dimensional Diagram - DBSCAN.xlsx:
  • Cluster Two-Dimensional Diagram - DBSCAN.xlsx:
  • Silhouette Diagram - Cluster Centers.xlsx:
  • Silhouette Diagram - Data With Labels.xlsx:
  • Silhouette value Diagram - Data With Labels.xlsx:

Data under the path .../artifacts/image/statistic

  • Correlation Plot.xlsx:
  • Distribution Histogram After Log Transformation.xlsx:
  • Distribution Histogram.xlsx:
  • Probability Plot.xlsx:

Check if rows in Geochemistrypi+ID match the corresponding rows in Geochemistrypi

Data under the path .../artifacts/data

  • Cluster Labels - DBSCAN.xlsx:
  • Application Data Feature-Engineering.xlsx:
  • Application Data Original.xlsx:
  • Data Original.xlsx:
  • Data Selected Dropped-Imputed Feature-Engineering.xlsx:
  • Data Selected Dropped-Imputed.xlsx:
  • Data Selected.xlsx:
  • X With Scaling.xlsx:

Data under the path .../artifacts/image/model_output

  • Cluster Three-Dimensional Diagram - DBSCAN.xlsx:
  • Cluster Two-Dimensional Diagram - DBSCAN.xlsx:
  • Silhouette Diagram - Cluster Centers.xlsx:
  • Silhouette Diagram - Data With Labels.xlsx:
  • Silhouette value Diagram - Data With Labels.xlsx:

Data under the path .../artifacts/image/statistic

  • Correlation Plot.xlsx:
  • Distribution Histogram After Log Transformation.xlsx:
  • Distribution Histogram.xlsx:
  • Probability Plot.xlsx:

DZzuk1ll avatar Dec 17 '24 11:12 DZzuk1ll

Anomaly detection - Isolation Forest -Manual Hyperparameter Selection - Haibin Lai Test

Test Result: Successful

Process Step:

Template -> [Section Name: Option]

  • Built-in Training Data Option: 5
  • Output Data Identifier Column Selection: 1
  • World Map Projection: 1
  • Distribution in World Map: 4
  • Continue World Map: 2
  • Data Selection: [2, 5]
  • Missing Values Process: 1
  • Strategy for Missing Values: 2
  • Imputation Method Option: 2
  • Feature Engineering: 1
  • Feature Engineering: ABD
  • Feature Engineering: a + b * d
  • Continue Feature Engineering: 2
  • Mode Selection: 5
  • Feature Selection on X set: 1
  • Feature Selection strategy on X set: 2
  • Model Selection: 1
  • Isolation Forest - Hyper-parameters Specification
    • N Estimators: 100
    • contamination of the data set: 0.3
    • Max Features: 3
    • Bootstrap: 1
    • Max Samples: 64
  • 2 Dimensions Data Selection: 2
  • 2 Dimensions Data Selection: 3
  • 3 Dimensions Data Selection: 1
  • 3 Dimensions Data Selection: 3
  • 3 Dimensions Data Selection: 4

Verification Step:

  1. Check If the position of ID matches the correct row in Geochemistrypi+ID
  • Data under the path .../artifacts/data
    • Data Original.xlsx ✅
    • Data Selected.xlsx ✅
    • Data Selected Dropped-Imputed.xlsx ✅
    • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
    • X Abnormal.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly.xlsx with label is_anomaly; For GeochemistryPI + ID, it's X Abnormal.xlsx with label is_abnormal)
    • X Abnormal Detection.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly Detection.xlsx with label is_anomaly; For GeochemistryPI + ID, it's X Abnormal Detection with label is_abnormal)
    • X Normal.xlsx ✅
    • X With Scaling.xlsx ✅
  • Data under the path .../artifacts/image/map
    • Map Projection - SiO2.xlsx ✅
  • Data under the path .../artifacts/image/model_output
    • Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
    • Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
    • Anomaly Detection Two-Dimensional Diagram - Isolation Forest..xlsx ✅
  • Data under the path .../artifacts/image/statistic
    • Correlation Plot.xlsx ✅
    • Distribution Histogram After Log Transformation.xlsx ✅
    • Distribution Histogram.xlsx ✅
    • Probability Plot.xlsx ✅
  • Data under the path .../artifacts
    • Transform Pipeline Configuration.txt ✅
  • Data under the path .../metrics No Output
  • Data under the path .../parameters
    • Hyper Parameters - Isolation Forest.txt ✅
  1. check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
  • Data under the path .../artifacts/data
    • Data Original.xlsx ✅
    • Data Selected.xlsx ✅
    • Data Selected Dropped-Imputed.xlsx ✅
    • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
    • X Abnormal.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly.xlsx with label is_anomaly; For GeochemistryPI + ID, it's X Abnormal.xlsx with label is_abnormal)
    • X Abnormal Detection.xlsx ✅ (⚠️ for GeochemistryPI, it's X Anomaly Detection.xlsx with label is_anomaly; For GeochemistryPI + ID, it's X Abnormal Detection with label is_abnormal)
    • X Normal.xlsx ✅
    • X With Scaling.xlsx ✅
  • Data under the path .../artifacts/image/map
    • Map Projection - SiO2.xlsx ✅
  • Data under the path .../artifacts/image/model_output
    • Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
    • Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
    • Anomaly Detection Two-Dimensional Diagram - Isolation Forest..xlsx ✅
  • Data under the path .../artifacts/image/statistic
    • Correlation Plot.xlsx ✅
    • Distribution Histogram After Log Transformation.xlsx ✅
    • Distribution Histogram.xlsx ✅
    • Probability Plot.xlsx ✅
  • Data under the path .../artifacts
    • Transform Pipeline Configuration.txt ✅
  • Data under the path .../metrics No Output
  • Data under the path .../parameters
    • Hyper Parameters - Isolation Forest.txt ✅

Anomaly detection - Isolation Forest and Local Outlier Factor -Manual Hyperparameter Selection - Haibin Lai Test

Test Result: Successful

Process Step:

Template -> [Section Name: Option]

  • Built-in Training Data Option: 5
  • Output Data Identifier Column Selection: 1
  • World Map Projection: 1
  • Distribution in World Map: 4
  • Continue World Map: 2
  • Data Selection: [2, 5]
  • Missing Values Process: 1
  • Strategy for Missing Values: 2
  • Imputation Method Option: 1
  • Feature Engineering: 2
  • Mode Selection: 5
  • Feature Selection on X set: 1
  • Feature Selection strategy on X set: 1
  • Model Selection: 3
  • Isolation Forest - Hyper-parameters Specification
    • N Estimators: 100
    • contamination of the data set: 0.3
    • Max Features: 3
    • Bootstrap: 1
    • Max Samples: 64
  • 2 Dimensions Data Selection: 2
  • 2 Dimensions Data Selection: 3
  • 3 Dimensions Data Selection: 1
  • 3 Dimensions Data Selection: 3
  • 3 Dimensions Data Selection: 4

Isolation Forest - Hyper-parameters Specification - N Neighbors: 20 - Leaf Size: 30 - P: 2 - Contamination: 0.3 - N Jobs: 1

  • 2 Dimensions Data Selection: 2
  • 2 Dimensions Data Selection: 3
  • 3 Dimensions Data Selection: 1
  • 3 Dimensions Data Selection: 3
  • 3 Dimensions Data Selection: 4

Verification Step:

  1. Check If the position of ID matches the correct row in Geochemistrypi+ID
  • Data under the path .../artifacts/data
    • Data Original.xlsx ✅
    • Data Selected.xlsx ✅
    • Data Selected Dropped-Imputed.xlsx ✅
    • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
    • X With Scaling.xlsx ✅
  • Data under the path .../artifacts/image/map
    • Map Projection - SiO2.xlsx ✅
  • Data under the path .../artifacts/image/model_output
    • No Output file
  • Data under the path .../artifacts/image/statistic
    • Correlation Plot.xlsx ✅
    • Distribution Histogram After Log Transformation.xlsx ✅
    • Distribution Histogram.xlsx ✅
    • Probability Plot.xlsx ✅
  • Data under the path .../artifacts
    • Transform Pipeline Configuration.txt ✅
  • Data under the path .../metrics
    • No Output file ✅
  • Data under the path .../parameters
    • No Output file ✅
  • Data under the path .../summary
    • No Output file ✅
  • Data under the path .../Isolation Forest/artifacts/data
    • X Abnormal.xlsx ✅
    • X Abnormal Dectection.xlsx ✅
    • X Normal.xlsx ✅
  • Data under the path .../Isolation Forest/artifacts/
    • Transform Pipeline Configuration.txt ✅
  • Data under the path .../Isolation Forest/parameters/
    • Hyper Parameters - Isolation Forest.txt ✅
  • Data under the path .../Isolation Forest/artifacts/image/map
    • No Output file ✅
  • Data under the path .../Isolation Forest/artifacts/image/static
    • No Output file ✅
  • Data under the path .../Isolation Forest/artifacts/image/model_output
    • Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
    • Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
    • Anomaly Detection Two-Dimensional Diagram - Isolation Forest.xlsx ✅
  • Data under the path .../Local Outlier Factor/metrics
    • No Output file ✅
  • Data under the path .../Local Outlier Factor/parameters
    • No Output file ✅
  • Data under the path .../Local Outlier Factor/summary
    • No Output file ✅
  • Data under the path .../Local Outlier Factor/artifacts/data
    • X Abnormal.xlsx ✅
    • X Abnormal Dectection.xlsx ✅
    • X Normal.xlsx ✅
  • Data under the path .../Local Outlier Factor/artifacts/image/model_output
    • Anomaly Detection Density Estimation - Local Outlier Factor.xlsx ✅
    • Anomaly Detection Three-Dimensional Diagram - Local Outlier Factor.xlsx ✅
    • Anomaly Detection Two-Dimensional Diagram - Local Outlier Factor..xlsx ✅
    • Lof Score Diagram - Local Outlier Factor ✅
  • Data under the path .../Local Outlier Factor/artifacts/image
    • Transform Pipeline Configuration.txt ✅
  1. check if the rows in Geochemistrypi+ID match to the corresponding one in Geochemistrypi
    • Data under the path .../artifacts/data
      • Data Original.xlsx ✅
      • Data Selected.xlsx ✅
      • Data Selected Dropped-Imputed.xlsx ✅
      • Data Selected Dropped-Imputed Feature-Engineering.xlsx ✅
      • X With Scaling.xlsx ✅
  • Data under the path .../artifacts/image/map
    • Map Projection - SiO2.xlsx ✅
  • Data under the path .../artifacts/image/model_output
    • No Output file
  • Data under the path .../artifacts/image/statistic
    • Correlation Plot.xlsx ✅
    • Distribution Histogram After Log Transformation.xlsx ✅
    • Distribution Histogram.xlsx ✅
    • Probability Plot.xlsx ✅
  • Data under the path .../artifacts
    • Transform Pipeline Configuration.txt ✅
  • Data under the path .../metrics
    • No Output file ✅
  • Data under the path .../parameters
    • No Output file ✅
  • Data under the path .../summary
    • No Output file ✅
  • Data under the path .../Isolation Forest/artifacts/data
    • X Abnormal.xlsx ✅
    • X Abnormal Dectection.xlsx ✅
    • X Normal.xlsx ✅
  • Data under the path .../Isolation Forest/artifacts/
    • Transform Pipeline Configuration.txt ✅
  • Data under the path .../Isolation Forest/parameters/
    • Hyper Parameters - Isolation Forest.txt ✅
  • Data under the path .../Isolation Forest/artifacts/image/map
    • No Output file ✅
  • Data under the path .../Isolation Forest/artifacts/image/static
    • No Output file ✅
  • Data under the path .../Isolation Forest/artifacts/image/model_output
    • Anomaly Detection Density Estimation - Isolation Forest.xlsx ✅
    • Anomaly Detection Three-Dimensional Diagram - Isolation Forest.xlsx ✅
    • Anomaly Detection Two-Dimensional Diagram - Isolation Forest.xlsx ✅
  • Data under the path .../Local Outlier Factor/metrics
    • No Output file ✅
  • Data under the path .../Local Outlier Factor/parameters
    • No Output file ✅
  • Data under the path .../Local Outlier Factor/summary
    • No Output file ✅
  • Data under the path .../Local Outlier Factor/artifacts/data
    • X Abnormal.xlsx ✅
    • X Abnormal Dectection.xlsx ✅
    • X Normal.xlsx ✅
  • Data under the path .../Local Outlier Factor/artifacts/image/model_output
    • Anomaly Detection Density Estimation - Local Outlier Factor.xlsx ✅
    • Anomaly Detection Three-Dimensional Diagram - Local Outlier Factor.xlsx ✅
    • Anomaly Detection Two-Dimensional Diagram - Local Outlier Factor..xlsx ✅
    • Lof Score Diagram - Local Outlier Factor ❌
  • Data under the path .../Local Outlier Factor/artifacts/image
    • Transform Pipeline Configuration.txt ✅

HaibinLai avatar Dec 19 '24 13:12 HaibinLai