GML
GML copied to clipboard
Auto Data Science - Python Library.
GML
Creators
Muhammad AhmedNaman Tuli
Contributors
Mehran KamalRafey Iqbal Rahman
Tired of doing Data Science manually? GML is here for you!
GML is an automatic data science library in python built on top of multiple Python packages. Complete features which we offer are listed as:
Installation:
pip install GML
https://pypi.org/project/GML
If you are facing any pytorch related issue during installation, kindly refer to following solution: https://github.com/Muhammad4hmed/GML/issues/6#issuecomment-735912557
Features:
Auto Feature Engineering
from GML import FeatureEngineering
fe = FeatureEngineering(Data, 'target', fill_missing_data=True, encode_data=True,
normalize=True, remove_outliers=True,
new_features=True, feateng_steps=2 ) # feateng_steps = 0 for features selection without feature creation
X_new, y, test = fe.get_new_data()
Click Here for complete DEMO
Auto EDA (Powered by Sweetviz)
from GML import sweetviz
result1 = sweetviz.compare([train,'train'],[test,'test'],'target')
result2 = sweetviz.analyze([train,'train'])
result.show_html()
result2.show_html()
Click Here for complete DEMO
Auto Machine Learning
from GML import AutoML
gml_ml = AutoML()
gml_ml.GMLClassifier(X, y, metric = accuracy_score, folds = 10)
data:image/s3,"s3://crabby-images/689b7/689b7fe68794a1286fdfed150c01f94065f5d41c" alt=""
Click Here for complete DEMO
Auto Text Cleaning
from GML import AutoNLP
nlp = AutoNLP()
cleanX = X.apply(lambda x: nlp.clean(x))
Click Here for complete DEMO
Auto Text Classification using transformers
from GML import AutoNLP
nlp = AutoNLP()
nlp.set_params(cleanX, tokenizer_name='roberta-large-mnli', BATCH_SIZE=4,
model_name='roberta-large-mnli', MAX_LEN=200)
model = nlp.train_model(tokenizedX, y)
Click Here for complete DEMO
Auto Image Classification with Augmentation
from GML import Auto_Image_Processing
gml_image_processing = Auto_Image_Processing()
model = gml_image_processing.imgClassificationcsv(img_path = './covid_image_data/train',
train_path = './covid_image_data/Training_set_covid.csv',
model_list = models,
tfms = True, advance_augmentation = True,
epochs=1)
Click Here for complete DEMO
Text Augmentation using transformers: GPT-2
from GML import AutoNLP
nlp = AutoNLP()
nlp.augmentation_train('./data.csv')
nlp.set_params(X['Text'])
new_Text = nlp.augmentation_generate(y = y, SENTENCES = 100)
Click Here for complete DEMO
More cool features and handling of different data types like audio data etc will be added in future.
Feel free to give suggestions, report bugs and contribute.