Pauline Banye

Results 50 comments of Pauline Banye

> ### Model Title > HIA (TDC dataset) > > ### Publication > Hello @pauline-banye ! > > As part of your Outreachy contribution, we have assigned you the dataset...

Hi @GemmaTuron, my apologies for the delayed response. I have been under the weather for a couple of days but I'm getting much better. My work on this has been...

Next step involved analysing the data. I observed that the HIA dataset from TDC comprises of a total of 578 molecules, 500 of which are active and 78 inactive. -...

I also created graphical representations of active molecules and inactive molecules using the RDKIT python package. The rdkit library is a Python library that allows us to handle chemical structures...

Next step involved training the data. The data comprises of the Drug ID, Drug (Smile) and the Y which indicates the active or inactive state. I prepared a list of...

Once the model was trained, I evaluated the model performance by predicting the results using the validation and test sets. The list of smiles data and the result is passed...

1. AUROC value The Area Under the Curve (AUC) measures the ability of a classifier to distinguish between classes. The higher the AUROC value, the better the performance of the...

2. ROC Curve A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the performance of a binary classification model. The ROC curve is plotted with...

3. Contingency Table Contingency tables are used to record the number of molecules assigned to different classes after a test has been performed. It displays a visual representation of the...

> Hi @pauline-banye ! > > very good job, please also add your comments on model performance before we close this contribution! Thanks so much @GemmaTuron 😊! I'm currently working...