UFC-Predictions
UFC-Predictions copied to clipboard
A web app to predict UFC fights
UFC predictions
Usage

- Go to https://ufc-predictions.rajeevwarrier.com/
- Select weight-class of the bout
- Select Number of 5 minute rounds the fight is scheduled for
- Select if the fight is a title fight or not
- Select the fighter names
- Click predict
Details
- Scraped event and fight stats, data from 1993 to present date using Beautiful Soup.
- Cleaned, preprocessed and feature engineered the data to each row being a historical representation of both fighters and their individual fights/fight stats.
- Dataset uploaded and now available on Kaggle at: https://www.kaggle.com/rajeevw/ufcdata
- Oversampled minority class, created and tested predictive models using
RandomForestClassifierandXGBoostClassifier - Created a web app using dash and deployed it with docker on heroku.
Results
- Accuracy (valid): 0.7218
- AUC Score (valid): 0.7763

-
0corresponds to Blue: Fighter in the blue corner -
1corresponds to Red: Fighter in the red corner -
Generally the underdog is in the blue corner and favourite fighter is in the red corner.
-
The model is therefore (understandably) having a hard time figuring out when the underdog wins. This is because the sport is very volatile and there can be anything from an injury, psychological loss/trauma to just pure luck that determine the winner.
Details about the data
Context
This is a list of every UFC fight in the history of the organisation. Every row contains information about both fighters, fight details and the winner. The data was scraped from ufcstats website. After fightmetric ceased to exist, this came into picture. I saw that there was a lot of information on the website about every fight and every event and there were no existing ways of capturing all this. I used beautifulsoup to scrape the data and pandas to process it. It was a long and arduous process, please forgive any mistakes. I have provided the raw files incase anybody wants to process it differently. This is my first time creating a dataset, any suggestions and corrections are welcome!
How to use from Scratch?
- From the root i.e.
UFC-Predictions, Simply runpython -m src.create_ufc_data
(Note: This will scrape everything from the beginning if you haven't used this before. Otherwise the command will update the data files. Then, it will preprocess the raw scraped files to create usable data files)
Content
Each row is a compilation of both fighter stats. Fighters are represented by 'red' and 'blue' (for red and blue corner). So for instance, red fighter has the complied average stats of all the fights except the current one. The stats include damage done by the red fighter on the opponent and the damage done by the opponent on the fighter (represented by 'opp' in the columns) in all the fights this particular red fighter has had, except this one as it has not occured yet (in the data). Same information exists for blue fighter. The target variable is 'Winner' which is the only column that tells you what happened. Here are some column definitions:
Column definitions:
R_andB_prefix signifies red and blue corner fighter stats respectively_opp_containing columns is the average of damage done by the opponent on the fighterKDis number of knockdownsSIG_STRis no. of significant strikes 'landed of attempted'SIG_STR_pctis significant strikes percentageTOTAL_STRis total strikes 'landed of attempted'TDis no. of takedownsTD_pctis takedown percentagesSUB_ATTis no. of submission attemptsPASSis no. times the guard was passed?REVare the number of reversalsHEADis no. of significant strinks to the head 'landed of attempted'BODYis no. of significant strikes to the body 'landed of attempted'CLINCHis no. of significant strikes in the clinch 'landed of attempted'GROUNDis no. of significant strikes on the ground 'landed of attempted'win_byis method of winlast_roundis last round of the fight (ex. if it was a KO in 1st, then this will be 1)last_round_timeis when the fight ended in the last roundFormatis the format of the fight (3 rounds, 5 rounds etc.)Refereeis the name of the Refdateis the date of the fightlocationis the location in which the event took placeFight_typeis which weight class and whether it's a title bout or notWinneris the winner of the fightStanceis the stance of the fighter (orthodox, southpaw, etc.)Height_cmsis the height in centimeterReach_cmsis the reach of the fighter (arm span) in centimeterWeight_lbsis the weight of the fighter in pounds (lbs)ageis the age of the fightertitle_boutBoolean value of whether it is title fight or notweight_classis which weight class the fight is in (Bantamweight, heavyweight, Women's flyweight, etc.)no_of_roundsis the number of rounds the fight was scheduled forcurrent_lose_streakis the count of current concurrent losses of the fightercurrent_win_streakis the count of current concurrent wins of the fighterdrawis the number of draws in the fighter's ufc careerwinsis the number of wins in the fighter's ufc careerlossesis the number of losses in the fighter's ufc careertotal_rounds_foughtis the average of total rounds fought by the fightertotal_time_fought(seconds)is the count of total time spent fighting in secondstotal_title_boutsis the total number of title bouts taken part in by the fighterwin_by_Decision_Majorityis the number of wins by majority judges decision in the fighter's ufc careerwin_by_Decision_Splitis the number of wins by split judges decision in the fighter's ufc careerwin_by_Decision_Unanimousis the number of wins by unanimous judges decision in the fighter's ufc careerwin_by_KO/TKOis the number of wins by knockout in the fighter's ufc careerwin_by_Submissionis the number of wins by submission in the fighter's ufc careerwin_by_TKO_Doctor_Stoppageis the number of wins by doctor stoppage in the fighter's ufc career
Acknowledgements
-
Inspiration: https://github.com/Hitkul/UFC_Fight_Prediction Provided ideas on how to store per fight data. Unfortunately, the entire UFC website and fightmetric website changed so couldn't reuse any of the code.
-
Print Progress Bar: https://gist.github.com/aubricus/f91fb55dc6ba5557fbab06119420dd6a To display progress of how much download is complete in the terminal
-
Web app: https://github.com/jasonchanhku/ Ideas on how to use dash and google search api to show fighter images