autoemploi-fsm
autoemploi-fsm copied to clipboard
A script that automates the tedious process finding where you pass your exam in FSM + basic web app
Auto Emploi FSM
A small personal project to automate the search for your class in the exams week (if you study in FSM). You can use it at https://fsm.jbara.xyz
How it works
This codes automates logging into FSM Extranet space using your credentials (and ORC on their weak captcha) and retrieves the pdf documents for the time and location of latest day's exams. It then applies OCR on every page of the documents and automates the search for the class you are searching for.
sequenceDiagram
participant Cronjob
participant Extrafetcher
participant Extranet
participant Refiner
participant OCR
participant Database
participant ImagesFolder
participant Webapp
participant User
loop for every 30 mins
Cronjob->>+Extrafetcher: Execute every 30 mins
Extrafetcher->>+Extranet: Scrape PDF files
Extranet->>-Extrafetcher: PDF files
loop for each PDF file
Extrafetcher->>+Refiner: PDF file
loop for each Page
Refiner->>+OCR: Feed image to OCR
OCR->>+Database: Save OCR result
Refiner->>+ImagesFolder: Save cropped image
end
end
end
User->>+Webapp: User input their name
Webapp->>+Database: Find image associated to user
Webapp->>+ImagesFolder: Get the image
Webapp->>+User: Show Salle
Installation
If you wish to use this locally or host it yourself. You can clone the repo and then install the pip requirements by issuing the following command. Note: It is advised you use a python virtual environment for this.
pip install -r requirements.txt
Make sure both Mozilla Firefox and Geckodriver are installed on your system and accessible via PATH
as they're both needed by selenium.
Make sure to install tesseract as it is needed by pytesseract.
Note: If you'd like to help port this to use chromium, pull requests are welcome!
Usage
In order to fetch the latest pdf documents and treat them. Note that this will ask you for your login credentials when you use it for the first time. Which they'll be stored in credentials.json
python autodevoir-cli fetch <username> <password>
After that feel free to run the basic web app and test it out
gunicorn webapp:app
In order to update (say, a new pdf document is released upstream) just re-run the python autodevoir-cli fetch
While it's not recommended yet. autodevoir-cli
is made to be run in a cronjob to frequently check for updates upstream.
NOTE: Only works on UNIX and UNIX-like Operating systems. If you would like to try this on windows, use WSL or feel free to edit the code (don't forget to PR).
Extra: While this is made for testing purposes. It can be useful to get the filenames that are related from the CLI. (say for an automation script or a custom app you're working on). Run python autodevoir-cli find
to use the CLI instead of web.
API
An API is implemented in case you want to use this in whatever you'd like. Just make a POST
request to https://jbara.xyz/api
with the following content.
{
"ident": "your name"
}
And you will get a response as an array of images corresponding to your search result as follows.
{
"result": [
"https://fsm.jbara.xyz/images/161223-l3-9h.png",
"https://fsm.jbara.xyz/images/151223-l3-15h30.png"
]
}