COVID-QA
COVID-QA copied to clipboard
Real-Time data scraping for countries
Hi,
I have been working on chatbot for croatian language. Here is little help for real time scraping.
import requests from bs4 import BeautifulSoup import numpy as np import pandas as pd
url = "https://www.worldometers.info/coronavirus/" headers = {'Accept': 'text/html'} response = requests.get(url, headers=headers) #print(response) content = response.content soup = BeautifulSoup(content,"lxml")
elements = (np.array([[y.text for y in x.find_all("td")] for x in soup.find(id="main_table_countries_today").find_all("tr")])) elements = [x for x in elements if len(x)==9]
wordmeters = pd.DataFrame(elements) wordmeters.columns = ["Country,Other","Total Cases","New Cases","Total Deaths","New Deaths","Total Recovered","Active Cases","Serious, Critical","Tot Cases/1M pop"] wordmeters
Sorry for the late reply. Integrating this data for questions like "How many cases are in X?" is actually on our roadmap, but would require quite a lot of implementations:
- We need to identify if a question is asking for this type of structured information.
- What type is asked for, new cases, total cases/deaths etc.
- Finally we need to match the country description in your Dataframe with the country that was asked for. [4. Maybe handle spelling mistakes in either country or what was asked for.]
It would help to be able to query an API with this info. Do you have any updates on your integration or would like to implement the proposed steps in this repository?