framework
framework copied to clipboard
`to_pandas()` in Excel resource returning an empty dataframe
Overview
I'm trying to use the to_pandas() to create a pandas DataFrame in one resource that is an Excel file.
from frictionless import Package
package = Package("datapackage.yaml")
mega = package.get_resource("mega-sena")
df = mega.to_pandas()
print(df)
Unfortunately, when the script is called, it returns an empty DataFrame:
Empty DataFrame
Columns: [Concurso, Data do Sorteio, Bola1, Bola2, Bola3, Bola4, Bola5, Bola6, Ganhadores 6 acertos, Cidade / UF, Rateio 6 acertos, Ganhadores 5 acertos, Rateio 5 acertos, Ganhadores 4 acertos, Rateio 4 acertos, Acumulado 6 acertos, Arrecadação Total, Estimativa prêmio, Acumulado Sorteio Especial Mega da Virada, Observação]
Index: []
The sample of the error could be found in this reprex commit[^1].
Creating a DataFrame from an .csv (with the same Excel content) works:
# Create csv file with excel content
from frictionless import Package
import pandas as pd
df = pd.read_excel("download/Mega-Sena.xlsx", index_col=0)
df.to_csv("data/Mega-Sena.csv")
# Create DataFrame from csv file
from frictionless import Package
package = Package("datapackage_csv.yaml")
mega = package.get_resource("mega-sena")
df = mega.to_pandas()
print(df)
The result:
Concurso Data do Sorteio ... Acumulado Sorteio Especial Mega da Virada Observação
0 1 11/03/1996 ... R$0,00 None
1 2 18/03/1996 ... R$0,00 None
2 3 25/03/1996 ... R$0,00 None
3 4 01/04/1996 ... R$0,00 None
4 5 08/04/1996 ... R$0,00 None
... ... ... ... ... ...
2611 2612 19/07/2023 ... R$61.356.654,15 None
2612 2613 22/07/2023 ... R$62.837.684,98 None
2613 2614 25/07/2023 ... R$64.067.866,59 None
2614 2615 27/07/2023 ... R$64.873.493,13 None
2615 2616 29/07/2023 ... R$65.936.608,29 None
The sample of the workaround could be found in this reprex commit.
[^1]: To run it just install packages listed in requirements.txt file and run python scripts/pandas_excel.py