pandas-ai icon indicating copy to clipboard operation
pandas-ai copied to clipboard

How to read an existing dataframe

Open code4indo opened this issue 1 year ago • 7 comments

🐛 Describe the bug

The generated code expects a file_name.csv, as shown in the example below, How to read an existing dataframe. i'm running in jupyter notebook

Running PandasAI with openai LLM...

Code generated:

import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('nama_file.csv')
suspicious_df = df[df['Is Laundering'] == 1]
for currency in suspicious_df['Receiving Currency'].unique():
    currency_df = suspicious_df[suspicious_df['Receiving Currency'] == currency
        ]
    plt.scatter(currency_df['Amount Received'], currency_df['Amount Paid'],
        label=currency)
plt.xlabel('Amount Received')
plt.ylabel('Amount Paid')
plt.legend()
plt.show()

code4indo avatar May 09 '23 09:05 code4indo

@code4indo thanks for reporting. Would you mind sharing the prompt you used?

gventuri avatar May 09 '23 09:05 gventuri

@code4indo thanks for reporting. Would you mind sharing the prompt you used?

pandas_ai.run(df, prompt="Create a scatterplot showing the relationship between the amount of money received and the amount of money paid in transactions suspected of money laundering, based on the currency used")

code4indo avatar May 09 '23 09:05 code4indo

for testing purpose i using this dataset https://www.kaggle.com/datasets/ealtman2019/ibm-transactions-for-anti-money-laundering-aml

code4indo avatar May 09 '23 09:05 code4indo

@code4indo Could you share the dataset as well if its public or a mock version?

yzaparto avatar May 09 '23 09:05 yzaparto

@code4indo Could you share the dataset as well if its public or a mock version?

You can download CSV data from link above

code4indo avatar May 09 '23 09:05 code4indo

@code4indo @gventuri The code generated by the llm is wrong. It should not have the line df = pd.read_csv('nama_file.csv'). Have created a pr to try to alleviate this problem by providing a better prompt that links the question with the df already supplied.

yzaparto avatar May 09 '23 10:05 yzaparto

hello , i use pycharm to run the code, they run successfully, but they don't show the result,what should i do? thank u

hunaone avatar May 13 '23 12:05 hunaone

Closing, as the hallucination that overrides the df has now been fixed

gventuri avatar May 27 '23 09:05 gventuri