pandas-ai icon indicating copy to clipboard operation
pandas-ai copied to clipboard

UnicodeEncodeError: 'charmap' codec can't encode characters in position 128-129: character maps to <undefined>

Open KevinZhang19870314 opened this issue 5 months ago • 0 comments

System Info

pandasai: 1.5.17 OS: windows 11 python version: 3.10

🐛 Describe the bug

Get error when have chinese char "厘米" in csv file:

name,age,height
Kevin,18,“170厘米”
Ada,22,172cm
John,25,180cm
Nana,12,165cm
Jody,23,180cm
Sam,88,165cm
Joe,22,170cm
Average,30,172cm

python code:

from dotenv import load_dotenv
from pandasai import Agent
from pandasai.llm import OpenAI

load_dotenv()

df = "data/csv_analysis.csv"
llm = OpenAI(model="gpt-3.5-turbo")
agent = Agent([df], config={"llm": llm, "enable_cache": False})

response = agent.chat("Print out the number of rows in the table.")

print(response)

log:

D:\\venv\Scripts\python.exe D:\\chore\pandasai\test_csv.py 
--- Logging error ---
Traceback (most recent call last):
  File "C:\Users\kevin.zhang\AppData\Local\Programs\Python\Python310\lib\logging\__init__.py", line 1103, in emit
    stream.write(msg + self.terminator)
  File "C:\Users\kevin.zhang\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 128-129: character maps to <undefined>
Call stack:
  File "D:\\chore\pandasai\test_csv.py", line 12, in <module>
    response = agent.chat("Print out the number of rows in the table.")
  File "D:\\venv\lib\site-packages\pandasai\agent\__init__.py", line 87, in chat
    return self._lake.chat(query, output_type=output_type)
  File "D:\\venv\lib\site-packages\pandasai\smart_datalake\__init__.py", line 337, in chat
    result = GenerateSmartDatalakePipeline(pipeline_context, self.logger).run()
  File "D:\\venv\lib\site-packages\pandasai\pipelines\smart_datalake_chat\generate_smart_datalake_pipeline.py", line 50, in run
    return self._pipeline.run()
  File "D:\\venv\lib\site-packages\pandasai\pipelines\pipeline.py", line 87, in run
    data = logic.execute(
  File "D:\\venv\lib\site-packages\pandasai\pipelines\smart_datalake_chat\prompt_generation.py", line 68, in execute
    return pipeline_context.query_exec_tracker.execute_func(
  File "D:\\venv\lib\site-packages\pandasai\helpers\query_exec_tracker.py", line 128, in execute_func
    result = function(*args, **kwargs)
  File "D:\\venv\lib\site-packages\pandasai\smart_datalake\__init__.py", line 305, in _get_prompt
    self.logger.log(f"Using prompt: {prompt}")
  File "D:\\venv\lib\site-packages\pandasai\helpers\logger.py", line 75, in log
    self._logger.info(message)
Message: 'Using prompt: <dataframe>\ndfs[0]:8x3\nname,age,height\r\nJohn,25,180cm\r\nAda,22,172cm\r\nKevin,18,“170厘米”\r\n</dataframe>\n\n\n\n\nUpdate this initial code:\n```python\n# TODO: import the required dependencies\nimport pandas as pd\n\n# Write code here\n\n# Declare result var: type (possible values "string", "number", "dataframe", "plot"). Examples: { "type": "string", "value": f"The highest salary is {highest_salary}." } or { "type": "number", "value": 125 } or { "type": "dataframe", "value": pd.DataFrame({...}) } or { "type": "plot", "value": "temp_chart.png" }\n```\n\nQ: Print out the number of rows in the table.\nVariable `dfs: list[pd.DataFrame]` is already declared.\n\nAt the end, declare "result" variable as a dictionary of type and value.\n\n\n\nGenerate python code and return full updated code:'
Arguments: ()
8

Process finished with exit code 0

KevinZhang19870314 avatar Feb 04 '24 07:02 KevinZhang19870314