pandavro icon indicating copy to clipboard operation
pandavro copied to clipboard

decimal logical type : TypeError: can only concatenate str (not "int") to str

Open balaby25 opened this issue 1 year ago • 1 comments

am trying to convert an existing csv to avro using pandavro.

am not able to resolve the below error: File "fastavro/_logical_writers.pyx", line 130, in fastavro._logical_writers.prepare_bytes_decimal File "fastavro/_logical_writers.pyx", line 143, in fastavro._logical_writers.prepare_bytes_decimal TypeError: can only concatenate str (not "int") to str

i did check my csv, avsc and pandavro lines of code multiple times.. am not able to find what is the problem. am not savvy enough to call it a bug. can anyone provide me with some pointers.

data in the csv : 999.879 the column p_cost in avsc: { "name": "p_cost", "type": {"name": "decimalEntry", "type": "bytes", "logicalType": "decimal", "precision": 15, "scale": 3} }, the lines of code. :

def convert_to_decimal(val): """ Convert the string number value to a Decimal - Must set precision and scale beforehand """ return Decimal(val)

   schema_promotion = load_schema("promotion.avsc")
   df_promotion = pd.read_csv( '/scratch/tpcds_1/promotion/promotion.dat' , delimiter='|',header=None,usecols=[0,1,2,3,4,5,6,7,8,9,10,11,12

,13,14,15,16,17,18],names=['p_promo_sk',....,'p_cost',...,'p_discount_active'] ,dtype={'p_cost': 'str'})

getcontext().prec = 15 # set precision of all future decimals type(df_promotion['p_cost'])

df_promotion['p_cost'] = df_promotion['p_cost'].apply(convert_to_decimal) pdx.to_avro('test_promotion.avro', df_promotion, schema=schema_promotion )

throws below error:

Traceback (most recent call last): File "perfectlyrandom.py", line 313, in promotion() File "perfectlyrandom.py", line 262, in promotion pdx.to_avro('test_promotion.avro', df_promotion, schema=schema_promotion ) File "/home/opc/.local/lib/python3.8/site-packages/pandavro/init.py", line 322, in to_avro fastavro.writer(f, schema=schema, File "fastavro/_write.pyx", line 727, in fastavro._write.writer File "fastavro/_write.pyx", line 680, in fastavro._write.Writer.write File "fastavro/_write.pyx", line 432, in fastavro._write.write_data File "fastavro/_write.pyx", line 422, in fastavro._write.write_data File "fastavro/_write.pyx", line 366, in fastavro._write.write_record File "fastavro/_write.pyx", line 387, in fastavro._write.write_data File "fastavro/_logical_writers.pyx", line 130, in fastavro._logical_writers.prepare_bytes_decimal File "fastavro/_logical_writers.pyx", line 143, in fastavro._logical_writers.prepare_bytes_decimal TypeError: can only concatenate str (not "int") to str

if full schema definition and pandas df definition is needed, i shall provide the same. pip list: avro-python3 1.10.2 fastavro 1.5.1 numpy 1.23.3 pandas 1.5.0 pandavro 1.7.1

balaby25 avatar Nov 14 '22 08:11 balaby25