owl icon indicating copy to clipboard operation
owl copied to clipboard

Exception: Assert_failure ("src/base/misc/owl_dataframe.ml", 753, 2).

Open theodore-weld opened this issue 6 years ago • 4 comments

I was checking out the dataframe examples and I rewrote the first function just a bit to deal with some price data from Binance (11 columns; all floats) and I ran that function using a file of ETC/BTC price data with 1.2 million lines and it returned:

Exception: Assert_failure ("src/base/misc/owl_dataframe.ml", 753, 2).

I also run the function with a different (much smaller) file of price data and it worked just fine.

The function:

let csv_parser path = 
    let fname = path ^ "ETC-BTC.csv" in
    let types = [|"f";"f";"f";"f";"f";"f";"f";"f";"f";"f";"f"|] in
    let df = Dataframe.of_csv ~sep:',' ~types fname in
    Owl_pretty.pp_dataframe Format.std_formatter df

Here's a sample of the price data:

open_time,open,high,low,close,volume,close_time,quote_asset_volume,number_of_trades,taker_buy_base_asset_volume,taker_buy_quote_asset_volume,ignore
1507800720000,0.00223,0.00223,0.00223,0.00223,10.0,1507800779999,0.0223,1,10.0,0.0223,900.48

What should I be doing differently in order to use this with larger files?

theodore-weld avatar Feb 20 '20 04:02 theodore-weld

We should add a task to remove all asserts from the code. My guess is that there is a row with a different length. Can you check if I am wrong?

mseri avatar Feb 21 '20 12:02 mseri

Consider using https://github.com/Chris00/ocaml-csv

code-ghalib avatar Nov 15 '20 22:11 code-ghalib

Looks like the assertion fails exactly when the csv has >100 lines.

endorphin avatar Jul 15 '21 04:07 endorphin

This issue is quite old and the code might've changed since then however 'assertion when CSV > 100 lines' is very similar to the bug I fixed here: https://github.com/owlbarn/owl/pull/639. In that case the assertion was used for control flow, but printed to the console and ignored (the code to guess the CSV separator and types wants to stop iterating on lines once >100 lines in a file). Replacing the assert with a proper exception makes the assert go away.

Could you check whether my PR fixes your problem?

edwintorok avatar May 23 '23 20:05 edwintorok