Use duckdb loading throughout pyprophet
With export-parquet, we have an additional dependency of duckdb which allows for fast SQL queries, especially those involving lots of joins.
Here roll out duckdb SQL queries in pyprophet for greater data loading efficiency.
Examples
Conducted on dell XPS ubuntu
Export Command
time pyprophet export --in=39041_Hela_500ng_15SPD_DIA_Py3_1_S2-A7_1_4502.osw
Old timings: real 0m56.284s user 0m35.997s sys 0m15.130s
New timings: real 0m12.832s user 0m40.578s sys 0m8.378s
Score Command
- Only 1 iteration so most of the time showcased is loading the data
time pyprophet score --in=39041_Hela_500ng_15SPD_DIA_Py3_1_S2-A7_1_4502.osw --ss_num_iter=1
Old Timings: real 0m59.466s user 1m30.275s sys 0m11.004s
New timings: real 0m30.482s user 1m21.186s sys 0m9.460s
I am not sure why the tests are not being conducted.
Ok I think the tests are passing just not appearing in this PR for some reason
Will close this, as the recent PR #142 covers this.