Extend compare methods with pandas support
It is proposal - and can be implemented by me.
We have currently some basic comparison keywords. We could add more keywords and use optional pandas dependency to compare the data and display the difference. It could be great help for QAs as it usually take long time to find the differences in typical query output. We could focus on checking the query content instead of making the queries in compare keyword (to avoid complexity).
Example keyword:
${clients} Query SELECT * FROM ${table} WHERE client_id = ${client_id}
# names are temporary, as I suck with it
Query Result Should Be Equal To ${clients} ${path_to_csv_file}
Query Result Should Be Equal To ${clients} ${pandas_dataframe}
Query Result Should Be Equal To ${clients} ${list_of_dicts}
# types could be interchangeable thanks to RF converters - ${clients} could be also list of dicts / pandas df / path to file
Difference would raise an error and display difference table in form of HTML. Quick POC using following two CSV files:
file1.csv
client_id,age,city
1,15,Munich
2,84,Washington
3,55,Gdynia
4,43,Paris
file2.csv
client_id,age,city
1,15,Munich
2,84,Washington
3,54,Gdynia
4,43,Paris
robot:
Compare CSV file1.csv file2.csv
Robot log:
@amochin This idea is loosely related to DatabaseLibrary in a way, that this could be great help for people using the library but necessarily requires DatabaseLibrary. We could either implement it internally or I could create external package (like "CompareLibrary" or whatever). I'm leaning forward the second option but then advertise in this library documentation, that this library focus on quering/using the databases and actual comparison could be done using this or that library.
@bhirsz I think it's a really great feature! I'd come along with the proposal of putting it in an extra package and extend the documentation. Let's keep this DatabaseLibrary focused on the SQL requests - it's already complicated enough because of different databases and their python implementations...