haystack
haystack copied to clipboard
feat: Save the output of pipe.eval() as a preformatted Excel
Related Issues
- fixes #2431
Proposed Changes:
There are two new functions in schema.py
, save_excel()
and load_excel()
, which work similarly to save()
and load()
. The main difference is that they write and read a file in excel format.
-
save_excel()
saves the evaluation result in Excel format. The result of each node is saved in a separate sheet of the out_file file. The output file is generated according to the eval_template.xlsx template file. For each node, if its corresponding sheet appears in the template and contains a header line, the node evaluation results are appended respecting the specified columns. If there is no preexisting sheet or no header line, all metrics are written. Formatting options from the template still appear in the generated file. multilabel_id column of created file is saved as string to prevent automatic rounding of large numbers (numbers that have greater than 15 digits) by excel. -
load_excel()
loads the evaluation result saved bysave_excel()
How did you test it?
Updated test_extractive_qa_eval and test_extractive_qa_eval_multiple_queries test functions.
Notes for the reviewer
This is still a draft; I prefer to make an early PR to get feedback as suggested in the contribution guidelines.
The save_excel()
function creates only one file with several sheets, instead of several files like it's done in save()
. It seemed to be the "logical" way to do with excel documents.
As I work with Linux, I created the template file and ran the tests with LibreOffice. I believe it would be safer to create the template file and run some more tests with official Microsoft Excel.
I also fixed what I believe to be a small issue with updated test functions: the code saved and reloaded eval results, but then the same tests were run again on the same unchanged variable; so I updated this variable with the loaded data (cf. commit 04afb133bb3178213ec9098355310800cc69141a). I probably should have create separate issue and PR for this, and I can still do it if you prefer.
TODO:
- recreate template file with Microsoft Office
- add template sheet for each node type (only
Retriever
andReader
have been created) - delete sheets coming from template which were not used for a given evaluation result instance (there was no corresponding evaluated node)
- move template file somewhere else -> where ?
- multiple lines are repeated in updated test functions -> factorize them?
Checklist
- [X] I have read the contributors guidelines and the code of conduct
- [ ] I have updated the related issue with new insights and changes
- [X] I added tests that demonstrate the correct behavior of the change
- [X] I've used the conventional commit convention for my PR title
- [X] I documented my code
- [X] I ran pre-commit hooks and fixed any issue
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
Hi @agnieszka-m! Thank you for your review :) I committed all your suggestions.
Thanks for the feedback @tstadel ! ok, I will try formatting the excel in code instead of using a template.
@tstadel I added some basic formatting using openpyxl.
pylint still produces an error because I am "instantiating" an abstract class (whose __new__()
function actually returns an instance of one of its subclasses). This seems to be [the recommended way](https://pandas.pydata.org/docs/reference/api/pandas.ExcelWriter.html).
I could remove the error by directly instantiating the concrete subclass, replacing
with pd.ExcelWriter(out_file, engine="openpyxl") as writer:
with
with pd.io.excel._openpyxl.OpenpyxlWriter(out_file) as writer:
but it doesn't feel right to me. Any opinion on this?
@tstadel I added some basic formatting using openpyxl.
pylint still produces an error because I am "instantiating" an abstract class (whose
__new__()
function actually returns an instance of one of its subclasses). This seems to be [the recommended way](https://pandas.pydata.org/docs/reference/api/pandas.ExcelWriter.html).I could remove the error by directly instantiating the concrete subclass, replacing
with pd.ExcelWriter(out_file, engine="openpyxl") as writer:
with
with pd.io.excel._openpyxl.OpenpyxlWriter(out_file) as writer:
but it doesn't feel right to me. Any opinion on this?
There's an issue in pylint for this: https://github.com/PyCQA/pylint/issues/3060 So, let's suppress it like it's suggested here.
I think everything is ok now. Apologies for the multiple runs of the CI actions.
@tstadel thanks again for the review.
load_excel
doesn't work for me if I usesave_excel
andload_excel
in Tutorial 5 instead of the ordinarysave
andload
functions. Let's ensure this works
→ load_excel()
indeed lacked robustness; I fixed it for this case.
- When saving to excel the resulting xlsx file still has a
index
column. I'm not sure whether this is a pandas bug, but let's investigate that.
→ It seems that it is not related to save_excel()
but to the dataframes returned by pipeline.eval()
which contain a column explicitly named index
(these dataframes are created here).
The same phenomenon can be observed in files created by save()
(for instance when running Tutorial 5).
hey @camillepradel, thanks a lot for your work, this is awesome! :) Do you think it would also be possible to make the design more generic? Design is a quite subjective thing I think and it would be nice if we could get people to define their own style, e.g. font, header format, row colors, freeze header. You used before a template I saw, was this for the whole structure (column labels etc.), or just the design? If it just uses the design, I think it would be a nice alternative to give the user more customization options if a template was provided
Hi @ju-gu. The template was used for the design but it depended on the data to be saved as it was required to define a sheet with a header line for each node eval result to save (this in order to specify columns to write and their respective formatting).
We could make a compromise by adding an optional parameter template_file
to save_excel()
.
- If the parameter is provided, we use the template file the same way it was done in the early implementation
The output file is generated according to the template file. For each node, if a sheet with the same name appears in the template and contains a header line, the node evaluation results are appended respecting the specified columns and the formatting options from the template are kept. If there is no preexisting sheet or no header line, all metrics are written and no formatting is applied.
- Otherwise, we apply basic formatting as it is done in current implementation.
Would you like me to implement it that way, or were you thinking about something else?
that is already pretty cool. But as @tstadel pointed out it would mean a template for each node, which need to be kept up to date. I think it would be better if we could just use the design of a template. Then it would also work if the headers are not the same. Do you have an idea how to do this? Maybe by just defining a header and then overwriting it?
hey, @camillepradel so I checked again with @ZanSara and it would be also fine if you would just add the previous template option as optional, as you previously described :). I actually will use this a lot, as I am doing a lot of analyzation. So thanks again, for your contribution! looking very much forward to see this :)
Hi @ju-gu ! I restored the template option. There is now only one template sheet, as you suggested. It is a bit less customizable, but much more simple!
thanks! That sounds very good :). I did yet not get to test it and won't for this week, so maybe @bogdankostic and @bglearning can have a look?
Hi @camillepradel, thanks for your PR! I will have a look at this in the following days. In the meantime, could you please delete the setup.cfg
file from your branch and add the dependency to the pyproject.toml
file? We recently restructured our package to comply with PEP 621.
Closing for lack of activity