knitr icon indicating copy to clipboard operation
knitr copied to clipboard

Add 'rows affected' output when using sql engine

Open edalfon opened this issue 3 years ago • 3 comments

Adds a default output for sql chunks where is_sql_update_query evaluates to TRUE, as proposed in #2050.

It does it simply by:

  • Commit 1: propagating DBI::DBI::dbExecute's result to data, so that users can set output.var and get the number of affected rows assigned to a variable.
  • Commit 2: adding a default output, indicating the number of rows affected.

I included a line in NEWS.md describing this. However, I did not include a test. I was not familiar with knitr's testing procedures, but I learned from this recent PR (#1987) that there is one major test for knitr's sql engine in another repository (https://raw.githubusercontent.com/yihui/knitr-examples/master/115-engine-sql.Rmd). This already covers the changes in this PR. I used it to test the changes and everything works. So I didn't really think it was necessary to include additional tests.

A final note: as discussed in knitr's issue #1637, I think a similar change would be needed on RStudio's side, to make it also work when running chunks interactively (RStudio's Run current chunk, Run all, etc.). I haven't looked into that, though, and I have no idea who to poke for this, but I'd be happy to take a look.

Huge thanks for knitr and hope this helps a tiny bit.

edalfon avatar Sep 25 '21 22:09 edalfon

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Sep 25 '21 22:09 CLAassistant

Many thanks for reviewing this. I agree with pretty much everything you said. In particular, that saving data for the result of dbExecute() is key. And although I share your concerns on printing, perhaps I just feel less strongly about them.

And the reason I feel less strongly about those concerns is that I think the most common use cases for this output are:

  • to get quick feedback while interactively trying out queries
  • to leave documented the result of data processing on the DB side

But I did not think such an output would be mainly used in -let's say- a public facing document (e.g. final report or the like). In that case, as you said, the users can set output.var and format and use it as they wish.

But for the two use cases above, I thought a quick default without configuring anything else would be useful (the user simply keep using a {sql} chunk and there you go). Therefore, I put a quick-and-dirty approach in that second commit.

I totally share your concerns. It's not cool forcing results = 'asis', hard-coding output in English and a particular number format. Ideally, the user should be able to customize all that. But I was not sure it was worth doing it. After all, if the users do want a particular format for that info, they can do it via output.var. So, I guess one should weigh the benefits of doing that vs. the costs in terms of code complexity, maintainability, and so on. Perhaps I perceived that the costs outweigh the benefits. But you are for sure in a better position to make such assessment.

edalfon avatar Sep 27 '21 14:09 edalfon

a note on this

Also, regarding the textual output, I am thinking about your comment

it seems SQLite does not actually return the number of rows affected (apparently always returns 0). Yet, this is pretty standard in most RDBMS and it is also the default in DBI::dbExecute, so I guess this is just a quirk specific to SQLite.

If we are not sure that all RDBMS outputs a number of rows, I guess it would be puzzling to have written in your report

My bad here!!! SQLite do also return the number of rows affected. I wrote that after a couple of failed attempts, but I cannot really reproduce them now (perhaps I just looked at the results of DROP TABLE, which are in fact usually 0).

Having said that, DBI::dbExecute can indeed return puzzling results. For example, when you combine several queries, I've seen weird numbers returned by DBI::dbExecute with a Postgres backend (PostgreSQL do let you send multiple statements at once, but duckdb, for example, throws an error. SQLite, I think, only processes the first statement and ignores the rest). But my take on this is that's DBI's deal and usage, and knitr cannot and should not solve that.

edalfon avatar Sep 27 '21 14:09 edalfon