False positives with `Rf_getAttrib()`
It seems that recently (sometime in the past two days), rchk started incorrectly flagging calls to Rf_getAttrib(). Specifically, if given that SEXP x is protected, when calling y = Rf_getAttrib(x, sym), y is protected too. E.g., this pattern:
extern SEXP sym_foo; // e.g., Rf_install("foo");
func_(SEXP x) {
SEXP attr_foo = Rf_getAttrib(x, sym_foo);
// rchk thinks attr_foo needs to be PROTECT()ed
SEXP attr_foo_foo = Rf_getAttrib(attr_foo, sym_foo);
// rchk complains that
// - allocating function Rf_getAttrib(?,S:foo) may destroy its unprotected argument
// if `attr_foo_foo` is passed to an R API function,
// rchk complains that
// - calling allocating function <func> with a fresh pointer (attr_foo_foo)
}
It seems that with this pattern, rchk now warns, and this is due to some upstream change, as there have been no new commits to rchk itself (this repo) recently.
For example, the [UP] warnings about prop_ and prop_set_ are brand new:
https://github.com/RConsortium/S7/actions/runs/11613632489/job/32339898283
Note just two days ago, these warnings were not present:
https://github.com/RConsortium/S7/actions/runs/11582187471/job/32244624698
Locally I could not reproduce them until I had run docker pull kalibera/rchk:latest.
Another false positive occurs with Rf_findVarInFrame
// `x` is protected
SEXP env = Rf_getAttrib(x, sym); // rchk thinks `env` needs protection
SEXP val = Rf_findVarInFrame(env, sym); // rchk thinks `val` needs protection
I don't think these are false positives. Are you 100% sure that these were not flagged before? A call to getAttrib() may return a fresh object that needs protection. So can findVarInFrame().
I just updated the opening comment with links to rchk CI runs from today and two days ago, showing these are brand new warnings.
A call to getAttrib() may return a fresh object that needs protection.
how?
So can findVarInFrame().
how?
(apologies for the direct question; I've recently had to become thoroughly acquainted with both those code paths in base R, and I'm fairly sure that, while they may allocate, they will not return an object that needs to be protected if the values supplied to those functions are protected. But, if I'm misunderstanding, I would very much appreciate being corrected).
First of all for a package author, this doesn't matter, a package author should follow Writing R Extensions and assume defensively that a call to any function returns a fresh pointer.
Rchk tries to be smarter than that and detect when actually functions return a fresh pointer, to some reasonable extent. So, in other words, a defensive programmer should protect in these two cases, either way. R implementation may change and may start allocating where it wasn't before.
Now, getAttrib() does allocate and return a fresh object in the current R implementation when converting from a pairlist, e.g. when converting names. So, simply, it may return a fresh object that needs protection. However, in some cases it won't. Rchk has some context-sensitive analysis and should be able to see that, with some symbols, the return value is not fresh. It may be that some of this context-sensitivity got broken (e.g. with new output from LLVM, or new R, in some specific case). A common problem I am seeing often is that rchk claims that "names" attribute needs protection (which it would, but only for some types of x) - again, rchk has some context sensitivity even for that, but it cannot handle all cases and it may be that some of that got broken and is now less sensitive than when I originally implemented it. Base R itself sometimes omits protection of values returned from getAttrib() when it is safe to do so with the current base R implementation - but there it is permissible, because we control the implementation.
Re findVarInFrame(), for instance with a user database or an active binding one may get a fresh object. This is something rchk can never be expected to see, one needs to defensively assume it is fresh, and this is also done in base R.
If you are sure that some of this worked differently before, it would be good to know for sure change of which component caused the change. If it was a specific commit to R or rchk, then I can have a look. If R and rchk (and LLVM) was the same, I can't help.
This is what I see locally, perhaps it's helpful
$ docker images kalibera/rchk
REPOSITORY TAG IMAGE ID CREATED SIZE
kalibera/rchk latest d6771960c26c 2 hours ago 1.61GB
kalibera/rchk <none> 5dc49c593bae 18 months ago 1.61GB
tomasz@mule:~
$ docker image inspect d6771960c26c 5dc49c593bae
[
{
"Id": "sha256:d6771960c26ca1c3374cb45bc55427bc4d7d6970b58e8699b47bb8bbbc2a7cae",
"RepoTags": [
"kalibera/rchk:latest"
],
"RepoDigests": [
"kalibera/rchk@sha256:5195084c92b495b46bc87731da7915049ec35362f58a1669debe337812969f83"
],
"Parent": "",
"Comment": "",
"Created": "2024-10-31T13:51:13.195959758Z",
"DockerVersion": "24.0.7",
"Author": "",
"Config": {
"Hostname": "",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"PATH=/rchk/scripts:/rchk/build/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TZ=Europe/Prague",
"WLLVM=/usr/local/bin",
"LLVM=/usr",
"RCHK=/rchk",
"CPICFLAGS=-fPIC",
"CFLAGS=-Wall -g -O0 -fPIC",
"CXXFLAGS=-Wall -g -O0 -fPIC -I/usr/include/libcxxabi",
"CC=wllvm",
"CXX=wllvm++ -stdlib=libc++",
"LLVM_COMPILER=clang",
"LD=clang++ -stdlib=libc++",
"R_LIBS=/rchk/packages/libs",
"R_LIBSONLY=/rchk/packages/libsonly",
"PKG_BUILD_DIR=/rchk/packages/build"
],
"Cmd": null,
"Image": "sha256:0d12ac82452cc315dedf71eb0ce6b9330d4ed179f53382e8648ef25a3bd19905",
"Volumes": null,
"WorkingDir": "/rchk",
"Entrypoint": [
"/bin/bash",
"/container.sh"
],
"OnBuild": null,
"Labels": {
"maintainer": "[email protected]",
"org.opencontainers.image.ref.name": "ubuntu",
"org.opencontainers.image.version": "22.04"
}
},
"Architecture": "amd64",
"Os": "linux",
"Size": 1610710354,
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay2/a13d4a7152254ce1e9d91b4efb98ae800275c900f44b036d0fd88d25d1d530aa/diff:/var/lib/docker/overlay2/08a0350cef88460dfa1beb07897ce97824593c733117f76791d25c2b63c7b2af/diff:/var/lib/docker/overlay2/512ccaa224c585f8683b2a42ae38b37557fe80c063e6ee44ea61d26602f8961b/diff:/var/lib/docker/overlay2/120caf9c7b5c4f7d70de4b2bf397d4b09016b407ca56c8544fe713873f17f6a8/diff:/var/lib/docker/overlay2/16e86f9f1751930d9d06f2e48d402cea9b53f34cb1b82baec1d1eeefd9db0ba9/diff:/var/lib/docker/overlay2/843ac406bdabc7eaf8afa8eb06eb4cb5f597cf9a311bbda0125700642e70a644/diff:/var/lib/docker/overlay2/1b7a919da11367b18702ca530f40c6879c1e956d04f6c2f6338f4d787852b963/diff",
"MergedDir": "/var/lib/docker/overlay2/e5a497d2615e83a2e0cb21bd9acd06d33b0c7388483085e37a8992288875dd4e/merged",
"UpperDir": "/var/lib/docker/overlay2/e5a497d2615e83a2e0cb21bd9acd06d33b0c7388483085e37a8992288875dd4e/diff",
"WorkDir": "/var/lib/docker/overlay2/e5a497d2615e83a2e0cb21bd9acd06d33b0c7388483085e37a8992288875dd4e/work"
},
"Name": "overlay2"
},
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:0b9c994b0484c0bc61f9de7c28a58745a504704254c5e8ed12349ebee3393a66",
"sha256:1974502a7ca99e05c7ac62fc899d057376672f472b845ed84be99b038a6aab67",
"sha256:f193d7c82b1105c0def093dd08e673f267fb8d729b6482b52f9eaf82309dbfdf",
"sha256:c0e41c3b78b0052a5e9123c22576e624d70adaff00ba87b500dd738e776a09c4",
"sha256:31fd170f6e21eba8f0b57dc106ddc3f3f47cb8464bf03b5bd9d1166274a7da00",
"sha256:5cacf4ef100d6d51859a41c9025b5916db98ddc01178670d1c855e80bb74345f",
"sha256:ab135d76a551e1329b432899218036a37f902cb707ab955979b29cc2d235f195",
"sha256:f5cb6e6651751664fcfa1947f2f1616ebf6f8f8ba849955031f267cfbecefba6"
]
},
"Metadata": {
"LastTagTime": "0001-01-01T00:00:00Z"
}
},
{
"Id": "sha256:5dc49c593bae1fb80b78cad4143186d4d6d5a5e81f70413e8afa2d7c8d2e3e6c",
"RepoTags": [],
"RepoDigests": [
"kalibera/rchk@sha256:69c11432309619a9e678ce1462e7021239020fc5d0578eab44574d386b3791aa"
],
"Parent": "",
"Comment": "",
"Created": "2023-04-13T13:40:43.493661229Z",
"DockerVersion": "20.10.21",
"Author": "",
"Config": {
"Hostname": "",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"PATH=/rchk/scripts:/rchk/build/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TZ=Europe/Prague",
"WLLVM=/usr/local/bin",
"LLVM=/usr",
"RCHK=/rchk",
"CPICFLAGS=-fPIC",
"CFLAGS=-Wall -g -O0 -fPIC",
"CXXFLAGS=-Wall -g -O0 -fPIC -I/usr/include/libcxxabi",
"CC=wllvm",
"CXX=wllvm++ -stdlib=libc++",
"LLVM_COMPILER=clang",
"LD=clang++ -stdlib=libc++",
"R_LIBS=/rchk/packages/libs",
"R_LIBSONLY=/rchk/packages/libsonly",
"PKG_BUILD_DIR=/rchk/packages/build"
],
"Cmd": null,
"Image": "sha256:7a1fd8650a383bf37bb22940be430e59d05a93ccf0e8da9298c8385ec5187152",
"Volumes": null,
"WorkingDir": "/rchk",
"Entrypoint": [
"/bin/bash",
"/container.sh"
],
"OnBuild": null,
"Labels": {
"maintainer": "[email protected]"
}
},
"Architecture": "amd64",
"Os": "linux",
"Size": 1607693939,
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay2/72abad608e7e597a31807328d89411e99ca70e077bdfddcf06f145b45c617f9b/diff:/var/lib/docker/overlay2/fede2d334bf4fb3e0a7cc98ff67df0de2d4ed5d62021a520aa085c703732bf06/diff:/var/lib/docker/overlay2/3d2fa37e7c729f2550e8e59acc2ab578fb1f4e4c16b40c7bfe09f4103df4e78c/diff:/var/lib/docker/overlay2/b3a8a91d5e983a0bd8c004607aa748bc395fc0729944927be1ad445d26a14224/diff:/var/lib/docker/overlay2/68359e70b619a0c109c0936bbfeddde0739f61f80bc857875758c307f6779492/diff:/var/lib/docker/overlay2/a1afff1f0d3f5026809ea2fb8e4e8310b8980122aef6584300a5c57567090269/diff:/var/lib/docker/overlay2/1410cd26fc4f32343e1defbfb55e22d75022b1b5f3e0f91d8c869b4ba283d17f/diff",
"MergedDir": "/var/lib/docker/overlay2/66c292f4178345bffed6c5f6cf9ac1fdbd7507edc83847850f9a7aecc034bf91/merged",
"UpperDir": "/var/lib/docker/overlay2/66c292f4178345bffed6c5f6cf9ac1fdbd7507edc83847850f9a7aecc034bf91/diff",
"WorkDir": "/var/lib/docker/overlay2/66c292f4178345bffed6c5f6cf9ac1fdbd7507edc83847850f9a7aecc034bf91/work"
},
"Name": "overlay2"
},
"RootFS": {
"Type": "layers",
"Layers": [
"sha256:17f623af01e277c5ffe6779af8164907de02d9af7a0e161662fc735dd64f117b",
"sha256:2c34a3ad12095077cee5f38f0725a71c371cd1f0e8bbba4ac7c8ac50e16a10d7",
"sha256:bc06ef4fa32e699990e9288479fe8f7b7e77de16acfb92e0a463fab497aff333",
"sha256:fcf1f2f2126f79c6fdbff82a06c713187f88ff08c357c398d38ec4a3def0aecd",
"sha256:c3b4285a54d83aa766d1916579fa7a0c6df2c46378029bdae3ef462c00b84d7f",
"sha256:8ed21cc1c62b47b413964320b8b5daa38a28023a568585af6ea049b8c911130b",
"sha256:217d04b8e1101cf7366de1459821a2e4ce408e7fc759e35fdf342df4bfc9f90a",
"sha256:da14a4000b396dcf8c68435b23cbcca79dbcb675f43201a198e07527461a5923"
]
},
"Metadata": {
"LastTagTime": "0001-01-01T00:00:00Z"
}
}
]
tomasz@mule:~
$ docker image history kalibera/rchk:latest
IMAGE CREATED CREATED BY SIZE COMMENT
d6771960c26c 2 hours ago /bin/sh -c #(nop) ENTRYPOINT ["/bin/bash" "… 0B
<missing> 2 hours ago /bin/sh -c #(nop) ADD file:0f72386d5185d8925… 3.35kB
<missing> 2 hours ago /bin/sh -c #(nop) WORKDIR /rchk 0B
<missing> 2 hours ago /bin/sh -c #(nop) ENV PATH=/rchk/scripts:/r… 0B
<missing> 2 hours ago |1 DEBIAN_FRONTEND=noninteractive /bin/sh -c… 4.69MB
<missing> 2 hours ago /bin/sh -c #(nop) ENV R_LIBSONLY=/rchk/pack… 0B
<missing> 2 hours ago |1 DEBIAN_FRONTEND=noninteractive /bin/sh -c… 106MB
<missing> 2 hours ago /bin/sh -c #(nop) ENV CPICFLAGS=-fPIC CFLAG… 0B
<missing> 2 hours ago |1 DEBIAN_FRONTEND=noninteractive /bin/sh -c… 26.6MB
<missing> 2 hours ago /bin/sh -c #(nop) ENV WLLVM=/usr/local/bin … 0B
<missing> 2 hours ago |1 DEBIAN_FRONTEND=noninteractive /bin/sh -c… 13.1MB
<missing> 2 hours ago |1 DEBIAN_FRONTEND=noninteractive /bin/sh -c… 1.38GB
<missing> 2 hours ago /bin/sh -c #(nop) ENV TZ=Europe/Prague 0B
<missing> 2 hours ago |1 DEBIAN_FRONTEND=noninteractive /bin/sh -c… 4.98MB
<missing> 2 hours ago /bin/sh -c #(nop) ARG DEBIAN_FRONTEND=nonin… 0B
<missing> 2 hours ago /bin/sh -c #(nop) LABEL maintainer=tomas.ka… 0B
<missing> 5 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
<missing> 5 months ago /bin/sh -c #(nop) ADD file:89847d76d242dea90… 77.9MB
<missing> 5 months ago /bin/sh -c #(nop) LABEL org.opencontainers.… 0B
<missing> 5 months ago /bin/sh -c #(nop) LABEL org.opencontainers.… 0B
<missing> 5 months ago /bin/sh -c #(nop) ARG LAUNCHPAD_BUILD_ARCH 0B
<missing> 5 months ago /bin/sh -c #(nop) ARG RELEASE 0B
A pattern like this
extern SEXP sym_foo; // e.g., Rf_install("foo");
func_(SEXP x) {
SEXP attr_foo = Rf_getAttrib(x, sym_foo);
has the additional difficulty that sym_foo can be anything (unless the tool was trying to track the content of global variables, but that is not realistic). Also, the tool cannot know the type of x when it analyzes this function. So, flagging attr_foo is by design and a good thing.
EDIT: however, when sym_foo is assigned only once, and assigned a value coming from call to Rf_install using a constant string literal, then rchk understands this pattern (it does whole program analysis) and takes into account the symbol name - and it is also the case of how this pattern is used in S7. There was a regression in rchk which caused temporarily this context-sensitivity being lost (more below).
Thanks for the examples showing cases where the return values from getAttrib() and findVarInFrame() might need protection.
For S7, we know these edge cases aren't possible: getAttrib() will never receive a pairlist, and the argument to findVarInFrame() is always a plain environment (one that we create and manage). Additionally, the main reason for caching sym_foo is to avoid repeated install() calls on hot code paths. For example, with property access in S7's @ operator, we can avoid any allocations on the most common path. Similarly for method dispatch, we can dispatch with very few PROTECT calls.
It would be counterproductive to add PROTECT calls to these heavily-exercised code paths solely to satisfy rchk.
Ok, so it seems to me there is no related issue in rchk.
Let me remind again to anyone who may come across this, rchk is by design and in this case more permissive that what Writing R Extensions requires (see 5.9.1 Handling the effects of garbage collection), so choosing to ignore these reports goes against that document.
Looking at your log, the new docker image uses a slightly newer version of rchk (changes there shouldn't influence these reports), but a much newer version of R than the older docker image. I've checked the new rchk with R-4.4.2, but not with R-devel - it may be there is some problem causing loss of context-sensitivity in the getAttrib() call. I will have a closer look.
It turns out this was a regression. The call to Rf_install(?,S:properties) in rchk terminology, so when the symbol name is "properties" and is given via a symbol variable, possibly a global where such symbol is cached in a straightforward way, does not allocate nor return a fresh object and rchk has been designed to get this from the code. It just temporarily didn't work due to a regression from last week. This depends on the symbol name, "properties" such as in S7 is fine (and "foo" as in the example would be as well), but some other attribute names are not. As I wrote before, defensive programs shouldn't rely on this, but rchk was designed to be less strict (and more tight to the present implementation of R at hand). Base R itself uses code that depends on that getting some attributes doesn't allocate and some attributes are not fresh objects, but again, this could in principle change without notice in base R. This rchk regression has been fixed now, so this should be no longer flagged in S7, and probably you would get the same reports as before the regression.
The remaining rchk reports I see from the current CRAN version of S7 seem to be related to variable lookup. As discussed here already, variables gotten from environments may be fresh (e.g. with user databases or active bindings) in principle, and again, this could in principle change and allocation could happen even in other cases. Even base R itself always protects in such cases and I would definitely do that in any code. Once rchk check service is back for CRAN, S7 will get flagged because of these issues on CRAN.
Thank you for following up! I reran with the latest rchk and now there are no issues reported 🎉. https://github.com/RConsortium/S7/actions/runs/11687328551/job/32545173853