gdal icon indicating copy to clipboard operation
gdal copied to clipboard

[WIP] GDAL_GCP C++'ification: appropriate for 3.9 ?

Open rouault opened this issue 1 year ago • 3 comments

Up to now, GDAL_GCP was only a C-only structure. The typical/recommended pattern to create GCPs from C is:

GDAL_GCP* gcps = (GDAL_GCP*)CPLMalloc( count * sizeof(GDAL_GCP) );
GDALInitGCPs( count, gcps );  // zero-initialize  pixel,line,x,y,z double members of each gcps[], and affects a CPLStrdup("") string to pszId and pszInfo members of each gcps[]
CPLFree(gcps[0].pszId); // free the initial empty string
gcps[0].pszId = CPLStrdup("foo"); // for example
GDALDeinitGCPs( count, gcps ); // free memory of gcps[i].pszId and gcps[i].pszInfo
CPLFree(gcps); // free the arra itself

This is "fine" for a C user, but quite tedious in a C++ context. Big users of creating GCPs are GDAL drivers themselves.

So this pull request makes GDAL_GCP is a C++ struct, when gdal.h included from C++ code, with a default allocator, a destructor and move constructor / assignment operators.

This mostly works except a few details:

  • If a GDAL_GCP is allocated as a C++ object ( ie something like GDAL_GCP gcp, or GDAL_GCP gcps[5] or with std::vector<GDAL_GCP>, or anything that is not CPLMalloc()/VSIMalloc()), and then passed to GDALInitGCPs(), a memory leak would occur because pszId will be initialized twice. Here the fix consists in the GDAL_GCP constructor to assign it to the return of a VSIGetStaticEmptyString() function, which returns a special static instance of an empty string, that VSIFree() explicitly recognizes and does not free. Hence no leak occurs.

  • For a GDAL_GCP created as a C++ object, one could run into issues with double-free, when explicitly calling GDALDeinitGCPs(), and then when the GDAL_GCP destructor automatically runs. This was easily fixed by making GDALDeinitGCPs() to nullify pszId and pszInfo members after CPLFree()'ing them.

All above are just implementation details. More annoyingly, here are the few backward incompatibilities I found with a few instances of C++ code:

  • In gdalpamdataset.cpp, we kind of abused the recommended workflow mentionned at the beginning, by assigning a static empty string to pszId and pszInfo on a GDAL_GCP[] object, and not use GDALInitGCPs() / GDALDeinitGCPs(). The issue now is that when ~GDAL_GCP() is called it will try to CPLFree() that. No other solution here than removing that initialization, that is no longer needed

  • In GDALLoadTabFile() and GDALLoadOziMapFile(), a GDAL_GCP[] array was allocated, and then memcpy()'d onto a dynamically array allocated with CPLMalloc( count * sizeof(GDAL_GCP) ) returned to the user. This code has to be changed to remove the memcpy() part and do cpl_malloc_gcps[i] = std::move(asGCPs[i])

  • The Sentinel2 driver used memset(gcps, 0, sizeof(gcps)) to initialize a GDAL_GCP[] array. Some compilers complained about memset()'ing a non plain-old-object instance. The fix is just to remove that now useless memset().

Except those details, all other drivers compile and pass their regression tests, which makes me believe, that while being backward incompatible, this is still reasonable. The compiler should complain in most problematic situations where the code has to be changed (this was for example for the memcpy() in GDALLoadTabile/GDALLoadOziMapFile). The only undetected change at compile time was the "clever trick" of gdalpamdataset.cpp which was only detected at runtime. Hopefully such a pattern should be exceptional. In the worse case, users can define the new GDAL_SUPPRESS_GCP_CPLUSPLUS macro before including gdal.h, which will make their code see the POD C structure.

And to re-iterate, this has no impact on C-only users

The first commit introduces the minimum changes to get things working, and the second one modernizes the HFA driver to use a std::vector<GDAL_GCP> to show the benefits of the proposed change (less magic numbers, automatic construction/destruction) . More drivers could be modernized in a similar way

rouault avatar Feb 05 '24 00:02 rouault

This <const_cast>(special empty string) dance doesn't look healthy...

I assume VSIFree is used much more often than GDAL_GCP. Thus the extra test is quite expensive. Would it help to make the special empty string a non-static private member of GDAL_GCP instead of a global static? A little bit more memory for very few cases instead of a little bit more runtime for many cases.

Could pszId and pszInfo be changed to const char*? This would fit well to the signatures of SetId() and SetInfo().

dg0yt avatar Feb 05 '24 06:02 dg0yt

Thus the extra test is quite expensive

I haven't measured the perf impact, but the only significant impact I would anticipate would only be in loops where you would do tons of malloc/free patterns, and that's obviously something that should be avoid be avoided upfront. I'd be very surprised that this shows up in realistic usage scenarios.

Would it help to make the special empty string a non-static private member of GDAL_GCP instead of a global static?

This special static empty string that VSIFree() doesn't free is necessary if we want backward compatibility for scenarios like: GDAL_GCP x; GDALInitGCPs(1, &x);. Without the special empty string trick, the GDAL_GCP constructor would have to dynamically allocate one empty string, and GDALInitGCPs() would override it with another one, without freeing (since GDALInitGCPs() is aimed at being used on uninitialized data in its nominal usage), thus causing a memory leak. We could fix all GDAL code to avoid doing GDALInitGCPs() on GDAL_GCP objects C++ constructed, but I've no idea of the impact on outside code.

Could pszId and pszInfo be changed to const char*?

Definitely not for the C version of the structure. And for the C++ one, that would break all code that currently sets them directly (also hard to quantify how frequent this is outside of the GDAL codebase)

rouault avatar Feb 05 '24 14:02 rouault

I see. I still feel a little bit worried by the mix of C, C++, and const_cast.

dg0yt avatar Feb 06 '24 06:02 dg0yt

Closing, as superseded per https://github.com/OSGeo/gdal/pull/9285

rouault avatar Feb 22 '24 21:02 rouault