Cryptic error messages with too long filenames on Windows
I've stumbled upon weird error messages with TileDB (2.22) on Windows, that I eventually figured out to be related to too long filenames.
Let's consider the following test.py (example given with tiledb-py for ease, but the original issue comes from the use of the TileDB C++ API from the GDAL TileDB driver):
import tiledb
import numpy as np
import shutil
dim = tiledb.Dim(name="dim", domain=(1, 4), tile=2, dtype=np.int32)
dom = tiledb.Domain(dim)
attr = tiledb.Attr(name="attr", dtype=np.int32)
schema = tiledb.ArraySchema(domain=dom, sparse=False, attrs=[attr])
filename = "z:/" + ('X' * 190)
try:
tiledb.Array.create(filename, schema)
finally:
shutil.rmtree(filename)
When run, it throws:
Traceback (most recent call last):
File "C:\dev\gdal space\build_conda\test.py", line 11, in <module>
tiledb.Array.create(filename, schema)
File "tiledb\libtiledb.pyx", line 976, in tiledb.libtiledb.Array.create
File "tiledb\libtiledb.pyx", line 342, in tiledb.libtiledb._raise_ctx_err
File "tiledb\libtiledb.pyx", line 327, in tiledb.libtiledb._raise_tiledb_error
tiledb.cc.TileDBError: [TileDB::IO] Error: Cannot write to file 'C:\dev\gdal space\build_conda'; File opening error CreateFile GetLastError 5 (0x00000005): Access denied
Note that it tries to write to the current directory, which has nothing to do with the array I'm creating ("z:/XXXXXXXXXXXXXXX....XXXX")
I didn't try to create too long filenames to be annoying. The original issue came actually from using pytest with the tmp_path fixture for GDAL regression test suite. With a test like (can't be run by you as requires development version of the GDAL TileDB driver, but just to give the nominal context where the issue was triggered):
def test_tiledb_write_overviews(tmp_path, use_group):
dsname = str(tmp_path / "test_tiledb_write_overviews.tiledb")
src_ds = gdal.Open("data/rgbsmall.tif")
ds = gdal.GetDriverByName("TileDB").CreateCopy(dsname, src_ds)
ds.BuildOverviews("NEAR", [2])
Behind the scenes this creates a "test_tiledb_write_overviews_1" TileDB array as a subdirectory of str(tmp_path / "test_tiledb_write_overviews.tiledb" + ".ovr"). And it turns out that tmp_path generates quite long temporary directory names (here tmp_path evaluates to C:\Users\evenr\AppData\Local\Temp\pytest-of-evenr\pytest-95\test_tiledb_write_overviews_Fa0\)
The error message I got here was slightly different and scarier, as it looked like an attempt at deleting my current working directory:
E RuntimeError: TileDB: TileDBRasterBand::IRasterIO() failed: TileDB internal: [OrderedWriter::dowork] ([TileDB::IO] Error: Failed to delete file 'C:\dev\gdal space\autotest\gdrivers' DeleteFile GetLastError 5 (0x00000005): Access denied
E )
This exception is thrown when running tiledb::Query::submit()
It would be good if TileDB could be anticipate, at array creation, that some of its internal filenames are going to go over the Windows filename size limit , and throw an explicit exception when that happens ((or use \\?\ prefixing an appropriate Windows API so that long filenames work)
@teo-tsirpanis Can you take a look? I know you've looked into long filenames before.
@rouault can you try changing the path to filename = "z:\\\\" + ('X' * 190)? I suspect that our URI handling logic thought the filename was a relative path, and something went wrong in the meantime.
A prerequisite to enable support for long paths is to unify the path handling logic and remove the Windows-specific code, using something like https://uriparser.github.io/ for all platforms. It's not very hard to do but there will inevitably be behavior breaking changes in edge cases (see https://github.com/TileDB-Inc/TileDB/pull/4921#discussion_r1586290587 for a recently noticed difference between platforms).
can you try changing the path to
filename = "z:\\\\" + ('X' * 190)?
I've tried that (and \\ also), and same error message
@rouault We are going to figure out/implement the right fix ASAP. Thanks for reporting the issue!