TileDB-Py
TileDB-Py copied to clipboard
Support for openstack s3?
I was wondering if tiledb s3 backend should also work with openstack Swift s3 compatible api? I had it working for a bit but it seems to have stopped working for me and I can't figure out if it's due to a change in tiledb or something else.
Hi @mangecoeur.
We have not made any changes to the S3 backend recently, so this is unlikely due to any changes in TileDB. Can you give us some more info about the issue you're seeing such as any error messages and when was the last time you were able to get it to work?
Thanks.
Adding a note that we do test against Minio S3 implementation regularly, and we have tested against other S3-compatible endpoints in the past (or are aware of active users). But please let us know what you are seeing and we'll take a look.
So the error I'm getting is:
Failed to read S3 object s3://tessa/clusters_distances/distances.tiledb/__meta/__1660148217721_1660148217721_e642f1cfaaad472bb074a15d6c1ef99a.vac
Exception:
Error message:
The exception and error message are blank. If I use VFS I can correctly list the files. The dataset was created locally then uploaded to S3, I don't know if that might be the cause? The dataset works fine when accessed locally.
Deleting the .vac file manually from the __meta folder seems to have cured s3 reading, although I'm getting intermittent segfaults but it seems only with tiledb installed via pip on macosx while the conda-forge version is fine.
Thanks for the update! Just for my clarification, does this sequence of events seem correct? At first you were able to use OpenStack S3 for TileDB where at one point you wrote metadata to the array. Then you consolidated your array metadata with "sm.consolidation.mode": "array_meta"
-- I'm assuming this because you have *vac
files in __meta
. Was the array metadata vacuumed afterwards? After consolidation or vacuuming, you started getting the Failed to read S3 object
error?
In regards to the intermittent segfaults, can you provide a reproducible example?
Yes that was the sequence. I ran both consolidation and vaccum but after vaccuming on the .vac files still existed. Deleting them manually seemed to fix s3.
I'm still getting segfaults but I can't figure out what is conditions are causing them. I literally just rant the same test script 5 times and the 3rd attempt worked, all the other segfaulted. This seems to be specifically related to the pypi version of tiledb, on the same machine using a conda environment I don't get this problem.
The segfault happens in the code, which extracts multiple square slices from a 2D sparse distance matrix
with tiledb.SparseArray(self._distances_array_file, 'r', ctx=self.tiledb_ctx) as distances
for s in slices:
data = distances[s, s]
The segfault message:
sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)'
Hi @mangecoeur, it looks like that error message comes from deep inside libc in some malloc-related routine.
- can you confirm succeeds against an identical array stored locally?
- are you able to share an array which reproduces this issue?
Thanks
So I found that I have the same issue with a local array and it seems to relate to the PyPi version of tiledb. Running the exact same setup I get a segfault when using a python venv (created using hatch) but it ~~works fine when using the version installed with conda~~ actually on a new conda env I have the problem again.
It might be something to do with how dependencies are resolved by pip within hatch that is causing some incompatible combination of packages to be installed.
I can't make public the array I'm working on and it's a bit big to share (about 14GB), but if you want I could dump some of the array metadata if you let me know what you need.
Did some more exploring, managed to get this error message from gdb:
0x00007ffef57b3a5e in std::_Function_handler<tiledb::common::Status (unsigned long, unsigned long), tiledb::sm::parallel_for<tiledb::sm::SparseGlobalOrderReader<unsigned char>::compute_hilbert_values(std::vector<tiledb::sm::ResultTile*, std::allocator<tiledb::sm::ResultTile*> >&)::{lambda(unsigned long)#1}>(tiledb::common::ThreadPool*, unsigned long, unsigned long, tiledb::sm::SparseGlobalOrderReader<unsigned char>::compute_hilbert_values(std::vector<tiledb::sm::ResultTile*, std::allocator<tiledb::sm::ResultTile*> >&)::{lambda(unsigned long)#1} const&)::{lambda(unsigned long, unsigned long)#1}>::_M_invoke(std::_Any_data const&, unsigned long&&, std::_Any_data const) ()
from
~/.conda/envs/tessa-1/lib/python3.10/site-packages/tiledb/../../../libtiledb.so.2.11
Happy to take a further look if there's any way to reproduce. We've fixed a number of issues in this code path, please let us know if this is not resolved in newer TileDB versions.