pyodbc Retrieve a query as a NumPy structured array

trafficstars

In this PR, we a .fetchdictarray() method to the pyodbc.Cursor object. This adds numpy as an optional build and runtime dependency. Only when numpy is available at build time, is the extension src/npcontainer.cpp compiled. In addition WITH_NUMPY will be defined such that src/cursor.cpp can add the method, and src/pyodbcmodule.cpp can initialize numpy on import.

Here is the docstring of the .fetchdictarray() method:

fetchdictarray(size=-1, return_nulls=False, null_suffix='_isnull')
                               --> a dictionary of column arrays.

Fetch as many rows as specified by size into a dictionary of NumPy
ndarrays (dictarray). The dictionary will contain a key for each column,
with its value being a NumPy ndarray holding its value for the fetched
rows. Optionally, extra columns will be added to signal nulls on
nullable columns.

Parameters
----------
size : int, optional
    The number of rows to fetch. Use -1 (the default) to fetch all
    remaining rows.
return_nulls : boolean, optional
    If True, information about null values will be included adding a
    boolean array using as key a string  built by concatenating the
    column name and null_suffix.
null_suffix : string, optional
    A string used as a suffix when building the key for null values.
    Only used if return_nulls is True.

Returns
-------
out: dict
    A dictionary mapping column names to an ndarray holding its values
    for the fetched rows. The dictionary will use the column name as
    key for the ndarray containing values associated to that column.
    Optionally, null information for nullable columns will be provided
    by adding additional boolean columns named after the nullable column
    concatenated to null_suffix

Remarks
-------
Similar to fetchmany(size), but returning a dictionary of NumPy ndarrays
for the results instead of a Python list of tuples of objects, reducing
memory footprint as well as improving performance.
fetchdictarray is overall more efficient that fetchsarray.

Note: The code is based on a https://github.com/ContinuumIO/TextAdapter (which was released in 2017 by Anaconda, Inc. under the BSD license). The original authors of the numpy container are Francesc Alted and Oscar Villellas.

Jan 27 '23 19:01 ilanschnell

pinging the PR to see if there is anything I can help with to merge this feature

Apr 05 '23 20:04 ndmlny-qs

@mkleehammer this has been rebased against the py3 branch

Aug 25 '23 17:08 ndmlny-qs

@mkleehammer you will need to close out this PR so I can make a different one against the py3 branch. I'll resolve any tests that fail in a new PR agains the that branch.

Aug 25 '23 17:08 ndmlny-qs

@mkleehammer this PR can be closed in favor of #1270 where I am still working on updating tests

Sep 15 '23 16:09 ndmlny-qs

@ilanschnell You can close this PR, see #1270 for a comment about why this can be closed.

@mkleehammer if Ilan does not close out this PR, you can close it out as you clean out stale PRs

Jun 05 '24 14:06 ndmlny-qs

pyodbc pyodbc copied to clipboard

Retrieve a query as a NumPy structured array

pyodbc
pyodbc copied to clipboard