cuspatial
cuspatial copied to clipboard
[DOC] Documented method and code location mismatches
During #523 , I noticed many methods in cuSpatial are documented in one place, but the actual definition is in another. For example:
Location of incorrect documentation
polyline_bounding_boxes is under "spatial indexing" category.
But in code the methods is in gis.py: https://github.com/rapidsai/cuspatial/blob/ba0aed1e75ac37b7ba9723c68234438887f49101/python/cuspatial/cuspatial/core/gis.py#L308
directed_hausdorff_distance is under trajectories category, but was exposed in gis.py in code:
https://github.com/rapidsai/cuspatial/blob/ba0aed1e75ac37b7ba9723c68234438887f49101/python/cuspatial/cuspatial/core/gis.py#L29
Describe the problems or issues found in the documentation
A blind fix to the issue is that the methods should be in the file named after the same category in documentation. However a priori to the question is that we know the categories actually makes sense. Currently we have spatial indexing, gis and trajectories to sort computing related APIs, internals to document storage foramts, geopandas computatbility and IO to document interops.
The computing related APIs are slightly confusing since gis and trajectories are two slightly overlapped concepts. In the bigger picture the category of computing methods should relate to the problem we try to solve with cuSpatial as a package. I'd like to get some opinions from the community.
Besides the above, geoArrow package currently is placed within cuspatial.geometry module which is eccentric, in two ways. First the name implies geometric operations, but a data structure was defined instead. Second, a well defined data type system is the center piece to a library. Though geoarrow is largely a work in progress definition and cuSpatial will remain it's functional interface for foreseeable future, we should experiment using geoArrow/geodataframe to organize our input, instead of using legacy SoA array format. This is an early thought and I haven't done any verification work yet. The biggest obstacle I can imagine is that the legacy APIs may define the buffer layout differently to geoArrow (MultiLineStrings being an example). Standardizing these inputs may require significant amount of rework of existing APIs.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
I think we should restructure the library packages as the requirements for cuSpatial are better defined. If you think that it is confusing enough to reduce further adoption, combining trajectories and gis in the simplest way forward.
Besides the above, geoArrow package currently is placed within
cuspatial.geometrymodule which is eccentric, in two ways. First the name implies geometric operations, but a data structure was defined instead. Second, a well defined data type system is the center piece to a library. Though geoarrow is largely a work in progress definition and cuSpatial will remain it's functional interface for foreseeable future, we should experiment using geoArrow/geodataframe to organize our input, instead of using legacy SoA array format. This is an early thought and I haven't done any verification work yet. The biggest obstacle I can imagine is that the legacy APIs may define the buffer layout differently to geoArrow (MultiLineStrings being an example). Standardizing these inputs may require significant amount of rework of existing APIs.
I agree completely that future input should be organized in the geoArrow format, and it should be pretty straight forward.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
@isVoid I think you have made progress on this issue with recent PRs. Could you perhaps add a checklist to track what remains?
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This is fixed thanks to recent refactors.