superset icon indicating copy to clipboard operation
superset copied to clipboard

feat(api): Fetch datasets with a given advanced data type

Open reesercollins opened this issue 2 years ago • 7 comments

SUMMARY

For future drill-to actions, it would be useful to have an api to fetch all datasets with a given advanced data type.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

Set an advanced type on a dataset column. Calling the advanced_data_type/datasets api with that advanced type should return the dataset id and column id.

ADDITIONAL INFORMATION

  • [ ] Has associated issue:
  • [x] Required feature flags: ENABLE_ADVANCED_DATA_TYPES: True
  • [ ] Changes UI
  • [ ] Includes DB Migration (follow approval process in SIP-59)
    • [ ] Migration is atomic, supports rollback & is backwards-compatible
    • [ ] Confirm DB migration upgrade and downgrade tested
    • [ ] Runtime estimates and downtime expectations provided
  • [x] Introduces new feature or API
  • [ ] Removes existing feature or API

reesercollins avatar Jul 20 '22 15:07 reesercollins

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (4e33235) 67.03% compared to head (5483fd6) 65.53%. Report is 2128 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #20793      +/-   ##
==========================================
- Coverage   67.03%   65.53%   -1.50%     
==========================================
  Files        1813     1803      -10     
  Lines       69437    69024     -413     
  Branches     7450     7319     -131     
==========================================
- Hits        46545    45236    -1309     
- Misses      20972    21901     +929     
+ Partials     1920     1887      -33     
Flag Coverage Δ
hive 52.60% <73.33%> (-0.28%) :arrow_down:
mysql 78.14% <100.00%> (-0.24%) :arrow_down:
postgres 78.21% <100.00%> (-0.24%) :arrow_down:
presto 52.50% <73.33%> (-0.28%) :arrow_down:
python 78.69% <100.00%> (-2.86%) :arrow_down:
sqlite 76.65% <73.33%> (-0.25%) :arrow_down:
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jul 20 '22 16:07 codecov[bot]

@reesercollins I wonder if you would be able to get what you needed from the /datasets/related/columns rest api by adding the advanced data type filter. In theory, that would save you from having to create a new api, and would keep it in the new rest v1 api. Then you can also use the RelatedFieldFilter for permissions.

eschutho avatar Aug 15 '22 21:08 eschutho

@reesercollins I wonder if you would be able to get what you needed from the /datasets/related/columns rest api by adding the advanced data type filter. In theory, that would save you from having to create a new api, and would keep it in the new rest v1 api. Then you can also use the RelatedFieldFilter for permissions.

Hey @eschutho quick question about that. The data we need back is the columns that have the in question advanced data type as well as the dataset Id that they belong to. As far as I was aware from the related endpoint doesn't return the data in this way, but I could be completely wrong. would I be able to retrieve that data in a form mentioned above from /datasets/related/columns? Because it would be nice not to add a new endpoint.

cccs-RyanS avatar Aug 16 '22 13:08 cccs-RyanS

Hey @eschutho quick question about that. The data we need back is the columns that have the in question advanced data type as well as the dataset Id that they belong to. As far as I was aware from the related endpoint doesn't return the data in this way, but I could be completely wrong. would I be able to retrieve that data in a form mentioned above from /datasets/related/columns? Because it would be nice not to add a new endpoint.

Would it work for you for the endpoint to return the entire column data? If so, you can add a filter to the query for the advanced data type and then you'll have the dataset id on the column data. Does that work?

eschutho avatar Aug 16 '22 19:08 eschutho

Would it work for you for the endpoint to return the entire column data? If so, you can add a filter to the query for the advanced data type and then you'll have the dataset id on the column data. Does that work?

The issue is the /dataset/related/... endpoint gives no control over what data is returned from it. It is hard-coded to return the primary key of the data model (In this case, the TableColumn) and the name. That means the endpoint is returning the column id, not the dataset id.

reesercollins avatar Aug 16 '22 19:08 reesercollins

The issue is the /dataset/related/... endpoint gives no control over what data is returned from it. It is hard-coded to return the primary key of the data model (In this case, the TableColumn) and the name. That means the endpoint is returning the column id, not the dataset id.

I see what you mean. @dpgaspar would it be weird for an api class to overwrite _get_result_from_rows to make this response more flexible to each model?

eschutho avatar Aug 17 '22 00:08 eschutho

@reesercollins I spoke with @dpgaspar and I would recommend using the /related endpoint, but overwriting _get_result_from_rows to include more information (key/values) from the columns, including the dataset_id that you need. But to keep it backward compatible, I would leave the "value" and "text" keys as is for anyone who may be using this api. But let me know if you run into any complications with this approach. 🙏

eschutho avatar Aug 25 '22 16:08 eschutho

Seems like the conversation here tapered off... is there any intent to get this across the finish line? I'll close/reopen to jumpstart CI for good measure.

rusackas avatar Feb 06 '24 19:02 rusackas

PR looks busted with 5000+ files. Closing for now, feel free to reopen once the branch is rebased properly

mistercrunch avatar Mar 08 '24 01:03 mistercrunch