flytekit
flytekit copied to clipboard
Set default format of structured dataset to empty
Signed-off-by: Kevin Su [email protected]
TL;DR
https://github.com/flyteorg/flyte/issues/2864
The current default format type is Parquet, and it leads to some issues
- SD transformer will always convert dataframe to parquet instead of cls.DEFAULT_FORMATS[df_type]
def t1() -> StructuredDataset: # The default format of structured dataset is Parquet here
- Failed to run BQ task if the cache is enabled because type validation is failing.
@task(cache=True, cache_version="1.0")
def t1() -> StructuredDataset: # The default format of structured dataset is Parquet here
df = pd.DataFrame({"len": [len(sd.open(pd.DataFrame).all())]})
return StructuredDataset(df, uri=bq_uri) # The format of structured dataset is ""
In this PR, we set the default format to ""
def t1() -> StructuredDataset: # The default format of structured dataset is "" here
- By default, we use
handlers[df_type][protocol][""]to encode a dataframe. ifhandlers[df_type][protocol][""]doesn't exist, then usehandlers[df_type][protocol][cls.DEFAULT_FORMATS[df_type]].
Type
- [x] Bug Fix
- [ ] Feature
- [ ] Plugin
Are all requirements met?
- [x] Code completed
- [x] Smoke tested
- [x] Unit tests added
- [ ] Code documentation added
- [ ] Any pending items have an associated Issue
Complete description
^^^
Tracking Issue
^^^
Follow-up issue
NA
Codecov Report
Merging #1159 (beec6a7) into master (2ccaed7) will increase coverage by
0.58%. The diff coverage is76.78%.
@@ Coverage Diff @@
## master #1159 +/- ##
==========================================
+ Coverage 68.51% 69.09% +0.58%
==========================================
Files 288 295 +7
Lines 26085 26983 +898
Branches 2918 2537 -381
==========================================
+ Hits 17871 18644 +773
- Misses 7735 7843 +108
- Partials 479 496 +17
| Impacted Files | Coverage Δ | |
|---|---|---|
| flytekit/clis/helpers.py | 94.59% <ø> (ø) |
|
| flytekit/clis/sdk_in_container/init.py | 100.00% <ø> (+33.33%) |
:arrow_up: |
| flytekit/configuration/internal.py | 16.43% <0.00%> (+0.43%) |
:arrow_up: |
| flytekit/core/context_manager.py | 39.61% <0.00%> (ø) |
|
| flytekit/core/launch_plan.py | 57.89% <0.00%> (ø) |
|
| flytekit/core/map_task.py | 43.11% <0.00%> (ø) |
|
| flytekit/deck/__init__.py | 0.00% <ø> (ø) |
|
| flytekit/extras/tensorflow/__init__.py | 0.00% <0.00%> (ø) |
|
| flytekit/models/literals.py | 40.28% <0.00%> (ø) |
|
| flytekit/tools/fast_registration.py | 81.53% <0.00%> (-7.53%) |
:arrow_down: |
| ... and 101 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
We are going to hold off on this change for a while, need to get the propeller change out and people migrated first. https://github.com/flyteorg/flytepropeller/pull/483