Daft icon indicating copy to clipboard operation
Daft copied to clipboard

[FEAT] Implement str.substr Expression

Open danila-b opened this issue 9 months ago • 1 comments

State: DRAFT No tests and no docs yet until the implementation is finalized.

Adds Expressions.str.substr implementation. Resolves issue: https://github.com/Eventual-Inc/Daft/issues/1934

danila-b avatar May 12 '24 10:05 danila-b

@kevinzwang can you take a look when you get a chance?

samster25 avatar May 13 '24 17:05 samster25

The function turned out a bit lengthier than I expected, I'll see if I can shorten or refactor it but the general functionality should stay the same.

danila-b avatar May 18 '24 09:05 danila-b

@danila-b is this PR ready for review?

kevinzwang avatar May 20 '24 21:05 kevinzwang

@danila-b is this PR ready for review?

Yes, it is

danila-b avatar May 21 '24 13:05 danila-b

@danila-b are you still working on this? Your PR is almost in a mergeable state. Let me know if you'd like me to push it past the finish line

kevinzwang avatar May 30 '24 00:05 kevinzwang

@danila-b are you still working on this? Your PR is almost in a mergeable state. Let me know if you'd like me to push it past the finish line

Hey! Yeah I will try to implement the function with impl Iterator how you suggested one more time this weekend, and if it won't work out then feel free to take it over to get it across the finish line.

danila-b avatar May 30 '24 09:05 danila-b

Codecov Report

Attention: Patch coverage is 83.17308% with 35 lines in your changes are missing coverage. Please review.

Please upload report for BASE (main@e836290). Learn more about missing BASE report. Report is 2 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2269   +/-   ##
=======================================
  Coverage        ?   79.16%           
=======================================
  Files           ?      472           
  Lines           ?    54685           
  Branches        ?        0           
=======================================
  Hits            ?    43293           
  Misses          ?    11392           
  Partials        ?        0           
Files Coverage Δ
daft/expressions/expressions.py 92.93% <100.00%> (ø)
src/daft-core/src/python/series.rs 94.43% <100.00%> (ø)
src/daft-dsl/src/functions/utf8/mod.rs 99.49% <100.00%> (ø)
src/daft-dsl/src/python.rs 91.57% <100.00%> (ø)
daft/series.py 91.88% <88.88%> (ø)
src/daft-core/src/series/ops/utf8.rs 93.10% <83.33%> (ø)
src/daft-dsl/src/functions/utf8/substr.rs 63.33% <63.33%> (ø)
src/daft-core/src/array/ops/utf8.rs 93.66% <84.55%> (ø)

codecov[bot] avatar May 30 '24 20:05 codecov[bot]

I realize there is no super clean solution, since you would need to box the iterators to not do the start_iter, start_repeat, length_iter, length_repeat matching, which comes at a performance cost. If you are happy with this, I'm ready to merge it!

Happy to merge it as it is :+1:

danila-b avatar Jun 03 '24 21:06 danila-b

Great, awesome work with this PR!

kevinzwang avatar Jun 03 '24 21:06 kevinzwang