Daft icon indicating copy to clipboard operation
Daft copied to clipboard

[DOCS] quickstart-revision

Open avriiil opened this issue 1 year ago • 2 comments

This revision involves:

  • minor restructuring for better flow
  • more intro text to provide context and highlight Daft features
  • expanded and more fun/relatable code example that includes ML classification with UDFs

Still to do: [ ] upload Parquet file to public S3 bucket [x] store images on stable URL / check licensing [x] update all doc links [x] perfection revision of owner/dog name combos :D

avriiil avatar Apr 15 '24 14:04 avriiil

Thanks for your thoughts @jaychia! I agree with most of your points. I've added commits for (1) a restructure to bring Expressions higher up,
(2) trimmed some sections that feel superfluous to me, (3) fixed nits**

LMK what you think :)

** I'm not sure about nit(5). Filtering rights now flows nicely into Query Planning which was an intentional bridge. We could also trim the Query Planning and instead point to this docs page or a new, separate page on Query Planning in Daft (which would be great to have)?

avriiil avatar Apr 16 '24 14:04 avriiil

df = df.with_column("has_dog", df["has_dog"].apply(lambda x: True, return_dtype=DataType.bool())) this is probably not what we want? I think we'd want like a fillnull or something similar.

right, I was looking for something like that but couldn't find it in the docs. Maybe something like if expr.is_null > set value?

avriiil avatar Apr 19 '24 09:04 avriiil

@jaychia - this should be good to go now!

avriiil avatar Jul 09 '24 12:07 avriiil

Looks good to go mostly, some more things we should address before merge:

Closing ": image

We've received some comments around not knowing how to use .select() for running expressions. Perhaps this section can be expanded a little to show that you can use expressions in a .select(): image

For this example, we shouldn't just be using .apply to set everything to True. Instead we can show an if_else: image

I think this should work:

df["has_dog"].is_null().if_else(True, df["has_dog"])

jaychia avatar Jul 15 '24 21:07 jaychia

Thanks for the sharp eye @jaychia , fixes made!

avriiil avatar Jul 22 '24 11:07 avriiil