Daft [DOCS] quickstart-revision

This revision involves:

minor restructuring for better flow
more intro text to provide context and highlight Daft features
expanded and more fun/relatable code example that includes ML classification with UDFs

Still to do: [ ] upload Parquet file to public S3 bucket [x] store images on stable URL / check licensing [x] update all doc links [x] perfection revision of owner/dog name combos :D

Apr 15 '24 14:04 avriiil

Thanks for your thoughts @jaychia! I agree with most of your points. I've added commits for (1) a restructure to bring Expressions higher up,
(2) trimmed some sections that feel superfluous to me, (3) fixed nits**

LMK what you think :)

** I'm not sure about nit(5). Filtering rights now flows nicely into Query Planning which was an intentional bridge. We could also trim the Query Planning and instead point to this docs page or a new, separate page on Query Planning in Daft (which would be great to have)?

Apr 16 '24 14:04 avriiil

df = df.with_column("has_dog", df["has_dog"].apply(lambda x: True, return_dtype=DataType.bool())) this is probably not what we want? I think we'd want like a fillnull or something similar.

right, I was looking for something like that but couldn't find it in the docs. Maybe something like if expr.is_null > set value?

Apr 19 '24 09:04 avriiil

@jaychia - this should be good to go now!

Jul 09 '24 12:07 avriiil

Looks good to go mostly, some more things we should address before merge:

Closing ":

We've received some comments around not knowing how to use .select() for running expressions. Perhaps this section can be expanded a little to show that you can use expressions in a .select():

For this example, we shouldn't just be using .apply to set everything to True. Instead we can show an if_else:

I think this should work:

df["has_dog"].is_null().if_else(True, df["has_dog"])

Jul 15 '24 21:07 jaychia

Thanks for the sharp eye @jaychia , fixes made!

Jul 22 '24 11:07 avriiil