datafusion
datafusion copied to clipboard
Add `tail` method on DataFrame
Is there a better way we could do this? Maybe add something upstream if necessary?
As I'm thinking of it, I don't know that this operation is necessarily well defined. Just like with limit when you call it multiple times on a large dataframe you get different results, I would expect different results from multiple calls here.
If we do put this in, I would suggest adding more text to the description to explain why this is an expensive operation - that it performs a collect to determine the size of the dataframe.
Originally posted by @timsaucer in https://github.com/apache/datafusion-python/pull/915#discussion_r1798327215