Aleksei Smirnov

Results 16 comments of Aleksei Smirnov

Jake, you are right, currently we store null information in validity buffer using 1 bit per value - that is the reason why using BooleanDataFrameColumn takes more time, that just...

> @asmirnov82 Can you please help review this? Additionally, I'd love your comments on what the difference between Decimal128 and Decimal256 Arrow type handling would be in a DataFrame? As...

@davesearle DataFrame was designed to handle situations where all the data is in memory, so streaming was not the primary goal. However DataFrame allows convertion to a collection of Arrow...

TestTokenizerUsingExternalVocab test fails, because external vocabulary is not available by https://pythia.blob.core.windows.net/public/encoding/gpt2.tiktoken url

@tarekgh, I implemented the fix. Now all tokenizer tests passed. Thank you for your help