Norbert Orzechowicz

Results 68 issues of Norbert Orzechowicz

https://parquet.apache.org/docs/file-format/data-pages/encodings/#a-namedeltaencadelta-encoding-delta_binary_packed--5

lib-parquet

Parquet comes with very handy mechanism called "Column Statistics" which says for example what are the min/max values, total number of null values etc. By reading those statistics we won't...

Currently, Flow can only read XML, but we should start looking into saving XML files as well. There are two libraries that could help us achieve that: - https://github.com/veewee/xml -...

adapter-xml

Currently, PagesBuilder is deciding what type of encoding applies to the columns, and we only support two types: - RLE_DICTIONARY - PLAIN We should allow users to overwrite this, but...

Currently run method triggers DataFrame clone internally, the idea behind that was to always duplicate the dataframe in order to make it reusable. However PHP clone is shallow, I tried...

It would be super handy to have an ErrorHandler that would do the same thing as SkipRows handler is doing but that would first log entire rows collection to PsrLogger.

It would be nice to have static analysis detect following scenario as invalid: ```php $array = []; if (\count($array)) { foreach ($array as $element) { // some logic } }...

To support AWS S3 remote storage we should implement the following components first: - AWS SDK - (covering only authentication and S3 storage operations) - Filesystem AWS Bridge - providing...

library-filesystem
hard

https://parquet.apache.org/docs/file-format/data-pages/encryption/ > Parquet files containing sensitive information can be protected by the modular encryption mechanism that encrypts and authenticates the file data and metadata - while allowing for a regular...