Daniel Lemire

Results 293 issues of Daniel Lemire

https://github.com/uniVocity/csv-parsers-comparison

This should be a bit better. My own scores are about +5%.

Terminology: Selecting a subset of columns is called a projection. Since extracting the indexes is expensive, you may want to only pick some of the indexes, never committing to memory...

The caveat with Intel random number generation instructions is that you need to trust Intel's implementation. That is, you must have faith the Intel did not collaborate with the NSA...

There is now a C/C++ implementation of Roaring that might be of interest... https://github.com/RoaringBitmap/CRoaring

https://github.com/FastFilter/fastfilter_cpp

We support JSON Pointer, but we should support JSON Path. It is more work, but also more useful. Currently, a limited subset of JSONPath is supported, see https://github.com/simdjson/simdjson/pull/2127

help wanted
good first issue

For better documentation of our interface, we should adopt concepts when C++20 is available. https://lemire.me/blog/2023/04/18/defining-interfaces-in-c-with-concepts-c20/

Currently, our documentation does not address raw_json(). Furthermore, it looks to be a consuming operation, something we ought to rethink possibly.

The simdjson library has support for JSON Pointers. [JSON Path](https://goessner.net/articles/JsonPath/) is a much more powerful query language. It seems that it could be efficiently implemented with On Demand. cc @jkeiser