Sem
Sem
Parquet is a problem because it is not an append-friendly format, and we will lose the ability of out of memory generation. We may add AVRO as an option, it...
> is writing the parquet file in multiple parts so that when the full directory is read In this case it may affect benchmarking very hard, because number of files...
@ShockleysxX Do you need any kind of help with this issue?
Based on my experience with Apache Spark itself, protobuf-way may be a painful story: - you need to incorporate it into CI; - the generated code is huge (for java...
@acezen Because it is a very huge change, should we discuss it first in a mailing list? Like voting. Because it is a really big change, pros and cons are...
May we add `buf` as a building system? It is under Apache 2.0 and actually we need only binaries. https://github.com/bufbuild/buf
@acezen May we merge it into a separate branch? I want to try to add `buf` and some additional scripts, especially for python part.
I would also like to add something about neighborhood retrieval. For example, stats (min, max, etx.) for an ego-network.

@acezen It seems to me that we forgot to add PySpark to the coverage report