Vadym Doroshenko issues

Results 26 issues of


Vadym Doroshenko

Update to version 1.1.2rc0

It's made based on [PR](https://github.com/OpenMined/PyDP/commit/eadd992310a9606a5772bc99d570f52fafc8ba22)

QuantileTree wrapper

This PR implements wrapper for [QuantileTree](https://github.com/google/differential-privacy/blob/main/cc/algorithms/quantile-tree.h) algorithm from Google C++ DP building block library.

Support max_contribution_per_user contribution bounding

# Context ## Definitions _Contribution bounding_ is a process of limiting contributions by a single individual (or an entity represented by a privacy key) to the output dataset or its...

Type: New Feature :heavy_plus_sign:

Absolute budget per aggregation

# Context The workflow for computing DP aggregations with PipelineDP is the following (not important here steps are missing, [the full example](https://github.com/OpenMined/PipelineDP/blob/41b70a3c7e19b82024e2d0f44842aaab570440bd/examples/quickstart.ipynb)): ``` # Define the total budget. budget_accountant =...

Good first issue :mortar_board:

Type: New Feature :heavy_plus_sign:

Move parameter validation testing to proper place

# Context [DPEngine.aggregate](https://github.com/OpenMined/PipelineDP/blob/66012f04a94720ba2e5499c1c96edb3399b83287/pipeline_dp/dp_engine.py#L53) is an API function that performs DP aggregations. It takes [AggregateParams](https://github.com/OpenMined/PipelineDP/blob/66012f04a94720ba2e5499c1c96edb3399b83287/pipeline_dp/aggregate_params.py#L57) as an argument. `AggregateParams` is dataclass and specifies details of the computation which is performed. It...

Good first issue :mortar_board:

Type: Testing :test_tube:

Investigate possibility of PipelineDP API for Beam SQL

# Context PipleineDP supports anonymzation with Beam RDD API ([example](https://github.com/OpenMined/PipelineDP/blob/main/examples/movie_view_ratings/run_on_spark.py)). It seems interesting to have the support of [Beam SQL API](https://beam.apache.org/documentation/dsls/sql/overview/). # Goal To investigate and to design BeamSQL API...

Type: New Feature :heavy_plus_sign:

Type: Research :microscope:

Investigate possibility of PipelineDP API for Spark SQL

# Context PipleineDP supports anonymzation with Spark RDD API ([example](https://github.com/OpenMined/PipelineDP/blob/main/examples/movie_view_ratings/run_on_spark.py)). It seems interesting to have the support of [Spark SQL API](https://spark.apache.org/sql/). # Goal To investigate and to design SparkSQL API...

Type: New Feature :heavy_plus_sign:

Type: Research :microscope:

Enable spark test

Create code examples for different tasks

# Context Terminology can be found [here](https://pipelinedp.io/key-definitions/). There are 2 APIs: 1. **Core APIs** - which represent by public functions of DPEngine ([example of usage](https://github.com/OpenMined/PipelineDP/blob/main/examples/movie_view_ratings/run_all_frameworks.py)) 3. **High-level APIs**: `PrivatePCollection` (Beam),...

Good first issue :mortar_board:

Type: Documentation :books: