data-prep-kit icon indicating copy to clipboard operation
data-prep-kit copied to clipboard

[Feature] Add license filtering for code modules

Open Bytes-Explorer opened this issue 1 year ago • 4 comments

Search before asking

  • [X] I searched the issues and found no similar issues.

Component

Transforms/code/code_quality

Feature

Capability to filter by permissive licenses for any new code data as a new module.

Are you willing to submit a PR?

  • [x] Yes I am willing to submit a PR!

Bytes-Explorer avatar May 16 '24 14:05 Bytes-Explorer

Needed for release 1

Bytes-Explorer avatar Jun 17 '24 14:06 Bytes-Explorer

@Bytes-Explorer i caved on proglang_select, but now we're repeating this pattern for a different column name? We can't have a new transform for every attribute we want to annotate/filter on. I recommend 1 of the following

  1. Extend filter transform so that instead of removing rows, it optionally allows, keeping all rows, but adds a new column that is set to True for what "would have been filtered" cc: @cmadam
  2. Create a new transform that generalizes what proglang_select does for an arbitrary column and set of values.

I would strongly recommend one of these alternatives to creating a new special purpose transform. We're a toolkit after all.

daw3rd avatar Jun 18 '24 18:06 daw3rd

These are two separate functionalities and we need to think from a view of how an end user will use it. There can be many transforms that will use filtering at the end to filter. However coupling all the functionalities together may make it hard for an end user to use the toolkit, esp when the number of functionalities grow.

Bytes-Explorer avatar Jun 19 '24 03:06 Bytes-Explorer

This is in progress in PR #257

daw3rd avatar Sep 13 '24 16:09 daw3rd