zingg issues

Build helper methods for column prefix and zframe column rename in one single place

right now the code has ColName.COL_PREFIX all over. We should see whats needed and then improve the code

Graph scoring etc should move out of Matcher and in its own separate class

Current Matcher has the Graph scoring and other graph stuff which makes them tighly coupled. We should move the scoring to a different class. Also think through other graph stuff...

sonalgoyal

Move stop words to pre processor

vikasgupta78

enhancement

blocking heavily dependent on field order

3

blocking algorithms are currently heavily dependent on field order, giving vastly different results when field order in fedDefinitions is changed. We should make them more consistent.

sonalgoyal

sim functions should have field+fn name

may have impact on enterprise also

vikasgupta78

keep arguments passed from user immutable

We add the dataframe to the pipe when we read it, which modifies the original args object. In a way that is ok as we are only enriching the args....

sonalgoyal

change header

sonalgoyal

Improve ZFrame interface

2

Current ZFrame has methods like drop(String, String..) which can be replaced with drop(String..)

sonalgoyal

Revisit Row, Column, StructType and StructField interfaces for ZFrame

Currently we have implemented methods in ZFrame that should actually be in Row, Column, StructType etc classes. eg getAsString. One thing to remember - StructField not serializable in Snowpark so...

sonalgoyal

technicalDebt

Preprocessing to lower case

Preprocessing phase needed which will conver all data to lower case before start of any phase. This is specially relevant for stop words and recommender as currently those are case...

vikasgupta78

zingg
zingg copied to clipboard

Metadata

Build helper methods for column prefix and zframe column rename in one single place

Graph scoring etc should move out of Matcher and in its own separate class

Move stop words to pre processor

blocking heavily dependent on field order

sim functions should have field+fn name

keep arguments passed from user immutable

change header

Improve ZFrame interface

Revisit Row, Column, StructType and StructField interfaces for ZFrame

Preprocessing to lower case

← Metadata

Owner

Metadata

zingg zingg copied to clipboard

Metadata

← Metadata

Owner

Metadata

zingg
zingg copied to clipboard