Ruslan Dautkhanov
Ruslan Dautkhanov
`java.lang.NoSuchMethodError: 'void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism` Not sure how this was tested, perhaps hadoop and hadoop-common both need to be pinned to the same version, something like 3.3.6
@candalfigomoro with Koalas 1.0 release that improves a bunch of APIs, do you think it's easier to make FT on Koalas work? https://koalas.readthedocs.io/en/latest/whatsnew/v1.0.0.html I didn't exactly follow the make_index issue...
@candalfigomoro @tuethan1999 Does the index has to be deterministic? Does it have to be sequential? If answers are "no" to both, then `distributed` would be the most performant way to...
> The index column for a Featuretools entity does not need to be sequential, but it should be unique. `monotonically_increasing_id()` (used by `distributed` index_type) is guaranteed to be unique. Why...
@candalfigomoro thanks - didn't realize `index` column had to be "outside" of the df index column.. just for understanding - what are the use cases for sequential ids in FT?...
@thehomebrewnerd sorry for the delay in response One way to make a column deterministic is to have it defined as something like `key_column = hash(list_of_key_columns)` or `key_column = concat(list_of_key_columns)` As...
Just to clarify - `list_of_key_columns` is whatever makes a primary key for a given dataset (provided by the user). One another option is to do something like ```python F.hash(F.col("*")) ```...
@candalfigomoro `attach_id_column` uses `monotonically_increasing_id` under the hood https://github.com/databricks/koalas/blob/8d65308c43489f662a40d7c17a0cd9c6149ba00e/databricks/koalas/internal.py#L626 so it has the same semantic. @thehomebrewnerd that's great Featuretools already has capability to use an existing column for the index! I...
Hot off the press @ueshin and team's article about Koalas - it covers nicely some of the items we discussed here https://databricks.com/blog/2020/08/11/interoperability-between-koalas-and-apache-spark.html Particularly, different provided indexing strategies (if an index...
https://github.com/FeatureLabs/featuretools/releases/tag/v0.19.0 release 0.19 includes support for Koalas 🎉