featuretools icon indicating copy to clipboard operation
featuretools copied to clipboard

"WindowExec: No Partition Defined for Window operation!" warnings on Spark EntitySets

Open nicodv opened this issue 2 years ago • 1 comments

Both when I add Spark DataFrames to my EntitySet and when I call .dfs() on the Spark EntitySet, I see a flood of warnings:

22/04/26 16:41:09 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.

It's even shown in the official documentation: https://featuretools.alteryx.com/en/stable/guides/using_spark_entitysets.html#Running-DFS

Is this an indication of some fundamental scaling issues when using Spark, or can I safely ignore it? What is the root cause of the warning?

Output of featuretools.show_info()

Featuretools version: 1.8.0

SYSTEM INFO

python: 3.9.11.final.0 python-bits: 64 OS: Darwin OS-release: 21.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

INSTALLED VERSIONS

numpy: 1.22.3 pandas: 1.3.5 tqdm: 4.64.0 cloudpickle: 2.0.0 dask: 2022.4.1 distributed: 2022.4.1 psutil: 5.9.0 pip: 22.0.3 setuptools: 60.6.0

nicodv avatar Apr 27 '22 17:04 nicodv

Hi @nicodv , we will look into this and get back to you

gsheni avatar May 02 '22 14:05 gsheni