python_mozetl icon indicating copy to clipboard operation
python_mozetl copied to clipboard

test_taar_similarity.test_compute_donors causing test failures

Open acmiyaguchi opened this issue 6 years ago • 3 comments

https://github.com/mozilla/python_mozetl/blob/0f8189f87f857f43e9c0142f9c612a0bcc28978c/tests/test_taar_similarity.py#L258-L263

________________________________________________ test_compute_donors ________________________________________________

spark = <pyspark.sql.session.SparkSession object at 0x7fa2c3b17f10>
addon_whitelist = ['system-addon-guid', 'var-0-guid-0', 'var-0-guid-1', 'var-0-guid-2', 'var-1-guid-0', 'var-1-guid-1', ...]
multi_clusters_df = DataFrame[client_id: string, normalized_channel: string, geo_city: array<strin...ar_parent_browser_engagement_unique_domains_count: array<struct<value:bigint>>]

    def test_compute_donors(spark, addon_whitelist, multi_clusters_df):
        multi_clusters_df.createOrReplaceTempView("longitudinal")
    
        # Perform the clustering on our test data. We expect
        # 3 clusters out of this and 10 donors.
>       _, donors_df = taar_similarity.get_donors(spark, 3, 10, addon_whitelist, random_seed=42)

tests/test_taar_similarity.py:263: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
mozetl/taar/taar_similarity.py:151: in get_donors
    clusters = compute_clusters(addons_df, num_clusters, random_seed)
mozetl/taar/taar_similarity.py:101: in compute_clusters
    model = pipeline.fit(addons_df)
.tox/py27/local/lib/python2.7/site-packages/pyspark/ml/base.py:132: in fit
    return self._fit(dataset)
.tox/py27/local/lib/python2.7/site-packages/pyspark/ml/pipeline.py:109: in _fit
    model = stage.fit(dataset)
.tox/py27/local/lib/python2.7/site-packages/pyspark/ml/base.py:132: in fit
    return self._fit(dataset)
.tox/py27/local/lib/python2.7/site-packages/pyspark/ml/wrapper.py:288: in _fit
    java_model = self._fit_java(dataset)
.tox/py27/local/lib/python2.7/site-packages/pyspark/ml/wrapper.py:285: in _fit_java
    return self._java_obj.fit(dataset._jdf)
.tox/py27/local/lib/python2.7/site-packages/py4j/java_gateway.py:1160: in __call__
    answer, self.gateway_client, self.target_id, self.name)
.tox/py27/local/lib/python2.7/site-packages/pyspark/sql/utils.py:63: in deco
    return f(*a, **kw)

acmiyaguchi avatar Jun 22 '18 23:06 acmiyaguchi