rikai icon indicating copy to clipboard operation
rikai copied to clipboard

Test failed on `pytest -s python/tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version`

Open Renkai opened this issue 3 years ago • 3 comments

It's the log error on my MacOS

(rikai) ➜  rikai git:(master) ✗ sbt publishLocal

[info] welcome to sbt 1.4.9 (Homebrew Java 11.0.12)
[info] loading global plugins from /Users/renkaige/.sbt/1.0/plugins
[info] loading settings for project rikai-build from plugins.sbt,sbt-antlr4.sbt ...
[info] loading project definition from /Users/renkaige/renkai-lab/rikai/project
[info] loading settings for project rikai from build.sbt ...
[info] set current project to rikai (in build file:/Users/renkaige/renkai-lab/rikai/)
[info] Wrote /Users/renkaige/renkai-lab/rikai/target/scala-2.12/rikai_2.12-0.0.13-SNAPSHOT.pom
[info] :: delivering :: ai.eto#rikai_2.12;0.0.13-SNAPSHOT :: 0.0.13-SNAPSHOT :: integration :: Mon Nov 29 14:58:34 CST 2021
[info] 	delivering ivy file to /Users/renkaige/renkai-lab/rikai/target/scala-2.12/ivy-0.0.13-SNAPSHOT.xml
[info] 	published rikai_2.12 to /Users/renkaige/.ivy2/local/ai.eto/rikai_2.12/0.0.13-SNAPSHOT/poms/rikai_2.12.pom
[info] 	published rikai_2.12 to /Users/renkaige/.ivy2/local/ai.eto/rikai_2.12/0.0.13-SNAPSHOT/jars/rikai_2.12.jar
[info] 	published rikai_2.12 to /Users/renkaige/.ivy2/local/ai.eto/rikai_2.12/0.0.13-SNAPSHOT/srcs/rikai_2.12-sources.jar
[info] 	published rikai_2.12 to /Users/renkaige/.ivy2/local/ai.eto/rikai_2.12/0.0.13-SNAPSHOT/docs/rikai_2.12-javadoc.jar
[info] 	published ivy to /Users/renkaige/.ivy2/local/ai.eto/rikai_2.12/0.0.13-SNAPSHOT/ivys/ivy.xml
[success] Total time: 1 s, completed Nov 29, 2021, 2:58:34 PM
(rikai) ➜  rikai git:(master) ✗ pytest -s python/tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
====================================================================== test session starts =======================================================================
platform darwin -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /Users/renkaige/renkai-lab/rikai/python, configfile: pytest.ini
plugins: requests-mock-1.9.3, anyio-3.4.0, timeout-2.0.1
collected 1 item

python/tests/spark/sql/codegen/test_mlflow_registry.py Warning: Ignoring non-Spark config property: rikai.sql.ml.registry.test.impl
Warning: Ignoring non-Spark config property: fs.s3a.impl
Warning: Ignoring non-Spark config property: fs.s3a.aws.credentials.provider
Warning: Ignoring non-Spark config property: com.amazonaws.services.s3.enableV4
Warning: Ignoring non-Spark config property: fs.AbstractFileSystem.s3a.impl
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/pyspark/jars/spark-unsafe_2.12-3.1.2.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
:: loading settings :: url = jar:file:/Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
Ivy Default Cache set to: /Users/renkaige/.ivy2/cache
The jars for the packages stored in: /Users/renkaige/.ivy2/jars
org.apache.hadoop#hadoop-aws added as a dependency
ai.eto#rikai_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-8bed0f45-38c7-4ca6-a3b7-726626cfb35e;1.0
	confs: [default]
	found org.apache.hadoop#hadoop-aws;3.2.0 in central
	found com.amazonaws#aws-java-sdk-bundle;1.11.375 in central
	found ai.eto#rikai_2.12;0.0.13-SNAPSHOT in local-ivy-cache
	found org.antlr#antlr4-runtime;4.8-1 in spark-list
	found org.xerial.snappy#snappy-java;1.1.8.4 in central
	found org.apache.logging.log4j#log4j-api-scala_2.12;12.0 in central
	found org.scala-lang#scala-reflect;2.12.10 in spark-list
	found org.apache.logging.log4j#log4j-api;2.13.2 in central
	found io.circe#circe-core_2.12;0.12.3 in central
	found io.circe#circe-numbers_2.12;0.12.3 in central
	found org.typelevel#cats-core_2.12;2.0.0 in central
	found org.typelevel#cats-macros_2.12;2.0.0 in central
	found org.typelevel#cats-kernel_2.12;2.0.0 in central
	found io.circe#circe-generic_2.12;0.12.3 in central
	found com.chuusai#shapeless_2.12;2.3.3 in spark-list
	found org.typelevel#macro-compat_2.12;1.1.1 in spark-list
	found io.circe#circe-parser_2.12;0.12.3 in central
	found io.circe#circe-jawn_2.12;0.12.3 in central
	found org.typelevel#jawn-parser_2.12;0.14.2 in central
	found org.apache.logging.log4j#log4j-core;2.13.0 in central
downloading /Users/renkaige/.ivy2/local/ai.eto/rikai_2.12/0.0.13-SNAPSHOT/jars/rikai_2.12.jar ...
	[SUCCESSFUL ] ai.eto#rikai_2.12;0.0.13-SNAPSHOT!rikai_2.12.jar (2ms)
:: resolution report :: resolve 3689ms :: artifacts dl 17ms
	:: modules in use:
	ai.eto#rikai_2.12;0.0.13-SNAPSHOT from local-ivy-cache in [default]
	com.amazonaws#aws-java-sdk-bundle;1.11.375 from central in [default]
	com.chuusai#shapeless_2.12;2.3.3 from spark-list in [default]
	io.circe#circe-core_2.12;0.12.3 from central in [default]
	io.circe#circe-generic_2.12;0.12.3 from central in [default]
	io.circe#circe-jawn_2.12;0.12.3 from central in [default]
	io.circe#circe-numbers_2.12;0.12.3 from central in [default]
	io.circe#circe-parser_2.12;0.12.3 from central in [default]
	org.antlr#antlr4-runtime;4.8-1 from spark-list in [default]
	org.apache.hadoop#hadoop-aws;3.2.0 from central in [default]
	org.apache.logging.log4j#log4j-api;2.13.2 from central in [default]
	org.apache.logging.log4j#log4j-api-scala_2.12;12.0 from central in [default]
	org.apache.logging.log4j#log4j-core;2.13.0 from central in [default]
	org.scala-lang#scala-reflect;2.12.10 from spark-list in [default]
	org.typelevel#cats-core_2.12;2.0.0 from central in [default]
	org.typelevel#cats-kernel_2.12;2.0.0 from central in [default]
	org.typelevel#cats-macros_2.12;2.0.0 from central in [default]
	org.typelevel#jawn-parser_2.12;0.14.2 from central in [default]
	org.typelevel#macro-compat_2.12;1.1.1 from spark-list in [default]
	org.xerial.snappy#snappy-java;1.1.8.4 from central in [default]
	---------------------------------------------------------------------
	|                  |            modules            ||   artifacts   |
	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
	---------------------------------------------------------------------
	|      default     |   20  |   2   |   2   |   0   ||   20  |   1   |
	---------------------------------------------------------------------

:: problems summary ::
:::: ERRORS
	unknown resolver null


:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
:: retrieving :: org.apache.spark#spark-submit-parent-8bed0f45-38c7-4ca6-a3b7-726626cfb35e
	confs: [default]
	0 artifacts copied, 20 already retrieved (0kB/14ms)
21/11/29 14:58:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2021-11-29 14:58:59,878 INFO Rikai (callback_service.py:54): Spark callback server started
2021-11-29 14:58:59,883 INFO Rikai (callback_service.py:113): Rikai Python CallbackService is registered to SparkSession
2021/11/29 14:59:01 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2021/11/29 14:59:01 INFO mlflow.store.db.utils: Updating database tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 451aebb31d03, add metric step
INFO  [alembic.runtime.migration] Running upgrade 451aebb31d03 -> 90e64c465722, migrate user column to tags
INFO  [alembic.runtime.migration] Running upgrade 90e64c465722 -> 181f10493468, allow nulls for metric values
INFO  [alembic.runtime.migration] Running upgrade 181f10493468 -> df50e92ffc5e, Add Experiment Tags Table
INFO  [alembic.runtime.migration] Running upgrade df50e92ffc5e -> 7ac759974ad8, Update run tags with larger limit
INFO  [alembic.runtime.migration] Running upgrade 7ac759974ad8 -> 89d4b8295536, create latest metrics table
INFO  [89d4b8295536_create_latest_metrics_table_py] Migration complete!
INFO  [alembic.runtime.migration] Running upgrade 89d4b8295536 -> 2b4d017a5e9b, add model registry tables to db
INFO  [2b4d017a5e9b_add_model_registry_tables_to_db_py] Adding registered_models and model_versions tables to database.
INFO  [2b4d017a5e9b_add_model_registry_tables_to_db_py] Migration complete!
INFO  [alembic.runtime.migration] Running upgrade 2b4d017a5e9b -> cfd24bdc0731, Update run status constraint with killed
INFO  [alembic.runtime.migration] Running upgrade cfd24bdc0731 -> 0a8213491aaa, drop_duplicate_killed_constraint
WARNI [0a8213491aaa_drop_duplicate_killed_constraint_py] Failed to drop check constraint. Dropping check constraints may not be supported by your SQL database. Exception content: No support for ALTER of constraints in SQLite dialectPlease refer to the batch mode feature which allows for SQLite migrations using a copy-and-move strategy.
INFO  [alembic.runtime.migration] Running upgrade 0a8213491aaa -> 728d730b5ebd, add registered model tags table
INFO  [alembic.runtime.migration] Running upgrade 728d730b5ebd -> 27a6a02d2cf1, add model version tags table
INFO  [alembic.runtime.migration] Running upgrade 27a6a02d2cf1 -> 84291f40a231, add run_link to model_version
INFO  [alembic.runtime.migration] Running upgrade 84291f40a231 -> a8c4a736bde6, allow nulls for run_id
INFO  [alembic.runtime.migration] Running upgrade a8c4a736bde6 -> 39d1c3be5f05, add_is_nan_constraint_for_metrics_tables_if_necessary
INFO  [alembic.runtime.migration] Running upgrade 39d1c3be5f05 -> c48cb773bb87, reset_default_value_for_is_nan_in_metrics_table_for_mysql
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
Successfully registered model 'rikai-test'.
2021/11/29 14:59:05 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: rikai-test, version 1
Created version '1' of model 'rikai-test'.
Successfully registered model 'vanilla-mlflow'.
2021/11/29 14:59:08 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: vanilla-mlflow, version 1
Created version '1' of model 'vanilla-mlflow'.
Successfully registered model 'vanilla-mlflow-no-tags'.
2021/11/29 14:59:11 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: vanilla-mlflow-no-tags, version 1
Created version '1' of model 'vanilla-mlflow-no-tags'.
Successfully registered model 'vanilla-mlflow-wrong-tags'.
2021/11/29 14:59:13 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: vanilla-mlflow-wrong-tags, version 1
Created version '1' of model 'vanilla-mlflow-wrong-tags'.
2021-11-29 14:59:15,196 INFO Rikai (mlflow_registry.py:223): Resolving model resnet_m_fizz from mlflow:/rikai-test/1
2021-11-29 14:59:15,355 INFO Rikai (base.py:207): Created model inference pandas_udf with name resnet_m_fizz_a78a3cc5
/Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[Stage 1:>                                                          (0 + 1) / 1]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stack of Thread-3 (123145423126528) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/py4j/java_gateway.py", line 2375, in run
    command = smart_decode(self.input.readline())[:-1]
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stack of Thread-2 (123145402580992) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/py4j/java_gateway.py", line 2259, in run
    readable, writable, errored = select.select(

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stack of Thread-1 (123145385791488) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/socketserver.py", line 237, in serve_forever
    self._handle_request_noblock()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/socketserver.py", line 316, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/socketserver.py", line 347, in process_request
    self.finish_request(request, client_address)
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/socketserver.py", line 360, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/socketserver.py", line 747, in __init__
    self.handle()
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/pyspark/accumulators.py", line 262, in handle
    poll(accum_updates)
  File "/Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/pyspark/accumulators.py", line 233, in poll
    r, _, _ = select.select([self.rfile], [], [], 1)

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
F

============================================================================ FAILURES ============================================================================
______________________________________________________________ test_mlflow_model_from_model_version ______________________________________________________________

spark = <pyspark.sql.session.SparkSession object at 0x7ffcb080f460>, mlflow_client = <mlflow.tracking.client.MlflowClient object at 0x7ffc62852610>

    @pytest.mark.timeout(60)
    def test_mlflow_model_from_model_version(
        spark: SparkSession, mlflow_client: MlflowClient
    ):
        # peg to a particular version of a model
        spark.sql("CREATE MODEL resnet_m_fizz USING 'mlflow:/rikai-test/1'")
>       check_ml_predict(spark, "resnet_m_fizz")

python/tests/spark/sql/codegen/test_mlflow_registry.py:136:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
python/tests/spark/sql/codegen/utils.py:48: in check_ml_predict
    predictions.show()
../../miniconda/envs/rikai/lib/python3.8/site-packages/pyspark/sql/dataframe.py:484: in show
    print(self._jdf.showString(n, 20, vertical))
../../miniconda/envs/rikai/lib/python3.8/site-packages/py4j/java_gateway.py:1303: in __call__
    answer = self.gateway_client.send_command(command)
../../miniconda/envs/rikai/lib/python3.8/site-packages/py4j/java_gateway.py:1033: in send_command
    response = connection.send_command(command)
../../miniconda/envs/rikai/lib/python3.8/site-packages/py4j/java_gateway.py:1200: in send_command
    answer = smart_decode(self.stream.readline()[:-1])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <socket.SocketIO object at 0x7ffcb07fd370>, b = <memory at 0x7ffc6470e7c0>

    def readinto(self, b):
        """Read up to len(b) bytes into the writable buffer *b* and return
        the number of bytes read.  If the socket is non-blocking and no bytes
        are available, None is returned.

        If *b* is non-empty, a 0 return value indicates that the connection
        was shutdown at the other end.
        """
        self._checkClosed()
        self._checkReadable()
        if self._timeout_occurred:
            raise OSError("cannot read from timed out object")
        while True:
            try:
>               return self._sock.recv_into(b)
E               Failed: Timeout >60.0s

../../miniconda/envs/rikai/lib/python3.8/socket.py:669: Failed
======================================================================== warnings summary ========================================================================
../../miniconda/envs/rikai/lib/python3.8/site-packages/pkg_resources/__init__.py:1130
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
  /Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/pkg_resources/__init__.py:1130: DeprecationWarning: Use of .. or absolute path in a resource path is not allowed and will raise exceptions in a future release.
    return get_provider(package_or_requirement).get_resource_filename(

../../miniconda/envs/rikai/lib/python3.8/site-packages/mlflow/types/schema.py:49
  /Users/renkaige/miniconda/envs/rikai/lib/python3.8/site-packages/mlflow/types/schema.py:49: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    binary = (7, np.dtype("bytes"), "BinaryType", np.object)

tests/spark/sql/codegen/test_mlflow_registry.py: 51 warnings
  /Users/renkaige/miniconda/envs/rikai/lib/python3.8/contextlib.py:120: SADeprecationWarning: The Column.copy() method is deprecated and will be removed in a future release. (deprecated since: 1.4)
    next(self.gen)

tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
  /Users/renkaige/miniconda/envs/rikai/lib/python3.8/contextlib.py:120: SADeprecationWarning: The ColumnCollectionConstraint.copy() method is deprecated and will be removed in a future release. (deprecated since: 1.4)
    next(self.gen)

tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
  /Users/renkaige/miniconda/envs/rikai/lib/python3.8/contextlib.py:120: SADeprecationWarning: The ForeignKeyConstraint.copy() method is deprecated and will be removed in a future release. (deprecated since: 1.4)
    next(self.gen)

tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version
  /Users/renkaige/miniconda/envs/rikai/lib/python3.8/contextlib.py:120: SADeprecationWarning: The CheckConstraint.copy() method is deprecated and will be removed in a future release. (deprecated since: 1.4)
    next(self.gen)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
==================================================================== short test summary info =====================================================================
FAILED python/tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version - Failed: Timeout >60.0s
=========================================================== 1 failed, 71 warnings in 61.03s (0:01:01) ============================================================
21/11/29 14:59:43 ERROR TaskContextImpl: Error in TaskCompletionListener
java.lang.IllegalStateException: Block broadcast_1 not found
	at org.apache.spark.storage.BlockInfoManager.$anonfun$unlock$3(BlockInfoManager.scala:293)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:293)
	at org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:1196)
	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$releaseBlockManagerLock$1(TorrentBroadcast.scala:287)
	at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$releaseBlockManagerLock$1$adapted(TorrentBroadcast.scala:287)
	at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:125)
	at org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1(TaskContextImpl.scala:124)
	at org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1$adapted(TaskContextImpl.scala:124)
	at org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1(TaskContextImpl.scala:137)
	at org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1$adapted(TaskContextImpl.scala:135)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:135)
	at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124)
	at org.apache.spark.scheduler.Task.run(Task.scala:141)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
21/11/29 14:59:43 ERROR Utils: Uncaught exception in thread Executor task launch worker for task 0.0 in stage 1.0 (TID 1)
java.lang.NullPointerException
	at org.apache.spark.scheduler.Task.$anonfun$run$2(Task.scala:152)
	at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1419)
	at org.apache.spark.scheduler.Task.run(Task.scala:150)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
(rikai) ➜  rikai git:(master) ✗ 21/11/29 14:59:43 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1): Block broadcast_1 not found

Previous exception in task: Python worker exited unexpectedly (crashed)
	org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:550)
	org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:539)
	scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
	org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:105)
	org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:49)
	org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:470)
	org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
	scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
	org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
	org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
	org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
	org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
	org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
	org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
	org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
	org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	org.apache.spark.scheduler.Task.run(Task.scala:131)
	org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
	org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
	org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
	java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	java.base/java.lang.Thread.run(Thread.java:829)

Renkai avatar Nov 29 '21 07:11 Renkai

sbt publishLocal
cd python
pip install -e .[all]
pytest -s tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version

Works fine for me.

Unit tests will download pre-trained model from internet. Please make sure your network is stable.

da-tubi avatar Dec 06 '21 03:12 da-tubi

Is there a way to check if the pre-trained model is good or not?

Renkai avatar Dec 13 '21 13:12 Renkai

I found it passed after I adjust the time limit to longer.

pytest -s tests/spark/sql/codegen/test_mlflow_registry.py::test_mlflow_model_from_model_version

=========================================================== 1 passed, 71 warnings in 98.96s (0:01:38) ============================================================

Renkai avatar Dec 13 '21 13:12 Renkai