yocto-gl
yocto-gl copied to clipboard
[BUG] mflow R package only returns `Error in wait_for`[BUG]
Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the issues policy
Where did you encounter this bug?
Local machine
Willingness to contribute
No. I cannot contribute a bug fix at this time.
MLflow version
- Client: version 2.11.3
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 11 Pro
- Python version: 3.11.3
Describe the problem
Nearly all R commands return the following error:
Error in wait_for(function() mlflow_rest("experiments", "search", client = client, :
Operation failed after waiting for 10 seconds
This occurs on a fresh install of mlflow using pip install mlflow
and a fresh install of mlflow for R `install.packages("mlflow").
R version is 2.11.1
Tracking information
MLflow version: 2.11.3
Tracking URI: file:///C:/Users/XXX/Git/model-development/mlruns
Artifact URI: mlflow-artifacts:/0/63092125fe6b4133af90ef19d22f2a0f/artifacts
System information: Windows 10.0.22631
Python version: 3.11.3
MLflow version: 2.11.3
MLflow module location: C:\PYTHON~1\Lib\site-packages\mlflow\__init__.py
Tracking URI: file:///C:/Users/XXX/Git/model-development/mlruns
Registry URI: file:///C:/Users/XXX/Git/model-development/mlruns
Active experiment ID: 0
Active run ID: 63092125fe6b4133af90ef19d22f2a0f
Active run artifact URI: mlflow-artifacts:/0/63092125fe6b4133af90ef19d22f2a0f/artifacts
MLflow dependencies:
Flask: 3.0.0
Jinja2: 3.1.2
aiohttp: 3.8.6
alembic: 1.13.1
boto3: 1.26.148
botocore: 1.29.148
click: 8.1.3
cloudpickle: 3.0.0
docker: 7.0.0
entrypoints: 0.4
fastapi: 0.104.1
gitpython: 3.1.40
graphene: 3.3
importlib-metadata: 6.8.0
markdown: 3.6
matplotlib: 3.8.2
numpy: 1.24.3
packaging: 23.1
pandas: 2.0.1
protobuf: 4.25.0
pyarrow: 14.0.0
pydantic: 1.10.13
pytz: 2023.3
pyyaml: 5.4.1
querystring-parser: 1.2.4
requests: 2.31.0
scikit-learn: 1.4.2
scipy: 1.10.1
sqlalchemy: 2.0.23
sqlparse: 0.4.4
tiktoken: 0.4.0
uvicorn: 0.24.0.post1
virtualenv: 20.23.0
waitress: 3.0.0
watchfiles: 0.21.0
Code to reproduce issue
library(mlflow)
server <- mlflow::mlflow_server()
mlflow::mlflow_ui()
Stack trace
6: stop("Operation failed after waiting for ", wait, " seconds")
5: wait_for(function() mlflow_rest("experiments", "search", client = client,
verb = "POST", data = list(max_results = 1)), getOption("mlflow.connect.wait",
10), getOption("mlflow.connect.sleep", 1))
4: mlflow_validate_server(client)
3: mlflow_client()
2: mlflow_ui.NULL()
1: mlflow::mlflow_ui()
Other info / logs
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22631)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] mlflow_2.11.1
loaded via a namespace (and not attached):
[1] vctrs_0.6.5 swagger_3.33.1 httr_1.4.7 cli_3.6.2 rlang_1.1.3
[6] zeallot_0.1.0 processx_3.8.3 png_0.1-8 purrr_1.0.2 renv_1.0.5
[11] promises_1.2.1 jsonlite_1.8.8 glue_1.7.0 openssl_2.1.1 forge_0.2.0
[16] askpass_1.2.0 httpuv_1.6.14 ps_1.7.6 fansi_1.0.6 grid_4.3.2
[21] ini_0.3.1 base64enc_0.1-3 yaml_2.3.8 lifecycle_1.0.4 compiler_4.3.2
[26] fs_1.6.3 Rcpp_1.0.12 rstudioapi_0.15.0 later_1.3.2 lattice_0.21-9
[31] R6_2.5.1 utf8_1.2.4 reticulate_1.35.0 pillar_1.9.0 curl_5.2.1
[36] magrittr_2.0.3 Matrix_1.6-5 tools_4.3.2 withr_3.0.0
What component(s) does this bug affect?
- [ ]
area/artifacts
: Artifact stores and artifact logging - [ ]
area/build
: Build and test infrastructure for MLflow - [ ]
area/deployments
: MLflow Deployments client APIs, server, and third-party Deployments integrations - [ ]
area/docs
: MLflow documentation pages - [ ]
area/examples
: Example code - [ ]
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry - [ ]
area/models
: MLmodel format, model serialization/deserialization, flavors - [ ]
area/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templates - [ ]
area/projects
: MLproject format, project running backends - [ ]
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs - [ ]
area/server-infra
: MLflow Tracking server backend - [ ]
area/tracking
: Tracking Service, tracking client APIs, autologging
What interface(s) does this bug affect?
- [ ]
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server - [ ]
area/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Models - [ ]
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry - [ ]
area/windows
: Windows support
What language(s) does this bug affect?
- [X]
language/r
: R APIs and clients - [ ]
language/java
: Java APIs and clients - [ ]
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
- [ ]
integrations/azure
: Azure and Azure ML integrations - [ ]
integrations/sagemaker
: SageMaker integrations - [ ]
integrations/databricks
: Databricks integrations
@acircleda I can't repro your problem on MLflow 2.11.3, could you double check you installed it correctly? Are the environment variables (MLFLOW_BIN, MLFLOW_PYTHON_BIN) set correctly?
I can correctly open mlflow ui with below command:
I suspect (from some previous issues) this may be a Windows-specific issue.
I believe the environment variables are set correctly:
> Sys.which('python')
python
"C:\\PYTHON~1\\python.exe"
> Sys.which('mlflow')
mlflow
"C:\\PYTHON~1\\Scripts\\mlflow.exe"
In checking these paths I can confirm the exe's exist.
Some additional info:
If I run, server <- mlflow::mlflow_server()
, the following is returned:
$server_url
[1] "http://127.0.0.1:5000"
$handle
PROCESS 'mlflow.exe', running, pid 22388.
$file_store
[1] "file://C:/Users/XXX/Documents/mlruns"
attr(,"class")
[1] "mlflow_server"
However, no other commands work and http://127.0.0.1:5000 is not accessible. If I switch to the terminal and run mlflow server --host 127.0.0.1 --port 8080
, I get mlflow server --host 127.0.0.1 --port 8080
, after which the UI is accessible.
Just as a follow-up, I ran the following code:
Sys.setenv(MLFLOW_BIN=Sys.which("mlflow"))
Sys.setenv(MLFLOW_PYTHON_BIN=Sys.which("python"))
and verified using Sys.getenv()
that these existed in the system environment, which they did. Running the following commands still resulted in the same error:
library(mlflow)
mlflow_client(tracking_uri = NULL)
@acircleda Could you try if the python code works? Want to see if your tracking server breaks or it's R installation problem.
import mlflow
with mlflow.start_run():
mlflow.log_param("test", "test")
This seems to work. I did a fresh install into a venv-controlled environment on python 3.11.3
import mlflow with mlflow.start_run(): ... mlflow.log_param("test", "test") ... 'test'
I also ran this same python code in R Studio (via reticulate
) and got the same result.
Tried the same command in R and got the same error reported in this ticket.
library(mlflow)
mlflow::mlflow_log_param("test", "test")
What if you run R
in terminal within the same python venv and try those R commands?
Using the R -e '...'
commands in the terminal, I get the same error as above.
@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.
Let me know if there is any other information you would like me to provide.