[BUG] - conda-store builds taking forever.
Describe the bug
i'm specifying a conda store build in our deployed instance and it appears to take for ever (this particular one is still running for 7 hours) without any logs. when looking at the worker pod there are a bunch of messages:
worker pod messages
â conda-store-worker celery.exceptions.ChordError: Dependency build-1-constructor-installer raised Calle â â dProcessError(1, ['python', '-m', 'constructor', '--help']) â â conda-store-worker [2025-06-27 01:14:31,811: INFO/ForkPoolWorker-4] Task task_build_conda_pack[build-1 â â conda-pack[] succeeded in 72.24953944900017s: None â â conda-store-worker [2025-06-28 00:33:58,738: INFO/MainProcess] Task task_update_storage_metrics[469226 â â c-8a0d-4e49-b73c-5c1d1dee2b92[] received â â conda-store-worker [2025-06-28 00:33:58,765: INFO/ForkPoolWorker-2] Task task_update_storage_metrics[4 â â 9226ac-8a0d-4e49-b73c-5c1d1dee2b92[] succeeded in 0.02669937099562958s: None â â conda-store-worker [2025-06-28 00:33:58,765: INFO/MainProcess] Task task_build_conda_environment[build â â 3-environment[] received â â conda-store-worker [2025-06-28 00:34:00,745: WARNING/ForkPoolWorker-2] CONDA_FLAGS=--strict-channel-pr â â iority â â conda-store-worker [2025-06-28 00:34:00,750: WARNING/ForkPoolWorker-2] Locking dependencies for ['linu â â x-64']... â â conda-store-worker [2025-06-28 00:34:00,751: INFO/ForkPoolWorker-2] linux-64 using specs ['dandi', 'da â â talad', 'ipykernel'] â â conda-store-worker [2025-06-28 00:34:16,598: WARNING/ForkPoolWorker-2] - Install lock using: â â conda-store-worker [2025-06-28 00:34:16,598: WARNING/ForkPoolWorker-2] â â conda-store-worker [2025-06-28 00:34:16,598: WARNING/ForkPoolWorker-2] conda-lock install --name YOURE â â NV /tmp/tmpr8mbu496/conda-lock.yaml â â conda-store-worker [2025-06-28 00:34:16,598: WARNING/ForkPoolWorker-2] Rendering lockfile(s) for linux â â -64... â â conda-store-worker [2025-06-28 00:34:16,600: WARNING/ForkPoolWorker-2] - Install lock using : â â conda-store-worker [2025-06-28 00:34:16,600: WARNING/ForkPoolWorker-2] â â conda-store-worker [2025-06-28 00:34:16,600: WARNING/ForkPoolWorker-2] conda create --name YOURENV --f â â ile conda-linux-64.lock â â conda-store-worker /opt/conda/lib/python3.12/site-packages/conda/base/context.py:198: FutureWarning: A â â dding 'defaults' to channel list implicitly is deprecated and will be removed in 25.3. â â conda-store-worker â â conda-store-worker To remove this warning, please choose a default channel explicitly with conda's reg â â ular configuration system, e.g. by adding 'defaults' to the list of channels: â â conda-store-worker â â conda-store-worker conda config --add channels defaults â â conda-store-worker â â conda-store-worker For more information see https://docs.conda.io/projects/conda/en/stable/user-guide/ â â configuration/use-condarc.html â â conda-store-worker â â conda-store-worker deprecated.topic( â â conda-store-worker [2025-06-28 00:36:23,765: INFO/ForkPoolWorker-2] building conda_prefix=/home/conda/ â â yarikoptic/124bb6d6-1751070838-3-test-1 took 144.961 [s] | conda-store-worker [2025-06-28 00:36:29,140: INFO/MainProcess] Task task_build_conda_env_export[build- â â -conda-env-export[] received â â conda-store-worker [2025-06-28 00:36:29,141: INFO/MainProcess] Task task_build_conda_pack[build-3-cond â â -pack[] received â â conda-store-worker [2025-06-28 00:36:29,142: INFO/MainProcess] Task task_build_constructor_installer[b â â ild-3-constructor-installer[] received â â conda-store-worker [2025-06-28 00:36:29,143: INFO/ForkPoolWorker-2] Task task_build_conda_environment[ â â uild-3-environment[] succeeded in 150.37720980399172s: None â â conda-store-worker [2025-06-28 00:36:32,191: INFO/ForkPoolWorker-4] Task task_build_conda_env_export[b â â ild-3-conda-env-export[] succeeded in 3.0499828989995876s: None â â conda-store-worker [2025-06-28 00:36:52,170: INFO/ForkPoolWorker-3] packaging archive of conda environ â â ment=/home/conda/yarikoptic/124bb6d6-1751070838-3-test-1 took 23.025 [s] â â conda-store-worker [2025-06-28 00:36:52,171: ERROR/ForkPoolWorker-3] Chord '6a78a37e-f335-46b6-8a40-04 â â d54d29f0bd' raised: ChordError("Dependency build-3-constructor-installer raised CalledProcessError(1, â â ['python', '-m', 'constructor', '--help'])") â â conda-store-worker Traceback (most recent call last): â â conda-store-worker File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/celery/back â â ends/redis.py", line 528, in on_chord_part_return â â conda-store-worker resl = [unpack(tup, decode) for tup in resl] â â conda-store-worker ^^^^^^^^^^^^^^^^^^^ â â conda-store-worker File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/celery/back â â ends/redis.py", line 434, in _unpack_chord_result â â conda-store-worker raise ChordError(f'Dependency {tid} raised {retval!r}') â â conda-store-worker celery.exceptions.ChordError: Dependency build-3-constructor-installer raised Calle â â dProcessError(1, ['python', '-m', 'constructor', '--help']) â â conda-store-worker [2025-06-28 00:36:52,173: INFO/ForkPoolWorker-3] Task task_build_conda_pack[build-3 â â conda-pack[] succeeded in 23.03208869199443s: None â â conda-store-worker [2025-06-28 23:32:51,063: INFO/MainProcess] Task task_update_storage_metrics[1c984d â â 5-15d7-4ba2-970a-07ce711083a0[] received â â conda-store-worker [2025-06-28 23:32:51,071: INFO/ForkPoolWorker-2] Task task_update_storage_metrics[1 â â 984d85-15d7-4ba2-970a-07ce711083a0[] succeeded in 0.007957488007377833s: None â â conda-store-worker [2025-06-28 23:32:51,071: INFO/MainProcess] Task task_build_conda_environment[build â â 4-environment[] received â â conda-store-worker [2025-06-28 23:32:53,091: WARNING/ForkPoolWorker-2] CONDA_FLAGS=--strict-channel-pr â â iority â â conda-store-worker [2025-06-28 23:32:56,799: WARNING/ForkPoolWorker-2] Locking dependencies for ['linu â â x-64']... â â conda-store-worker [2025-06-28 23:32:56,800: INFO/ForkPoolWorker-2] linux-64 using specs ['python >=3. â â 13', 'ipykernel', 'ipywidgets', 'pip *']
Expected behavior
a build completes, provides build log output and status.
OS and architecture in which you are running Nebari
macos arm
How to Reproduce the problem?
added this through the conda-store ui.
channels:
- conda-forge
- defaults
dependencies:
- python>=3.13
- ipykernel
- ipywidgets
- pip
- pip:
- dandi
Command output
Versions and dependencies used.
conda 23.11.0
âŊ kubectl version Client Version: v1.32.2 Kustomize Version: v5.5.0 Server Version: v1.31.9-eks-5d4a308
Compute environment
AWS
Integrations
conda-store
Anything else?
in general, we would like to have a few builds available to all users. being able to monitor the builds (and their failures) would be useful.
I would check the conda-store logs - https://www.nebari.dev/docs/how-tos/access-logs-loki#conda-store-logs
And also confirm that you haven't run out of conda storage. If you have, you can delete some environments or, through the admin ui you can delete individual builds.
The conda-store admin UI is at /conda-store/admin. You may need to click login.
thank you @kcpevey
only 3% of storage is being used.
regarding the logs, i see the build being received by the worker in the worker pod logs, but that's it.
on the admin ui only the following is show in the logs:
ui log
starting build of conda environment 2025-07-02 14:09:11.885524 UTC
plugin-conda-lock: lock_environment entrypoint for conda-lock
plugin-conda-lock: Note that the output of `conda config --show` displayed below only reflects settings in the conda configuration file, which might be overridden by variables required to be set by conda-store via the environment. Overridden settings: CONDA_FLAGS=--strict-channel-priority
plugin-conda-lock: Running command: ['mamba', 'info']
plugin-conda-lock: /opt/conda/lib/python3.12/site-packages/conda/base/context.py:198: FutureWarning: Adding 'defaults' to channel list implicitly is deprecated and will be removed in 25.3.
plugin-conda-lock: To remove this warning, please choose a default channel explicitly with conda's regular configuration system, e.g. by adding 'defaults' to the list of channels:
plugin-conda-lock: conda config --add channels defaults
plugin-conda-lock: For more information see https://docs.conda.io/projects/conda/en/stable/user-guide/configuration/use-condarc.html
plugin-conda-lock: deprecated.topic(
plugin-conda-lock: mamba version : 1.5.9
plugin-conda-lock: active environment : None
plugin-conda-lock: user config file : /root/.condarc
plugin-conda-lock: populated config files : /opt/conda/.condarc
plugin-conda-lock: conda version : 24.9.2
plugin-conda-lock: conda-build version : not installed
plugin-conda-lock: python version : 3.12.7.final.0
plugin-conda-lock: solver : libmamba (default)
plugin-conda-lock: virtual packages : __archspec=1=zen2
plugin-conda-lock: __conda=24.9.2=0
plugin-conda-lock: __glibc=2.31=0
plugin-conda-lock: __linux=5.10.237=0
plugin-conda-lock: __unix=0=0
plugin-conda-lock: base environment : /opt/conda (writable)
plugin-conda-lock: conda av data dir : /opt/conda/etc/conda
plugin-conda-lock: conda av metadata url : None
plugin-conda-lock: channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
plugin-conda-lock: https://repo.anaconda.com/pkgs/main/noarch
plugin-conda-lock: https://repo.anaconda.com/pkgs/r/linux-64
plugin-conda-lock: https://repo.anaconda.com/pkgs/r/noarch
plugin-conda-lock: package cache : /opt/conda/pkgs
plugin-conda-lock: /root/.conda/pkgs
plugin-conda-lock: envs directories : /opt/conda/envs
plugin-conda-lock: /root/.conda/envs
plugin-conda-lock: platform : linux-64
plugin-conda-lock: user-agent : conda/24.9.2 requests/2.32.3 CPython/3.12.7 Linux/5.10.237-230.949.amzn2.x86_64 ubuntu/20.04.6 glibc/2.31 solver/libmamba conda-libmamba-solver/24.9.0 libmambapy/1.5.9
plugin-conda-lock: UID:GID : 0:0
plugin-conda-lock: netrc file : None
plugin-conda-lock: offline mode : False
plugin-conda-lock: Running command: ['conda', 'config', '--show']
plugin-conda-lock: /opt/conda/lib/python3.12/site-packages/conda/base/context.py:198: FutureWarning: Adding 'defaults' to channel list implicitly is deprecated and will be removed in 25.3.
plugin-conda-lock: To remove this warning, please choose a default channel explicitly with conda's regular configuration system, e.g. by adding 'defaults' to the list of channels:
plugin-conda-lock: conda config --add channels defaults
plugin-conda-lock: For more information see https://docs.conda.io/projects/conda/en/stable/user-guide/configuration/use-condarc.html
plugin-conda-lock: deprecated.topic(
plugin-conda-lock: add_anaconda_token: True
plugin-conda-lock: add_pip_as_python_dependency: True
plugin-conda-lock: aggressive_update_packages:
plugin-conda-lock: - ca-certificates
plugin-conda-lock: - certifi
plugin-conda-lock: - openssl
plugin-conda-lock: allow_conda_downgrades: False
plugin-conda-lock: allow_cycles: True
plugin-conda-lock: allow_non_channel_urls: False
plugin-conda-lock: allow_softlinks: False
plugin-conda-lock: allowlist_channels: []
plugin-conda-lock: always_copy: False
plugin-conda-lock: always_softlink: False
plugin-conda-lock: always_yes: None
plugin-conda-lock: anaconda_upload: None
plugin-conda-lock: auto_activate_base: True
plugin-conda-lock: auto_stack: 0
plugin-conda-lock: auto_update_conda: True
plugin-conda-lock: bld_path:
plugin-conda-lock: changeps1: True
plugin-conda-lock: channel_alias: https://conda.anaconda.org
plugin-conda-lock: channel_priority: flexible
plugin-conda-lock: channel_settings: []
plugin-conda-lock: channels:
plugin-conda-lock: - defaults
plugin-conda-lock: client_ssl_cert: None
plugin-conda-lock: client_ssl_cert_key: None
plugin-conda-lock: clobber: False
plugin-conda-lock: conda_build: {}
plugin-conda-lock: create_default_packages: []
plugin-conda-lock: croot: /opt/conda/conda-bld
plugin-conda-lock: custom_channels:
plugin-conda-lock: pkgs/main: https://repo.anaconda.com
plugin-conda-lock: pkgs/r: https://repo.anaconda.com
plugin-conda-lock: pkgs/pro: https://repo.anaconda.com
plugin-conda-lock: custom_multichannels:
plugin-conda-lock: defaults:
plugin-conda-lock: - https://repo.anaconda.com/pkgs/main
plugin-conda-lock: - https://repo.anaconda.com/pkgs/r
plugin-conda-lock: local:
plugin-conda-lock: debug: False
plugin-conda-lock: default_channels:
plugin-conda-lock: - https://repo.anaconda.com/pkgs/main
plugin-conda-lock: - https://repo.anaconda.com/pkgs/r
plugin-conda-lock: default_python: 3.12
plugin-conda-lock: default_threads: None
plugin-conda-lock: denylist_channels: []
plugin-conda-lock: deps_modifier: not_set
plugin-conda-lock: dev: False
plugin-conda-lock: disallowed_packages: []
plugin-conda-lock: download_only: False
plugin-conda-lock: dry_run: False
plugin-conda-lock: enable_private_envs: False
plugin-conda-lock: env_prompt: ({default_env})
plugin-conda-lock: envs_dirs:
plugin-conda-lock: - /opt/conda/envs
plugin-conda-lock: - /root/.conda/envs
plugin-conda-lock: envvars_force_uppercase: True
plugin-conda-lock: error_upload_url: https://conda.io/conda-post/unexpected-error
plugin-conda-lock: execute_threads: 1
plugin-conda-lock: experimental: []
plugin-conda-lock: extra_safety_checks: False
plugin-conda-lock: fetch_threads: 5
plugin-conda-lock: force: False
plugin-conda-lock: force_32bit: False
plugin-conda-lock: force_reinstall: False
plugin-conda-lock: force_remove: False
plugin-conda-lock: ignore_pinned: False
plugin-conda-lock: json: False
plugin-conda-lock: local_repodata_ttl: 1
plugin-conda-lock: migrated_channel_aliases: []
plugin-conda-lock: migrated_custom_channels: {}
plugin-conda-lock: no_lock: False
plugin-conda-lock: no_plugins: False
plugin-conda-lock: non_admin_enabled: True
plugin-conda-lock: notify_outdated_conda: True
plugin-conda-lock: number_channel_notices: 5
plugin-conda-lock: offline: False
plugin-conda-lock: override_channels_enabled: True
plugin-conda-lock: path_conflict: clobber
plugin-conda-lock: pinned_packages: []
plugin-conda-lock: pip_interop_enabled: False
plugin-conda-lock: pkgs_dirs:
plugin-conda-lock: - /opt/conda/pkgs
plugin-conda-lock: - /root/.conda/pkgs
plugin-conda-lock: proxy_servers: {}
plugin-conda-lock: quiet: False
plugin-conda-lock: register_envs: True
plugin-conda-lock: remote_backoff_factor: 1
plugin-conda-lock: remote_connect_timeout_secs: 9.15
plugin-conda-lock: remote_max_retries: 3
plugin-conda-lock: remote_read_timeout_secs: 60.0
plugin-conda-lock: repodata_fns:
plugin-conda-lock: - current_repodata.json
plugin-conda-lock: - repodata.json
plugin-conda-lock: repodata_threads: None
plugin-conda-lock: repodata_use_zst: True
plugin-conda-lock: report_errors: None
plugin-conda-lock: reporters:
plugin-conda-lock: - {'backend': 'console', 'output': 'stdout', 'verbosity': 0, 'quiet': False}
plugin-conda-lock: restore_free_channel: False
plugin-conda-lock: rollback_enabled: True
plugin-conda-lock: root_prefix: /opt/conda
plugin-conda-lock: safety_checks: warn
plugin-conda-lock: sat_solver: pycosat
plugin-conda-lock: separate_format_cache: False
plugin-conda-lock: shortcuts: True
plugin-conda-lock: shortcuts_only: []
plugin-conda-lock: show_channel_urls: None
plugin-conda-lock: signing_metadata_url_base: None
plugin-conda-lock: solver: libmamba
plugin-conda-lock: solver_ignore_timestamps: False
plugin-conda-lock: ssl_verify: True
plugin-conda-lock: subdir: linux-64
plugin-conda-lock: subdirs:
plugin-conda-lock: - linux-64
plugin-conda-lock: - noarch
plugin-conda-lock: target_prefix_override:
plugin-conda-lock: trace: False
plugin-conda-lock: track_features: []
plugin-conda-lock: unsatisfiable_hints: True
plugin-conda-lock: unsatisfiable_hints_check_depth: 2
plugin-conda-lock: update_modifier: update_specs
plugin-conda-lock: use_index_cache: False
plugin-conda-lock: use_local: False
plugin-conda-lock: use_only_tar_bz2: None
plugin-conda-lock: verbosity: 0
plugin-conda-lock: verify_threads: 1
plugin-conda-lock: Running command: ['conda', 'config', '--show-sources']
plugin-conda-lock: ==> /opt/conda/.condarc <==
plugin-conda-lock: channels: []
it's been building for 20+ mins at this point with no output shown anywhere to track progress. on k9s, the store worker only shows that it has received the build.
conda-store-worker [2025-07-02 14:09:11,863: INFO/MainProcess] Task task_build_conda_environment[build â
â 4-environment[] received â
â conda-store-worker [2025-07-02 14:09:14,041: WARNING/ForkPoolWorker-3] CONDA_FLAGS=--strict-channel-pr â
â iority â
â conda-store-worker [2025-07-02 14:09:17,158: WARNING/ForkPoolWorker-3] Locking dependencies for ['linu â
â x-64']... â
â conda-store-worker [2025-07-02 14:09:17,159: INFO/ForkPoolWorker-3] linux-64 using specs ['python 3.12 â
â .*', 'ipykernel', 'ipywidgets', 'pip *'] â
â conda-store-worker [2025-07-02 14:09:39,240: INFO/MainProcess] Terminating build-3-environment (15) â
â conda-store-worker [2025-07-02 14:09:39,256: INFO/MainProcess] Task task_cleanup_builds[a30c3028-9b2f- â
â 024-994f-bdfc47b3b763[] received â
â conda-store-worker [2025-07-02 14:09:45,336: WARNING/ForkPoolWorker-6] marking build 3 as CANCELED sin â
â ce stuck in BUILDING state and not present on workers â
â conda-store-worker [2025-07-02 14:09:45,359: INFO/ForkPoolWorker-6] Task task_cleanup_builds[a30c3028- â
â b2f-4024-994f-bdfc47b3b763[] succeeded in 1.1121771300095133s: None
i was able to build a different environment. i suspect this may be a conda resolution issue that is taking a long time.
There is a high chance that this was the case; the logs are divided into three parts, but they need to be successfully processed by the worker before appearing in the UI (they are stored in Minio as artifacts). One quick check I sometimes do when those things happen is building the environment locally, since our main computer usually has more resources available. If it takes time to run on my machine, then it will take twice as long on conda-store.
Although 7 hours is too much, I suggest restarting the conda-store worker pod. And retrying with minimal changes to the env. Also, building it in parts might be helpful in case one of the dependencies is adding more complexity to the solving (I tend to reduce it in half, then adding back the remaining deps and resolving)
Additionally, regarding conda-forge, it is generally not a good idea to mix defaults and CF, as numerous internal naming resolutions may end up falling into the default conda channel, leading to broken environments and intractable issues. I recommend sticking to conda-forge only if possible.
thanks @viniciusdc - i did end up restricting to conda forge only, but still no dice. and since the other envs built (e.g. pytorch), i knew worker pods were fine. is there a corresponding docker image in which i could test the setup locally in a shell so i can see the lock file creation process output.
next i was going to try exporting the environment from a local linux install and feeding that to the builder.
next i was going to try exporting the environment from a local linux install and feeding that to the builder.
I know this was closed out, but I would try this first since there might be an issue with the dependency resolving stage itself. Finding that might help in understanding why it does not show up in the logs. You can run conda-store locally as well through here https://conda.store/conda-store/how-tos/install-standalone
but, under the hood conda-store runs, conda/mamba so you also should be able to run the env build localy trough that too
I am checking this right now, based on my findings so far, the worker node is throutting while performing the build, ususaly kuberntes would give it more resources since there is not limiting factor on the container config, the issue is, since its running on general there simplely might no be any more resources available.
You opened a new issue regarding the node instance type changes https://github.com/nebari-dev/nebari/issues/3093, I assume you increased the available resources. After returning your deployment (after 5 minutes, Keycloak should have been running again), did you try this build again?
I will re-open this issue, because this evidenciates a good example of why we need auto-scaling with the conda-store workers
it wasn't for increasing size of resources but for another issue with ELB failing a healthcheck.
if i can specify resources for the conda-store workers (as in the draft PR you shared), it should trigger a scale up of one of the other environments if there is not enough on general. i can also up the max nodes to 2 for general, so that k8s could spread out pods if needed.
i can also up the max nodes to 2 for general
I don't recommend this one. AWS has an issue with multi-region zones and PV mounting. If your other node ends up in a different zone than the first one, you will start experiencing issues with pods pending due to errors while mounting the volumes.
As a quick "fix" for this env, I recommend building locally and exporting the conda-lock file to pass directly to conda-store, which should tell conda to skip the solving process completely, which should not increase the CPU usage
for the general issue with conda-store, that PR will help redirecting the load to other nodes that have more resources available. In the mean time, I will re-raise the need for an auto-scaling option to conda-store (we considered Keda in the past for that)
i was able to build by supplying the conda-lock file but had to do a few retries through the admin interface.
it seems that if it runs into this connection error it does not retry.
Click to open error log about connection error
action_fetch_and_extract_conda_packages: DOWNLOAD python-3.11.0-he550d4f_1_cpython.conda | 53 of 146
Traceback (most recent call last):
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/connectionpool.py", line 716, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/connectionpool.py", line 468, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/connectionpool.py", line 463, in _make_request
httplib_response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/http/client.py", line 1428, in getresponse
response.begin()
File "/opt/conda/envs/conda-store-server/lib/python3.12/http/client.py", line 331, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/http/client.py", line 300, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/connectionpool.py", line 802, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/util/retry.py", line 552, in increment
raise six.reraise(type(error), error, _stacktrace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/packages/six.py", line 769, in reraise
raise value.with_traceback(tb)
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/connectionpool.py", line 716, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/connectionpool.py", line 468, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/urllib3/connectionpool.py", line 463, in _make_request
httplib_response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/http/client.py", line 1428, in getresponse
response.begin()
File "/opt/conda/envs/conda-store-server/lib/python3.12/http/client.py", line 331, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/http/client.py", line 300, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/conda_store_server/_internal/worker/build.py", line 256, in build_conda_environment
context = action.action_fetch_and_extract_conda_packages(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/conda_store_server/_internal/action/base.py", line 38, in wrapper
action_context.result = f(action_context, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/conda_store_server/_internal/action/download_packages.py", line 88, in action_fetch_and_extract_conda_packages
) = conda_package_streaming.url.conda_reader_for_url(url)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/conda_package_streaming/url.py", line 75, in conda_reader_for_url
conda = LazyConda(url, session)
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/conda_package_streaming/lazy_wheel.py", line 50, in __init__
tail = self._stream_response(start="", end=CONTENT_CHUNK_SIZE)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/conda_package_streaming/lazy_wheel.py", line 190, in _stream_response
response = self._session.get(self._url, headers=headers, stream=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/conda-store-server/lib/python3.12/site-packages/requests/adapters.py", line 682, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
uhm.. that's indeed intrgiuing, may I suggest you to check both the resource consumption of the workers trough grafana, and install flower on another pod connected to the same DB as the conda-store worker?
apiVersion: apps/v1
kind: Deployment
metadata:
name: celery-dashboard
spec:
replicas: 1
selector:
matchLabels:
app: celery-dashboard
template:
metadata:
labels:
app: celery-dashboard
spec:
containers:
- name: celery-dashboard
image: "mher/flower:latest"
command: ["celery", "flower"]
env:
- name: FLOWER_BROKER_API
value: "redis://:******@nebari-conda-store-redis:6379/0"
- name: SERVER_PORT
value: "5555"
ports:
- containerPort: 5555
name: flower
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "1"
memory: "2Gi"
you will need to find the redis URL conda-store is using, should be inside the conda-store-secret, this shoudl allow you to see available workers, and overall status of the executed tasks
I am writting a guide on that and will live in the docs soon
Hey @satra I am not sure if we checked this, but can you verify the disk usage inside the conda-store worker pod?
df -h .
pinging @asmacdo - who has been checking this.
FWIW this can still happen. (And in our case, "building forever" eventually kills the node.) On our side, we haven't seen this for a long time, avoided by documentation-- users are requested to build conda environments in their own userspace. When we build shared environments, it can be done by pre-generating the conda lock file.
My previous (mis)understanding was that creating an environment from spec (not lockfile) ran forever when we installed dandi via pip. However, I saw this again today when attempting to install this (NOTE: dependency resolution made that impossible, but still should have failed instead of exploding)