mxnet icon indicating copy to clipboard operation
mxnet copied to clipboard

MXNET for cuda 11.0 Nvidia 450

Open i55code opened this issue 4 years ago • 29 comments

Is there a version of MXnet for cuda 11.0, Nvidia 450 , Ubuntu 18.04? Something like pip install --pre mxnet_cu102mkl -f ... ?

Thank you!

i55code avatar Jul 02 '20 23:07 i55code

With cuda 11.0 of course comes support for A100 and TensorFloat-32 (TF32). FYI, I'm preparing a PR that refactors how our unittests handle tolerances, so that the proper tolerances are seamlessly applied across dtypes and contexts, including A100 contexts. Stay tuned.

DickJC123 avatar Jul 04 '20 03:07 DickJC123

Thank you @DickJC123 !

i55code avatar Jul 07 '20 15:07 i55code

Yes, thank you!

chrisroat avatar Sep 11 '20 16:09 chrisroat

Hi, I have installed WSL2 on windows 10 and things appear to be working. WSL2 requites CUDA 11. Is there a pip install mxnet-cu110 or similar for using mxnet on this new platform?

Thanks, Mickey

tadam98 avatar Sep 13 '20 19:09 tadam98

@tadam98 not yet. I will open a PR to enable this build

szha avatar Sep 13 '20 20:09 szha

Please do. As WSL2 is now working not bad at all with my 2080 TI super GPU and some of my stuff uses mxnet.


From: Sheng Zha [email protected] Sent: Sunday, September 13, 2020 11:04:34 PM To: apache/incubator-mxnet [email protected] Cc: tadam98 [email protected]; Comment [email protected] Subject: Re: [apache/incubator-mxnet] MXNET for cuda 11.0 Nvidia 450 (#18657)

not yet. I will open a PR to enable this build

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/apache/incubator-mxnet/issues/18657#issuecomment-691718222, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFBMCT3U5DUJL7XEUDLB26TSFUQVFANCNFSM4OPKGTAA.

tadam98 avatar Sep 13 '20 20:09 tadam98

Would cuda 9.2 or 10.x work in WSL2 ?


From: Sheng Zha [email protected] Sent: Sunday, September 13, 2020 11:04:34 PM To: apache/incubator-mxnet [email protected] Cc: tadam98 [email protected]; Comment [email protected] Subject: Re: [apache/incubator-mxnet] MXNET for cuda 11.0 Nvidia 450 (#18657)

not yet. I will open a PR to enable this build

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/apache/incubator-mxnet/issues/18657#issuecomment-691718222, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFBMCT3U5DUJL7XEUDLB26TSFUQVFANCNFSM4OPKGTAA.

tadam98 avatar Sep 13 '20 20:09 tadam98

Is there a way I can assist with the PR?

tadam98 avatar Sep 23 '20 19:09 tadam98

@tadam98 that would be really helpful. Thanks for offering to help. The changes needed are basically the same as other cuda build variants. For example, here are all the mentions of cu102 build: https://github.com/apache/incubator-mxnet/search?q=cu102. These occurrences include changes to build logic and build config, CI/CD, and package information.

Let me know if you have any question or otherwise feel free to ping me when you open a PR.

szha avatar Sep 23 '20 21:09 szha

@szha Sorry to bug you, but it's been 4 months. Are there any updates?

ruro avatar Nov 26 '20 17:11 ruro

@RuRo sorry for lack of update on this issue. The nightly build version now has the cu110 package (for development purpose only): https://dist.mxnet.io/python

For releasing on PyPI, there are some ongoing license issue that are being sorted out.

szha avatar Nov 26 '20 17:11 szha

@szha Hi, I've installed the latest mxnet_cu11 ("mxnet_cu110-2.0.0b20201214-py3-none-manylinux2014_x86_64.whl") and the following error has happened: AttributeError: module 'mxnet' has no attribute 'mod'

Working with: Python 3.7.8 Debian 10 GPU: NVIDIA Tesla P100 CUDA 11

How can I solve this? Thanks in advance

Joaquin-aliaga avatar Dec 15 '20 11:12 Joaquin-aliaga

@Joaquin-aliaga it's probably due to unclean installation. try to:

  • remove all previous installation
  • (if you have a local copy of mxnet source code) invoke python in a location without mxnet

szha avatar Dec 15 '20 23:12 szha

@szha I've created a new Google Cloud Platform instance (so no mxnet library was installed), then installed mxnet_cu110-2.0.0b20201218-py3-none-manylinux2014_x86_64.whl and the same error has happened.

module 'mxnet' has no attribute 'mod'

Working with: Python 3.7.8 Debian 10 GPU: NVIDIA Tesla P100 CUDA 11

FYI: same code with mxnet-cu101 works fine, but GCP has Cuda11 by default.

Joaquin-aliaga avatar Dec 19 '20 22:12 Joaquin-aliaga

pip install mxnet-cu110 Tip me ERROR: No matching distribution found for mxnet-cu110

@szha

ggzzzzz628 avatar Dec 23 '20 08:12 ggzzzzz628

Are there any updates on the progress of this?

angus-lherrou avatar Feb 09 '21 15:02 angus-lherrou

@angus-lherrou details in https://github.com/apache/incubator-mxnet/pull/19764

lgg avatar Feb 25 '21 15:02 lgg

@szha @i55code guess we can close this issue with 1.8.0 release? https://github.com/apache/incubator-mxnet/releases/tag/1.8.0

lgg avatar Mar 03 '21 14:03 lgg

mxnet-cu11* doesn't seem to be available in pip/pypi.org?

ruro avatar Mar 03 '21 15:03 ruro

Working on it.

szha avatar Mar 03 '21 15:03 szha

How to install without pip ?

M-Tonin avatar Mar 09 '21 16:03 M-Tonin

@M-Tonin you can follow the guide for building from source

szha avatar Mar 09 '21 22:03 szha

Dear All,

I am trying to get mxnet for CUDA 11.0 installed. I am using:

pip install mxnet-cu110

Unfortunately I get the following:

ERROR: Could not find a version that satisfies the requirement mxnet-cu110
ERROR: No matching distribution found for mxnet-cu110

pip 21.0.1 with python 3.7.6

Do you have any suggestion how to solve the issue?

Thank you

Carlo

cberri avatar Mar 26 '21 13:03 cberri

@cberri what platform you have? It's strange.

Try to find available for you version of mxnet here in constructor: https://mxnet.apache.org/versions/1.8.0/get_started?

https://pypi.org/project/mxnet-cu110/ is available in pip

lgg avatar Mar 26 '21 13:03 lgg

@Igg thanks for your quick reply. I am trying in Windows 10 in a conda environment.

Not sure what is going on!

cberri avatar Mar 26 '21 13:03 cberri

@cberri did you try the constructor (https://mxnet.apache.org/versions/1.8.0/get_started)?

image

Unfortunately I can't help you with windows, because I have linux. But I guess that there is no mxnet-cu110 for windows yet. (just check the url and image above)

lgg avatar Mar 26 '21 16:03 lgg

@yajiedesign is helping with the wheels for windows for 1.8. Those would likely only appear in dist.mxnet.io because of the size limit.

szha avatar Mar 26 '21 16:03 szha

@Igg, Excellent point, I get the same error message when I do :

$ pip install mxnet-cu102

I also tried:

$ pip -v install mxnet-cu102

And this is the output:

Using pip 21.0.1 from C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip (python 3.7)
Non-user install because site-packages writeable
Created temporary directory: C:\Users\CARLOB~1\AppData\Local\Temp\pip-ephem-wheel-cache-sd6o9034
Created temporary directory: C:\Users\CARLOB~1\AppData\Local\Temp\pip-req-tracker-ypjed3bp
Initialized build tracking at C:\Users\CARLOB~1\AppData\Local\Temp\pip-req-tracker-ypjed3bp
Created build tracker: C:\Users\CARLOB~1\AppData\Local\Temp\pip-req-tracker-ypjed3bp
Entered build tracker: C:\Users\CARLOB~1\AppData\Local\Temp\pip-req-tracker-ypjed3bp
Created temporary directory: C:\Users\CARLOB~1\AppData\Local\Temp\pip-install-x7f8mpk8
1 location(s) to search for versions of mxnet-cu102:
* https://pypi.org/simple/mxnet-cu102/
Fetching project page and analyzing links: https://pypi.org/simple/mxnet-cu102/
Getting page https://pypi.org/simple/mxnet-cu102/
Found index url https://pypi.org/simple
Looking up "https://pypi.org/simple/mxnet-cu102/" in the cache
Request header has "max_age" as 0, cache bypassed
Starting new HTTPS connection (1): pypi.org:443
https://pypi.org:443 "GET /simple/mxnet-cu102/ HTTP/1.1" 304 0
  Skipping link: none of the wheel's tags match: py2-none-manylinux1_x86_64, py3-none-manylinux1_x86_64: https://files.pythonhosted.org/packages/5f/d9/05d00c8f148af5a5cfd3f905419b5dc1cd43aeb0b91e26c00941d0cd9f98/mxnet_cu102-1.6.0.post0-py2.py3-none-manylinux1_x86_64.whl#sha256=b7f3284f28b40e3ad5c098e385dab4f4a2baccf25795d6ec97d33ee2e2d5fe93 (from https://pypi.org/simple/mxnet-cu102/)
  Skipping link: none of the wheel's tags match: py2-none-manylinux2014_x86_64, py3-none-manylinux2014_x86_64: https://files.pythonhosted.org/packages/46/a4/7c81a3ddd2d406bd1e13aa9f2b7a1dc8480eacb7f92a43484d7866ba8b89/mxnet_cu102-1.7.0-py2.py3-none-manylinux2014_x86_64.whl#sha256=6dac6f3d758d3991e4a6b188e994c1deed1c01ba158d830ed24b4737a2311334 (from https://pypi.org/simple/mxnet-cu102/)
  Skipping link: none of the wheel's tags match: py2-none-manylinux2014_x86_64, py3-none-manylinux2014_x86_64: https://files.pythonhosted.org/packages/20/18/f8d1f2ca0433ed37c140315426a97e4537a9e13e6071bdd387081ea1a1a3/mxnet_cu102-1.7.0.post0-py2.py3-none-manylinux2014_x86_64.whl#sha256=35c402c3886b15a22c1c459670bb34f5bb423caac9c46a76fa4a1be24167c3dc (from https://pypi.org/simple/mxnet-cu102/)
  Skipping link: none of the wheel's tags match: py2-none-manylinux2014_x86_64, py3-none-manylinux2014_x86_64: https://files.pythonhosted.org/packages/95/56/92b23233314ac91fa25c7198772d54f2b099a7dddd7fcc117c83eb2817a8/mxnet_cu102-1.7.0.post1-py2.py3-none-manylinux2014_x86_64.whl#sha256=98a791bd9a3bb008af95abe2e0586abef1b9491da67c24944a433007160e0e54 (from https://pypi.org/simple/mxnet-cu102/)
  Skipping link: none of the wheel's tags match: py2-none-manylinux2014_x86_64, py3-none-manylinux2014_x86_64: https://files.pythonhosted.org/packages/d1/0f/675a35918b9538d1a51fb27de897c3c277d512bb946b059248a360e26401/mxnet_cu102-1.8.0-py2.py3-none-manylinux2014_x86_64.whl#sha256=9ac48084ccd0673ca657eca787141212cde8d8b28501b273cdd102e9d0489db2 (from https://pypi.org/simple/mxnet-cu102/)
Given no hashes to check 0 links for project 'mxnet-cu102': discarding no candidates
ERROR: Could not find a version that satisfies the requirement mxnet-cu102
ERROR: No matching distribution found for mxnet-cu102
Exception information:
Traceback (most recent call last):
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 171, in _merge_into_criterion
    crit = self.state.criteria[name]
KeyError: 'mxnet-cu102'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 318, in resolve
    name, crit = self._merge_into_criterion(r, parent=None)
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 173, in _merge_into_criterion
    crit = Criterion.from_requirement(self._p, requirement, parent)
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 83, in from_requirement
    raise RequirementsConflicted(criterion)
pip._vendor.resolvelib.resolvers.RequirementsConflicted: Requirements conflict: SpecifierRequirement('mxnet-cu102')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_internal\resolution\resolvelib\resolver.py", line 122, in resolve
    requirements, max_rounds=try_to_avoid_resolution_too_deep,
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 453, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_vendor\resolvelib\resolvers.py", line 320, in resolve
    raise ResolutionImpossible(e.criterion.information)
pip._vendor.resolvelib.resolvers.ResolutionImpossible: [RequirementInformation(requirement=SpecifierRequirement('mxnet-cu102'), parent=None)]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_internal\cli\base_command.py", line 189, in _main
    status = self.run(options, args)
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_internal\cli\req_command.py", line 178, in wrapper
    return func(self, options, args)
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_internal\commands\install.py", line 317, in run
    reqs, check_supported_wheels=not options.target_dir
  File "C:\Users\CarloBeretta\anaconda3\envs\cellpose\lib\site-packages\pip\_internal\resolution\resolvelib\resolver.py", line 127, in resolve
    six.raise_from(error, e)
  File "<string>", line 3, in raise_from
pip._internal.exceptions.DistributionNotFound: No matching distribution found for mxnet-cu102
Removed build tracker: 'C:\\Users\\CARLOB~1\\AppData\\Local\\Temp\\pip-req-tracker-ypjed3bp'

Maybe It helps to find the issue.

cberri avatar Mar 26 '21 16:03 cberri

@cberri hello, try installing wheel from here: https://dist.mxnet.io/python/cu102/

e.g.: pip install mxnet-cu102 -f https://dist.mxnet.io/python

or pip install https://repo.mxnet.io/dist/python/cu102/mxnet_cu102-1.7.0-py2.py3-none-win_amd64.whl

source: https://discuss.mxnet.apache.org/t/mxnet-version-for-cuda-10-2-in-windows-10/5919/3

also check this: https://github.com/apache/incubator-mxnet/issues/17963 and this https://github.com/apache/incubator-mxnet/issues/19581

lgg avatar Mar 29 '21 13:03 lgg