xmrig-nvidia icon indicating copy to clipboard operation
xmrig-nvidia copied to clipboard

--background option no work for me

Open semeion opened this issue 5 years ago • 22 comments

When i set -B or --background the xmrig-nvidia return: "WARNING: NVIDIA GPU 0: cannot be selected."

I am using Arch Linux with 1050Ti. xmrig-nvidia v2.14.4.

The xmrig CPU version work fine.

How to fix it?

semeion avatar Jun 30 '19 22:06 semeion

I've never used that feature so I doubt anyone does (maybe windows?) Make a systemd service file instead like a normal background process in Linux, it's not much tougher than a bash script

Put this in /etc/systemd/system/xmrig-nvidia.service

[Unit]
Description=xmrig-nvidia
After=network-online.target
Wants=network-online.target
AssertFileNotEmpty=/opt/xmrig-nvidia/config.json

[Service]
Type=simple
Environment=GPU_FORCE_64BIT_PTR=1
Environment=GPU_MAX_HEAP_SIZE=100
Environment=GPU_USE_SYNC_OBJECTS=1
Environment=GPU_MAX_ALLOC_PERCENT=100
Environment=GPU_SINGLE_ALLOC_PERCENT=100
Environment=CUDA_DEVICE_ORDER=PCI_BUS_ID
SyslogIdentifier=xmrig-nvidia
WorkingDirectory=/opt/xmrig-nvidia
ExecStart=/opt/xmrig-nvidia/xmrig-nvidia
Restart=always
KillSignal=SIGQUIT
User=root
Group=root
Nice=19
LimitMEMLOCK=256M

[Install]
WantedBy=multi-user.target

And then systemctl daemon-reload (only needed when editing/adding to systemd confs) and then systemctl enable xmrig-nvidia (also only needed once) and then systemctl start xmrig-nvidia and then to watch its log do journalctl -xafu xmrig-nvidia It will fire up as soon as networking is up on reboot, and you should only need the journal command.

Spudz76 avatar Jul 02 '19 18:07 Spudz76

Obviously the example assumes you've put it in /opt/xmrig-nvidia/ and also have a config.json there If you put all the commandline arguments in the ExecStart line then you have to daemon-reload every time you want to tweak something. With config.json you edit the json and then systemctl restart xmrig-nvidia because all systemd knows is the json file exists and not its content.

Spudz76 avatar Jul 02 '19 18:07 Spudz76

Wow!

What are that environment variables? And what is LimitMEMLOCK=256M? Will that variables increase the performance in some way?

And, thank you for this nice answer.

semeion avatar Jul 02 '19 23:07 semeion

Worked like a charm! Thanks!

BTW, do you know some tip to overclock the GFX 1050 Ti in linux? Every tutorial i have followed don´t work... Seems like the 1050 Ti don´t accept OC... idk...

semeion avatar Jul 03 '19 00:07 semeion

The env stuff:

  • GPU_* are probably only effective for OpenCL miners (and even then probably only AMD) but I tend to set them anyway (I copy this general skeleton out and modify to launch other miners, prefer it just has the env that works everywhere with everything)
  • CUDA_DEVICE_ORDER=PCI_BUS_ID forces CUDA to offer multiple gpus in their slot number order, otherwise CUDA will put the fastest card first (as 0) if you have mixed cards, whereas nvidia-smi and Xorg generally refer to them in slot-order (so this forces the index in miner apps, to be the same as the one when inspecting with nvidia-smi commands and clocking via Xorg with nvidia-settings, etc) It would not have any effect if you have one GPU, or all of them are the same type (then they end up in PCI order anyway)

To clock nvidia cards in Linux you must fire up Xorg. If you do not want a desktop you don't need one, you can just fire xinit and maybe an xterm and no window-manager to save wasted vram and cpu cycles (such as if GDM fired up but nobody could see it to login anyway). But, once you have Xorg going and also have it set to allow overclocking (google coolbits Xorg) then nvidia-settings will apply clocks. BUT - You will be forced into P2 mode in Linux however, and there is no known unlock to get P0 back. In Windows you can tweak the driver to let you use CUDA in P0 (which is usually better clocks, it depends on your cards particular bios, sometimes P2==P0 and you're OK). But also sometimes you can only set offsets for the P0 clocks and P2 clocking doesn't work, again is card and manufacturer dependent.

I have some MSI 1060 which suck at P2 so that rig is one of the only Windows ones, it gets 33% more hashrate when P0 is unlocked (and clocks can be set). But then these PNY 1060 act just like P0 even in P2 mode, so those do full speeds and can be clocked in Linux just fine. It's in however they (manufacturer) decided to setup the bios profiles for each mode.

nvidia-settings -c :0 -q all generally dumps everything you might be able to know about a GPU. If you do have a screen hooked up (or redirect X through ssh to some other Xorg or even cygwin/x) then you can launch nvidia-settings with no args and get the GUI (and better check out what things your card might let you do)

Spudz76 avatar Jul 03 '19 02:07 Spudz76

This /etc/X11/xorg.conf is an example of how I make Xorg fire up enough to clock, but not rob me of resources. I think I needed to install xterm (so that vanilla xinit would have something to launch and hold the session open) and also xorg-input-void (so that it doesn't even bother to bind USB, but Xorg won't launch without inputs, but we have no inputs, so this void module hacks that problem)

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    Screen      1  "Screen1" RightOf "Screen0"
    Screen      2  "Screen2" RightOf "Screen1"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
    Option         "AutoAddDevices" "false"
    Option         "AutoEnableDevices" "false"
    Option         "AutoAddGPU" "false"
EndSection

Section "Files"  
EndSection

Section "Module" 
    Load           "glx"
    Disable        "evdev"
    Disable        "vesa"
    Disable        "fbdev"
    Disable        "modesetting"
    Disable        "ati"
    Disable        "amdgpu"
    Disable        "fglrx"
    Disable        "mga"
    Disable        "nouveau"
EndSection

Section "InputDevice"
    Identifier     "Mouse0"
    Driver         "void"
EndSection

Section "InputDevice"
    Identifier     "Keyboard0"
    Driver         "void"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor2"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device" 
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 770"
    BusID          "PCI:1:0:0"
EndSection

Section "Device" 
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 770"
    BusID          "PCI:2:0:0"
EndSection

Section "Device" 
    Identifier     "Device2"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "GeForce GTX 970"
    BusID          "PCI:3:0:0"
EndSection

Section "Screen" 
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"
    Option         "Accel" "False"
    Option         "NoLogo" "True"
    Option         "UseDisplayDevice" "none"
    Option         "Interactive" "False"
    SubSection     "Display"
        Depth       24
        Modes      "640x480"
    EndSubSection
EndSection

Section "Screen" 
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"
    Option         "Accel" "False"
    Option         "NoLogo" "True"
    Option         "UseDisplayDevice" "none"
    Option         "Interactive" "False"
    SubSection     "Display"
        Depth       24
        Modes      "640x480"
    EndSubSection
EndSection

Section "Screen" 
    Identifier     "Screen2"
    Device         "Device2"
    Monitor        "Monitor2"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "Coolbits" "28"
    Option         "Accel" "False"
    Option         "NoLogo" "True"
    Option         "UseDisplayDevice" "none"
    Option         "Interactive" "False"
    SubSection     "Display"
        Depth       24
        Modes      "640x480"
    EndSubSection
EndSection

And then I use this /etc/systemd/system/xorg-headless.service to launch a real stripped down Xorg:

[Unit]
Description=Headless Xorg Server (for GPU control)

[Service]
ExecStart=/usr/bin/xinit

[Install]
WantedBy=multi-user.target

And I either disable or remove gdm (or any other window manager, and its Xorg launching capability). Since all mine are headless anyway, ssh only.

Once all this is working (check /var/log/Xorg.0.log or such for Xorg launch problems) then nvidia-settings should work as long as env DISPLAY=:0 and you use arg -c :0

Spudz76 avatar Jul 03 '19 03:07 Spudz76

Oh, thank you very much for the amazing explanation of the OC process.

Link bookmarked!

For sure someone more will read and learn too.

semeion avatar Jul 03 '19 03:07 semeion

I also use commands such as these in a bash script for clocking:

nvidia-smi -i 0 -pm 1
nvidia-smi -i 0 -c 0
nvidia-smi -i 0 -pl 90

set persistent clocks (even if Xorg and all other client apps release the GPU, otherwise it will return to defaults) - this also uses a service that comes with the driver, nvidia-persistenced set compute mode to default (in case something else set them weird) set watts for power limiting (example, 90w) note this is different than the percent that Windows takes so people saying they run 70% or whatever would translate to 0.7 * your_card_default_watts for the Linux setting to be equivalent. Mine are 120W default and so 90 is the same as windows 75%

nvidia-settings -c :0 -a [gpu:0]/GPUFanControlState=1
nvidia-settings -c :0 -a [fan:0]/GPUTargetFanSpeed=100
nvidia-settings -c :0 -a [gpu:0]/GPUPowerMizerMode=1

set fans to manually controlled (fixed speed), set speed to max, set "Performance" mode for PowerMizer rules

nvidia-settings -c :0 -a [gpu:0]/GPUGraphicsClockOffset[3]=48
nvidia-settings -c :0 -a [gpu:0]/GPUMemoryTransferRateOffset[3]=512

set P0 clock offsets, this is added to base clocks (which depend on your bios too) so to get the right offsets you should inspect the default clocks (nvidia-settings -c :0 -q GPUCurrentClockFreqs) and then figure the offsets from there to desired actual clock speeds.

If you are stuck in P2 then you may need to either use [2] for the array index in the above commands (to edit perf=1 which is P2 rather than perf=2 which is P0) or try it with no array brackets at all (some card types use the non-array notation)

Some cards block P2 clock offsetting, you can check which modes have editable clocks with nvidia-settings -c :0 -q GPUPerfModes which gives each perf level and what its base clocks are and if they accept offsetting. The perf number defines each section and they are separated by ;, perf=0 is P8 (sleep/idle) perf=1 is P2 (CUDA mode) and perf=2 is P0 (highest perf, plus always editable clocks)

Running nvidia-smi while mining will tell you what P-mode you're in.

Spudz76 avatar Jul 03 '19 03:07 Spudz76

I am trying figure out all info to try do it, btw my actual (not OC) nvidia-smi report P0:

 nvidia-smi
Wed Jul  3 00:44:41 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:01:00.0 Off |                  N/A |
| 36%   57C    P0    N/A /  72W |   2116MiB /  4040MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     14478      C   /usr/bin/xmrig-nvidia                       2106MiB |
+-----------------------------------------------------------------------------+

semeion avatar Jul 03 '19 03:07 semeion

I don´t have X11 up, so, that command don´t work right now, but i can run it:

nvidia-smi -q -i 0 -d CLOCK

==============NVSMI LOG==============

Timestamp                           : Wed Jul  3 00:57:31 2019
Driver Version                      : 430.26
CUDA Version                        : 10.2

Attached GPUs                       : 1
GPU 00000000:01:00.0
    Clocks
        Graphics                    : 1733 MHz
        SM                          : 1733 MHz
        Memory                      : 3504 MHz
        Video                       : 1556 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 1936 MHz
        SM                          : 1936 MHz
        Memory                      : 3504 MHz
        Video                       : 1708 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    SM Clock Samples
        Duration                    : 294.85 sec
        Number of Samples           : 24
        Max                         : 1746 MHz
        Min                         : 139 MHz
        Avg                         : 1725 MHz
    Memory Clock Samples
        Duration                    : 295.27 sec
        Number of Samples           : 24
        Max                         : 3504 MHz
        Min                         : 405 MHz
        Avg                         : 3498 MHz
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A

semeion avatar Jul 03 '19 03:07 semeion

Yeah, see how it won't allow any applications clocks, that's because it's not a compute-only GPU (Tesla pro expensive models with no LCD connections at all) There was a flash hack for GTX970 that made it look like the equivalent Tesla card and then all that works, without Xorg (and I think it killed all the video outputs, too). But I don't think there are any similar mods for any Pascal based cards due to signed flash and other problems/locks.

But for consumer cards there is no other way than Xorg working for nvidia-settings to be able to set clocks. And even then if you can't get P0 you might not be allowed to clock.

Spudz76 avatar Jul 03 '19 18:07 Spudz76

And you have to setup Xorg and make nvidia-settings work, to even find out if your cards bios will allow P2 clock editing, by dumping GPUPerfModes setting:

  Attribute 'GPUPerfModes' (tpad:0[gpu:0]): perf=0, nvclock=135, nvclockmin=135, nvclockmax=405, nvclockeditable=0, memclock=405, memclockmin=405, memclockmax=405,
  memclockeditable=0, memTransferRate=810, memTransferRatemin=810, memTransferRatemax=810, memTransferRateeditable=0 ; perf=1, nvclock=135, nvclockmin=135, nvclockmax=840,
  nvclockeditable=0, memclock=800, memclockmin=800, memclockmax=800, memclockeditable=0, memTransferRate=1600, memTransferRatemin=1600, memTransferRatemax=1600,
  memTransferRateeditable=0 ; perf=2, nvclock=135, nvclockmin=135, nvclockmax=840, nvclockeditable=1, memclock=1733, memclockmin=1733, memclockmax=1733, memclockeditable=1,
  memTransferRate=3466, memTransferRatemin=3466, memTransferRatemax=3466, memTransferRateeditable=1

This breaks down into this when formatted better:

perf=0, nvclock=135, nvclockmin=135, nvclockmax=405, nvclockeditable=0, memclock=405, memclockmin=405, memclockmax=405, memclockeditable=0, memTransferRate=810, memTransferRatemin=810, memTransferRatemax=810, memTransferRateeditable=0

perf=1, nvclock=135, nvclockmin=135, nvclockmax=840, nvclockeditable=0, memclock=800, memclockmin=800, memclockmax=800, memclockeditable=0, memTransferRate=1600, memTransferRatemin=1600, memTransferRatemax=1600, memTransferRateeditable=0

perf=2, nvclock=135, nvclockmin=135, nvclockmax=840, nvclockeditable=1, memclock=1733, memclockmin=1733, memclockmax=1733, memclockeditable=1, memTransferRate=3466, memTransferRatemin=3466, memTransferRatemax=3466, memTransferRateeditable=1

But the important bits are nvclockeditable and memclockeditable if they say =0 then you can't clock that P-mode.

perf=0 is P8 perf=1 is P2 perf=2 is P0

Thus where it only has editable=1 in the perf=2 line, means I can only clock this GPU in P0 (under Linux) Note how the perf=1 line has all editable=0 which means locked.

Unfortunate that you have to get Xorg all set (go through hassle) just to find out you probably can't clock P2 (unless the manufacturer actually took some time to tweak their bios, like PNY did on 1060 6GB dual-fan cards). P2 being nonclockable is the nvidia default so the manufacturer has to bother to "fix" that issue, most don't.

Windows also lets you clock P2 somehow (app has an "unlock min/max" button) and I do not know of any equivalent of that either, in Linux. But then if you've got P2 clocked up to the sky, when you exit miner and the GPU flips up to P0 momentarily it will crash/lock the system due to the offsets from P2 being too high as P0 offsets (instant GPU freeze).

Best option is windows so you can disable the P2 lock and get P0 for real, then clock as normal.

Spudz76 avatar Jul 03 '19 19:07 Spudz76

Also note this example (Quadro K1100M) is before Pascal and runs P0 in Linux just fine with no changes. So I can clock this one, no problems, but it was before the whole P2-for-compute locking idea and isn't a Pascal core.

It may become unclockable and lock P2 if I ran a newer driver, I'm intentionally on 390.116 and I think I may have had problems clocking on the newer drivers. I don't think I tested the oldest possible Pascal-supporting driver version (on a Pascal) to see if it may not have the lock in it yet. You could try 390.116 and see if it computes in P0 or P2... but I am pretty sure Pascal chips were P2-lock forever.

Spudz76 avatar Jul 03 '19 19:07 Spudz76

what command did you used to dump GPUPerfModes setting?

semeion avatar Jul 03 '19 19:07 semeion

nvidia-settings -c :0 -q GPUPerfModes

again, can't be done until you make Xorg work, so that nvidia-settings works (even from commandline)

Spudz76 avatar Jul 03 '19 22:07 Spudz76

However I did just notice you are in P0 so if you do setup Xorg it should actually be clockable

Spudz76 avatar Jul 03 '19 22:07 Spudz76

nvidia-xconfig --allow-empty-initial-configuration --enable-all-gpus --cool-bits=28 --separate-x-screens if you run that, it should build you a basic /etc/X11/xorg.conf without the memory savings of forcing 640x480 and turning off accelerations, but it will work fine.

I trim the Xorg a lot so that 1GB or even 512MB cards still have room for mining jobs, but it's not needed really if you have 4GB (note it only uses a little over half for full mining speed anyway, as-is)

Spudz76 avatar Jul 03 '19 22:07 Spudz76

oh and I live life as root therefore half this stuff might need sudo (I wouldn't know)

Spudz76 avatar Jul 03 '19 22:07 Spudz76

I can´t reboot my PC right now, but i will do it on next reboot and put everythings you told me in pratice! a lot of info!

Thank you very much! If iget it working (or not) i will post here :+1:

semeion avatar Jul 03 '19 22:07 semeion

However of note, when RandomX comes in soon, you might need to trim since you will need 2080MB plus the scratchpads which will use a little over 75% of your total 4GB (you will need to recover some probably)

Spudz76 avatar Jul 03 '19 22:07 Spudz76

It can be editable, but when i change the clock offset to +200, it change in fact, but the xmrig lost like 10 H/s, what is weird...

Another thing is, seems like it have and "Adaptive Clocking: enabled", maybe it is working together with the OC and making the things worse, idk...

Attribute 'GPUPerfModes' (blackbird:0.1): perf=0, nvclock=139, nvclockmin=139, nvclockmax=607, nvclockeditable=1, memclock=405, memclockmin=405, memclockmax=405, memclockeditable=1, memTransferRate=810, memTransferRatemin=810,
  memTransferRatemax=810, memTransferRateeditable=1 ; perf=1, nvclock=139, nvclockmin=139, nvclockmax=1911, nvclockeditable=1, memclock=810, memclockmin=810, memclockmax=810, memclockeditable=1, memTransferRate=1620,
  memTransferRatemin=1620, memTransferRatemax=1620, memTransferRateeditable=1 ; perf=2, nvclock=164, nvclockmin=164, nvclockmax=1936, nvclockeditable=1, memclock=3504, memclockmin=3504, memclockmax=3504, memclockeditable=1,
  memTransferRate=7008, memTransferRatemin=7008, memTransferRatemax=7008, memTransferRateeditable=1
  Attribute 'GPUPerfModes' (blackbird:0[gpu:0]): perf=0, nvclock=139, nvclockmin=139, nvclockmax=607, nvclockeditable=1, memclock=405, memclockmin=405, memclockmax=405, memclockeditable=1, memTransferRate=810, memTransferRatemin=810,
  memTransferRatemax=810, memTransferRateeditable=1 ; perf=1, nvclock=139, nvclockmin=139, nvclockmax=1911, nvclockeditable=1, memclock=810, memclockmin=810, memclockmax=810, memclockeditable=1, memTransferRate=1620,
  memTransferRatemin=1620, memTransferRatemax=1620, memTransferRateeditable=1 ; perf=2, nvclock=164, nvclockmin=164, nvclockmax=1936, nvclockeditable=1, memclock=3504, memclockmin=3504, memclockmax=3504, memclockeditable=1,
  memTransferRate=7008, memTransferRatemin=7008, memTransferRatemax=7008, memTransferRateeditable=1

semeion avatar Jul 03 '19 23:07 semeion

About your /etc/systemd/system/xmrig-nvidia.service, why are using Nice=19 (Low priority)?

Using Nice=-20 (high priority) will increase the hashrate?

semeion avatar Jul 15 '19 02:07 semeion