fusioninventory-agent icon indicating copy to clipboard operation
fusioninventory-agent copied to clipboard

Storage inventory triggers kernel ERRORs on some RAID controllers (patch)

Open 48kRAM opened this issue 4 years ago • 6 comments

Hello,

We've discovered an issue on some of our systems where the storage inventory triggers kernel ERRORs when querying SCSI devices:

kernel: 3w-sas: scsi1: ERROR: (0x03:0x0101): Invalid command opcode:opcode=0x85.
kernel: 3w-sas: scsi1: ERROR: (0x03:0x0101): Invalid command opcode:opcode=0x85.
kernel: 3w-9xxx: scsi0: ERROR: (0x03:0x0101): Invalid command opcode:opcode=0x85.
kernel: 3w-9xxx: scsi0: ERROR: (0x03:0x0101): Invalid command opcode:opcode=0x85.

This appears to be due to unimplemented commands in the driver for these controllers; nevertheless, we would like to avoid them in the logs.

The issue appears to occur from using hdparm to query for device info; running hdparm outside of fusionInventory-agent also causes the error. I have a patch which skips this check on certain controllers that have this issue. Would it be possible to include this patch so we don't have to continue rolling our own version of this agent?

Thanks,

-Josh Malone Computing and Info Services National Radio Astronomy Observatory

Storages.diff.txt

48kRAM avatar Apr 13 '21 18:04 48kRAM

Hi. Is there any chance of this getting added to the storage module? Is there anything I can do to help the process?

Thanks,

-Josh

48kRAM avatar Nov 29 '21 20:11 48kRAM

I will check it

ddurieux avatar Nov 29 '21 20:11 ddurieux

So your problem is only because it display errors? is it crash the agent ?

ddurieux avatar Nov 30 '21 07:11 ddurieux

Maybe @po1vo can help as he is the author of this code where the idea was to retrieve few useful information from smartctl or hdparm.

Can you provide hdparm -V output to check hdparm version ?

Anyway @48kRAM, do you feel sufficiently easy with source code to run t/tasks/inventory/linux/storages.t --dump 3ware-issue-910 ? This can generate dump file to be included in unittests, but this can also help to find the best way to skip the error. You'll then only need to join resources/linux/storages/3ware-issue-910.dump file.

g-bougard avatar Nov 30 '21 09:11 g-bougard

@48kRAM , If all you need is to avoid unpleasant logs, there are easier options:

  • Remove hdparm - the easiest
  • Filter at syslog level

@g-bougard , hdparm was the original tool to retrieve information, i merely made smartctl the primary tool to do the task and left hdparm as a backup. A better solution to the problem (is it even a problem?) would be to disable hdparm for SCSI devices as hdparm is mostly focused on ATA devices as it's manual says.

po1vo avatar Nov 30 '21 12:11 po1vo

We're going to look into removing the hdparm tool on affected systems

48kRAM avatar Dec 02 '21 16:12 48kRAM