DeepSea icon indicating copy to clipboard operation
DeepSea copied to clipboard

rebuild runner needs to read error messages from osd.py(runner)

Open jschmid1 opened this issue 4 years ago • 2 comments

When running a salt-run rebuild.node operation which fails before unmounting the drive but after zapping it we have stale data from osd.list(minion module).

The osd.remove func will raise an Exception [OSDNotFound] that needs to be handled by this function.

Otherwise the rebuild runner just exits with:

[ERROR   ] Failed to remove OSD(s)... skipping data1
The following minions were skipped:
data1

Resolve any issues and run
 salt-run rebuild.nodes data1

I also noted that the rebuild runner in master is not the same as in SES6. That needs to be forwardported eventually.

jschmid1 avatar Sep 16 '19 10:09 jschmid1

Shouldn't the try...except get caught in osd.remove since exceptions do not propagate across runners? Also, the osd.remove module is user facing and should give a reasonable return error independent of rebuild.node.

The _check_return is a summary of all the operations for a minion. Errors for a specific osd should come from osd.remove.

swiftgist avatar Sep 16 '19 14:09 swiftgist

Shouldn't the try...except get caught in osd.remove since exceptions do not propagate across runners? Also, the osd.remove module is user facing and should give a reasonable return error independent of rebuild.node.

Right, and I'm inclined to rework that..

The _check_return is a summary of all the operations for a minion. Errors for a specific osd should come from osd.remove.

It still needs context, I think the discussion we'll have about module return types for deepsea-next will have some influence on that.

jschmid1 avatar Sep 16 '19 14:09 jschmid1