om icon indicating copy to clipboard operation
om copied to clipboard

`om assign-{multi-,}stemcell` fails, while assigning with `om curl` works

Open hoegaarden opened this issue 2 years ago • 6 comments

Observed behavior

We had multiple stemcells on opsman (621.151, 621.141 and others) and could verify that by om curl -p /api/v0/stemcell_associations

Using the Opsman Stemcell Libraty UI didn't let us select any other stemcell than 621.151. The dropdown didn't even show it as an option, it only showed the 621.151 version.

Also om assign-multi-stemcell fails

# om assign-multi-stemcell --config config-stemcell-multi.yaml
finding available stemcells for product: "pivotal-container-service"...
validating that stemcell exists in Ops Manager...
2021/09/14 13:13:05 stemcell version 621.141 for ubuntu-xenial not found in Ops Manager.
Available Stemcells for "pivotal-container-service": ubuntu-xenial 621.151

So does om assign-stemcell:

# om assign-stemcell --config config-stemcell-single.yaml
finding available stemcells for product: "pivotal-container-service"...
validating that stemcell exists in Ops Manager...
2021/09/14 13:19:04 stemcell version %!s(float64=621.141) not found in Ops Manager.
        Available Stemcells for "pivotal-container-service": 621.151

We've also tried to use om assign-stemcell / om assign-multi-stemcell with -s/-p instead of the config file, but that showed the same errors.

But if we do the assignment via om curl, the assignment works correctly:

# om --trace curl -p /api/v0/stemcell_associations -x PATCH -d "$(cat config-stemcell.json)"
PATCH /api/v0/stemcell_associations HTTP/1.1
Content-Type: application/json

{
  "products": [
    {
      "guid": "pivotal-container-service-12766d0d3957a0e6b99a",
      "staged_stemcells": [
        {
          "os": "ubuntu-xenial",
          "version": "621.141"
        }
      ]
    }
  ]
}
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Cache-Control: no-cache, no-store
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Date: Tue, 14 Sep 2021 11:27:06 GMT
Etag: W/"44136fa355b3678a1146ad16f7e8649e"
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Referrer-Policy: strict-origin-when-cross-origin
Server: Ops Manager
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Request-Id: 310243a9-dfee-4522-bed5-56ad049015dd
X-Runtime: 0.237057
X-Xss-Protection: 1; mode=block

2
{}
0


Status: 200 OK
Cache-Control: no-cache, no-store
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Date: Tue, 14 Sep 2021 11:27:06 GMT
Etag: W/"44136fa355b3678a1146ad16f7e8649e"
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Referrer-Policy: strict-origin-when-cross-origin
Server: Ops Manager
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Request-Id: 310243a9-dfee-4522-bed5-56ad049015dd
X-Runtime: 0.237057
X-Xss-Protection: 1; mode=block
{}

Expected behaviour

  • Assigning a stemcell in Opsman's WebUI should work
  • Assigning a stemcell with om assign-multi-stemcell should work
  • Assigning a stemcell with om assign-stemcell should work (or should it? Is this dependant on the om / Opsman version?)

Workaround

We've created a custom task to use in our concourse pipelines, that does the stemcell assignment with the above mentioned om curl.

Additional notes:

  • Because of the fact, that we could not select the 621.141 stemcell in Opsman's stemcell library, maybe this is not a bug in om; maybe this is an issue with Opsman? Or both?
  • It would be great to have an om subcommand where we can see the available stemcells, e.g. use the stemcell_associaton API and just present the .stemcell_library[] part of the response? Ideally we'd have something like bosh stemcells.
  • The log output from assign-stemcell "stemcell version %!s(float64=621.141) not found in Ops Manager" looks fishy

Versions and other stuffs

  • Ops Manager: v2.10.16-build.269
  • om 7.3.1

hoegaarden avatar Sep 14 '21 16:09 hoegaarden

Try putting the stemcell version in quotes in your config file. It appears the yank value is being read as a float instead of a string.

jtarchie avatar Sep 15 '21 02:09 jtarchie

Try putting the stemcell version in quotes in your config file. It appears the yank value is being read as a float instead of a string.

RIght. That might explain the issue with om assign-stemcell --config ..., because we indeed did not quote it there. However, the assignment also did not work when we used:

  • om assign-stemcell -p ... -s ...
  • om assign-multi-stemcell --config ... where we had the stemcell in the config set to "ubuntu-xenial:621.141"
  • om assign-multi-stemcell -p ... -s ...
  • the Opsman WebUI

hoegaarden avatar Sep 15 '21 06:09 hoegaarden

Hi @hoegaarden ,

Thanks for creating this issue, our team will be taking a look at this issue.

jaristiz avatar Oct 20 '21 19:10 jaristiz

I had the very same issue again with:

  • om from the platform automation 5.0.19
  • Opsman 2.10.24-build.360

I have the suspicion, that if you once rolled out a newer stemcell neiter Opsman (in the UI) nor the om assign-stemcell or om assign-multi-stemcell allow you to go back anymore.

So, we, again, had to introduce a custom task which directly uses the API (via om curl) to downgrade the stemcell.

hoegaarden avatar Mar 16 '22 12:03 hoegaarden

Hi, I have the same issue with opsman 2.10.33

FlorentFlament avatar Mar 22 '22 11:03 FlorentFlament

Just a little update here:

We think there are two things going on with this.

  1. Ops Manager does not list stemcell downgrades in the list of available stemcell associations for a product, or at least, it didn't. (As of May 2022, a fix to this is nominally shipped, in response to this issue even, but I haven't tested to see if this resolves the problem.)
  2. om validates the requested stemcell is on the associations list before attempting to assign it, in a (potentially overly aggressive, not landing perfectly in this unanticipated case) attempt to validate, fail fast, and provide strong error messaging.

1 Might be resolved in Ops Manager now. If it is, I'm not sure it's worth revisiting 2. If we don't hear of anyone else having this problem for a bit, I'd say, might be time to close this issue.

anEXPer avatar Aug 02 '22 18:08 anEXPer