GAM
GAM copied to clipboard
GAM errors out printing devices if it takes > 1 hour to fetch them all
Steps to reproduce
-
gam print devices
on a domain with a large (150,000 or more) devices. After running for one hour, GAM will fail with an error likeERROR: 400 - Request contains an invalid argument. - 400
- For a domain with fewer enrolled devices, it's possible to simulate this issue by setting
pageSize=1
here and then adding asleep(600)
at the bottom of thewhile True:
loop here. This tells GAM to only retrieve one device per page (the default is 100) and to sleep 10 minutes in between page retrievals. Assuming you have at least 6 devices, the process should fail in just over an hour's time just like the above issue in a.large domain.
Further detail
This is Google internal bug 237397223. It seems that a series of pages retrieved by the API as a script loops through pages and nextPageToken values (a book if you will) is only good for an hour. If you try to retrieve a page in that sequence later than one hour after the first page was retrieved (where pageToken was not set) then the API call will fail with the above error.
Most customers don't hit this because an hour is plenty of time to retrieve tens of thousands of devices but large customers with more than 150,000 devices will run into it.
It's also worth noting that deviceUsers.list() faces the same issue.
Other Google API list() calls may face similar issues and should probably be tested.
Workaround
To work around this issue we can try:
- retrieve the first hour of pages from the devices.list() API. Set the
orderBy=creation_time
parameter. That parameter will ensure we are retrieving the oldest devices first. - As we retrieve pages of devices in this first hour, look at the
createTime
parameter of each retrieved device and keep track of the newest device we see. - When we do get the 400 error, start over with another devices.list() API call and a new set of pages but this time additionally set the
filter
parameter. The filter can be set to something likefilter=register:<create_time_of_newest_device>
where we are setting the create time of the newest device we've already retrieved. This essentially allows us to pick up where we left off when the Google servers threw an error. We do need to de-dupe the results since Google will send us the last device again but that can be handled relatively easily.
This "bug" drove me crazy last year. :) https://groups.google.com/g/google-apps-manager/c/hMywF_k1FxI/m/lYXSGlJXCwAJ
I tried to export all browsers now with gam print browsers fields browsers and it still seems to behave the same for me. (Ending with an error after one hour.) Was it supposed to fix also browser export or just ChromeOS devices?
This issue should be fixed on Google's end now. The pageToken should continue to work even after 1 hour. Work remains to back out GAM's complicated workarounds that were needed to deal with the bug.
@taers232c fyi
There was a similar bug when printing cros telemetry, do you know if that is fixed as well?
Thanks