navigation2 icon indicating copy to clipboard operation
navigation2 copied to clipboard

Rolling / Docker release

Open Timple opened this issue 2 years ago • 34 comments

Feature request

Feature description

Bloom-release the master branch to rolling. As the master is now targetted towards rolling.

Implementation considerations

For people targetting rolling but having no interest (yet) in compiling nav2 from source, it would save lots om compilation effort in workspaces and CI if one could apt install ros-rolling-nav2-etc. Of course one still can build from source by cloning the repository in their workspace.

Additional considerations

Perhaps this could even be an automated process.

Timple avatar Oct 26 '21 13:10 Timple

Historically, I've thought against doing this since it does add some additional administrative overhead, but I think I need to take some time and reassess that position since alot of our process has changed since then.

I wanted to avoid email dumps on build failures for small API changes requiring me to aggressively fix an issue and push an update. Since Nav2 has about two dozen packages, I'd be getting hundreds of emails over a week when a API breaking change was made. Obviously we'd fix it, but it puts that burden 100% onto me and very immediately (interrupting whatever I was doing). The build farm notification system for build failures make sense if I mess up with an established distro that I shouldn't expect to randomly start failing, but for Rolling where changes are expected, its too aggressive.

I can certainly appreciate the user convenience that would provide, but I'm also trying to balance maintenance with new development with such a small core contribution team.

SteveMacenski avatar Oct 26 '21 17:10 SteveMacenski

Well, those are some valid points.

I'd be getting hundreds of emails over a week when a API breaking change was made.

Won't you get lots of github-issues anyhow since the master is broken in this case? Since the policy now states that master targets rolling.

I can certainly appreciate the user convenience that would provide, but I'm also trying to balance maintenance with new development with such a small core contribution team.

I'll leave that descision up to you 🙂

Timple avatar Oct 27 '21 14:10 Timple

Won't you get lots of github-issues anyhow since the master is broken in this case?

Far less than 24+ a day! And usually someone will actually just make the quick fix and submit a PR, but only after apt updating, so its less immediate. I don't mind a notification of the error to fix, but I want single-digit number of notifications, not flooding my entire inbox every few hours.

@nuclearsandwich: I'm curious, would a change to Rolling failure notifications be possible? Rolling is unique among distributions where notifications about failures due to upstream packages should be less liberal since we expect failures and them to potentially happen often. Perhaps an email regarding a repo (e.g. 1 rosdistro entry with N packages) a day? If the notifications were more reasonable, I'd gladly release to Rolling as I have done with other mono-repos like robot localization and slam toolbox. For my sanity, I can't wake up to 30 emails in my inbox on each failure due to a namespace change or something silly.

SteveMacenski avatar Oct 27 '21 17:10 SteveMacenski

@nuclearsandwich: I'm curious, would a change to Rolling failure notifications be possible? Rolling is unique among distributions where notifications about failures due to upstream packages should be less liberal since we expect failures and them to potentially happen often. Perhaps an email regarding a repo (e.g. 1 rosdistro entry with N packages) a day? If the notifications were more reasonable, I'd gladly release to Rolling as I have done with other mono-repos like robot localization and slam toolbox. For my sanity, I can't wake up to 30 emails in my inbox on each failure due to a namespace change or something silly.

As someone who gets emails (pre-sorted into labels and largely not in my inbox) for all build farm failures across all distributions I completely agree with your point, the deluge is not as helpful. However, the build farm emails come directly from Jenkins which doesn't support batching or digesting via the existing email plugins I know of. I think that you could add source entries for navigation2 without doing a full bloom into Rolling. This would give you the single CI job for navigation2 based on ROS 2 Rolling binaries for its dependencies and thus, one email per build failure. However, it would mean that there would be no binary packages in Rolling for navigation2 which wouldn't necessarily meet @Timple's request.

Another option that would work today would be to release into Rolling and then consign any build farm emails referencing Rbin jobs to /dev/null via filter. But that's less than ideal. I've just been grepping through the ros_buildfarm source and conferring with @cottsay and it does not seem like disabling email notifications for binary jobs on a per-repository basis is currently possible but it's something that we may add based on feedback like this. However I can't commit to an exact timeline for when we'd be able to implement that and deploy it on the build farm. But the feedback is definitely heard and it's something we'll look at. I do think that it's likely to be an all-or-nothing setting, either per-package or per-repository rather than being able to digest up emails without resorting to a custom mailer implementation.

nuclearsandwich avatar Oct 27 '21 22:10 nuclearsandwich

However, it would mean that there would be no binary packages in Rolling for navigation2

Yeah that wouldn't help us, we already have CI setup with some heavy caching thanks to @ruffsl with Rolling without the ROS build farm. I think @Timple really wants easy access binaries.

Another option that would work today would be to release into Rolling and then consign any build farm emails referencing Rbin jobs to /dev/null via filter. But that's less than ideal.

Can you expand on that a little more? Where would we setup this filter? Sounds like this may be the best middle-road solution. Not really ideal, but would meet everyone's needs. I could add in a field for Rolling in the Nav2 readme table of build statuses that way even without emails there would be a place to see it (less emails I assume I'd get from people anyway if I didn't notice).

I CCed @nuclearsandwich on if it were possible, but doesn't seem like it is right now. If I were to better generalize the request if you were going to spend some time and actually build something new for this, it would be:

  • In bloom runs / rosdistro, it would be nice to specify how to respond to build failures
  • One such option should be to batch failure emails by repository / root rosdistro IDs rather than per-package failures, then send emails to the spanning set of people relevant in the package.xml files.
  • Another nice-to-have would be to only send emails once a week regarding rolling related failures, rather than every time it tries to turn over.

Then that could be made default ON for Rolling and OFF for actual releases.

If there are failures in rolling its "more OK" than in other situations, since that's kind of the point of rolling. I think the notification / seriousness of it should be diluted a bit. Or hell, I'd also take the following instead:

  • Send me emails for every package's build failure in rolling
  • But only send out notifications right before or right after a new Rolling sync -- since that's when a new API change would be made to break, just notify me once about it. Then every time after when a sync is made, but only on the Rolling update release frequency. I'd take 30 emails every month on those situations.

SteveMacenski avatar Oct 28 '21 01:10 SteveMacenski

emails referencing Rbin jobs to /dev/null via filter

Sounds like a personal email rule which simply trashes the rolling nav2 jobs. That, together with this:

a field for Rolling in the Nav2 readme table of build statuses

would be kind of similar to the current status. Upon build failure @SteveMacenski won't see any automated mails. But people noticing can put in a github issue or even a PR with a fix.

Timple avatar Oct 28 '21 06:10 Timple

SteveMacenski: Can you expand on that a little more? Where would we setup this filter?

Timple: Sounds like a personal email rule which simply trashes the rolling nav2 jobs. That, together with this:

Yeah, I could help provide filter templates for Google Mail, procmail, or Sieve which would allow the navigation maintainers to ignore received messages regarding navigation2 packages or any Rolling binary jobs while still receiving other build farm emails. But they would have to be applied and maintained by individual maintainer.

If I were to better generalize the request if you were going to spend some time and actually build something new for this.

Personally, email administration and management is one of my more perverse and arcane hobbies and so building out a more robust and application-specific buildfarm mailer is tempting but I don't think it's something I can spend Open Robotics time on and I won't make any public commitments using my own personal time.

What is readily achievable within the current ros_buildfarm Jenkins automation is a config flag in either the distribution.yaml file in ros/rosdistro or in the release-build.yaml file in ros2/ros_buildfarm_config that overrides the default maintainer_emails on/off setting for each release-build file. The end result would be that personal email filters would not be required for maintainers who don't wish to get Rolling binary message failures but it would still be an all or nothing option rather than a bulk or digest option.

nuclearsandwich avatar Oct 29 '21 20:10 nuclearsandwich

Ah ok. I suppose that works as well! Totally understand.

So what's the route forward here? Should I release and then get some rule or how do you want to play this?

SteveMacenski avatar Nov 08 '21 23:11 SteveMacenski

Should I release and then get some rule or how do you want to play this?

Well I actually sat down to export one of my existing gmail filters and write a script that generates a search query for a specific set of packages and due to query length limits in gmail and the fact that partial words are not searchable on that platform. It essentially takes one filter per individual package to capture the jobs for each platform.

Here was a quick script that generated a subject query matching each job name that would need to be filtered for all packages found in the current working directory (which assumes that the package.xml dirname == package name).

If I paste the generated string into gmail for any more than one package at a time it gives me an error.

I can rewrite the script so that it generates a full gmail mailFilters.xml file with one filter for each package but before I get all fancy I want to 1. verify that at least for @SteveMacenski gmail is the correct target mail platform and 2. see how easy it is to suppress maintainer email from being sent rather than just dropping them on receipt.

#!/usr/bin/env bash

JOBS=(
	Rbin_rhel_el864__PACKAGE__rhel_8_x86_64__binary
	Rbin_uF64__PACKAGE__ubuntu_focal_amd64__binary
	Rbin_ufv8_uFv8__PACKAGE__ubuntu_focal_arm64__binary
	Rsrc_el8__PACKAGE__rhel_8__source
	Rsrc_uF__PACKAGE__ubuntu_focal__source
)

PKGS=($(find . -name package.xml | awk -F/ '{ print $(NF - 1) }'))

keywords=()

for pkg in "${PKGS[@]}"; do
	for job in "${JOBS[@]}"; do
		keywords+=("$(printf $job | sed "s:PACKAGE:$pkg:")")
	done
done

echo "subject:(${keywords[@]})"

nuclearsandwich avatar Nov 11 '21 22:11 nuclearsandwich

Hi,

  1. GMail is the right platform
  2. Anything I can do to help that?

SteveMacenski avatar Nov 15 '21 20:11 SteveMacenski

@nuclearsandwich @SteveMacenski Any updates on this?

wep21 avatar Dec 17 '21 00:12 wep21

I haven't been back to this since Steve's confirmation. I'll provide an updated script as it's not trivial to make this alteration in the build farm configurations.

nuclearsandwich avatar Jan 05 '22 18:01 nuclearsandwich

Here's a gist containing a script generating the full xml format for exported gmail filters as well as a sample of the generated xml from my local (and several months-out-of-date) clone of navigation2.

https://gist.github.com/nuclearsandwich/174a442ccbae8e066af4e05ce1b26138

nuclearsandwich avatar Jan 05 '22 23:01 nuclearsandwich

@SteveMacenski Do you have any time to handle the rolling release?

wep21 avatar Mar 01 '22 18:03 wep21

Its on my queue to look into in March, but we're also still waiting for the Rolling -> 22.04 to be fully completed so we're not battling multiple issues at once. Right now, Nav2's CI is down for the same reason.

SteveMacenski avatar Mar 01 '22 19:03 SteveMacenski

@SteveMacenski I guess ros-rolling-ompl will be available in the next sync. https://discourse.ros.org/t/preparing-for-rolling-sync-2022-03-03/24521 I will also try to build nav2 on Ubuntu 22.04 and fix ci.

wep21 avatar Mar 02 '22 01:03 wep21

The docker images for ros:rolling-ros-base-jammy have finally be pushed (there where some upstream issues with rospkg and some unrelated seccomp bumps that slowed things), so we have the base images ready for building CI images:

https://github.com/docker-library/official-images/pull/11917

I've tried building Nav2 locally by commenting out some unreleased leaf dependencies and building ompl from source in our underlay, but encountered another snag with ament:

https://github.com/ompl/ompl/issues/883

As for gazebo, looks like we should try migrating to ignition as well, given Gazebo v11 doesn't currently target Jammy:

https://github.com/ignitionrobotics/ros_ign/issues/219#issuecomment-1053946152

For WIP, see:

https://github.com/ros-planning/navigation2/pull/2838

ruffsl avatar Mar 02 '22 17:03 ruffsl

As for gazebo, looks like we should try migrating to ignition as well, given Gazebo v11 doesn't currently target Jammy:

My colleagues will glare daggers at me if I discourage a migration to Ignition so I'm definitely encouraging a migration to Ignition Gazebo as the way forward but I'll point out that Jammy is still missing Ignition Gazebo as well (although we should have the rest of the packages imported soon :tm:). But ROS 2 will be able to use Gazebo 11 from the Ubuntu repositories. https://github.com/ros/rosdistro/pull/31560 is an update to the rosdep keys which should enable this. However Gazebo 11 on Ubuntu Jammy isn't an Open Robotics project and is supported by the Debian and Ubuntu communities.

nuclearsandwich avatar Mar 09 '22 23:03 nuclearsandwich

We're also a bit blocked on Rolling release on some other issues tangent to our CI build issues with Rolling. This is blocked for the immediate future until we're able to build/test on current rolling in 22.04

SteveMacenski avatar Mar 24 '22 00:03 SteveMacenski

Ping @SteveMacenski

I heard the CI issues are resolved for some time at e.g. geometry2.

Flova avatar Jun 25 '22 17:06 Flova

This is true, we're still working through some 22.04 related problems though described in https://discourse.ros.org/t/nav2-issues-with-humble-binaries-due-to-fast-dds-rmw-regression/26128. We're still not at a point where it would be wise to release to Rolling.

This is a ticket I frequently visit, it is definitely not forgotten :wink:

SteveMacenski avatar Jun 26 '22 04:06 SteveMacenski

Nice :+1:

I just gave the linked theads a read, and it helped my find a bug in our code where most message callbacks of a specific node didn't happen! The node was pretty standard, with the default single threaded executor and stuff. I am on this a few weeks and it turns out it is most likely not our fault as switching to cyclone_dds magically fixed it. The issue is that we cannot use cyclone because of performance issues with the default executor, so we use an executor from IRobot and this seems to be dependent on fastrtps...

Sadly, I am quite frustrated with ROS2 :( , as we are experiencing a severe performance degradation, much boilerplate code and the overall feeling of it being unfinished (e.g. no callback groups for python, which messes with sim time and tf quite a bit).

BUT I am looking forward to nav2 as it seems to be a big improvement compared to movebase. :)

Flova avatar Jun 26 '22 16:06 Flova