navigation2 icon indicating copy to clipboard operation
navigation2 copied to clipboard

Semantic labeling in maps

Open SteveMacenski opened this issue 4 years ago • 54 comments

Create a basic demo, and perhaps some standards, around enabling semantic information in the map yaml to label places:

  • point and size,
  • pixels,
  • polygons
  • others?

With info like:

  • name string
  • class type
  • geometric information
  • other properties?

Then capabilities to load that information, and some capabilities enabled on top of it.

Ex:

  • navigate to a label (go to dishwasher)
  • avoid a label (stay out of conference room)
  • get distance to label (are you within 10ft of the front door?)
  • where are you (in the Steve’s cubicle)

As such I propose we discuss the following, in this order

  • [x] File formats for points, lines, and areas: YAML, XML, OSM-XML, or other
  • [x] What we want out of masks and how to embed that information: pgm/png, XML, or other
  • [ ] A couple potential designs for multi-floor mapping to come to the consensus on how it makes most sense to represent the multiple maps and the "gateways" between them. The goal of this will be only to determine if the multi-map work is a consumer of this semantic work, or a direct member of it needing to be fully designed in tandem.
  • [ ] A couple of potential designs around using the major classes of objects: point, area, mask (e.g. dock or waypoint or elevator; zoom or section; speed or keepout) to motivate the design of tools or servers required for this development.
  • [ ] Tools: server to read like map server, services to get all or some of the semantic information, wrappers for calling to get it and parsing the result for the specific objects it wants, IO tools.
  • [ ] Once we have some pretty good understandings of how we want the data formatted and how we want them to be used, we can then discuss the GUI element of this. Though I think we're all in agreement here.

SteveMacenski avatar Mar 11 '20 15:03 SteveMacenski

Following is our conversation on Slack regarding this issue:

@shrijitsingh99 We have populated our thoughts as well as a rough roadmap in the document attached below. Would be great if we could have some discussions and feedbacks on it. Semantic Maps.pdf

@SteveMacenski Can you give me some idea on what you're looking for out of semantic maps? I think a good place to start is with a list of things we might want to label or include in our maps to help drive the design. Things like keep outs, speed zones, etc are great and relatively straight forward since they're 1:1 mappings of labels, but I'm also thinking about things like marking doors, elevators, docks, certain common destination points, etc

My initial thought on that would be to add a new map_name_semantics.yaml file to the existing map.yaml file (like how we embed the pgms) which could contain the labels / types / poses of those objects. I think an interesting question is how to represent not just positions but more general areas like naming a room or "in front of a XYZ". Does the OSM format help with this at all?

I love your idea on making the map editors so that its more user friendly. I think either a Qt or rviz based solution is best, but I'll leave it to you guys to discuss. I'm open to either (maybe Qt is best?). Any thought into how we can use this information in the stack? I think that's an open question with potentially multiple reasonable solutions.

@sarath18 Here are a few ideas we have regarding the question you had.

Semantic Maps Discussion.pdf

@shrijitsingh99

Does the OSM format help with this at all?

Yeah it is used for representing such information. So whatever is data is currently supported in Google Maps, it can be represented using OSM file format.

I love your idea on making the map editors so that its more user friendly. Thats one of the main aims, to have it as easy to use as possible so that more people use it. I think either a Qt or rviz based solution is best, but I’ll leave it to you guys to discuss. More research & discussion is needed before we decided on which to use. Any thought into how we can use this information in the stack? As you said this would require more discussion and suggestions from everyone. We have some ideas in mind, will follow up on it soon.

@sarath18 Here's the link to the Layered Semantic Maps in Google Maps in case the link in the pdf is not working.

@SteveMacenski The nice thing about yaml is that someone can modify it relatively easily if they want to add something and not use this particular gui tool (e.g. build their own tool or have something automated to add points). OSM format appears to just be XML (https://wiki.openstreetmap.org/wiki/OSM_XML) which could also be fine. We just use yaml for other things so it makes sense to try to keep things consistent across the toolset. I wouldn't want to have some things in XML, others in YAML, and others in OSM. I want to keep the types of formats that effects maps down to just 1. After thinking a little, I think Qt is the best option. Rviz requires ROS to be running and having a Qt application can be done anywhere. Plus, if we make a setup assistant, that will likely be pyQi as well, so this could make it easier to integrate those ideas down the line into a single interface. Alexey has not joined the slack yet, I am encouraging him to do so.

I think before starting this work we should have a couple of concrete ideas in mind about how we use (what I see as) three classes of annotation: points, areas, and masks (points meaning a dock or elevator, a discrete thing; areas meaning a room or a zone; and a mask meaning keepout / speed masks over the map)

@shrijitsingh99 Agreed once we have a solid plan then only we should look into implementation. This require extensive discussion to cover all use cases.

@sarath18 I agree with you on that a unified format will be much easier than have parts in YAML and XML. But these maps can scale really quickly and handling it using YAML might get little trickly. On the other hand OSM based (XML type format) can provide much better readability and easier parsing. More extensive discussion on the pros and cons of these formats should be done before coming to a conclusion.

Here's a rough example of what the format would look like. https://gist.github.com/shrijitsingh99/8f24584edb11cda02227dbe21b0cb334

example

shrijitsingh99 avatar Apr 30 '20 16:04 shrijitsingh99

@AlexeyMerzlyakov I think you should join in on this (I've also sent you some Riot messages that it doesn't look like you've seen on this).

They're looking to collaborate to work on this stuff together and they have some really good ideas to add semantic information like the costmap filters and more into the navigation stack for use.

SteveMacenski avatar Apr 30 '20 22:04 SteveMacenski

The map XML looks pretty good. I'm mostly just wanting to keep things consistent - if we use an XML for this, we should consider moving the map.yaml file to an xml as well. I don't want multiple formats floating around (especially multiple formats within the same section of the stack).

This format doesn't look like it can handle masks - how would you propose having masks as well for certain zones like a speed zone?

XML, YAML, etc, that's a decision to be made but do-able either way. Same with how we represent masks. The largest open-question from this proposal is how to use this information. I'm hoping that @AlexeyMerzlyakov and I can help with figuring that out. For things like keep outs / speed there's a more clear way of how this is handled. Its less obvious for things like docks or rooms.

A example would be to host a latched map metadata topic like the /map but with this information. A subscriber can get it and pick out the information it wants (costmap get keep out polygons, an autonomy stack takes the waypoints to follow, a planner gets the rooms for commands to go somewhere, etc). We should at least be able to make a demo using the large classes of things (docks, rooms, zones, waypoints).

SteveMacenski avatar Apr 30 '20 22:04 SteveMacenski

Another idea could be using the costmap for encoding the semantic info. We could use a .pgm and a .yaml to associate each value in the map to a semantic meaning.

fmrico avatar Apr 30 '20 22:04 fmrico

The map XML looks pretty good. I'm mostly just wanting to keep things consistent - if we use an XML for this, we should consider moving the map.yaml file to an xml as well. I don't want multiple formats floating around (especially multiple formats within the same section of the stack).

Yeah, I too support only one format for maps, whether it is XML, YAML, or something else. I will make a list of pros and cons for both and we could discuss more on that then.

Another idea could be using the costmap for encoding the semantic info. We could use a .pgm and a .yaml to associate each value in the map to semantic meaning.

There are lots of limitations to this. One simple roadblock you will hit is that, what if each cell has multiple semantic information? There are other issues as well, such as, you will not be able to query information like "In which room is the fridge?".

This format doesn't look like it can handle masks - how would you propose having masks as well for certain zones like a speed zone?

A mask would be like any other polygon like a room. The difference would bet that it would contain an attribute or property tag specifying it is a mask and the mask name as well as other mask properties.

The largest open-question from this proposal is how to use this information. A example would be to host a latched map metadata topic like the /map but with this information. A subscriber can get it and pick out the information it wants (costmap get keep out polygons, an autonomy stack takes the waypoints to follow, a planner gets the rooms for commands to go somewhere, etc). We should at least be able to make a demo using the large classes of things (docks, rooms, zones, waypoints).

Yeah, so we also had thought of building somewhat of a query server where different entities can request certain data from the map and get the appropriate response. The examples you gave were pretty much what we had in mind.

Having such a query server would essentially cater to the needs of all the different types of applications, whether its cost maps, normal goal-based planners, something like a topological planner, and can even be used to have predefined paths also.

We were looking into data structures like R-Tree which would enable response to such queries.

We will try get a more concrete specification format proposal over the weekend.

shrijitsingh99 avatar Apr 30 '20 23:04 shrijitsingh99

A mask would be like any other polygon like a room. The difference would bet that it would contain an attribute or property tag specifying it is a mask and the mask name as well as other mask properties.

But a mask isn't a polygon. The situation I'm thinking of is if you want to have a speed zone mask. The value of the pixels are either an absolute speed (1.4m/s) or a percentage (43%) and you have gradients over spaces to ease into and out of restrictions. Or having slower speeds closer to static obstacles and faster elsewhere, like an inflation layer applied to speed. In this case, there's no clear polygon because its a gradient, unless you'd represent each individual pixel as a point. At that point, you'd really just be describing an image and it would make sense to just store it as an image for both compression and visual inspection (and modification).

Yeah, so we also had thought of building somewhat of a query server where different entities can request certain data from the map and get the appropriate response. The examples you gave were pretty much what we had in mind.

I think having some concrete examples of how we can integrate this capability into a behavior tree, planner, or where ever we decide is best should be a top priority to figure out. I don't think it necessarily effects the storage, loading/saving, GUI, or map server changes-- but it does effect the usefulness of adding such capability. We're on the same page on the mechanics of "how" (maybe not the exact details yet) to have the information, now's the time for "why".

We were looking into data structures like R-Tree which would enable response to such queries.

That's certainly one direction. If the robot is at point A and you request going to point B, there's no promise or implied relationship that A and B are anywhere near each other topologically. I'm not sure that type of data structure would be particularly well suited for this application unless we were querying in local neighborhoods. We could also just format this like a TF listener where we have some magic topic we use to broadcast this information, and we have an object that listens to it and can be used to get information as needed. I think a hash table would be fine and find via name fields. Another option. There exist many. It depends if we want to have a single XML-reader server (like map server) that broadcasts or each user of that information reads it itself.

SteveMacenski avatar Apr 30 '20 23:04 SteveMacenski

Also reposting this that Carlos had posted in the topological navigation ticket: https://www.cpp.edu/~ftang/courses/CS521/notes/topological%20path%20planning.pdf

It brings up a good point if we want to structure this as a sparse graph and how we related annotated points/polygons to the map (associative or relational)

SteveMacenski avatar Apr 30 '20 23:04 SteveMacenski

The idea seems to be very interesting. It is closely related with zones [#1263] and lanes [#1522] tasks where currently I am working on. And also, it is close with milti-maps/multi-floors tasks as well. So, I think I ought to be involved in and we surely need to collaborate on this.

At first, I'd fully agree with Steve's opinion that we need to clearly identify the use-cases of the proposal. There are many big areas were already touched in the ticket, each of them having its own application. That is the step I think it is reasonable to dissuss what is required before we will move to system design disscussion. This is related to the question "Why?". For example, I see the following use-cases for in there:

  • Costmap Filters (zones and lanes)
  • Multi-floors/multi-maps case
  • Labeled points for navigation (e.g. we are saying to navigator: GoTo point "B", instead of: GoTo {10,20})

I think, we need to specify all of them definitely.

The next step - is to undestand what format could be used for storing and publishing semantic maps. From one hand, the OSM/XML or YAML is good for points, labels and objects, but not suitable for such items like zones, lanes and so on. From another hand, we can't do this task using only a costmaps (multiple objects per one point limitation). Possibly, it might be required to keep both formats in one map: PGM files for map(s) layers, zones/lanes, etc. and say OSM for preserving map metadata. Therefore it is important to undestand what we finally want.

The open question here - it is a map publishing. I we have many-layered maps and OSM, we need to think over ROS2 topic relation. Is it possbily to have one OSM master topic and dynamically configured /map_layer_i topics?

Regarding R-Tree queries, I think this is rather big task related to both map-server, BT-tree navigator and topological path planner involvement. That in my opinion it is better to be moved to a separate ticket. @SteveMacenski what do you think about this point?

About QT/RViz/Web-based application: my opinion that QT or RViz are possibly best solutions. Especially QT-based map redactor GUI. This will allow developers to draw theit own maps without running ROS2 infrastructure (AFAIK, ROS2 already require QT libraries to be installed onboard, so there is no problem about QT). Also, I have some experience in window drawing on QT, so I can also help here with.

Unfortunately, this week I have a limited access to the Internet. Starting from next Monday I will be able to discuss it more detaily.

AlexeyMerzlyakov avatar May 01 '20 16:05 AlexeyMerzlyakov

This is what I had in mind for the map/query server. So that map server would be reading and parsing the XML then storing it in a suitable data-structure (which can be discussed at a later stage). The map server then offers a set of endpoints that can be accessed by using ROS service for a query where the client can request some data such as "location of docking station", "location of waypoint B" or "current room robot is in". This service can also be used to alter the states of various regions such as disabling a zone, marking room as closed temporarily, marking docking station as occupied, etc. This dynamic nature will allow modifying the map at run time. This will allow huge growth potential in the future, one future feature could be defining new zones or waypoints at runtime.

Also reposting this that Carlos had posted in the topological navigation ticket: https://www.cpp.edu/~ftang/courses/CS521/notes/topological%20path%20planning.pdf

It brings up a good point if we want to structure this as a sparse graph and how we related annotated points/polygons to the map (associative or relational)

I have had a quick glance over this, will look into it in more detail. Both methods associative or relational will need more detailed analysis and discussion to decide what would be a better approach.

We're on the same page on the mechanics of "how" (maybe not the exact details yet) to have the information, now's the time for "why".

Yeah, I have listed a few points above but will try and formalize everything into a document so that we can keep track of all the use cases. What do you think? This should help streamline discussions and ideas.

We could also just format this like a TF listener where we have some magic topic we use to broadcast this information, and we have an object that listens to it and can be used to get information as needed. I think a hash table would be fine and find via name fields. Another option. There exist many.

We can even consider something like a database (though it might be overkill) for this. TBH I think even brute force search will also be sufficiently fast since the maps we build will be pretty small in size compared to OSM who are storing the map of the entire world, so it makes sense to optimize search time.

It depends if we want to have a single XML-reader server (like map server) that broadcasts or each user of that information reads it itself.

I am more inclined to a single server since individual queries will be pretty small so each user reading the information itself will add unnecessary overhead.

It is closely related with zones [#1263] and lanes [#1522] tasks where currently I am working on. And also, it is close with milti-maps/multi-floors tasks as well. So, I think I ought to be involved in and we surely need to collaborate on this.

Definitely, collaboration is certainly needed.

The open question here - it is a map publishing. I we have many-layered maps and OSM, we need to think over ROS2 topic relation. Is it possbily to have one OSM master topic and dynamically configured /map_layer_i topics?

This seems like a good idea having a sperate topic for each zone but maybe something like having a map server that will dynamically generate the map layer and return will be more generalized if we are creating a quer-response type system that I mentioned at the beginning of this post.

About QT/RViz/Web-based application: my opinion that QT or RViz are possibly best solutions.

QT seems like a general consensus.

We can also look into Lanelet2, it is widely used in HD Maps and is built on top of the OSM specification.

shrijitsingh99 avatar May 01 '20 20:05 shrijitsingh99

Possibly, it might be required to keep both formats in one map:

I agree with you on the point that we might need both types of formats in one map to avoid being limited to one object per point. For multistoried maps, all the data in one level can be encapsulated along with the reference to the .pgm file that describes the map of that particular level.

In this case, there's no clear polygon because its a gradient, unless you'd represent each individual pixel as a point.

Speaking about masks, having gradient masks for zones seem like a really good idea. I believe adding masks to zones in XML files will be a better option and will have the following advantages over the masks defined through pgm files:

  • Gradients for the masks will be generated during runtime using the gradient functions like linear, radial, etc. defined as attributes to the mask property of a zone.
  • Multiple masks can be associated with the same zone without defining the geometry for the zone again and again. On the other hand, we need to create separate pgm file to define each type of mask.
  • pgm file needs to be edited every time whenever we need to tweak the gradient mask.
  • These gradient generators can be reconfigured dynamically during runtime which is not possible when loading static gradient maps in pgm files.

For example, gradient masks defining directions will work for hand in hand with the DirectedLanes mentioned in #1263. The preferred direction can be changed on some parameter like time of the day, emergency, crowded places etc.

<zone id="corridor">
  <!-- Node references defining a zone -->

  <!-- Zone Masks-->
  <mask type="speed" gradient="radial" percentage="0.6"/>
  <mask type="direction" gradient="linear"/>
</zone>

About QT/RViz/Web-based application: my opinion that QT or RViz are possibly best solutions.

From all the above discussion, I think we all agree on the fact that Qt is to the best option for the developing the semantic maps editor. These are a few resources/examples will help in the development of the editor.

Sarath18 avatar May 01 '20 21:05 Sarath18

I think we need to potentially back up and look at solving one problem at a time, there's too many sub-topics being discussed.

I think we're all on the same page about the annotations for points, areas, and lines (rooms, docks, elevator locations, etc). We have some disagreement on the masks, there's some talk of multi-floor mapping, and the topic of how do we use this information. Lets go one step at a time and build on each other. It makes it easier than these long comments hitting on a bunch of different topics.

As such I propose we discuss the following, in this order

  • [x] File formats for points, lines, and areas: YAML, XML, OSM-XML, or other
  • [ ] What we want out of masks and how to embed that information: pgm/png, XML, or other
  • [ ] A couple potential designs for multi-floor mapping to come to the consensus on how it makes most sense to represent the multiple maps and the "gateways" between them. The goal of this will be only to determine if the multi-map work is a consumer of this semantic work, or a direct member of it needing to be fully designed in tandem.
  • [ ] A couple of potential designs around using the major classes of objects: point, area, mask (e.g. dock or waypoint or elevator; zoom or section; speed or keepout) to motivate the design of tools or servers required for this development.
  • [ ] Once we have some pretty good understandings of how we want the data formatted and how we want them to be used, we can then discuss the GUI element of this. Though I think we're all in agreement here.

Lets start simple on bullet 1: how do we want to embed information for the lines, points, and areas.

@shrijitsingh99 and @Sarath18 have proposed OSM-XML. Can you give us a brief overview about why you like this format over other XMLs or other open standards? This seems to relate more to the autonomous car space, so I would just like you guys to, for the record, give us the reasons that you came to that decision. Also, does OSM let us create our own object attributes (e.g. can I add a new field whatever-thing as a XML field for a point? I'm wondering if this standard restricts us from expanding later potentially.

The other alternative is YAML which the existing maps and ROS configuration files are in. YAML is slower to load than XML due to its more complex structure, but that's not overly concerning to me. I've worked with YAML with tens of thousands of annotations on maps without too much of a worry. The current map structure looks like

map.yaml

image: turtlebot3_world.pgm
resolution: 0.050000
origin: [-10.000000, -10.000000, 0.000000]
negate: 0
occupied_thresh: 0.65
free_thresh: 0.196

I assume these would be trivial to move to OSM if we wanted it to be. That would, however, break backwards compatibility completely with existing maps and all ROS1 users.

No matter which format we decide, I think that there should be a new field named annotations, labels, or similar as an entry in the map metadata file (like turtlebot3_world.pgm) and not placed directly into the map metadata file. This is to allow for multiple annotations for the same map to be used (and also backwards compatibility).

SteveMacenski avatar May 01 '20 21:05 SteveMacenski

@shrijitsingh99 and @Sarath18 have proposed OSM-XML. Can you give us a brief overview about why you like this format over other XMLs or other open standards?

The OSM specification is a very mature and widely used standard, it is primarily used by OSM as well as liblanelet. Our use case will mill most likely be a subset of features that OSM offers with some minor additions. It already is used for defining geometries, routing, multi-level maps.

Each element defines tags which have a key and value:

<tag key="name" value="abc" />
<tag key="description" value="xyz" />

So properties or attributes can be added using this.

Also, does OSM let us create our own object attributes (e.g. can I add a new field whatever-thing as a XML field for a point? I'm wondering if this standard restricts us from expanding later potentially.

It is designed to be generalizable and highly scalable. New features can be very easily added without modifying the parser or the GUI tool at all. Since it has doesn't define extra xml tags for new functionality like masks. It uses something called relation for adding to properties to geometric features (i.e. points, lines, polygons etc.).

The example @Sarath18 gave:

<zone id="corridor">
  <!-- Node references defining a zone -->

  <!-- Zone Masks-->
  <mask type="speed" gradient="radial" percentage="0.6"/>
  <mask type="direction" gradient="linear"/>
</zone>

in OSM would look like:

<relation id="56688" user="shrijit" uid="12345" visible="true" version="28" changeset="6947637" timestamp="2011-01-12T14:23:49Z">
  <member type="node" ref="294942404" role=""/>
  ...

  <tag k="name" v="Meeting Room No Go Zone"/>
</relation>

<relation id="56689" user="shrijit" uid="12345" visible="true" version="28" changeset="6947537" timestamp="2011-01-12T14:23:49Z">
  <member type="relation" ref="56688" role=""/>
  <tag k="type" v="mask:speed"/>
  <tag k="gradient" v="radial"/>
  <tag k="percentage" v="60%"/>
</relation>

<relation id="56689" user="shrijit" uid="12345" visible="true" version="28" changeset="6947537" timestamp="2011-01-12T14:23:49Z">
  <member type="relation" ref="56688" role=""/>
  <tag k="type" v="mask:direction"/>
  <tag k="gradient" v="linear"/>
</relation>
</osm>

So adding something like route at a later stage can be done using a relation by <tag k="type" v="route"/>, this will no require modification in the GUI too nor the parser, you just have to add logic to process this new data.

OSM currently cannot be directly used for our purpose and would require a minor modification to work in the local cartesian system. It currently works in the geographic coordinate system. Still, even after this modification, we can leverage existing OSM tools like GUI tools, parsers, inter-format conversion tools, etc. without minor modifications to the code.

OSM being a specification supports a multitude of formats, OSM-XML being one of them. So if needed we can store data in multiple formats.

The simplicity of the OSM format having only few tags is one major factor that appeals to me. Choosing the OSM spec will allow us to focus on 'why what and how' without being bogged down by discussion on the format every time we add a new feature.

Whether to use YAML, XML or JSON can be discussed but it doesn't really affect the specification. According to me, XML has an advantage of being able to define tag-attributes but makes everything look clunky. YAML: Looks very clean and is already used in ROS but messed up indentation can become a pain to solve.

There are already several articles online comparing the different formats in detail so it won't go into that in detail.

That would, however, break backwards compatibility completely with existing maps and all ROS1 > users.

A break might be inevitable since the current format is only for single floor. When we add multi-floor we will have to add multiple pgm files to the YAML file.

This is to allow for multiple annotations for the same map to be used

I don't really a scenario for this to happen. I am not currently a big fan of having multiple files for maps. map.yaml feels more of a parameter file for map_server and not really related to the map actually so we should treat like another configuration file for a node.

shrijitsingh99 avatar May 02 '20 07:05 shrijitsingh99

A break might be inevitable since the current format is only for single floor. When we add multi-floor we will have to add multiple pgm files to the YAML file.

The backward compatibility might be provided by introducing a new tags in YAML, e.g. called map. If there is no such tag, resulting map.yaml to be treated old-map compatible, if exists - we may use new multi-floor/multi-map model. For example, the map.yaml for 2-floors configuration might look like:

map: FL1
  image: world_A.pgm
  labels: world_A.osm
  resolution: 0.050000
  origin: [-10.000000, -10.000000, 0.000000]
  negate: 0
  occupied_thresh: 0.65
  free_thresh: 0.196
map: FL2
  image: world_B.pgm
  labels: world_B.osm
  resolution: 0.150000
  origin: [-15.000000, 20.000000, -5.000000]
  negate: 0
  occupied_thresh: 0.7
  free_thresh: 0.1

This map also could have a reference to OSM-files (if exist) containing labels. The main shortcoming of this approach will be producing many types paradigm: PGM+OSM+YAML. Which I think we need to avoid.

As an alternative, map server API could support both YAML or OSM files metadata. YAML will be remained for backward compatibility with ROS1 and current configuration, OSM will be used for newer types of (multi)maps. I think, there is no problem to convert dynamically YAM-compatible format into OSM and vice versa.

If we will choose OSM metadata type, it is not clear how could we subscribe/publish OSM using a ROS2 topic?

AlexeyMerzlyakov avatar May 04 '20 15:05 AlexeyMerzlyakov

OSM currently cannot be directly used for our purpose and would require a minor modification to work in the local cartesian system. It currently works in the geographic coordinate system.

How do we overcome that?

Still, even after this modification, we can leverage existing OSM tools like GUI tools, parsers, inter-format conversion tools, etc. without minor modifications to the code.

I assume you mean with minor modification, so we're at least working in cartesian coordinates (and probably some buttons like drop dock / waypoints / etc that are nav specific)

OSM recommends using PBF rather than OSM-XML. Why not that? By the time we're changing formats and if we're going to add a GUI and such, why not make it high performance?

A break might be inevitable since the current format is only for single floor. When we add multi-floor we will have to add multiple pgm files to the YAML file.

That's not correct. If we continue with yaml, we would only need to add more fields for different floor map locations, not removing any information. This is 100% backwards compatible for the case of a single floor. We must provide conversion scripts if we go this route, but I think that messing with this specification could be very detrimental for getting users to move to Nav2 if their entire library of maps are in an incompatible format. We may need to support both, even.

I am not currently a big fan of having multiple files for maps. map.yaml feels more of a parameter file for map_server and not really related to the map actually so we should treat like another configuration file for a node.

From real-world experience that this separation is important. There should be a central file for telling it where to find other stuff if necessary. While I wasn't around in the Willow days when the map.yaml was created, my guess is its not because they couldn't embed that information into the header of the pgm. The map.yaml is not a configuration for the map_server because its associating data with hyperparameters required to interpret that data.

@AlexeyMerzlyakov suggestion looks as reasonable as anything else. I may have preferred a mapping of {floor_id: /path/to/osm} but we don't need to get bogged down in details on that. I mirror his thoughts on backwards compatibility of files and/or conversions and/or backwards support of yaml.

I think per @shrijitsingh99's comment, there is some benefit drawn from OSM (or at least conversions to and from it) in use of GUI tools for annotation, if its possible to modify them to be useful for our needs. I would be OK with using OSM for the labels / annotations if we can show that we can embed custom fields on points and the necessary annotations we may need for navigation. A few examples

  • Point that has type of "Dock" with a name of "Dock1" with coordinates (X, Y, Theta) and priority value 0.52
  • Point of type Waypoint with name "waypoint 71" with coordinates (X, Y, Theta) and a tolerance of 0.42
  • Point of type Viapoint with name "Front Door" with coordinates in GPS with tolerance of 0.42 and action type "take photo".
  • An area with label living room and a vector list of items in the room (e.g. TV, couch, etc)
  • A region labelled no go zone with name "Chemical Spill"
  • How to support multiple levels. I imagine we have a base XML with just includes to the other XMLs with a name for "Floor 1", "Floor 3". etc.
  • Global settings of the map like the path (relative or absolute) to the map image relative/path/to/map.png, time created, location, map frame origin, number of fields in the OSM file, etc.

Is there an example GUI that you think we could use as a starting point to make some GUI editor program if we use this format that can support custom fields like above? I imagine in a simple case we have the map in a Qt window with a side bar of buttons like "drop dock" to drag and drop or set exact coordinates, then opens a menu to input other metadata about that object. Or for the areas drag a box or make line segments to create a shape. It would be create for basic primitives like dock, waypoint, door, elevator, etc there were just direct buttons with pre-configured metadata profiles.

If so, then we'd just need to backwards support .yaml files in the code I suppose and probably also provide a conversion script (which would be really simple with only 5-6 entries). My hesitations are making sure we don't have YAML + OSM + PNG like Alexey says, making sure there's a way for existing yaml users to use Nav2 through native support or conversions, making sure that the standard can cleanly support the types of data we require (see above), and that there's some tooling in the ecosystem for that format that makes it valuable to use for our needs. If we satisfy those, I'm OK with using OSM for the points / regions / connections annotations.

SteveMacenski avatar May 04 '20 19:05 SteveMacenski

The backward compatibility might be provided by introducing a new tags in YAML, e.g. called map.

Sounds good

As an alternative, map server API could support both YAML or OSM files metadata.

Yeah, this will be a good approach. Converting between either types will be pretty straightforward.

If we will choose OSM metadata type, it is not clear how could we subscribe/publish OSM using a ROS2 topic?

So this requires further discussion. I had mentioned some stuff here:

This is what I had in mind for the map/query server. So that map server would be reading and parsing the XML then storing it in a suitable data-structure (which can be discussed at a later stage). The map server then offers a set of endpoints that can be accessed by using ROS service for a query where the client can request some data such as "location of docking station", "location of waypoint B" or "current room robot is in". This service can also be used to alter the states of various regions such as disabling a zone, marking room as closed temporarily, marking docking station as occupied, etc. This dynamic nature will allow modifying the map at run time. This will allow huge growth potential in the future, one future feature could be defining new zones or waypoints at runtime.

So building on this, for zone maps we can either internally in the map_server convert to occupancy grids and publish or just publish out the polygon and gradient information in a custom message type.

Something on these lines, would like to here opinions regarding this mechanism. I can expand on the above as I might not have been very clear on this.

How do we overcome that? Haven't gone deep into this, but from the surface, since it is just a specification we are free to modify it and and stuff to it as we will be interpreting the format to whatever we find suitable.

My main point was to use the OSM specification as a base to build our own spec, not to exactly use OSM.

OSM recommends using PBF rather than OSM-XML. Why not that? By the time we're changing formats and if we're going to add a GUI and such, why not make it high performance?

Yeah, it supports multiple formats, we can choose whatever we think fit. I was stressing building off the specification, the file format can be kept to anything be it PBF, XML, YAML they are all inter convertible.

There should be a central file for telling it where to find other stuff if necessary. That makes sense since.

  • Point that has type of "Dock" with a name of "Dock1" with coordinates (X, Y, Theta) and priority value 0.52

  • Point of type Waypoint with name "waypoint 71" with coordinates (X, Y, Theta) and a tolerance of 0.42

  • Point of type Viapoint with name "Front Door" with coordinates in GPS with tolerance of 0.42 and action type "take photo".

  • An area with label living room and a vector list of items in the room (e.g. TV, couch, etc)

  • A region labelled no go zone with name "Chemical Spill"

You can represent the above stuff for sure.

  • How to support multiple levels. I imagine we have a base XML with just includes to the other XMLs with a name for "Floor 1", "Floor 3". etc.

They do have support for multi-levels, you can see it on their maps, but haven't looked into this in detail. Will look it up.

  • Global settings of the map like the path (relative or absolute) to the map image relative/path/to/map.png, time created, location, map frame origin, number of fields in the OSM file, etc. Can be added with custom tags, not sure if there in the current spec.

I think there are 2 discussion going on and getting mixed up. From my perspective the specification of how you represent the semantic data (like the OSM specification, what tags to have etc.) is separate from which file format we use, because you can use any file format given a spec.

I might have been unclear about this so stressing on it again: I am not saying we use the OSM directly but use it as a base incorporating its core features (how it represents lines, polygons, properties etc.) to build own specification.

shrijitsingh99 avatar May 04 '20 21:05 shrijitsingh99

My main point was to use the OSM specification as a base to build our own spec, not to exactly use OSM.

In that case, aren't we just talking about XML then? If we build from it, then we probably won't have direct access to the annotation tools.

You can represent the above stuff for sure.

Can you provide snippet examples of these in the spec. I'm looking for direct validation that these can be supported.

I think there are 2 discussion going on and getting mixed up. From my perspective the specification of how you represent the semantic data (like the OSM specification, what tags to have etc.) is separate from which file format we use, because you can use any file format given a spec.

OSM is the metadata, the png / pgm / etc are the actual map images. What I was looking from that line is just the ability to have some global parameters of which one can be a filepath to the map image. I'm not asking that OSM knows or cares about what this is. We just need to be able to globally embed the same information as in the map.yaml in this map.osm file.

I might have been unclear about this so stressing on it again: I am not saying we use the OSM directly but use it as a base incorporating its core features (how it represents lines, polygons, properties etc.) to build own specification.

I think that's where you lose me a bit. If we're not using this spec to use the tools that it works with, why use it at all? I can see value in using an existing standard and also then reaping the benefits of tooling available. If we're going to change the standard, then we're really just talking about XML.

SteveMacenski avatar May 04 '20 21:05 SteveMacenski

Yeah, this will be a good approach. Converting between either types will be pretty straightforward.

Looks like now we are on the same page about backward comatibility. Great.

So this requires further discussion. I had mentioned some stuff here ...

I agree that having a server with pointy service queries - it is a good practice. However, this does not cover all cases. Let's imagine: I am writing a new path planner which counting all the features we are discussing here (dock stations, doors, room types, etc...). This planner wants to have all map information in the same place including the metadata in order to make its job. If path planner will start to sending to a map server a bunch of pointy requests per each object (location of docking station, location of the door, etc...) on each path planning iteration (requests should be iterative because of dynamism of the world as mentioned above), this may highly affect whole system performance.

Therefore we additionally may need to have an ability to share whole metadata dynamically in one place through a map server. New imaginary path planner might parse itself metadata-file and select necessary features from it, but I consider this is rather map server responsibility. In this case we can continuously sharing whole metadata through a ROS2 topic or having aggregating service requests. However, both msg and srv formats are rather restricted for a spiecific types of data. By adding new type of objects into metadata will require re-build of services and/or messages and will break compatibility with previous versions of messages or requests, which is not very suitable for a flexibility we are lookin on.

Another way - is to sharing a hash-table[] via /topic with integer hashes (for objects' keys) and its values. Just a brainstorm. Anyway, I think it is an open question for today.

Another open question - is that @SteveMacenski told about. Why do not prefer XML over OSM if OSM is not fully suitable for us and we need to adjust/modify OSM format along tools for OSM format for our needs. XML or even YAML looks more straightforward for that.

AlexeyMerzlyakov avatar May 05 '20 16:05 AlexeyMerzlyakov

@AlexeyMerzlyakov please keep the conversation on topic, we're not at discussing how we use information. See the bulleted list up the thread. We'll be talking forever and never doing if we don't keep focused on one issue at a time. Right now we're discussing the format to save points, areas, and map metadata. (but we could easily have a service on map_server to get all metadata or metadata matching some regex by feature name/type/area.)

Why do not prefer XML over OSM if OSM is not fully suitable for us and we need to adjust/modify OSM format along tools for OSM format for our needs. XML or even YAML looks more straightforward for that.

This is the big question for me. OSM as a spec makes sense to use if we can use it and use tools from it, but if we're going to change it and not be able to use the tools, then its just XML. There's not a problem with that. I'm just trying to make sure we're making a decision with all the facts based on how we'll end up functionally using this. I really don't want to keep on this discussion for another week.

@shrijitsingh99 can you comment on if its just XML or if OSM means something here we need to consider? XML we can obviously replicate anything in the existing yaml for so I have no concerns. We'll just need a yaml (for backwards support) and an xml parsing library. If its just XML and @shrijitsingh99 this is the way we want to go, I approve.

SteveMacenski avatar May 05 '20 19:05 SteveMacenski

Hello everyone, I have talked to @shrijitsingh99 regarding the file format and what I believe he wants to convey is, we use the standards specified in these formats and the use them for creating semantic maps. By standards, we mean how semantic information is defined in these formats and the relationship among them. For example

  • OSM: nodes, ways/paths, zones, relations
  • GeoJSON: type, geometry, properties

and other related formats used in mapping applications.

Out of all these standards, we thought. We can use the maturity of all the standards and build on top of it to create our own semantic information. Among these, we thought OSM was the best to build upon. The following links:

define the basic elements and standards used to represent data and not the file format XML or PBF. Our implementation will boil down to using XML and YAML as file formats (which hopefully everyone at this point agrees on).

By using these standards we can leverage the power of already existing tools to and building on top of them to support our use case.

Now by using XML with YAML for semantic maps, I would like to summarize all the key features we have discussed:

  • Addition of semantic information to existing PGM/image files
  • Backward compatibility
  • The tools that will be used (GUI and parsers) are already present in the ecosystem i.e. YAML used map configs and XML in BehaviorTrees
  • Using these file formats will provide high readability and configuration capabilities. PBF file (compressed) files will have no readability.
  • We can convert the data storage into any format we want (JSON, PBF) and highly compressed file formats like PBF can be used to publish map metadata.

Sarath18 avatar May 05 '20 20:05 Sarath18

@shrijitsingh99 can you comment on if its just XML or if OSM means something here we need to consider? XML we can obviously replicate anything in the existing yaml for so I have no concerns. We'll just need a yaml (for backwards support) and an xml parsing library. If its just XML and @shrijitsingh99 this is the way we want to go, I approve.

I think we all are nearly on the same page. @Sarath18 summarized what I wanted to say more clearly.

So if by XML you mean creating a new standard which uses some of the concepts of the OSM spec (namely points, ways and relations) then I guess we are on the same page.

Coming to file format, do we want to use XML, YAML or something like compressed? I am fine with any but like @Sarath18 said if we use compressed there will be no readability.

Backward compatibility is 100% agreed upon, how we are going to make it backward compatible exactly needs one final discussion.

shrijitsingh99 avatar May 05 '20 20:05 shrijitsingh99

Point that has type of "Dock" with a name of "Dock1" with coordinates (X, Y, Theta) and priority value 0.52

<node id="2" x="1.0" z="1.0" z="0.0" yaw="1.57" >
    <tag k="type" v="Dock" />
    <tag k="name" v="Dock1" />
    <tag k="priority" v="0.52" />
</node>

Point of type Waypoint with name "waypoint 71" with coordinates (X, Y, Theta) and a tolerance of 0.42

<node id="8" x="5.0" z="6.0" z="0.0" yaw="1.57" >
    <tag k="type" v="Waypoint" />
    <tag k="name" v="waypoint 71" />
    <tag k="tolerance" v="0.42" />
</node>

Point of type Viapoint with name "Front Door" with coordinates in GPS with tolerance of 0.42 and action type "take photo".

This one is not very straighforward as you will need to define a GPS reference point in the map.

<reference id="10" lat="54.0889580" lon="12.2487570" x="0" y="0" />

<node id="9" lat="13.74534" lon="14.86546" >
    <tag k="type" v="Viapoint" />
    <tag k="name" v="Front Door" />
    <tag k="tolerance" v="0.42" />
</node>

An area with label living room and a vector list of items in the room (e.g. TV, couch, etc)

<node id="213" x="5.0" y="10.0" />
<node id="214" x="5.0" y="5.0" />
<node id="215" x="10.0" y="5.0" />
<node id="216" x="10.0" y="10.0" />

<node id="2" x="5.0" z="6.0" z="0.0" yaw="1.57" >
    <tag k="name" v="TV" />
</node>
<node id="3" x="7.0" z="6.0" z="0.0" yaw="1.57" >
    <tag k="name" v="Couch" />
</node>
<node id="4" x="8.0" z="6.0" z="0.0" yaw="1.57" >
    <tag k="name" v="Table" />
</node>

<way id="312">
    <nd ref="213" />
    <nd ref="214" />
    <nd ref="215" />
    <nd ref="216" />
    <nd ref="213" />
    <tag k="name" v="Living Room" />
</way>

You get the list of items in the room by querying all the items defined within the room bounding areas so no need for an explicit relation between contents of the room. Nonetheles if you do need an explicit relation between these two entities, you cant do it as below:

<relation id="416">
 <member type="Node" ref="2" />
 <member type="Node" ref="3" />
 <member type="Node" ref="4" />
 <tag k ="name" v="Living Room Contents" />
</relation>

A region labelled no go zone with name "Chemical Spill"

<node id="213" x="5.0" y="10.0" />
<node id="214" x="5.0" y="5.0" />
<node id="215" x="10.0" y="5.0" />
<node id="216" x="10.0" y="10.0" />

<way id="312">
    <nd ref="213" />
    <nd ref="214" />
    <nd ref="215" />
    <nd ref="216" />
</way>


<relation id="415">
    <member type="Way" ref="312" />
    <tag k="type" v="No Go Zone" />
    <tag k="name" v="Chemical Spill" />
</relation>

How to support multiple levels. I imagine we have a base XML with just includes to the other XMLs with a name for "Floor 1", "Floor 3". etc.

There is a way for this, but haven't looked at it. Will go through the docs and update this.

shrijitsingh99 avatar May 05 '20 22:05 shrijitsingh99

We can use the maturity of all the standards and build on top of it to create our own semantic information

So to be clear, when you say something like "We want to use OSM standard" that means that you choose the OSM standard and the application will comply with that standard. Ex. "I follow the ISO 26262 standard" doesn't mean that you pick the things you like and build off of it / change the specifications to fit your needs. I think what you mean to say is that you see some OSM standard as an example set of tags and structures to borrow to build either a new standard or a new format based on XML.

We should be clear about that language moving forward. Saying something like "We use the OSM format" would be incorrect and misleading, unless the specification allows us to work completely within it. If the spec allows you to create custom nodes according to some standards and we comply with those standards, then we could say that we are OSM compliant. Else, we are an XML format taking inspiration from OSM.

Let us know if we're taking inspiration from OSM or complying with OSM for the requirements let out above.

By using these standards we can leverage the power of already existing tools to and building on top of them to support our use case.

[Citation needed], Can you give us, specific, examples of tools that we can use from OSM in using this format with our own custom extensions? For me, that was the #1 reason to look at OSM was to use their tools, and its no longer clear to me if what we'd make is compliant with the standard to use them.

I don't care one way or another if they're PBF, XML, or YAML. I want want there to be an engineering reasoned rationale for the choice. I would prefer XML or YAML for human readability and being able to use general YAML / XML parsing tools for making new tools. But if there's a structured reason that PBF is the best, lets do it. It sounds like though we're swaying towards something human readable.

This one is not very straighforward as you will need to define a GPS reference point in the map.

Would you though? If your robot had a GPS fix at Lat1 Long1 and you have a wp in the XML as Lat2 Long2, do you need a reference?

<relation id="416">
 <member type="Node" ref="2" />
 <member type="Node" ref="3" />
 <member type="Node" ref="4" />
 <tag k ="name" v="Living Room Contents" />
</relation>

This I'm curious about - how does this work? So if you had 4 objects (node tags) why define the relationship? Can't you query whatever parses this for all nodes in an area set out by a way what you're using to describe rooms? Do you have to define all internal relationships explicitly? It's also not clear to me why the living room is a way and the chemical spill is a relation. They look to contain the same types of information.

To recap:

  • Looking for formalization on what the proposed XML-like structure is
  • Either way, a engineering rationalized argument for it (if OSM and example tools that are helpful, that's a good reason. If just XML then why XML over YAML)
  • Still on same page for having some GUI tools, whos scope will be discussed later
  • Supporting Yaml backend for backwards compatability (map server can load, map saver only does xml, and GUI should also be able to load yaml but export xml)

Once we have that settled, we can write that up in the design doc that we're using XYZ format for ABC reasons, with 123 examples for [I'm out of canonical sequences] types of geometries covering intentional scope of {dock, wp cartesian, wp gps coord, elevator, door, room, area, zone, lane (?), arbitrary object, keep naming important things we need to make sure are covered well}. Then move on to discussing the masks, which I think should be short and to the point.

lane(?): is the way in OSM supposed to be a lane? https://wiki.openstreetmap.org/wiki/Way this makes it seem like it should be a center graph or something. Is there a more accurate tag to use for closed space rather than a travering way? Seems like a way should be used to define routes and lanes.

SteveMacenski avatar May 05 '20 22:05 SteveMacenski

We should be clear about that language moving forward.

Got it, will be more explicit going forward.

Let us know if we're taking inspiration from OSM or complying with OSM for the requirements let out above.

Inspiration

Complying won't be possible even if we wanted to since it requires lat and long for defining position.

[Citation needed], Can you give us, specific, examples of tools that we can use from OSM in using this format with our own custom extensions? For me, that was the #1 reason to look at OSM was to use their tools, and its no longer clear to me if what we'd make is compliant with the standard to use them.

Editors: https://wiki.openstreetmap.org/wiki/Comparison_of_editors JOSM (Java based), Merkaartor (Qt), iD (Web) being the commonly used ones.

Database Tools: https://wiki.openstreetmap.org/wiki/Databases_and_data_access_APIs

Format Conversion Tools: https://wiki.openstreetmap.org/wiki/Converting_map_data_between_formats

Multiple Supported File Formats: https://wiki.openstreetmap.org/wiki/OSM_file_formats

Not sure how useful this is we go the inspiration route, maybe by modifying some of the tools especially the GUI ones.

It sounds like though we're swaying towards something human readable.

Yeah, compressed makes more sense if the maps are larger like spanning km. Indoor maps are pretty small.

Would you though? If your robot had a GPS fix at Lat1 Long1 and you have a wp in the XML as Lat2 Long2, do you need a reference?

In that case, might have to come up with something for this, even adding such points in GUI wont be as simple as dragging and dropping a point since its not on the local XY coordinate system.

This I'm curious about - how does this work? So if you had 4 objects (node tags) why define the relationship? Can't you query whatever parses this for all nodes in an area set out by a way what you're using to describe rooms? Do you have to define all internal relationships explicitly?

You don't need to define explicitly:

You get the list of items in the room by querying all the items defined within the room bounding areas so no need for an explicit relation between contents of the room. Nonetheless if you do need an explicit relation between these two entities, you cant do it as below:


It's also not clear to me why the living room is a way and the chemical spill is a relation. They look to contain the same types of information.

The living room was just a polygon so I defined it using a way. You can do this using a relation if you defined a type called room, then a relation would make sense without that "living room" is just a name for a polygon.

Chemical Spill spill might have multiple properties like No Go Zone, Radiation Zone, etc associated with it so you might need to reuse the way for each of them.

  • Looking for formalization on what the proposed XML-like structure is
  • Either way, a engineering rationalized argument for it (if OSM and example tools that are helpful, that's a good reason. If just XML then why XML over YAML)
  • Still on same page for having some GUI tools, whos scope will be discussed later
  • Supporting Yaml backend for backwards compatability (map server can load, map saver only does xml, and GUI should also be able to load yaml but export xml)

👍 on all the points

shrijitsingh99 avatar May 05 '20 23:05 shrijitsingh99

Not sure how useful this is we go the inspiration route, maybe by modifying some of the tools especially the GUI ones.

Ok, that was what I was looking for, confirmation if we can use them, the answer is probably not but a good starting point to fork from.

In that case, might have to come up with something for this, even adding such points in GUI wont be as simple as dragging and dropping a point since its not on the local XY coordinate system.

I'm imagining some of these things being automatically added by other tools than a GUI. We support what we support in the GUI, but want to make sure that the typical navigation2 use-cases can be embedded in the format.


So last point would then be "why OSM-inspired XML over defining our own XML spec or XML". I'll say that YAML is slow to load and these could have a lot of information in it (thousands of waypoints, rooms, etc). It doesn't sound like anyone is strongly championing for YAML, though it would be nice to have everything in the same file format. You did bring up that the BT are XML as well which I didn't think about, so we're already mixed. That resolves a bunch of my issues with that.

@AlexeyMerzlyakov any objections with a OSM-inspired XML given this discussion?

If he's OK with it, I think the next step is to open a WIP PR to make a markdown under docs/design/. Starting with a quick blurb that we can refine later about what is semantic navigation/labeling. What we just finished was the file formats, we should write up a summary of our example support cases (dock, etc) and geometries (points, regions, routes), options of formats considered, why this OSM-inspired XML over the others, and how we will embed our support cases in the file (specific XML examples).


Onto the next line item: masks & non-geometric/discrete annotations (like gradients in a speed zone, an image mask for keep outs, multidirectional lane areas, etc). You guys did the heavy lifting on the XML discussion, we'll do that for this. This discussion is only around the file formats we want to store this stuff in. We'll decide in another bullet on the list how we want to actually use it.

What we propose is using images like the existing maps (pgm, png, anything a typical image loader can load, etc). This allows for a visual inspection and modification of these files in common image editing tools with are currently part of commercial work flows for some of these types of things. It also allows editing in many tools vs forcing users to use specific tools we create that might limit what they want to do with it. Given that folks could want to embed both rough percentages, exact values, or odd shapes, I think it would be good to keep this visual and general. This is in line with the work in designing that Alexey and I have been thinking about and he has begun to implement with the costmap filter tickets for enabling speed zones and keep out zones. Though it would be good if we were able to include annotating this type of thing in the GUI as well. Future topic.

I think from our discussions on directional lanes, it would make more sense to use the XML format to embed the directional graph data with a way tag (which is actually want its intended for). For non-directional hard-constraint lanes, the image may also be an option. So this method of embeding information is only used for spatial information that's relational to the map image itself.

The XML would have tags in the root for add ons to load (like <tag keepout file="/path/to/file.png"/>) whatever the tag we use, we'd make it able to be parsed automatically to find these masks and load them onto a specified topic or into memory or whatever the application wants.

SteveMacenski avatar May 05 '20 23:05 SteveMacenski

https://github.com/osrf/rmf_demos You should click on the video and watch it. It looks like OSRF has created some annotation tooling, framework, and integrations. I don't know much more on it right now beyond that video. The documentation is limited and I don't see alot of the code I would have expected to see to make something like this (so either I'm missing something, there are repos not publicly available to recreate, or some of this capability was fudged for a demo).

We may want to consider aligning with this project and making integrations with it if it looks sufficiently mature and this project is going to stick around for awhile. Please take a look and give me your thoughts and I can ping folks at OR with out plans / thoughts and see what they say. There are a few RMF repos under that org so take a quick glance through them.

https://github.com/osrf/traffic_editor https://github.com/osrf/rmf_schedule_visualizer https://github.com/osrf/rmf_core

SteveMacenski avatar May 06 '20 07:05 SteveMacenski

@AlexeyMerzlyakov any objections with a OSM-inspired XML given this discussion?

No, there are no objections about it. XML is a human readable and widely-used format everyone to know. There are tons of XML parsing/making existing tools everyone can use - parsers: tinyxml/tinyxml2, libxml2, libexpat, pugixml, rapidxml, xerces?, etc...; GUIs: CAM Editor, BaseX; plugins: VEX for Eclipse, Visual Studio itself has a XML tools onboard, many other for vim and emacs. XML already widely used in Navigation2 stack and ROS2 (with tinyxml/tinyxml2). Also, OSM-inspired XML will be more useful than just our homebrew XML because we can utilize OSM -> to OSM-inspired XML conversion in the future to import OSM maps into Navigation2. So, I think we are on agreement there.

AlexeyMerzlyakov avatar May 06 '20 09:05 AlexeyMerzlyakov

Regarding "masks & non-geometric/discrete annotations": we have following use-cases for today:

  • Keep-out zones
  • Speed limit zones
  • Preferred (indirected) lanes
  • Directed lanes
  • Costmap & robot move forcing gradients

For first two bullets and also for other zones-related filters that may be - the most convenient format to use is any raster graphics format (be it PGM, PNG, BMP or something else). The main advantages to use raster images over vector shapes describing in XMLs are:

  1. Ability to make any odd shape of zones
  2. Simplicity/Visibility to edit zones masks in any preferable graphics editor (e.g. GIMP)
  3. Algorithmic simplicity for better CPU performance. This point is rather related to question "How?" which is out of scope of current bullet.

Regarding speed limit zones - I see no problem to specify zone' numbers by color + having XML descriptions per each color with its speed limit given in percent or in absolute value.

For lanes and gradients raster and vector formats are both suitable. Raster formats have the same advantages as for zones, but vector formats here are also OK to use until we won't enable odd-shaped lanes or gradients.

Summarizing all points: if we will keep unified mask format for all costmap filters (again, to avoid multi-formats for one "costmap filters" task), it is more reasonable to choose raster images over vector, I think.

AlexeyMerzlyakov avatar May 06 '20 10:05 AlexeyMerzlyakov

So having gone through the repos, here is my takeaway from this solely from a semantic map point of view, this is no way a very thorough analysis and would be great to hear more about the direction of RMF from people working on it.

I have highlighted the cons, @Sarath18 mentions the pros:

  • Focussed on fleet management which has predefined paths in contrast more focussed on the general navigation problem. We also don't want to bind ourselves with a specific robotics middleware
  • The format focus elements needed to make a simulation environment and predefined paths not represent complete semantic information exactly
  • GUI is still pretty basic and early stages of development compared to some of the editors used in OSM
  • Some features of GUI will not be useful for us and extra stuff needs to be added.
  • Only predefined objects having respective Gazebo models can be defined
  • No support for features like masks and zones.
  • Waypoints, GPS navigation or via point navigation not supported, only navigation support is through predefined paths
  • Driveable areas are represented as lines with predefined width, cannot specify custom shape for it.
  • The concept of floors exists but only to define the texture of the flooring, No native support for zones and areas.
  • Dependency on them for format and new features else we will be essentially just be building a format of their base which can be done with OSM as well.
  • No unique ids for identifying elements
  • Pretty new and not very mature (around 7 months it seems), we don't want tie semantic maps to a specific implementation of a robotics middleware

As a robotics middleware I can see the potential benefits of it but besides the GUI and parts of format other things are not that useful. But we should find out more about their progress and direction to get a better idea.

One point I do think we should do is add conversion scripts to their format since our format will most likely be a superset of this. In this way we can support their framework too.

shrijitsingh99 avatar May 06 '20 20:05 shrijitsingh99

How I look at traffic editor is, it has great potential and some of its features do overlap with our current interests. Since it's fairly new we can try combining both the projects or have some kind of cross-compatibility support which is in benefit of both. Some key features of using traffic_editor include:

  • Aligns with some of our goals
  • Has good editor as a base to start with.
  • Format reuses point definition like OSM
  • Supports main features which we also want to support like walls, paths, objects, elevators etc.
  • Similar floorplan idea and multilevel support.
  • Uses Qt and YAML. These tools are already in the ecosystem.
  • Provides the ability to create maps by drawing on top of images.
  • Simulation can be loaded/generated directly from the map data
  • Integrated with RMF which could be deployed for robot fleets.

I agree with @shrijitsingh99 on a few points. It might not be a good middleware to use for semantic maps but has a great editor that we can build upon. For our use case, I think we need the ability to assign a tag to any element in the map instead of adding predefined object. Hence, the GUI lacks entity description support.

I would like to describe more on the traffic editor GUI when we start our discussion on it.

Sarath18 avatar May 06 '20 20:05 Sarath18

So I looked at it from the stand point of their tooling (GUI) and standards of data formats. The other stuff seems multi-robot specific and potentially we make some integrations there down the road. For now using some format that another ROS project in mobile-robotics-land could be useful. Especially if they're looking to use the nav stack for real-world demos. Users could use a single file for both sets if they wanted to, that seems powerful to me. Plus aligning in open-source is a force multiplier, more hands and eyes on a thing to debug and develop.

On Cons: focus on fleet management, masks, etc - totally understand. They do some stuff we don't need right now and they don't support stuff we do. But we can work off that or merge in those updates we need. If they can support lines, points, and areas, those are the primatives we need. They use "line"-like things to describe paths to follow for their demos, but that doesn't mean there isn't the concept of a point in their specification we can add names to or other attributes to do navigation. Looking a little past just the demo of capabilities they show, they are defining areas / lines (and points?) to do "stuff" through a GUI. Same with us. What those things are and what they're being used for are different, I agree.

Only predefined objects having respective Gazebo models can be defined

Now that's a real reason to potentially not use it - so you can't just draw shapes on a map, they have to be part of gazebo models? Can we not work with a map or do we have to work with a simulation world?

On pros: it sounds then like they can do points, so that means we have basically the same starting point as OSM for our "OSM-inspired" format.

Provides the ability to create maps by drawing on top of images.

Also sounds like we can work with just image files.

It might not be a good middleware to use for semantic maps but has a great editor that we can build upon.

I'm confused why its not a good thing to potentially use their standards for file formats / GUI. It sounds like they have alot of the capabilities we're looking for and enabled in ROS2 already. I'm not suggesting we use their fleet tools, simulation, or multi robot stuff, I'm just looking at their map-editing tools and formats. It seems like they have similar semantic information, just aimed at a different goal.

@codebot (Morgan) I believe is leading this effort out of the OR Singapore office. Maybe he can share some roadmap or his thoughts. I'm also a little confused as to why this is under OSRF and not a ros-* org, but I don't know the history of this project and if this intended for long-term support / adoption.

SteveMacenski avatar May 07 '20 00:05 SteveMacenski