IAWG: First pass at publish_lookup chapter
IAWG's first pass at changes to the Publish/Lookup chapter. This is a WIP, but far enough along that people may be interested in seeing what we are working on. Commits are not yet signed-off.
Please use emoji reactions ON THIS COMMENT to indicate your position on this proposal.
You do not need to vote on every proposal
If you have no opinion, don't vote - that is also useful data
If you've already commented on this issue, please still vote so
we know your current thoughts
Not all proposals solve exactly the same problem, so we may end
up accepting proposals that appear to have some overlap
This is not a binding majority-rule vote, but it will be a very
significant input into the corresponding ASC decision.
Here are the meanings for the emojis:
Hooray or Rocket: I support this so strongly that I
want to be an advocate for it
Heart: I think this is an ideal solution
Thumbs up: I'd be happy with this solution
Confused: I'd rather we not do this, but I can tolerate it
Thumbs down: I'd be actively unhappy, and may even consider
other technologies instead
If you want to explain in more detail, feel free to add another
comment, but please also vote on this comment.
@dsolt Are all of the comments in this PR resolved? (If so then can you mark them as resolved?)
For the meeting today, attached is a PDF rendering of the current state of this PR.
I worked thru the chapter and made some recommended changes Ralph
On Aug 9, 2022, at 7:31 AM, Josh Hursey @.***> wrote:
@jjhursey commented on this pull request.
In Chap_API_Publish.tex https://github.com/pmix/pmix-standard/pull/398#discussion_r941417726:
\refconst{PMIX_ERR_DUPLICATE_KEY} error.
+publishing to a \refconst{PMIX_RANGE_CUSTOM} ⬇️ Suggested change -publishing to a \refconst{PMIX_RANGE_CUSTOM} +Publishing to a \refconst{PMIX_RANGE_CUSTOM} — Reply to this email directly, view it on GitHub https://github.com/pmix/pmix-standard/pull/398#pullrequestreview-1066836793, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAST5YECWDB5OWW3W4LWDL3VYJTVFANCNFSM5SQEB4SA. You are receiving this because you commented.
3Q 2022:
- Reading turned up a few items that need further discussion.
- Will follow up in the IAWG to file tune the last few items.
- Intention to bring this back for presentation during the next quarterly meeting.
Here is the latest draft with the changes from the first reading (i.e. introduce a PMIx_Publish2, get rid of PMIX_DATA_TO_PUBLISH, and specify that the range of lookup is used to determine eligible publishers).
Will update the PR soon after WG has had the chance to review.
PMIx ASC 4Q 2022 Meeting presented a straw poll regarding:
Advice to the working group on how best to proceed with the Lookup semantics related to the range specified by the lookup and the scenario where there are duplicate keys published in separate ranges.
Three options were presented:
- Option 1: Change the lookup range to specify where the data was published, not by whom it was published.
- Option 2: If no range is specified in the lookup and multiple matches for a key exist in different ranges, then return a specific status code (e.g., PMIX_MULTIPLE_MATCH) and the full set of matching keys with associated range information.
- Option 3: Do not allow publishing on ranges other than the publisher's namespace or global. Lookup would specify one or the other range.
- Other
Sorted by choice preference
| Option | First Choice | Second Choice | Third Choice | No | Abstain |
|---|---|---|---|---|---|
| Option 2 | 6 | 3 | 2 | 1 | 0 |
| Option 1 | 1 | 8 | 1 | 1 | 1 |
| Option 3 | 3 | 0 | 4 | 4 | 1 |
| Other | 3 | 0 | 0 | 1 | 8 |
Other comments:
- "I'd like to present a modified form of Option 2 that might provide an acceptable landing spot."
- "Seemed like Option-3 required users to encode into key to make it functional, so that is why I abstained on that item, b/c I was not entirely sure that was satisfactory option."
- "Use the approach given in the current PR which requires the lookup process to know the namespace of the publisher."
- "If options 1 and 2 are compatible, then I think the best case is to do both."
PMIx ASC 4Q 2022 day 2 meeting discussed this further and outlined a path forward. See the notes below
- https://github.com/pmix/pmix-standard/wiki/ASC-Q4-2022-Meeting#day-2-oct-27-2022
Notes from IAWG meeting Oct. 31, 2022
- Introduce new APIs for Publish/Lookup/Unpublish (suffix with
_data/_ds/_datastore)- Leave old API as they are with all ambiguities
- The same key can be published by different or the same publisher into the datastore
-
Mechanism for lookup
- Filter by access rules
- Filter by source rules
- Filter to find key
- More than one value may be returned as an array of values (with qualifiers)
-
Data store unique tuple
-
<key, value, publisher, access permissions, timestamp, handler id>
-
-
Publish
- Put a key/value pair into the datastore
- Access permissions: Can specify the range of recipients that can access it (range)
- Maybe we have a default?
- Publisher can publish the same key multiple times - accumulates in the data store (duplicates allowed)
- If a publisher publishes the same thing 10 times, then the lookup sees all 10 values
- FUTURE WORK: Could add a 'REPLACE' attribute to replace given the handle?
- Handed back a handler/id to reference when unpublishing
- No limitations beyond this
-
Lookup
- Ask for a 'key'
- Filtered by access rules defined by the publisher
- Source rules: Can specify the range of publishers from which to get the data
- Since multiple matches can happen
- If multiple matches for a 'key,' then you get multiple values back with qualifiers
- Always get back an array - even if it is an array of one item
- Qualifiers: publisher, timestamp, access permissions, others?
-
Unpublish
- Only the original publisher can unpublish
- Pass the handle from publishing to reference the data that was published
- FUTURE WORK: Maybe add a qualifier to unpublish anything from me with this key (may match more than one)
- Add 'access permissions'/range to modify the access permissions?
List of items to deal with:
- Define PMIX_PUBLISH_IDLEN.
- Do we want something like PMIX_PUBLISH_ID_ALL_MINE (like PMIX_PUBLISH_ID_ALL, but only for caller)
- Create API to translate a publish-id to a proc + epoch number
- macro's for creating/freeing pmix_pdsdata_t's
- Discuss how lookup values are released freed (these are arrays of values for each key and they are allocated by the system)
- lookup blocking and non-blocking are so different. Should we separate keys from return values in the blocking call so it looks more like the non-blocking call.
- Define PMIX_PDSDATA type (pmix_data_type_t) to describe the type PMIx_pdsdata_t
- Define PMIX_PUBLISH_ID type (pmix_data_type_t) to describe the type PMIx_publish_id_t
I added a new commit for:
- lookup blocking and non-blocking are so different. Should we separate keys from return values in the blocking call so it looks more like the non-blocking call.
PMIx ASC 1Q 2023 Pleanary Notes (see Notes Day 1 for more details)
- Noted some use cases
-
Rendevous with failed Publisher processes
- Publisher A publishes 'find_me_key'
- Publisher A dies
- Publisher B takes over Publisher A's role
- Publisher B unpublishes 'find_me_key'
- Publisher B publishes 'find_me_key'
-
Published key for the first consumer
- Publisher A publishes 'for_you'
- Consumer Z looks up 'for_you'
- Consumer Z unpublishes 'for_you'
- Make it automatic on first lookup - so only 1 gets it
- Publisher and Consumer must have the same uid/gid (owner)
-
Lookup should have a sense of consistency
- Publisher A publishes 'key'
- Consumer Z lookup 'key'
- Publisher B publishes 'key'
- Consumer Y lookup 'key'
- Consumers must see at least Publisher A's key
-
Rendevous with failed Publisher processes
- The IAWG will need to have a reference implementation before voting.
- Once semantics are ironed out then someone will need to work on the implementation
- Is the old publish compatible with the new publish_datastore?
- No, they would be separate storages and not share information. This is for the simplicity of the RM implementation and the semantics from the client side.
- On lookup, how do you know which scope the value was published in?
- Add a scope attribute on the lookup to define the scope. There will be a default.
- You only get values exactly at that specified level.
- Scopes are not nested. If not published at the 'session' level you won't see it even if it was published at the 'namespace' level.
- Next Step
- Finish the text for the presentation in the next quarterly
- Discussion of implementation