irods
irods copied to clipboard
Hidden requirement on multi-1247 parallel transfer against replication resource
- [ ] main
- [ ] 4-2-stable
The Hidden Requirement
After experimenting and comparing the behavior of the C++, Python, and Jargon parallel transfer implementations, the server appears to have a hidden requirement in regards to closing a replica and updating the catalog via rx_replica_close()
.
That requirement being ... The first stream to open a replica for parallel transfer is also required to be the final stream to close the replica and perform catalog updates.
Testing was performed against a slightly modified 4.2.11 iRODS server.
The following resource hierarchy was used:
repl:replication
├── ufs0:unixfilesystem
├── ufs1:unixfilesystem
└── ufs2:unixfilesystem
The zone does not contain the target data object prior to running the test.
The test creates a brand new data object targeting the repl
resource.
Findings
- The C++ and Python implementations trigger replication correctly.
- This is because they force the first stream to handle the catalog updates (i.e. the first stream waits for sibling replicas to complete their transfers).
- The Jargon implementation leaves updating of the catalog to the stream which finished its transfer last.
- This results in hierarchy resolution issues (see table below).
- The hierarchy resolution operation is
CREATE
for all streams instead ofOPEN
for overlapping secondary streams. - The voting operation is
WRITE
instead ofOPEN
for all overlapping secondary streams. - Resolution results in an empty replica list.
- The replica state table issues a log warning about not having original replica statuses for restoration. (I'm not sure if this warning always appears)
- The hierarchy resolution operation is
- This scheme makes a lot of sense, however, it is problematic for certain scenarios. For example:
- NFSRODS is driven by the user. If the user runs a command such as
dd
, NFSRODS is at the mercy of howdd
decides to write bytes. Jargon attempts to detect overlapping streams to simplify things for the developer, but this presents a problem because overlap may not happen due to how the JVM/OS decides to schedule threads.- How do we solve this for applications similar to NFSRODS?
- NFSRODS is driven by the user. If the user runs a command such as
- A simple multi-threaded Jargon application, using latches to enforce the first-stream requirement, resulted in replication working consistently.
- This results in hierarchy resolution issues (see table below).
Additional Information regarding Hierarchy Resolution
The table below shows the values for three agents doing a parallel write of a zero-length file. The first row of each table represents the first stream to open the replica.
C++ Implementation
pid | open flags | hier resolution op | voting for | open type | function calls |
---|---|---|---|---|---|
31010 | O_WRONLY | O_CREAT | CREATE | CREATE | CREATE_TYPE | repl_file_create, invoke_file_modified, repl_file_modified |
31045 | O_RDWR | OPEN | OPEN | OPEN_FOR_WRITE_TYPE | repl_file_open, repl_file_close |
31032 | O_RDWR | OPEN | OPEN | OPEN_FOR_WRITE_TYPE | repl_file_open, repl_file_close |
Jargon Implementation
pid | open flags | hier resolution op | voting for | open type | function calls |
---|---|---|---|---|---|
14078 | O_RDWR | O_CREAT | CREATE | CREATE | CREATE_TYPE | repl_file_create, repl_file_close |
14077 | O_RDWR | O_CREAT | CREATE | WRITE | OPEN_FOR_WRITE_TYPE | repl_file_open, repl_file_close |
14079 | O_RDWR | O_CREAT | CREATE | WRITE | OPEN_FOR_WRITE_TYPE | repl_file_open, warning (repl status restore failed - no original status) |
i notice that your first jargon row is not the smallest jargon pid.
Adjusted statements immediately above the tables to say the first row represents the first stream/agent to open the replica.
Ooooh, this is good detective work.
I wonder if #6109 would be related or help here?
is this now handled with 4.3.2.5-SNAPSHOT ?
No.
To resolve this issue, we have to claim that the requirement is part of the design or we lift the requirement so that the last stream to close is responsible for updating the catalog and triggering policy.