TileDB
TileDB copied to clipboard
Lums/sc 19398/edge class
This PR adds an Edge class to the TileDB task graph library.
There are several substantial changes to the library as part of this PR (the Edge class itself is quite small in fact).
- All supporting classes for the finite state machine have been moved to a new state_machine subdirectory. The code for the finite state machine remains in the file fsm.h.
- The
PortStateMachineclass can now support two-stage and three-stage data transfer. The former is when aSourceis directly connected to aSink. The latter is when aSourceis connected to aSinkvia anEdge. - There are two enum classes representing the states for the two-stage and three-stage cases.
- The
PortStateMachineis parameterized by the type of the states (two_stageorthree_stage) as well as the policy class that implements the actions for the state transitions. A policy class inherits from thePortStateMachineclass using CRTP, making its implementation of the state transition actions directly available to thePortStateMachineclass. - Policy classes are contained in the file policies.h. Policy classes are parameterized by a
Moverand by aPortState(one of two_stage or three_stage). TheMoverinherits from the policy class, again via CRTP. TheMoverclass provides the actual data movement actions for the policy class (the policy class implements the synchronization between threads running theSourceand threads running theSink. - There are a number of Policy classes implemented, but the primary ones of interest are the
AsyncPolicyand theUnifiedAsyncPolicy. These policy classes implement the wait and notify functions using condition variables. Other policies are in place primarily for testing other parts of the task graph library. - Data movement from
SourcetoSinkis managed by anItemMoverclass, defined in item_mover.h. TheItemMoverinherits from a policy class using CRTP. - The
ItemMoverclass inherits from a specialized base classBaseMover, one specialized for two_stage on one for three_stage. TheBaseMovermaintains pointers to the data items from theSource,Sink, and in the case of three_stage data movement, theEdge. Data movement is effected by swapping the data being pointed to in order to move it along the pipeline. Sources andSinks inherit from a DataMover class and use its API for sending data from aSourceto aSink.- The
SourceandSinkare class templates the take anMoveras a parameter (actually as a template template, along with the type of data being moved. TheMoverinstantiated with theSourceor theSinkis expected to take a single parameter, the type being moved. - Unit tests are included to exercise all of these different classes. Particular unit tests are included as well to test transferring
DataBlocks. - An
Edgeinherits from bothSourceandSink. TheEdgeconstructor takes aSourceand aSinkand connects its internalSinkto passedSourceand its internalSourceto the passedSink.Edges are also parameterized byMovertype and the type being passed. - Although these classes are intended to work together, there are no include dependencies among the header files where they are declared.
- The most important tests included in the various unit tests asynchronously sending a large number of numbers from a
Sourceto aSinkand verifying that all numbers were sent correctly (as well as verifying that all intermediate states of the state machine are correct). This kind of test is repeated for the finite state machine,SourceandSinkports, and for pseudo graph nodes containingSources andSinks. The tests are conducted for directly-connectedSources andSinks as well as forSources andSinks connected byEdges. - The test in ports/test/unit_concurrency.cpp has a very crude (very crude) diagnostic output showing overlap of operations for a two_stage data mover. A future PR will include this for the three_stage data mover as well.
Example:
The following is an example of instantiating a Source and a Sink. Note that in practice, many of these will be predefined so that users will not need to define the whole stack of types. Note that all of the classes in this stack are parameterized simply by the type being passed.
// Define a two_stage item mover with asynchronous policy, parameterized by the type being used
template <class T>
using AsyncMover2 = ItemMover<AsyncPolicy, two_stage, T>;
// Define an asynchronous policy based on the two stage item mover
template <class T>
using AsyncPolicy2 = AsyncPolicy<AsyncMover2<T>, two_stage>;
// Define a state machine based on the asynchronous policy
template <class T>
using AsyncStateMachine2 = PortFiniteStateMachine<AsyncPolicy2<T>, two_stage>;
// Create Source and Sink objects using the two stage asynchronous mover
Source<AsyncMover2, size_t> left;
Sink<AsyncMover2, size_t> right;
An Edge requires a three stage Mover and would be used as follows:
Source<AsyncMover3, size_t> source;
Sink<AsyncMover3, size_t> sink;
Edge<AsyncMover3, size_t> edge(source, sink);
(Note that upcoming PRs will make more effective use of CTAD for Edge creation so that the type arguments for the Mover and the datatype being moved will not have to be specified.
TYPE: FEATURE DESC: Adds Edge class to the TileDB task graph library.
This pull request has been linked to Shortcut Story #19398: Edge: class Edge separately (no nodes).