Research:ChaNGaRefactor
From Astronomy Facility Wiki
Contents |
Objective
The code is being refactored to promote the reuse of code, to trigger the removal of redundant pieces of it and to make it more modular. The latter should give the code a plug-and-play feel, where different algorithms for various basic tasks of the tree-based gravity calculation can be plugged in without a major rewriting of the entire code.
Components
The new code treats the task of gravity computation as comprising various orthogonal concerns, such as walking the tree, performing computations on the nodes encountered therein and aiding optimizations such as, among others, the prefetching of subtrees and the splitting of work into remote and local components. To this end, the new code consists of abstract classes embodying these subtasks. The interactions among different components in the new framework are decidedly abstract, and the actual code is written by refining the interface between these components. Below are some of the components whose interactions bring out the desired behavior of the gravity solver.
- TreeWalk. This class represents the traversal of a piece of the global tree that is local to a TreePiece object. A TreePiece object can therefore be thought of as owning a TreeWalk. The class is abstract, and doesn't in itself specify a way to traverse trees. Rather, successor classes that inherit from TreeWalk flesh out the abstract method walk(...), thereby defining the way in which the tree in question is traversed. Subclasses of TreeWalk include:
- TopDownTreeWalk. This class performs a depth-first traversal beginning at the node it operates upon.
- DoubleTreeWalk. The specific walk type required in the construction of interaction lists. It traverses the local and global trees simultaneously, aiding the construction of partial interaction lists for each node encountered on the path from prefetch roots to individual buckets.
Writing the code for a new tree traversal algorithm should therefore be as easy as writing a class inheriting from TreeWalk and overriding the walk(...) method
- Compute. Compute objects encapsulate the computation that is performed on nodes encountered during a tree traversal. The type of computation performed varies with the kind of compute object that is invoked. Subclasses override the doWork(...) method provided by this abstract class to perform different kinds of computation.
- GravityCompute. As the name suggests, these objects calculate gravity forces. A GravityCompute is seeded with a bucket at the time of creation. The particles of this bucket serve as targets for the forces computed.
- PrefetchCompute. These objects perform the prefetch of remote tree chunks. They are given an array of bounding boxes at instantiation to decide which parts of the global tree need to be fetched before computation on the chunk can begin.
- Opt. These are Optimizer objects that encode the various optimizations in the original code. They guide and inform computation objects' decisions during tree traversal. Compute objects are instantiated with pointers to instances of subclasses of this abstract class. Given a node and a boolean representing an openingCriterion satisfaction, an Opt object tells its corresponding compute object whether it should perform a computation with the node in hand, request its TreeWalk object to descend further down the tree, return up the tree, etc.
As an example, when a Local Opt object encounters an Internal node that is "far enough," according to the openingCriterion test, it tells its Compute object to perform a computation on the node. Currently, there are 3 different kinds of Opt objects:
- Local. Guide different kinds of compute objects when they perform local work.
- Remote. Ditto, but for remote work.
- Prefetch. Specifically in place to guide PrefetchComputes during the prefetch of remote tree chunks.
- State. Objects of this type capture state information during tree traversal. They are associated with TreeWalk objects and operated upon by compute objects. An example of the utility of these objects is in the computation of interaction lists for buckets of particles. The DoubleTreeWalk object passes nodes to a compute object that modifies the plist, clist and checklists of each level. These lists are encapsulated in a class that inherits from State and together constitute the state of the computation.
Methods
Objects provide interfaces through which certain actions can be initiated. We enumerate these below, grouping the methods of each class together.
- TreeWalk
virtual void init(Compute *c, TreePiece *owner)
Sets up the TreeWalk object with a Compute and an owner TreePiece. Should be called before the walk commences via the TreeWalk::walk(...) method.
TreePiece *getOwnerTP()
Returns the TreePiece object this object was instantiated with. Enables access to TreePiece methods such as requestNode()
virtual void walk(GenericTreeNode *gtn, State *state, int chunk, int reqID) = 0
An abstract method that defines how the walk is carried out by a specific kind of TreeWalk. A TopDownTreeWalk : public TreeWalk object, for example, performs a depth first traversal of the tree rooted at node `gtn'.
WalkType getSelfType()
Returns the TreeWalk's type. Used mostly when trying to debug.
Notes: The code also provides an ancestorCheck(node,...) function. This method is currently used only in the context of remote gravity computations and checks whether the specified tree node has an ancestor that will remain unopened during a local computation that is initiated after the remote walk. It helps avoid duplicate computations and will be obviated by proposed changes to the local computation, forcing local computation to start from the prefetch roots instead of the global root.
The TopDownTreeWalk::walk(...) function should not be called with a NULL-gtn argument. That is to say, the node at which a tree walk starts must be legitimate. Successor nodes, even if they are NULL-valued, are automatically fetched and traversed if such is the decision arrived at by the openCriterion.
- Compute
A Compute object contains a pointer to a (void *)computeEntity that represents a unit of computation, for example a bucket (in the case of a GravityCompute) or a structure containing prefetch information (a set of bounding boxes representing all the particles of the TreePiece, for a PrefetchCompute.) A Compute also has a pointer to an Opt object. This helps distinguish between contexts of computation - Remote, Local, etc. Compute methods are listed below.
virtual int doWork(GenericTreeNode *, TreeWalk *tw, State *state, int chunk, int reqID, bool isRoot, bool &didcomp) = 0
Performs the computation work on the node in question. The TreeWalk that invoked this method on the Compute is passed as an argument. Also passed in is a pointer to the State object encapsulating the state of the computation. The Compute first consults the Opt object by presenting it with the result of the openCriterion test and the node type. The Compute checks the value returned by the Opt and decides on a course of action, which could include computation using the computeEntity and the node passed in, dumping the subtree rooted at the node passed in, descending further into the tree, etc.
bool openCriterion(TreePiece *ownerTP, GenericTreeNode *node, int reqID);
Checks whether the node needs to be opened.
Besides these, there are a number of book-keeping methods that are invoked on the Compute by the TreeWalk upon the occurrence of different events. This list may be expanded upon in the future. As an example of what these methods do in different contexts, the GravityCompute::nodeMissedEvent(owner,chunk) increments the remainingChunk[chunk] value whereas the PrefetchCompute::nodeMissedEvent(owner,chunk) does nothing .
int startNodeProcessEvent(TreePiece *owner){
}
int finishNodeProcessEvent(TreePiece *owner){
}
int nodeMissedEvent(TreePiece *owner, int chunk){
}
- Opt
The optimization objects return an action to be performed on the node in question given a value for the openCriterion() test. This is done through the Opt::action(...) method:
int action(bool openDecision, GenericTreeNode *node){
return action_array[openDecision][node->getType()];
}
Example Interactions between Components
In this section, we give an example of how algorithms can be assembled in a piecemeal manner. We look at the construction of a local gravity computation.
- TreeWalk.C. We start with an outline of the code in place for a TopDownTreeWalk. The pertinent methods read thus:
<source lang="c++"> void TopDownTreeWalk::walk(GenericTreeNode *startNode, State *state, int chunk, int reqID){
dft(startNode, state, chunk, reqID, true);
} </source>
// perform depth-first traversal on node
void TopDownTreeWalk::dft(GenericTreeNode *node, State *state, int chunk, int reqID, bool isRoot){
int ret;
GenericTreeNode *tmp;
if(node == NULL){ // something went wrong here
ckerr << "TopDownTreeWalk recvd. null node - chunk("<<chunk<<"), reqID("<<reqID<<"), isRoot("<<isRoot<<")" << endl;
CkAbort("Abort");
}
currentGlobalKey = node->getKey();
// process this node
// didcomp is helpful while debugging, but not otherwise
bool didcomp = false;
ret = comp->doWork(node, this, state, chunk, reqID, isRoot, didcomp);
if(ret == KEEP){ // descend further down tree
for(int i = 0; i < node->numChildren(); i++){
GenericTreeNode *child = node->getChildren(i);
currentGlobalKey = node->getChildKey(i);
comp->startNodeProcessEvent(ownerTP);
// check whether child is NULL and get from cache if necessary/possible
if(child == NULL){
// nodeMissed(...) will automatically request node, if it is unavailable in the software cache
child = ownerTP->nodeMissed(reqID, currentGlobalKey, chunk, comp->getSelfType() == Prefetch, state, getSelfType(), comp->getSelfType(), comp->getOptType());
if(child == NULL){ // missed in cache, skip node for now
comp->nodeMissedEvent(ownerTP, chunk);
continue;
}
}// end check NULL node
/* process children recursively
the next can't be the first node we are processing, so isRoot = false
*/
dft(child, state, chunk, reqID, false);
}// end for each child
}// end if KEEP
//else DUMP - don't need the node anymore, return up the tree
comp->finishNodeProcessEvent(ownerTP);
return;
}
- Compute.C. Next, we look at the innards of a GravityCompute object. Note that in the following, the response of the method to the DEFER signal from the Opt, while correct, is quite inelegant and can be done without if the local walk is begun from prefetch roots instead of the global tree root. This will be done in future versions of the code. Keeping these points in mind, we don't list the corresponding section of the code.
int GravityCompute::doWork(GenericTreeNode *node, TreeWalk *tw,
State *state, int chunk, int reqID, bool isRoot, bool &didcomp){
// ignores state because we don't maintain it across the entire walk
TreePiece *tp = tw->getOwnerTP();
// so that we have a quick return in case of empty nodes
if(node->getType() == Empty || node->getType() == CachedEmpty){
return DUMP;
}
// check opening criterion
bool open = openCriterion(tp, node, reqID);
.
.
.
int action = opt->action(open, node);
if(action == KEEP){
return KEEP;
}
else if(action == COMPUTE){
didcomp = true;
// performs actual computation involving 'computeEntity' and 'node'
int computed = nodeBucketForce(node,
(GenericTreeNode *)computeEntity,
tp->getParticles(),
tp->decodeOffset(reqID),
activeRung);
// book-keeping
if(getOptType() == Remote){
tp->addToNodeInterRemote(chunk, computed);
}
else if(getOptType() == Local){
tp->addToNodeInterLocal(computed);
}
// we don't need to descend further down the tree
return DUMP;
}
else if(action == KEEP_LOCAL_BUCKET){
didcomp = true;
// since this is a local bucket, we should have the particles at hand
GravityParticle *part = node->particlePointer;
int computed = 0;
// do actual computation
for(int i = node->firstParticle; i <= node->lastParticle; i++){
computed += partBucketForce(
&part[i-node->firstParticle],
(GenericTreeNode *)computeEntity,
tp->getParticles(),
tp->decodeOffset(reqID),
activeRung);
}
if(getOptType() == Remote){
tp->addToParticleInterRemote(chunk, computed);
}
else if(getOptType() == Local){
tp->addToParticleInterLocal(computed);
}
// can not descend past a bucket
return DUMP;
}
else if(action == KEEP_REMOTE_BUCKET){
didcomp = true;
// fetch particles and compute.
ExternalGravityParticle *part;
part =
tp->requestParticles(node->getKey(),
chunk,
node->remoteIndex,
node->firstParticle,
node->lastParticle,
reqID, false);
if(part){ // particles available; compute
int computed = computeParticleForces(tp, node, part, reqID);
if(getOptType() == Remote){
tp->addToParticleInterRemote(chunk, computed);
}
else if(getOptType() == Local){
tp->addToParticleInterLocal(computed);
}
}
else{ // particles missed; record this fact
if(getOptType() == Remote){
tp->addToRemainingChunk(chunk, node->lastParticle-node->firstParticle+1);
}
}
return DUMP;
}
else if(action == DUMP){
return DUMP;
}
.
.
.
}
- TreePiece.cpp. Finally, we can put all the pieces together in the correct place. In the TreePiece::startNextBucket() method, we do the following:
.
.
.
TreeWalk *tw = new TopDownTreeWalk();
Compute *gc = new GravityCompute();
/* This walk does not need to maintain state across different
invocations of the walk(...) method
That is to say, computations involving one bucket are
independent of others. This is not the case, for example,
in the interaction list based algorithm, where the list of
interactions is maintained across the entire traversal at
different depths of the tree.
*/
State *nullstate = new NullState();
// We will be performing a local gravity computation
Opt *opt = new LocalOpt();
// Initialize the Compute and TreeWalk objects
gc->init(bucketList[currentBucket], activeRung, opt);
tw->init(gc, this);
.
.
.
// start the tree walk from the tree built in the cache
if (bucketList[currentBucket]->rungs >= activeRung) {
for(int x = -nReplicas; x <= nReplicas; x++) {
for(int y = -nReplicas; y <= nReplicas; y++) {
for(int z = -nReplicas; z <= nReplicas; z++) {
// In the following, the chunk num is set to -1: local computation,
// should never request remote nodes
#ifdef CACHE_TREE
tw->walk(localCache->getRoot(), nullstate, -1, encodeOffset(currentBucket,x,y,z));
#else
tw->walk(root, nullstate, -1, encodeOffset(currentBucket,x,y,z));
#endif
}
}
}
// compute the ewald component of the force
if(bEwald) {
BucketEwald(bucketList[currentBucket], nReplicas, fEwCut);
}
}
bucketReqs[currentBucket].finished = 1;
finishBucket(currentBucket);
// delete allocated objects
delete gc;
/* Note: although it is alright to delete the state object in this case
since it won't be used anyway, in the remote computation phase of the
interaction list algorithm, for example, state will be saved in the
CacheManager when a node is missed. This state will later be retrieved
and used when the node in question is made available to the TreePiece.
Therefore, it would be incorrect to delete the state object here.
*/
delete nullstate;
delete opt;
.
.
.
