Research:ChaNGaOptions

From Astronomy Facility Wiki

Jump to: navigation, search

Here you can find a description of all the parameters supported by Research:ChaNGa. This list is probably incomplete, so please update this page if you notice something missing.

Normally parameters are set by a keyword in the parameter file. Note that these keywords follow a Hungarian Notation style where the first few characters indicate the type of variable. Parameters can be set (or overwritten) by command line switches as well.

Contents

Parallel Performance Options

One of the big differences between PKDGRAV and ChaNGa is its use of the CHARM load balancing system for parallel performance. The CHARM strategy is overdecomposition, i.e. divide the work into many more pieces than you have processors and let the runtime system load balance by appropriately assigning pieces to real processors. Options that control this process are the following:

  • nTreePieces: the total number of pieces into which the computation should be split. A rule of thumb is that CHARM needs 8 to 16 times more pieces than processors to do reasonable load balancing, e.g. 512 pieces for a 64 processor run. However, fewer than 1000 particles per piece will result in too much bookkeeping overhead for good performance. The command line version of this is -p <Npieces>.
  • nDomainDecompose: strategy for breaking computation into pieces. The command line version of this is -D <nDomain>. Options are
    • 0: Space Filling Curve (SFC). Morton order the particles and split evenly among pieces.
    • 1: Oct tree. Divide along cubical boundaries.
    • 2: Orthogonal Recursive Bisection (ORB).
    • 3: Peano-Hilbert Space Filling Curve. Otherwise similar to 0.
    ORB is generally a bad choice. Ironically this is the method PKDGRAV uses. SFC generates more communication but is easier to load balance. Oct might need more pieces for good load balancing.
  • +balancer <balancer>: specify load balancing scheme. See the CHARM manual for all the possibilities. Reasonable choices at the moment are:
    • GreedyLB: use the Greedy algorithm to assign pieces to processors. It works OK with Oct domain decomposition, but generates a lot of communication with SFC.
    • OrbLB: Use an ORB tree to assign pieces to processors. It works OK with either SFC or Oct domain decomposition.
    Note that more work needs to be done in this area. New balancers are currently being developed. There are also a number of flags that can fine tune the load balancing. See the load balancer options in the CHARM manual. One useful option is
    • +LBObjOnly ignore background loads when making migration decisions.
  • bPrefetch: turn on cache prefetching. The command line version of this is +f. ChaNGa will do a preliminary tree walk to preload the cache before doing the gravity calculation. We suggest that this should always be set.
  • nCacheDepth: depth of tree to fetch upon a cache miss. I.e., if a node is fetched, fetch its children (with a value of 1) or grandchildren (2) etc. 4 seems to be a good compromise. The command line version of this is -d <depth>.
  • nBucket: Maximum number of particles in the leaves of the tree. This is -b in the command line. This is a time space trade-off. Raise this number and you will save space because the tree is smaller, but gravity calculations will be more expensive. Lower this number and gravity calculations will be faster. If the number is too low, gravity can slow down again due to less processor pipelining.
  • nYield: allow charm to interrupt computation to prioritize communication. A lower number results in a higher interrupt frequency. This is -y on the command line.
  • nChunks: split the global work to allow better pipelining. A suggested value is 16 for large simulations (or even 32 for huge simulations).
  • +rand: randomly order the chunks in the computation pipeline. This is the default, as it helps avoid bursts of communication to a single processor.

Boundary condition options

  • bPeriodic Use periodic boundary conditions. The command line version is +per.
  • nReplicas Number of periodic replicas in each direction. The default is 1 for periodic boundary conditions.
  • dPeriod Box size for periodic boundary conditions. The default is 1.0. Periodic volumes are centered on 0.0.
  • bEwald Use Ewald expansion for periodic forces.
  • dEwCut Real space cutoff in the Ewald expansion. The default is 2.6.
  • dEwhCut Fourier space cutoff in the Ewald expansion. The default is 2.8.

Cosmology options

  • bComove: Use cosmological comoving coordinates for integration. The default is off.
  • dHubble0: Hubble constant in system units. The default is 0.0.
  • dOmega0: Comoving density of matter in units of the current critical density.
  • dLambda0: Cosmological constant in units of the current critical density.
  • dOmegaRad: Comoving density of radiation in units of the current critical density.
  • dOmegab: Comoving density of baryons in units of the current critical density.
  • dQuintess: Comoving density of a "Quintessence" field (with equation of state parameter w = -1/2) in units of the current critical density.
  • dRedTo: Final redshift to which to run the simulation.

Timestepping options

  • dDelta : The timestep to use when integrating the equations of motion in the simulation. The command line version of this is -dt. If multistepping is used, this is the largest allowed timestep that a particle can have.
  • nSteps : The total number of steps (of size dDelta) to complete. The command line option for this is -n.
  • dEta : The eta parameter is a dimensionless prefix that controls the scale of the timestep picking criterion. That is, to pick smaller, more conservative timesteps, use a smaller eta. The command line version of this is -eta.

The timestep this algorithm picks is \eta \sqrt{\epsilon/a} where ε is the particle's softening length and a is the particle's acceleration.

Force accuracy options

  • dTheta : The opening angle that determines when to open a branch in the tree. Smaller dTheta means more nodes will be opened, resulting in a more accurate, slower, calculation. The command line version of this is -theta.

Input/Output options

  • bBenchmark : Benchmark only mode: no output or checkpoints are written. The command line version of this is +bench. Default is off.
  • achInFile : The filename of the initial conditions (in tipsy format). The command line version of this is -I.
  • achOutFile : The prefix of the output filenames. The command live version of this is -o. The actual outputs will be tipsy standard format files, with filenames of this prefix and a suffix of six digits (left zero-padded) of the timestep for that output. The log file is also prepended with this name.
  • iOutInterval : This determines how often to write an output of the simulation. The command line version of this is -oi. An output file will be written every iOutInterval timesteps, and will have the name achOutFile.current_timestep_padded_to_six_digits. Output files are in tipsy standard format.
  • iLogInterval : This determines how often an entry will be made into the log file, which has the filename achOutFile.log. You can learn about the contents of the log files on the Research:LogFileContents page. The command line version of this is -ol.

Checkpoint/Restart options

  • iCheckInterval : This determines how often to write checkpoint files. The command line version of this is -oc. A checkpoint is written every iCheckInterval major timesteps. The checkpoint gets written to a directory basename.chk? where "?" alternates between "0" and "1". After the checkpoint completes, a file "lastcheckpoint" is written to indicate the latest clean checkpoint.
  • iWallRunTime : Number of wall clock minutes available for running. The command line version of this is -wall. ChaNGa will attempt to limit its run time to less than this number of minutes. When time is running out, it will write a checkpoint and exit.
  • +restart directory : this is only a command line option. Note that if this option is given, all other command arguments and the parameter file will be ignored. The calculation will continue from the checkpoint written in directory using all parameters from that checkpoint.

If a file named STOP is created in the directory in which ChaNGa is running, it will write a checkpoint and exit at the end of the currently running major timestep.

Movie Making options

ChaNGa has the ability to generate movie frames on the fly in PPM format. The interval between frames is controlled by either:

  • dDumpFrameTime simulation time interval between frames.
  • dDumpFrameStep number of timesteps between frames.

The rendering of the frames is controlled by the .director file. See the movie/director.README file in the ChaNGa distribution for an explanation of this file.

Personal tools