Transtomo

From GeoWiki
Jump to navigation Jump to search

Compile and Run Transtomo on Deepthought2 Tolulope Olugboji, olugboji@umd.edu March 31, 2016


Overview

This documentation outlines steps for compiling and running the extended version of transtomo on the deepthought2 cluster at the University of Maryland. This version of transtomo is a personal adaptation and improvement of the freely distributed copy on iEarth geophysics scientific software. This version includes the following improvements:

  • Functionality to load, initialize, and restart chains from models stored in ASCII files
  • Functionality to resample and save full chain history in ASCII files
  • Functionality to lock cells that do not participate in the update of travel time data

Work is also ongoing to extend the code to invert for azimuthal anisotropy transtomo. This documentation will provide details on this update when it's available. Feel free to contact me for any other questions on how to use this code and set it up on the deepthought2 cluster here at the University of Maryland.

Directory Listing

Here is a listing of the important directories that need to be set up on the cluster to organize and set things up.

1 <Home􀀀directory>
2 /homes/username(olugboji)
3 <Code􀀀data􀀀directory>
4 /lustre/username(olugboji)/transdimsurfacewavetomography/
5 <Code􀀀binary􀀀directory>
6 /home/deepthought2/username(olugboji)/bin/
7 <Run􀀀shell􀀀execute􀀀directory>
8 /homes/username(olugboji)/binShell
9 <Input/Output􀀀Data/Exhaust􀀀directory>
10 /lustre/username(olugboji)/tomoParamFiles/
11 <rjMcMC􀀀library􀀀directory>
12 /home/deepthought2/olugboji
13
14 % Data that need to be backed up from the local store to the ...
cluster directory ..
15
16 <local􀀀prjDir>
17 :/Documents/UMD Seismo/
18 <local􀀀code􀀀store>
19 <local􀀀prjDir>iEarthSoftware/transdimsurfacewavetomography
20 <local􀀀data􀀀store>
21 <local􀀀prjDir>transdimUSANT/All Ta Sta/
22 <local􀀀restrt􀀀models>
23 <local􀀀prjDir>transdimUSANT/All Ta Sta/VorModels/inModels
24
25 %.. Data from cluster exhaust to local directory
26 0.11
27 0.22


Login into deepthought2

ssh -X username@login.deepthought2.umd.edu


Updating and compiling source code

The followng are the sequence of steps necessary to compile the and install the required libraries and other source code �les: 2 • Update the <code-data-directory> with the latest version of the source code (available on <local-prjDir> or @ <gitHub directory> (see direc- tory listing in 2 above) • Compile and install the RJMCMC library (<code-data-directory>/RJMCMC 1.0.11/) also see README �le. On the cluster this involves the following steps: > cd path to <code-data-directory>/RJMCMC 1.0.11/ > module load openmpi > ./configure --prefix=<rjMcMC-library-directory> > make > make install This is the only way to install the RJMCMC library on the cluster. Attempt- ing to do a sudo make install will fail since the user does not have root access on the cluster. • On successfull install of the RJMCMC library, the user can then go ahead to compile and install the tomo mpi and tomo code into the <code-binary-directory>. It is from this directory that the SBATCH shell scripts in <run-shell-execute-directory> load the binary executables. Here are the steps to do this: > cd path to <Code-data-directory>/tomo-0.9.16 2 > setenv PKG CONFIG PATH /home/deepthought2/olugboji/lib/pkg-config > module load openmpi > ./configure -RJMCMC FLAGS=-I<rjMcMC-library-directory> -I<$MPI INC> - RJMCMC LIBS=-L<$MPI LIB> -lm -lmpi -lrjmcmc > make > cp tomo* <code-binary-directory>

  • Note: if the configure command fails, then just run the command with

empty values for the parameter ags RJMCMC FLAGS and RJMCMC LIBS, and then go into the makefile and update the relevant ags with the speci�ed include and load ags. Also note that the two MPI environment variables: $MPI INC and $MPI LIB are only set after the module load mpi command has been used. These two environment variables are absolutely crucial for compiling the mpi version of the code i.e. tomo mpi.

Build parameter files

Run [{{#filelink: single_job.sh}} run_pythonSbatch.sh] script, the single shell python caller, and the python code

{{#fileanchor: run_pythonSbatch.sh}}

#!/bin/tcsh
#
# Author: Tolulope Olugboji
# Date:  March 6, 2015
#
#  Used to build parameter files, by running them on umd's deepthought2

module load python/2.7.8

# remplace with the following output directory for 3 expts - 1. Expt1_All 2. Expt2_RadialRal 3. Expt3_Seasons


set expts = (newExpts/RadialRal/ newExpts/Summer/ newExpts/Winter/)
set saveAs = (Expt2_RadialRal/  Expt3_Seasons/Sum/  Expt4_Seasons/Win/)
set use4slrm = (expt2 expt3 expt4)
set indxExpts = `seq 1 $#expts`  # experiment iterator

echo "Single in Directory ..." $expts[1]
echo $saveAs[1]
echo $#expts

foreach iExpt ($indxExpts)

    set inDirX = '/lustre/olugboji/buildParamFiles/USANT15/Measure/'
    set outDirX = '/lustre/olugboji/buildParamFiles/USANT15/THBIParams/'

    echo "START !!!!! Expt " $iExpt "-----------------------------------------------------"
    echo "Input Dir ..." $inDirX$expts[$iExpt]
    echo "Output Dir    " $outDirX$saveAs[$iExpt]

    set inDir = $inDirX$expts[$iExpt]
    set outDir = $outDirX$saveAs[$iExpt]

    set inLove = `ls $inDir`
    set indxPhase = `seq 1 $#inLove`
    #set indxPhase = `seq 12 21`

    echo "all Love" $inLove[1] "length" $#inLove
    echo $indxPhase

    foreach iPhase ($indxPhase)

        set file = $inLove[$iPhase]
        echo $iPhase " : " $inLove[$iPhase]
        setenv JOBNAME "$file"
        setenv INDIR "$inDir"
        setenv OUTDIR "$outDir"

        set slurmOut = "/lustre/olugboji/buildParamFiles/USANT15/mpiOUT/slurm-$use4slrm[$iExpt]-$file.txt"

        #sbatch --job-name=$JOBNAME --time=2-0 --output=$slurmOut --export=JOBNAME ./runSingle_PhaseExpt.sh
        sbatch --job-name=$JOBNAME --time=2-0 --output=$slurmOut --export=ALL ./runSingle_PhaseExpt.sh
    end

    echo "END!!!!! Expt " $iExpt "-----------------------------------------------------"
end

Setting up parameter and data files

Now that the libraries and executable (binaries) environment is set up, the next stage is to set up the input parameters and the data environment necessary to stage the code on the cluster. There are two directories here:

The First is the <Input/Output-Data/Exhaust-directory> that stores the input data, parameter files needed to start the tomompi</code? code and results output generated after successful completion of the analysis. The datasets are organized by phase-velocity e.g. <Input/Output-Data/Exhaust-directory>/<folder> where <folder> represents the 22 phase velocity datasets: L05-L40 and R05-R40. Here is a listing of the data directory:

  • <Input/Output-Data/Exhaust-directory>/<folder>

Holds the important observation data, source-receiver configuration and path location files e.g. observations.dat, paths.dat, sources.dat, etc.

  • <Input/Output-Data/Exhaust-directory>/<folder>/results

Stores the saved output after completed run e.g. output mean, Voronoi-partition location and velocity history, misfit history, model save state, etc.)

  • <Input/Output-Data/Exhaust-directory>/<folder>/restart

Stores the model states (i.e. high resolution tesselated model states or intermediate model states) to be used in the case of a restart-able chain.

The second is the <run-shell-execute-directory> where the shell SBATCH scripts for staging the code on the cluster Here are the files in this directory:

  • <run-shell-execute-directory>/tomo sbatch.sh :

Script runs tomo mpi for all 22 phase velocity maps. You can tweak this script to run single maps, specify an initial number of partitions, or to determine if you want to restart chains from set stages or particular input models. This is actually a wrapper script to tomo single.sh that does all the heavy lifting.

  • <run-shell-execute-directory>/tomo single.sh :

Script used by tomo sbatch.sh above for running the tomo mpi on single phase velocity maps, it also sets up all the required environment variables, determines how many chains to run and sets up the required ags.

  • <run-shell-execute-directory>/tomoParamMPI.nml:

Parameter file used by tomo single.sh for setting up all the required input parameters used by tomo mpi. It sets up parameters like file paths, number of steps, distribution range, perturbation parameters, seed values for probability distribution, state of starting models etc. Note that we provide another parameter file: <run-shell-execute-directory>/rstrtTomoParamMPI.nml for setting up parameters if the chain is to be restart from a specified initial model state (typically a high-resolution near-mean state).

  • <run-shell-execute-directory>/dwnldRslts.sh:

Script used to download results data from the remote deepthought2 cluster machine to the <outputData-prjDir> on the local machine for post-processing and visualization.