Compiling

Instructions below assume that MODULEPATH is currently pointing to the Genoa software stack (default). Further details are available in Section Software Modules.

However, any code executed on the landing shell of mc2 must also be compiled on the Head Node. Here you must use software modules compiled for the Cascadelake architecture (Intel® Xeon® Gold 6252). They can be made available via

[user@hn]$ module unuse $MODULEPATH
[user@hn]$ module use $MODULEPATH_CASCADELAKE
[user@hn]$ module use <module>

where <module> is a specific software app, for instance a compiler or a library. For example, to compile a code with OpenMP support using the GNU Fortran compiler:

[user@hn]$ module unuse $MODULEPATH
[user@hn]$ module use $MODULEPATH_CASCADELAKE
[user@hn]$ module load GCC
[user@hn]$ OPT="-O2 -fopenmp -march=cascadelake -ffpe-summary=none
[user@hn]$ gfortran $OPT sample.f

we can also use -march=native (instead of cascadelake), leaving up to the compiler to find out what kind of optimized instructions it can generate.

Important: mind that you must not carry out heavy computational tasks on the Head Node, so that other users wont feel your presence.

Software destined to be executed on Compute Nodes should be compiled on those nodes by launching an interactive job within Slurm (this is further explained below). In this case you must use software modules compiled for the Genoa architecture (AMD® EPYC^TM 9454). Also note that compilation will be faster if you move the source files to Compute Space (under /compute), or even better, if you place your files under /tmp, thus avoiding slow I/O via NFS to your home space. You will understand this better with help of the mini compute cluster hardware diagram.

Finally, make sure that you use the right modules before using compilers and libraries, I mean, make sure that MODULEPATH points to a software stack consistently compiled for your CPU target (see Software Modules section).

Available Compilers

Distribution	Language	MPI	Command	Module
GNU	Fortran	No	`gfortran`	`GCCcore`
GNU	C	No	`gcc`	`GCCcore`
GNU	C++	No	`g++`	`GCCcore`

GNU	Fortran	Yes	`mpifort`	`OpenMPI`
GNU	C	Yes	`mpicc`	`OpenMPI`
GNU	C++	Yes	`mpic++`	`OpenMPI`

Intel	Fortran	No	`ifx`	`intel-compilers`
Intel	C	No	`icx`	`intel-compilers`
Intel	C++	No	`icpx`	`intel-compilers`

Intel	Fortran	Yes	`mpiifx`	`impi`
Intel	C	Yes	`mpiicx`	`impi`
Intel	C++	Yes	`mpiicpx`	`impi`

Compiling code for Genoa CPUs (Compute Nodes)

Instant user access to compute nodes (for instance via ssh) is not allowed. It is however possible to launch an interactive shell like bash on a Compute Node using Slurm. In the example below we are requesting one node to launch bash, and once the shell is running on cn1, we may load the necessary software modules and run the compiler interactively:

[user@hn]$ srun --nodes=1 --pty bash -i
[user@cn1]$ module load GCC
[user@cn1]$ FFLAGS="-O2 -ftree-vectorize -march=znver4 -fno-math-errno"
[user@cn1]$ gfortran $FFLAGS sample.f

Also note that I did not have to point MODULEPATH to the Genoa stak (that is already done by defaul).

GNU Compilers

Command line options can be found on the GCC online documentation web site. The recommended command-line optimization options are:

-O2, Considers nearly all supported optimizations that do not involve a space-speed tradeoff.
-ftree-vectorize, Perform vectorization on trees.
-march=native, Choose the target-specific optimizations. You may replace the target native by cascadelake for the Head Node or znver4 for Compute Nodes.
-fno-math-errno, Do not set errno after calling math functions that are executed with a single instruction

Examples

Compile serial binary on a Compute Node:

[user@cn1]$ module load GCC
[user@cn1]$ FFLAGS="-O2 -ftree-vectorize -march=znver4 -fno-math-errno"
[user@cn1]$ gfortran $FFLAGS sample.f

Compile a parallel binary on a Compute Node:

[user@cn1]$ module load OpenMPI
[user@cn1]$ FFLAGS="-O2 -ftree-vectorize -march=znver4 -fno-math-errno"
[user@cn1]$ mpifort $FFLAGS sample.f

Intel® Compilers

Command line options can be found here on the OneAPI Documentation web site. The recommended command-line optimization options are:

-O2, Enables optimizations for speed. Vectorization is enabled.
-march=core-avx2, Tells the compiler to generate code with Advanced Vector Extensions 2 (Intel® AVX2). This option is valid when compiling on the Compute Nodes (AMD EPYC processors). For the Head Node it should be replaced by -mcascadelake or -xCORE-AVX2.
-ftz, Flush denormal results to zero.
-fp-speculation=safe, Tells the compiler to disable speculation on floating-point operations if there is a possibility that the speculation may cause a floating-point exception.
-fp-model precise, Disables optimizations that are not value-safe on floating-point data.

Examples

Compile serial binary on a Compute Node:

[user@cn1]$ module load intel-compilers
[user@cn1]$ FFLAGS="-O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise"
[user@cn1]$ ifx $FFLAGS sample.f

Compile a parallel binary on a Compute Node:

[user@cn1]$ module load impi
[user@cn1]$ FFLAGS="-O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise"
[user@cn1]$ mpiifx $FFLAGS sample.f

Compiling code for Cascadelake CPUs (Head Node)

In this case we have to activate the Cascadelake software stack:

[user@hn]$ module unuse $MODULEPATH
[user@hn]$ module use $MODULEPATH_CASCADELAKE

Subsequent steps are indentical to those on Compute Nodes: load compier/library modules, and simply run the compiler.

Intel® Compilers

Compile a binary on the Head Node using OpenMP parallelization:

[user@hn]$ module load intel-compilers
[user@hn]$ FFLAGS="-O2 -xHost -qopenmp -ftz -fp-speculation=safe -fp-model precise"
[user@hn]$ gfortran $FFLAGS sample.f

GNU Compilers

Compile a serial binary on the Head Node:

[user@hn]$ module load GCC
[user@hn]$ FFLAGS="-O2 -ftree-vectorize -march=cascadelake -ffpe-summary=none"
[user@hn]$ gfortran $FFLAGS sample.f

You can also use -march=native and let the compiler find out what are the best architecture-dependent instructions to use.