Modules

Introduction

A variety of apps are available to mc2 users, possibly including different versions. In the present context, the term apps refers to compilers, tools, libraries, data, etc. They are managed at the user level by using Lmod environment modules. With this we mean that users can make apps accessible/unaccessible with simple commands.

Apps are made available to your current shell or to your Slurm job environment, by loading app modules. Modules update shell environment variables (like PATH or LD_LIBRARY_PATH), so that the system can find and use the apps of your choice. Modules replace the need for creating or changing setup scripts like .bashrc or .bash_profile.

For further details about Lmod, see the article Environment Modules Using Lmod in Admin Magazine by Jeff Layton, with an excellent explanation on how Lmod works and nice examples.

How it works

Within Lmod, the environment variable MODULEPATH is paramount. It consists of a sequence of collon-separated paths, pointing to module files, each defining a functioning environment for a specific app. With that we define the software that is currently reachable. For example, the currently defined MODULEPATH can be printed out with:

[user@hn ~]$ echo $MODULEPATH
/opt/apps/stack/modules/all:/opt/apps/stack/modules/compiler:/opt/apps/stack/modules/toolchain

You may customize MODULEPATH using the module use command (see the section devoted to this topic further below).

We may list and filter out the available modules whose name contains the word "intel":

[user@hn ~]$ module avail intel

-------------- /opt/apps/stack/modules/all --------------
   impi/2021.13.0-intel-compilers-2024.2.0 (L)
   intel-compilers/2024.2.0                (L,D)
   intel/2024a                             (L,D)

----------- /opt/apps/stack/modules/compiler ------------
   intel-compilers/2024.2.0

----------- /opt/apps/stack/modules/toolchain -----------
   intel/2024a
...

Modules are cast in the form of name/version. Importantly, the version is not necessarily a simple numeric tag, but may also include important info, such as the compilers used to build them. From the above we realize that we are offered modules that provide the Intel MPI libraries (impi), Intel Compilers (intel-compilers), Intel Compilers + Intel MPI + MKL + FFTW libraries (intel). The first is a simple module, the second belongs to the compiler class, while the third is classified as a toolchain.

Modules have to be loaded before using their executables/libraries. Apps made available by a module may depend on software (provided by additional modules), and all of that must be loaded for proper execution.

For instance, we can try to run the Intel Fortran compiler before and after loading the appropriate module:

[user@hn ~]$ ifx --version
bash: ifx: command not found...
[user@hn ~]$ module load intel-compilers
[user@hn ~]$ ifx --version
ifx (IFX) 2024.2.0 20240602
Copyright (C) 1985-2024 Intel Corporation. All rights reserved.

We can also list all currently available apps:

[user@hn ~]$ module list

Currently Loaded Modules:
  1) GCCcore/13.3.0
  2) zlib/1.3.1-GCCcore-13.3.0
  3) binutils/2.42-GCCcore-13.3.0
  4) intel-compilers/2024.2.0

Despite having loaded intel-compilers only, other modules (GCCcore, zlib, binutils) were automatically made available in order to satisfy dependences (the intel compilers were built using these apps).

Software Stacks

There are two independent sets of apps in mc2, hereafter referred to as Stacks. These are named Cascadelake and Genoa stacks, and they are located under the following paths:

  • /opt/apps/stack - on the Head Node and compiled for the Intel Cascadelake CPU;
  • /mnt/beegfs/stack - on the parallel filesystem and compiled for the AMD Genoa CPU;

Apps from the Genoa software stack will NOT run on the Head Node. Conversely, you CANNOT run software compiled for the Cascadelake CPU on Compute Nodes.

By default MODULEPATH points to the Cascadelake software stack, meaning that we can readily run the software upon loging into the Head Node. On the other hand, MODULEPATH will point to the Genoa stack when we submit a job to Slurm (via sbatch command), or when we land on a Compute Node for an interactive Slurm job (more on than here).

The system offers pre-defined variables MODULEPATH_CASCADELAKE and MODULEPATH_GENOA, which can be used to redefine MODULEPATH. For example, suppose that we have just landed on the system. If we print out MODULEPATH:

[user@hn]$ echo $MODULEPATH
/opt/apps/stack/modules/all:/opt/stack/modules/compiler:/opt/apps/stack/modules/toolchain

Now verify the difference to the Genoa stack:

[user@hn] echo $MODULEPATH_GENOA
/mnt/beegfs/apps/stack/modules/all:/mnt/beegfs/apps/stack/modules/compiler:/mnt/beegfs/apps/stack/modules/toolchain

You may change the active stack by unsetting the current path and setting a new one (more on this below):

[user@hn]$ module unuse $MODULEPATH
[user@hn]$ module use $MODULEPATH_GENOA
[user@hn]$ echo $MODULEPATH
/mnt/beegfs/apps/stack/modules/all:/mnt/beegfs/apps/stack/modules/compiler:/mnt/beegfs/apps/stack/modules/toolchain

Of course, the above is of limited use, except for instance when we need to verify the existing sofware in the Genoa stack (for running on Compute Nodes):

Customize your available modules

The module use command is not an attribution instruction. It adds a path to whatever exists in MODULEPATH. Hence, this variable must be emptied should we want to redifined it from scratch. This is achived with the module unuse command. So for instance:

[user@hn]$ module unuse $MODULEPATH
[user@hn]$ module use $MODULEPATH_CASCADELAKE

will switch to the Cascadelake stack, irrespectively of the previous active stack. In the above, the first command removes $MODULEPATH string from the variable MODULEPATH, effectively deleting itself. The second command appends MODULEPATH_CASCADELAKE to MODULEPATH.

Note that we can remove path sections from MODULEPATH:

[user@hn]$ echo $MODULEPATH
/path1:/path2:/path3
[user@hn]$ module unuse path2
[user@hn]$ echo $MODULEPATH
/path1:/path3

Important 1: Never change the variable MODULEPATH with a direct assignment - do it always with a module unuse/use command sequence. Direct assignments like MODULEPATH=/my/path1:/my/path2 are likely to break the internal workings of Lmod, thus leading to unexpected results.

Important 2: Do not add module load commands to your login scripts like .bashrc or .bash_profile. These are likely to be forgotten and lead to unwanted effects difficult to pin down. Just create a script with your favorite module commands and call it manually whenever you need it.

If you want to customize your MODULEPATH permanently for every session, for instance if you only run parallel jobs on compute nodes (Genoa), you may want to add the following lines to your ~/.bashrc file:

# User-defined (permanent) definition of MODULEPATH
if [ -z "$BASHRC_READ" ]; then
   export BASHRC_READ=1
   # Place any module commands below this point, for example:
   module unuse $MODULEPATH
   module use $MODULEPATH_GENOA
fi

The lines enclosed by if/fi will only be executed once for each login, no matter how many shells you launch.

Some useful module commands

List the currently loaded modules:

[user@hn]$ module list

Currently Loaded Modules:
  1) GCCcore/13.3.0              3) binutils/2.42-GCCcore-13.3.0
  2) zlib/1.3.1-GCCcore-13.3.0   4) GCC/13.3.0

Find out what modules are available to be loaded:

[user@hn]$ module avail

--------------------- /mnt/apps/modules/toolchain ---------------------
   foss/2024.05 (D)    gompi/2024.05 (D)

------------------------ /mnt/apps/modules/all ------------------------
   Autoconf/2.72-GCCcore-13.3.0
   Automake/1.16.5-GCCcore-13.3.0
   Autotools/20231222-GCCcore-13.3.0
   BLIS/1.0-GCC-13.3.0
   Bison/3.8.2-GCCcore-13.3.0
   Bison/3.8.2                        (D)
.
.
.

The module avail command has search capabilities. The following command finds all avaiable modules whose name contain the string mpi:

[user@hn]$ module avail mpi

--------------------- /mnt/apps/modules/toolchain ---------------------
   gompi/2024.05 (D)

------------------------ /mnt/apps/modules/all ------------------------
   FFTW.MPI/3.3.10-gompi-2024.05    ScaLAPACK/2.2.0-gompi-2024.05-fb
   OpenMPI/5.0.3-GCC-13.3.0         gompi/2024.05

If there are many modules on a system, the following command provides a concise listing:

[user@hn]$ module overview

--------------------- /mnt/apps/modules/toolchain ---------------------
foss (1)   gompi (1)

------------------------ /mnt/apps/modules/all ------------------------
Autoconf  (1)   Ninja     (1)   UnZip      (1)   libfabric    (1)
Automake  (1)   OpenBLAS  (1)   XZ         (1)   libffi       (1)
Autotools (1)   OpenMPI   (1)   binutils   (3)   libpciaccess (1)
BLIS      (1)   OpenSSL   (1)   bzip2      (1)   libreadline  (1)
Bison     (2)   PMIx      (1)   cURL       (1)   libtool      (1)
CMake     (1)   PRRTE     (1)   flex       (2)   libxml2      (1)
FFTW.MPI  (1)   Perl      (2)   foss       (1)   make         (1)
FFTW      (1)   Python    (1)   gettext    (1)   ncurses      (2)
FlexiBLAS (1)   SQLite    (1)   gompi      (1)   numactl      (1)
GCC       (1)   ScaLAPACK (1)   help2man   (1)   pkgconf      (2)
GCCcore   (1)   Tcl       (1)   hwloc      (1)   xorg-macros  (1)
M4        (2)   UCC       (1)   libarchive (1)   zlib         (3)
Meson     (1)   UCX       (1)   libevent   (1)

---------------- /usr/share/lmod/lmod/modulefiles/Core ----------------
lmod (1)   settarg (1)

List full info regarding modules whose name/version contain a specific string (argument)

[user@hn]$ module spider GCC

----------------------------------------------------------------------
  GCC: GCC/13.3.0
----------------------------------------------------------------------
    Description:
      The GNU Compiler Collection includes front ends for C, C++,
      Objective-C, Fortran, Java, and Ada, as well as libraries for
      these languages (libstdc++, libgcj,...).


     Other possible modules matches:
        GCCcore

    This module can be loaded directly: module load GCC/13.3.0

    Help:
      Description
      ===========
      The GNU Compiler Collection includes front ends for C, C++, 
      Objective-C, Fortran, Java, and Ada, as well as libraries for 
      these languages (libstdc++, libgcj,...).

      More information
      ================
       - Homepage: https://gcc.gnu.org/

Load and unload modules

[user@hn]$ module load package1 package2
[user@hn]$ module unload package2

Unload (discard) all loaded modules

[user@hn]$ module purge

Forcebly unload all loaded modules including sticky modules:

[user@hn]$ module --force purge

Modulefiles can contain internal help documentation. To access a module’s help do:

[user@hn]$ module help

Prepend a user defined path to the MODULEPATH list:

[user@hn]$ module use /user/defined/path
[user@hn]$ echo $MODULEPATH
/user/defined/path:/my/previous/module/path

Append a user defined path to the MODULEPATH list:

[user@hn]$ module use -a /user/defined/path
[user@hn]$ echo $MODULEPATH
/my/previous/module/path:/user/defined/path

Users may save their favorite collection of modules...

[user@hn]$ module save

... and restore them on a subsequent terminal session:

[user@hn]$ module restore

They can save heir module collection under a specified name:

[user@hn]$ module save my_modules

Verify the list of saved module collections:

[user@hn]$ module savelist

   1) my_modules

And load the collection in the future:

[user@hn]$ module restore my_modules