Benchmarking BRGM’s EFISPEC3D earthquake simulation application on AWS

Lilia ZIANE KHODJA (ANEO), Damien DUBUC (ANEO), Florent DE MARTIN (BRGM), Faïza BOULAHYA (BRGM), Steve MESSENGER (AWS), Gilles TOURPE (AWS), Diego BAILON HUMPERT (Intel), Loubna WORTLEY (Intel)

The Bureau de Recherches Géologiques et Minières (BRGM, French Geological Survey) is France’s leading public institution for Earth Science applications for the management of surface and sub-surface resources with a view to sustainable development. Under partnerships with numerous public and private stakeholders, it focuses on scientific research, providing scientifically-validated information to support public policy development and international cooperation.

To this end, BRGM needs to carry out high performance computing (HPC) simulations to assess various quantities of interest for public policy, such as sea level rise, seismic hazard or building vulnerability. For example, large-scale, physics-based earthquake scenarios are performed to assess seismic hazard as well as to quantify their degree of confidence with respect to the epistemic or aleatoric uncertainties. This approach requires tremendous computing resources that are generally provided by national (Tier-1) or European (Tier-0) infrastructures.

This post reports on the performance of EFISPEC3D, BRGM’s HPC application to compute earthquake scenarios on AWS.

EFISPEC3D is a scientific computer program that solves the three-dimensional equations of motion using a continuous Galerkin spectral finite element method. The code is parallelized using the Message Passing Interface (MPI).

“EFISPEC3D is optimized to scale over several thousands of cores for standard mesh size (e.g., 2 millions of spectral elements) thanks to its non-blocking communication scheme and its three levels of MPI communicators. The performance results depicted hereafter using a mesh of 1,906,624 spectral elements partitioned onto 27036 physical cores show a fair scaling down to 70 spectral elements per physical cores.” EFISPEC3D is used in different areas, such as seismic hazard assessment for critical industrial facilities like nuclear power plants, urgent computing, sensitivity analysis via Machine Learning or seismic wave propagation in complex geological medium (e.g., sedimentary valleys, volcanoes, etc.).

EFISPEC3D is developed by the French Geological Survey (BRGM) since 2009 in collaboration with Intel Corp, the Institute of Earth Sciences (ISTerre), and the Fundamental Computer Sciences Laboratory of Orléans (LIFO). It has been and is still part of several research projects funded by the French National Research Agency (ANR).

Different views of the mesh of the test case n°2 designed with two levels of refinements: one from 315 m to 105 m and one from 105 m to 35 m in order to account for seismic wavelengths measuring at least 35 meters. Within the contour of the basin, the mesh is refined over a constant depth that includes the bottommost point of the sedimentary basin under investigation. The mesh contains 7,053,889 unstructured hexahedral elements (approximately 1.4 billion degrees of freedom) on which the weak form of the equations of motion is solved.

Performance Results

The interest here is to present the use of the AWS services to carry out BRGM’s prediction simulations. A typical simulation lasting about 300 days on a single core will be run in 25 min on AWS Cloud, with using 27k physical cores and no hyperthreading.

AWS has made available several types of Amazon Elastic Compute Cloud (Amazon EC2) Intel-based instances. The 27k cores used for this performance report are based on c5n.18xlarge instances with Intel® Xeon® Platinum 8000 Series processor (Skylake-SP) with a frequency of 3.5GHz as well as m5zn.12xlarge instances powered by custom Intel® Xeon® Scalable 2nd generation processors (Cascade Lake) at 4.5GHz. Both Amazon c5n and m5zn instances use Elastic Fabric Adapter (EFA) to support inter-nodes communication. The two different types of instances were chosen as they offer different balances of cost/performance.

Instance	Processor	vCPU	RAM (GiB)	Networks (Gbps)
c5n.18xlarge	Skylake-SP	72	192	100
m5zn.12xlarge	Cascade-Lake	48	192	100

EFISPEC3D is implemented with Fortran08 and Intel MPI to solve 3D wave equations by the spectral finite element method.

Strong scaling simulations

In the first experiments, we test the strong scaling of the EFISPEC3D application on AWS c5n.18xlarge and m5zn.12xlarge instances.

The following script describes an example of a batch script to perform a simulation on 16 384 cores of c5n.x18large instances:

#!/bin/bash
#SBATCH --account=account-strong-scaling-16384
#SBATCH --job-name=string-scaling-16384
#SBATCH --partition=batch-c5n
#SBATCH --hint=nomultithread
#SBATCH --mem-per-cpu=2048M
#SBATCH --nodes=456
#SBATCH --ntasks=16384
#SBATCH --ntasks-per-node=36
#SBATCH --ntasks-per-core=1
#SBATCH --output=string-scaling-16384-%x_%j.out

ulimit -s unlimited
mpiexec.hydra -n 16384 /fsx/applications/EFISPEC3D-c5n-18xlarge/TEST-1-STRONG-SCALING/EFISPEC3d-ASYNC/bin/efispec3d_1.1_avx.exe

A mesh of about 2 million hexahedra, about 77GiB RAM, is analyzed on a single physical core to 27036 physical cores.

Number of cores	Number of instances	Average number of hexahedra per physical core	Average RAM per core (Mo)	Elapsed time for initialization
1	1	1906624	77145.555	181.56 s
32	1	59582	2420.925	29.62 s
64	2	26480	1077.098	25.91 s
128	4	13240	539.555	28.89 s
256	8	6620	270.070	39.14 s
512	15	3530	144.265	37.57 s
1024	29	1826	74.870	61.39 s
2048	57	930	38.236	97.45 s
4096	114	464	19.192	182.35 s
8192	228	233	9.700	442.14 s
16384	456	117	4.898	154.01 s
27036	751	71	2.988	276.20 s

Table 1: Information about simulations done on c5n.18xlarge instances

The following tables (Table 2, Table 3, Table 4 and Table 5) present the strong scaling of EFISPEC3D on C5n and M5zn instances:

Number of cores	Blocking communications	Non-blocking communications
1	2542.38 s	2522.54 s
32	92.13 s	106.09 s
64	45.19 s	47.50 s
128	22.79 s	24.93 s
256	12.18 s	12.51 s
512	6.66 s	6.63 s
1024	3.57 s	3.34 s
2048	2.17 s	1.60 s
4096	1.43 s	0.79 s
8192	1.74 s	0.40 s
16384	–	0.20 s
27036	–	0.14 s

Table 2: Elapsed time for time loop computation in seconds on c5n.18xlarge instances

Number of cores	Blocking communications	Non-blocking communications
1	1.0x	1.0x
32	27.59x	23.78x
64	56.26x	53.10x
128	111.56x	101.19x
256	208.79x	201.67x
512	381.94x	380.59x
1024	711.14x	754.92x
2048	1171.72x	1576.72x
4096	1782.18x	3187.30x
8192	1457.32x	6339.85x
16384	–	12484.81x
27036	–	18030.57x

Table 3: Speedups on AWS c5n.18xlarge instances

Number of cores	Number of instances	Blocking communications	Non-blocking communications
1	1	2026.03 s	2027.19 s
32	2	50.25 s	54.49 s
64	3	33.28 s	36.82 s
128	6	16.77 s	17.70 s
256	11	9.47 s	9.56 s
512	22	4.77 s	4.91 s
1024	43	2.98 s	2.49 s

Table 4: Elapsed time for time loop computation in seconds on AWS m5zn.12xlarge instances

Number of cores	Blocking communications	Non-blocking communications
32	40.31x	37.20x
64	60.88x	55.06x
128	120.82x	114.51x
256	213.94x	211.99x
512	424.32x	413.09x
1024	680.74x	812.20x

Table 5: Speedups on AWS m5zn.12xlarge instances

**Figure 1: Strong scaling on AWS c5n.18xlarge instances**

**Figure 2: Strong scaling on AWS m5zn.12xlarge instances**

The results present the elapsed times for the time loop computation (in seconds) of EFISPEC3D using blocking or non-blocking communications, and the achieved speedups with respect to the simulation made on a single physical core. We can notice that the strong scaling on the cluster of AWS with EFA is 67% at 27036 cores, more particularly for the non-blocking communications version for which a speedup of 18030 is reached on 27036 physical cores (see Figure 1). Such a speedup allows to compute a standard earthquake simulation in about 25 minutes instead of 300 days on a single core.

When blocking communications are used, we can see the drop in performance around 2048 cores due to the incompressible time of send receive MPI calls. By overlapping communications with computations, the drop of performance is pushed back until 27036 cores.

7M hexahedra simulations

During this second experiment, we enabled reading and writing to the Amazon FSx for Lustre file system in the EFISPEC3D application as the application uses data writing via Intel MPI. We then chose to check the performance of Amazon FSx for Lustre.

The objective was to test the performance of writing results to a file system when 2048 tasks write results to the file system at the same time. For this test, we have used a cluster of 57 c5n.18xlarge instances (total of 2052 cores).

The execution of the application generates 2TB of data mostly condensed in a single file. The performance configuration of the file system is based on two main parameters: the block size factor and the stripping factor. They are respectively the data packet size to write and the number of packets to write simultaneously by the application. The configuration of the file system makes it possible to calibrate the optimal writing to disk in adequacy with the size of writing by the application and the number of writing lines authorized by the file system.

To allow performance analysis for several file system configurations, we have reduced the file size to 206GB. This procedure allowed us to evaluate many more different configurations of the Amazon FSx for Lustre file system while preserving the reliability of the performance results obtained.

**Figure 3: Amazon FSx with Lustre performance to write file with MPI.**
*Impact of file system parameters. Lustre on the bandwidth measured for the reading phase*
*of the data (206GB). The stripe size varies from 1MB to 10MB and the striping factor varies from 5 to 80.*

In the chart Figure 3, each stick represents one execution with different configuration of Amazon FSx for Lustre parameters. For example, 5S – 1M means the test was calibrated with 5 Stripes factor and 1 Megabyte as size of bloc of data.

TheSE tests were run using a 96TiB FSx for Lustre volume with 80 OSDs. The “Scratch 2” deployment type or FSx for Lustre was used. This FS offers a theoretical performance of 200MBs per TB + 1.3GB/s of burst bandwidth. The file system as deployed had a baseline performance of 16GB/s with burst capability of 104GB/s

Following these six executions, we notice that the best performances will be reached with EFISPEC3D when we set the file system with a stripes factor of 80 and block sizes of 10 Megabytes.

**Figure 4:** General bandwidth in gigabytes per second for writing files during the post processing phase.

During the post processing phases of the EFISPEC3D application, the bandwidth went up to 10.3 GB/s to write a 206 GB result file.

Public cost study to 7M execution

We studied the cost of execution of the EFISPEC3D application on the simulation of 7 mega hexahedra. Based on the public prices of AWS proposed in the US Virginia Region.

As a reminder, the selected infrastructure is based on 2048 physical cores of the c5n.18xlarge instance and its 72 vCPUs of Intel® Xeon® Platinium processor. This means that we were able to start 57 instances of c5n.18xlarge in the same Region, namely the us-east-1 Region.

The total cost of these instances for 25 minutes comes to a public price of $93. In the chosen infrastructure we also use the FSx file system for $30 per day per deployed cluster for 3.6TB of storage.

In total, the cost of the cluster for this type of execution comes to $123 public price for an application execution during 25 minutes with 57 instances of c5n.18xlarge.

Conclusion

During the R&D and benchmarking phase, we were able to see how easy it is to deploy, install, and configure our HPC environment using AWS ParallelCluster, an AWS-supported open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS. We have also seen how quickly C5n instances are available in all AWS Regions across the globe. The M5zn instances are quite different in a number of respects. They have fewer faster cores and are based on Intel Cascade Lake architecture. Over 400 different instance types are available on AWS. As always with HPC applications there is no substitutes for testing the real application and data to see which systems offer the best blend of cost performance for any given application.

During the strong scaling tests, we were able to observe very good performance of the EFISPEC3D application which demonstrates the power of the C5n and M5zn servers for the HPC world.

Concerning the Amazon FSx for Lustre file system, we reached a bandwidth higher than 10GB/s which more than met the requirements of our application with intensive disk writing by all compute nodes.

EFISPEC3d is featured in a HPC Workshop. The workshop guides the user through the process of deploying a HPC cluster and running the EFISPEC3D software. It can be found at this location:

https://hpc.news/efispec3dworkshop

More information about EFISPEC3D can be found here:

https://www.brgm.fr/en/software/efispec3d-numerical-ground-motion-predictions-seismic-risk-assessments

Benchmarking BRGM’s EFISPEC3D earthquake simulation application on AWS

Share this article

Performance Results

Strong scaling simulations

7M hexahedra simulations

Public cost study to 7M execution

Conclusion

You may also be interested in these articles

Newsletter