## Intensive calculation (HPC) or the oldest loophole of the IT sector.

The first computers were conceived during the 40-50’s by researchers under military projects to calculate ballistic trajectories. At that moment, these researchers were at the same time physicians, mathematicians, electronic specialists, logicians… They thought all together about the physical problem, the mathematical formulation to solve the issue and the machine capable of solving this issue. We were not talking about software yet: the first supercomputers such as ENIAC were conceived to solve one and only one problem. Indeed, the calculation resources were very limited: let us not forget that the calculator used to prepare the Apollo missions had less computing power than a middle schooler’s calculator of today! All means are then carried out to allow a better exploitation of these resources. Therefore, some physical hypotheses have their own immediate translation in the architecture of the commonly-used calculation components. To reach another problem, to change the architecture of the machine is needed: to “reconnect” it, to even dismantle it et reassemble it another way… What could take several days.

Fortunately, scientists do not start back from
scratch each time: some components, needed for every scientific calculation,
were finally standardized. Therefore, the way of representing numbers has been
standardized in order to allow the reuse of existing conceptions whenever the
material is updated. De facto standard for many years, the representation of
float numbers (the “floats” and the “double”) as we know them were normalized
in 1985 and extended in 2008. In a second phase, the calculating units were
standardized. For instance, in 1967, IBM produced the co-processor *IBM System/360 Model 91*, one of the
first Floating Point Unit (or FPU).

This standardization of the material came along the emergence of high-level approaches to describe the problems to solve. Among the noteworthy examples, we can consider the FORmula TRANslator, best known under the name FORTRAN, of which the first was published in 1954 and which was officially standardized for the first time in 1966 by the ANSI. Lisp, a language that marked the world of research in artificial intelligence, comes from this period as the first version dates back to 1958. These standardizations were the occasion to specialize the roles in the teams and allowed, in the end, to create new disciplinary fields: some people conceived the calculation material while others implemented the software that used this material. The link was made by specialized software called compilers or interpreters. Of course, to write these compilers, people should know the programming language as well as the architecture of the material. Nevertheless, it is no longer necessary to know the physics of the problems to solve nor the associated mathematics and on the other side, it is no longer necessary to know the semiconductors physics allowing to design the processors. The language, graph or information theory, the artificial intelligence or even the combinatorial optimization are as many scientific fields that encountered a huge progress during the 1970-80’s because of the efforts of standardization.

The 1980’s were also the witness of the democratization of IT tool in the research centers of the most advanced companies. This democratization allowed the emergence of training courses on IT jobs: IT was not the scientists’ private preserve anymore. It is the common base of a real industry that impacted the cinema (the first synthesis images date back to 1978 in Star Wars), the games that became videos, the bank services, the communications, etc. It was then natural that pushed by the economic and societal issues, a part of the community invested their time to propose tools and methods that allowed in the same time an easier learning, an increased productivity and a cheaper maintenance. MatLab was created in 1984 to facilitate the learning and the execution of mathematical software: the goal was that students could use it and be operational fast enough to use it for mathematical courses. Launched at the beginning of the 1990’s, Java is a very good example of language and environment aiming for productivity (simple to access because essential, object oriented, managed, strongly typed). C# that appeared a few years later came along the same trend.

These evolutions towards an easier development naturally had a cost. Indeed, these abstractions had a large impact on the performances. Thanks to the increase of the calculating power of processors, this drawback became progressively marginal compared to the improvements in time of developing and maintenance. Today, it is rare to come across applications for which a real effort of optimization is necessary. De facto, most of the applications used today could consume at least twice as less calculation power; the effort of optimization is not worth it because the computers (or cell phones) are sufficiently powerful to allow the execution of non-optimized software. For the effort of optimization to be justified, the cost of optimization must be lower than the cost of the time lost. Therefore, to spend time optimizing the code of a game like Solitaire is totally absurd: the value of a few milliseconds gain here or there is strictly zero. On the opposite, to invest time in the optimization of performances of a software of 3D image rendition is totally justified in the field of animated movies. Indeed, the cost of the material of rendition of a movie is very high and the savings allow a return on investment pretty quick.

To optimize the performances of software, the leading-edge approaches require to work in the same time on modeling, algorithms and the implementation while considering the characteristics of the architecture of the used material. In other words,

they involve coming back at least partially to the abstractions that were introduced during the last decades. The calculation libraries among the most powerful up to today like GOTO BLAS, a linear algebra library, are by the way written in assembler language, the most low-level language possible.

Today, more and more applications require this high level of performances (physics, weather, oil exploitation, crash-tests simulations, financial calculation, molecular chemistry, pharmaceutical, cosmetics, cinema, video games, etc.) and more and more efforts are made to deliver both a high level of abstraction and performances. Therefore, the recent evolutions brought to the C++ language were conceived to make it possible to reconcile performances and high level of abstraction. This trend extends by the way to languages that were not first conceived to deliver a high level of performances. For instance, in 2015, Microsoft extended the C# language to allow developers to exploit the vectoral units of processors.

To meet the requirements, more and more tools
are optimizing their functioning based on the material. For instance, in the
field of linear algebra, the C++Eigen and NT² libraries enable to get much
closer of the mathematical language while taking care of the low-level
optimization. Machine Learning is another example of field and is maybe the one
that will allow to raise awareness among the larger number of people on these
different issues. Indeed, the issues are such that the web tycoons have all
developed their optimized tools and propose them to the entire community (*TensorFlow *of Google or *Cognitive Toolkit* of Microsoft are both
examples of it). These tools answer to both the requirements of performances
and the requirements of ease of maintainability of applications. Therefore,
they allow Data Scientists to concentrate on their work and to worry less about
the issues of performances.

This approach, consisting in conceiving the tools specific to a given reconciling field of development of application and to the issues of performances and simplicity of use, is a part of the added values of ANEO. For instance, this type of tool allows quants to efficiently use the material architecture as diversified as the x86 (multithreaded and vectorized), the GPU and the Xeon Phi from a same source code of the profession. It allows numerical engineers to express their problem of linear algebra and to efficiently target the x86 or the GPU.

We are currently experimenting the generalization of these approaches to the distributed architecture through two projects:

- A project aiming to allow users to dynamically compose the flows of image processing and of machine learning by targeting both CPU x86 and GPU.
- A project aiming to conceive a code of propagation of seismic waves on distributed architecture with basis of CPU x86, ARM or Power8.

We will talk about it in further publications that will be devoted to them. We will also talk very soon of FPGA and its interest in calculation.