Science —

Scientific computing’s future: Can any coding language top a 1950s behemoth?

Cutting-edge research still universally involves Fortran; a trio of challengers wants in.

“I don't know what the language of the year 2000 will look like, but I know it will be called Fortran.” —Tony Hoare, winner of the 1980 Turing Award, in 1982.

Take a tour through the research laboratories at any university physics department or national lab, and much of what you will see defines “cutting edge.” “Research,” after all, means seeing what has never been seen before—looking deeper, measuring more precisely, thinking about problems in new ways.

A large research project in the physical sciences usually involves experimenters, theorists, and people carrying out calculations with computers. There are computers and terminals everywhere. Some of the people hunched over these screens are writing papers, some are analyzing data, and some are working on simulations. These simulations are also quite often on the cutting edge, pushing the world’s fastest supercomputers, with their thousands of networked processors, to the limit. But almost universally, the language in which these simulation codes are written is Fortran, a relic from the 1950s.

Wherever you see giant simulations of the type that run for days on the world’s most massive supercomputers, you are likely to see Fortran code. Some examples are atmospheric modeling and weather prediction carried out by the National Center for Atmospheric Research; classified nuclear weapons and laser fusion codes at Los Alamos and Lawrence Livermore National Labs; NASA models of global climate change; and an international consortium of Quantum Chromodynamics researchers, calculating the behavior of quarks, the constituents of protons and neutrons. These projects are just a few random examples from a large computational universe, but all use some version of Fortran as the main language.

This state of affairs seems paradoxical. Why, in a temple of modernity employing research instruments at the bleeding edge of technology, does a language from the very earliest days of the electronic computer continue to dominate? When Fortran was created, our ancestors were required to enter their programs by punching holes in cardboard rectangles: one statement per card, with a tall stack of these constituting the code. There was no vim or emacs. If you made a typo, you had to punch a new card and give the stack to the computer operator again. Your output came to you on a heavy pile of paper. The computers themselves, about as powerful as today’s smartphones, were giant installations that required entire buildings. (OK, these computers only had a fraction of the power of today's smartphones.)

A Hollerith card that, when punched, will contain one Fortran statement.
A Hollerith card that, when punched, will contain one Fortran statement.
Public domain

In the 60 years since the creation of the first Fortran compiler, there has been tremendous activity in the field of programming languages and computer science. Entire paradigms of language design and program organization have arisen and done battle with each other. Fortran has remained serenely detached, now and then incorporating an idea or two into a new version of the language.

While structured programming, object orientation, functional programming, and logic programming all arose to solve various problems that supposedly were not solved by primitive Fortran, none of the languages that embodied these new organizing principles came close to supplanting Fortran in the realm for which it was invented: scientific and numerical computing. This remains true up to the present, as shown by the examples above and by the content of courses and textbooks on the subject. Introduction to High Performance Computing for Scientists and Engineers, for example (published in 2010), contains most of its code samples in Fortran.

Now, a few years after the appearance of Fortran, a very different kind of language was invented: Lisp, or, as it was originally called, LISP. “Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older," writes Dr. Joey Paquet in his lecture notes for the Comparative Studies of Programming Languages class [PDF] at Concordia University. Although Lisp became popular with artificial intelligence researchers, it never caught on with physical scientists. Two main reasons for this are speed and weirdness. Speed, because although some versions of Lisp attained respectable runtime efficiency, they were not in the same class as Fortran for heavy numerical work. Weirdness, because the prefix notation used by Lisp made expressions in that language look a lot less like normal mathematics than did math rendered in Fortran. (Fortran stood, after all, for FORmula TRANslator.) A normal chemist or engineer is far more comfortable with y = (a + b)/c than with (setf y ((/ (+ a b) c))).

This was, in a sense, unfortunate. The abstract power of the Lisp family, which is closely connected with its peculiar notation and its functional nature, can provide elegant solutions to problems of parallelism that Fortran eventually grappled with using a clumsy agglomeration of ad-hoc libraries and compiler directives hiding in code comments. But, as we'll cover, Lisp has been reborn in a modern form that is beginning to attract some numericists due to its powerful approach to concurrency.

Including the modern iteration of Lisp, three recently developed languages may actually have a chance to step out from the vast shadow of Fortran. Each hopes to capture the hearts of different segments of the scientific computing community.

Meet the candidates

Computer science and computing languages have been highly active fields of research since before Fortran Zero was unleashed. In that time, scores of languages have been designed and put to use, many of which provide powerful abstractions and facilities to the programmer that have little chance of ever making it into any version of Fortran. This is because many of the key ideas of these other languages are incommensurate with the imperative, data mutating design implicit in the core of Fortran. And these ideas are not merely academic curiosities; they afford real, practical, and provable advantages to the implementer of complex numerical algorithms that need to run correctly and efficiently in a multi-processor computing environment.

Haskell—the elder statesman

Of the three young languages with the potential to move beyond Fortran, the oldest is called Haskell. The unique programming language attained its first “stable variant” in 1998. While C, Fortran, and Pascal are part of the Turing machine branch of the language world, Haskell, along with Lisp, is a member of the lambda calculus branch. These programs are conceived as the composition of functions rather than as a series of explicit steps.

It’s difficult to convey the feel of programming in Haskell; one really needs to learn a bit of the language and try it out. And it’s worth doing this for the unique experience of an entirely unfamiliar style of programming, even if you never actually use the language.

While Fortran provides a comfortable translation of mathematical formulas, Haskell code begins to resemble mathematics itself. With its elaborate type system and pattern matching, the beginnings of function definitions in Haskell look like the setups of proofs in mathematical monographs, with definitions and axioms carefully established.

The Haskell logo.
The Haskell logo.

A simple example of this is a naive definition for a function that returns the Fibonacci sequence (recall that the Fibonacci sequence can be defined recursively as F0 = 1; F1 = 1; FN = FN-1 + FN-2 i.e., each term is the sum of the previous two terms):

fib :: Integer -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

This may look like a definition extracted from a mathematics textbook, but it’s legal Haskell code.

Haskell excels in situations where program correctness needs to be ensured, if possible before run time. Its purely functional nature allows programs to be constructed with a high degree of confidence in their correctness. In fact, it is a common experience for Haskell programmers to find that their code runs correctly the first time after all the compiler errors have been dealt with. This is a rare experience for users of almost any other language, and it's due to Haskell’s sophisticated type system.

Channel Ars Technica