AI has solved a key mathematical puzzle for understanding our world


Unless you're a physicist or an engineer, there's not much reason for you to know about partial differential equations. I know. After studying them for years while studying mechanical engineering, I haven't used them in the real world since.

Partial differential equations or PDEs are also magical. They are a category of mathematical equations that are really good at describing changes over space and time, and therefore very useful for describing the physical phenomena in our universe. They can be used to model anything from planetary orbits to plate tectonics to air turbulence disrupting a flight, which in turn allows us to do practical things like predicting seismic activity and designing safe aircraft.

The catch is that PDEs are notoriously difficult to solve. And here the meaning of "solve" is perhaps best illustrated with an example. For example, let's say you're trying to simulate air turbulence to test a new aircraft design. There is a well-known PDE called Navier-Stokes that describes the movement of a liquid. By “solving” Navier-Stokes you can take a snapshot of the air movement (a.k.a. wind conditions) at any point in time and model how it will move further or how it has moved before.

These calculations are very complex and computationally intensive, which is why disciplines that use many PDEs often rely on supercomputers to do the math. It is for this reason that the AI ​​field has been particularly interested in these equations. If we could use deep learning to speed up the solving process, it could do a lot of good for scientific research and technology.

Now, Caltech researchers have introduced a new deep learning technique for solving PDEs that is dramatically more accurate than the previously developed deep learning methods. It is also much more general and capable of solving entire families of PDEs – like the Navier-Stokes equation for any type of fluid – without the need for retraining. After all, it's 1000 times faster than traditional math formulas, which would ease our reliance on supercomputers and increase our computing power to model even bigger problems. Correctly. Bring it on.

Hammer time

Before we get into how the researchers did this, let's first evaluate the results. In the GIF below you can see an impressive demonstration. The first column shows two snapshots of the movement of a liquid. The second shows how the liquid kept moving in real life. and the third shows how the neural network predicted that the liquid would move. It looks basically the same as the second one.

The newspaper caused quite a stir on Twitter and even got a reputation from rapper MC Hammer. Really.

Fourier neural operator for parametric partial differential equations # Hamm400aos

– MC HAMMER (@MCHammer), October 22, 2020

Okay, back to how they did it.

If the function fits

The first thing to understand here is that neural networks are basically functional approximators. (Say what?) When you train on a set of paired inputs and outputs, you are actually calculating the function or a series of math operations that blend into each other. Remember to build a cat detector. You train the neural network by feeding it lots of pictures of cats and things that aren't cats (the inputs) and labeling each group with a 1 and 0 (the outputs), respectively. The neural network then looks for the best function that can convert every image of a cat to a 1 and every image of everything else to a 0. So it can look at a new picture and tell you if it's a cat or not. It uses the function it found to calculate its answer – and if the workout was good, it will be correct most of the time.

Conveniently, this functional approximation process is what we need to solve a PDE. Ultimately, we try to find a function that best describes, for example, the movement of air particles across physical space and time.

Here is the gist of the paper. Neural networks are usually trained to approximate functions between inputs and outputs that are defined in Euclidean space, your classical graph with x, y and z axes. This time, however, the researchers decided to define the inputs and outputs in Fourier space, a special type of diagram used to plot wave frequencies. The intuition they have drawn on from working in other areas is that something like the movement of air can actually be described as a combination of wave frequencies, says Anima Anandkumar, a Caltech professor who along with her colleagues, Professors Andrew Stuart and Kaushik, Bhattacharya supervised the research. The general wind direction at the macro level is like a low frequency with very long, lethargic waves, while the small eddies that form at the micro level are like high frequencies with very short and fast waves.

Why is that important? Because it is much easier to approximate a Fourier function in Fourier space than to argue with PDEs in Euclidean space, which greatly simplifies the work of the neural network. Important Accuracy and Efficiency Gains: In addition to its enormous speed advantage over traditional methods, their technique achieves a 30% lower error rate when solving Navier-Stokes than previous deep learning methods.

The whole thing is extremely clever and also makes the method generalizable. Previous deep learning methods had to be trained separately for each type of liquid, while this only had to be trained once to handle all of them, as the researchers' experiments confirmed. While they have not yet tried to extend this to other examples, it should also be able to handle any composition of the earth in solving PDEs related to seismic activity or any type of material in solving PDEs related to thermal conductivity.

Super simulation

The professors and their doctoral students did this research not just for theoretical fun. They want to bring AI into more scientific disciplines. Through discussions with various employees from the areas of climate science, seismology and materials science, Anandkumar first decided to tackle the PDE challenge with her colleagues and students. They are now working to put their method into practice with other researchers from Caltech and the Lawrence Berkeley National Laboratory.

A research topic that Anandkumar is particularly interested in: climate change. Navier-Stokes is not only good at modeling air turbulence. It is also used to model weather patterns. "Good, fine-grained weather forecasts on a global scale are such a challenging problem," she says, "and even on the largest supercomputers we cannot do this on a global scale today." If we can use these methods to speed up the entire pipeline, it would be hugely powerful. "

There are many, many other uses too, she adds. "With that in mind, the sky is the limit as we have a common way to speed up all of these applications."


Steven Gregory