April 27, 2024

The Next Einstein: New AI Can Develop New Theories of Physics

The Next Einstein: New AI Can Develop New Theories Of PhysicsAlbert Einstein Physics Technology - The Next Einstein: New AI Can Develop New Theories Of Physics

Researchers at Forschungszentrum Jülich have developed an AI capable of formulating physical theories by recognizing patterns in complex data sets, a feat historically achieved by great physicists like Isaac Newton and Albert Einstein. This AI, part of the “Physics of AI” initiative, simplifies complex interactions in data to develop new theories, differing from conventional approaches by making the theories explainable and grounded in the language of physics. Credit: SciTechDaily.com

The development of a new theory is typically associated with the greats of physics. You might think of Isaac Newton or Albert Einstein, for example. Many Nobel Prizes have already been awarded for new theories. Researchers at Forschungszentrum Jülich have now programmed an artificial intelligence that has also mastered this feat.

Their AI is able to recognize patterns in complex data sets and to formulate them in a physical theory. The development of a new theory is typically associated with the greats of physics. You might think of Isaac Newton or Albert Einstein, for example. Many Nobel Prizes have already been awarded for new theories. Researchers at Forschungszentrum Jülich have now programmed an artificial intelligence that has also mastered this feat. Their AI is able to recognize patterns in complex data sets and to formulate them in a physical theory.

In the following interview, Prof. Moritz Helias from Forschungszentrum Jülich’s Institute for Advanced Simulation (IAS-6) explains what the “Physics of AI” is all about and to what extent it differs from conventional approaches.

How do physicists come up with a new theory?

You usually start with observations of the system before attempting to propose how the different system components interact with each other in order to explain the observed behavior. New predictions are then derived from this and put to the test. A well-known example is Isaac Newton’s law of gravitation. It not only describes the gravitational force on Earth, but it can also be used to predict the movements of planets, moons, and comets – as well as the orbits of modern satellites – fairly accurately.

However, the way in which such hypotheses are reached always differs. You can start with general principles and basic equations of physics and derive the hypothesis from them, or you can choose a phenomenological approach, limiting yourself to describing observations as accurately as possible without explaining their causes. The difficulty lies in selecting a good approach from the numerous approaches possible, adapting it if necessary, and simplifying it.

What approach are you taking with AI?

In general, it involves an approach known as “physics for machine learning”. In our working group, we use methods of physics to analyze and understand the complex function of an AI.

The crucial new idea developed by Claudia Merger from our research group was to first use a neural network that learns to accurately map the observed complex behavior to a simpler system. In other words, the AI aims to simplify all the complex interactions we observe between system components. We then use the simplified system and create an inverse mapping with the trained AI. Returning from the simplified system to the complex one, we then develop the new theory. On the way back, the complex interactions are built up piece by piece from the simplified ones. Ultimately, the approach is therefore not so different from that of a physicist, with the difference being that the way in which the interactions are assembled is now read from the parameters of the AI. This perspective on the world – explaining it from interactions between its various parts that follow certain laws – is the basis of physics, hence the term “physics of AI”.

In which applications was AI used?

We used a data set of black and white images with handwritten numbers, for example, which is often used in research when working with neural networks. As part of her doctoral thesis, Claudia Merger investigated how small substructures in the images, such as the edges of the numbers, are made up of interactions between pixels. Groups of pixels are found that tend to be brighter together and thus contribute to the shape of the edge of the number.

How high is the computational effort?

The use of AI is a trick that makes the calculations possible in the first place. You very quickly reach a very large number of possible interactions. Without using this trick, you could only look at very small systems. Nevertheless, the computational effort involved is still high, which is due to the fact that there are many possible interactions even in systems with many components. However, we can efficiently parameterize these interactions so that we can now view systems with around 1,000 interacting components, i.e. image areas with up to 1,000 pixels. In the future, much larger systems should also be possible through further optimization.

How does this approach differ from other AIs such as ChatGPT?

Many AIs aim to learn a theory of the data used to train the AI. However, the theories that the AIs learn usually cannot be interpreted. Instead, they are implicitly hidden in the parameters of the trained AI. In contrast, our approach extracts the learned theory and formulates it in the language of interactions between system components, which underlies physics. It thus belongs to the field of explainable AI, specifically the “physics of AI”, as we use the language of physics to explain what the AI has learned. We can use the language of interactions to build a bridge between the complex inner workings of AI and theories that humans can understand.

Reference: “Learning Interacting Theories from Data” by Claudia Merger, Alexandre René, Kirsten Fischer, Peter Bouss, Sandra Nestler, David Dahmen, Carsten Honerkamp and Moritz Helias, 20 November 2023, Physical Review X.
DOI: 10.1103/PhysRevX.13.041033