Introduction

1 Introduction

Artiﬁcial intelligence (AI) research has tried many diﬀerent approaches since its founding. In the ﬁrst decades of the 21st century, the AI research is dominated by highly mathematical statistical machine learning (ML), which has proved highly successful, helping to solve many challenging problems in real life.

Many problems in AI can be solved theoretically by searching through many possible solutions: Reasoning can be reduced to performing a search. Simple exhaustive searches are rarely suﬃcient for most real-world problems. The solution, for many problems, is to use ”heuristics” or ”rules of thumb” that prioritize choices in favor of those more likely to reach a goal. A very diﬀerent kind of search came to prominence in the 1990s, based on the mathematical theory of optimization. Modern machine learning is based on these methods. Instead, of using detailed explanations to guide the search, it uses a combination of[1]: (a) general architectures; (b) trying trillions of possibilities, guided by simple ideas (like gradient descent) for improvement; and (c) the ability to recognize progress.

I am interested in applying machine learning to problems in computational physics problems that traditional numerical methods can not easily handle either because of its computational costs being too high or its traditional algorithms are too complicated to easily implement.

Enrico Fermi once criticized the complexity of a model (that contains many free parameters) by quoting Johnny von Neumann “With four parameters I can ﬁt an elephant, and with ﬁve I can make him wiggle his trunk”.

What Fermi implies is that it is easy to ﬁt existing data and what is important is to have a model with predicting capability (ﬁtting data not seen yet). The artiﬁcial neural network method tackles this diﬃculty by increasing the number of free parameters to millions, with the hope of obtaining predicting capability.

[next] [front] [up]