The Crisis in Validating Artificial Intelligence Research

Image Courtesy: Christine Daniloff/MIT

For any scientific study to be proven robust and accurate, it must be possible for others to replicate the study and produce similar results. This is essential to make sure that the results were not because of some fluke. It is also important for comparing different studies. However, replication of research in the field of artificial intelligence (AI) is facing challenges because researchers often don’t share their code.

To make replication possible, scientific papers include a section describing the methods used to carry out the research. Science has reported that from a sample of 400 papers submitted to top AI conferences, only 6% of the papers shared their algorithm’s code, a third of them shared the data used to test their algorithm, and only half shared the pseudocode — a simplified description of the algorithm.

Without access to these vital pieces of information, replicating the results of the study poses a major challenge.

In AI, small changes in the algorithm’s test run can make a huge difference to the results. In machine learning, for instance, data is used to train the algorithm for performing different tasks. An algorithm required to predict cost of houses will do so by analysing a dataset having characteristics of different houses and their prices. The choice of the dataset can result in the algorithm making different predictions. Thus, when a research paper doesn't share their data, another issue comes up in replicating the results.

Rosemary Ke, a PhD student at University of Montreal, faced this problem when she tried replicating a speech recognition algorithm. Ke and team replicated the code for the algorithm from its description, but they did not have the data used and so failed to get the same results. "There's randomness from one run to another," she says. You can get "really, really lucky and have one run with a really good number. That’s usually what people report,” she added.

There are many reasons as well for keeping researchers from sharing their code. The code could be incomplete, or owned by a company, or the researcher might be unwilling to share the code to stay ahead of competition.

However, this acts as a serious impediment for the progress of artificial intelligence research, and also reduces transparency. Computer scientist Odd Erik Gundersen said at a meeting of the Association for the Advancement of Artificial Intelligence (AAAI) that a change is necessary as AI grows. “It’s not about shaming,” he told Science. “It’s just about being honest.”

Efforts are being made to encourage researchers to reproduce results of their studies. In 2015, ReScience came into being. It is a computer science journal dedicated solely to replications. A “reproducibility challenge” is also being organised where most of the participants are students who will receive academic credits for their work.

Despite this, because of the pressure to publish results quickly, many researchers do not test the robustness of their algorithms by replicating them with different parameters. Many also do not report failed replications.

Get the latest reports & analysis with people's perspective on Protests, movements & deep analytical videos, discussions of the current affairs in your Telegram app. Subscribe to NewsClick's Telegram channel & get Real-Time updates on stories, as they get published on our website.

Subscribe Newsclick On Telegram