AI learns to write computer code in ‘stunning’ advance
Generative AI has taken the world by storm and continues to be a source of deep fascination in the public domain.
Over decades, the world has increasingly come to run on software, and the next iteration appears to be AI-written software. Indeed, software has already augmented many aspects of our lives. It allows us to interact with and control smartphones, nuclear powerplants, and even electric vehicles.
Setting aside the potentially terrifying scenario of software that can think for itself, there may be a future where plain English is reliably translated into functioning lines of code, or even a full program.
An AI model called AlphaCode brings humanity one step closer to that vision, according to a recent study. Researchers say the DeepMind system could soon assist experienced coders, but it’s unlikely to fully replace them.
Armando Solar-Lezama, head of the computer-assisted programming group at the Massachusetts Institute of Technology said: “It’s very impressive, the performance they’re able to achieve on some pretty challenging problems.”
AlphaCode goes beyond the industry standard: Codex, a 2021 system launched by the non-profit research organisation OpenAI. The non-profit had already developed GPT-3, a “large language model” that imitates and interprets text after being trained on a database of billions of books, articles, and other internet pages. By unleashing GPT-3 on over 100 gigabytes of programming from the online-code repertoire Github, OpenAI came up with Codex. The software writes code when prompted with descriptions of what to do.
However, AI still has problems when faced with tricky situations. AlphaCode’s developers focused on solving those issues. Like the Codex researchers, they began by feeding a language model with many gigabytes of code from GitHub to familiarise it with the syntax and coding standards. They then trained it to translate normal language problems into code by using thousands of inquiries collected from online programming competitions. When a fresh problem arose, AlphaCode generated a possible solution and filtered out the redundant proposals. This is standard procedure, but whereas Codex generated tens or hundreds of solutions, DeepMind had AlphaCode generate over a million.
To filter the candidate solutions, AlphaCode kept 1% of programs that passed test parameters. The field was narrowed further by clustering outputs to made-up inputs based on similarity. After that, AlphaCode submitted these clusters and went through them one by one until it aligned on 10 submissions (the maximum that humans submit in the competitions). This clustering technique allows the AI to test a wide range of tactics, which is the most innovative step in AlphaCode’s process according to Kevin Ellis, a computer scientist at Cornell University.
Following training, AlphaCode solved 34% of problems, per DeepMind reports. In another demonstration, DeepMind entered AlphaCode into an online coding competition. Competing with 5,000 participants, the system outperformed 45.7% of programmers. Researchers also found that AlphaCode did not strictly duplicate large portions of code or logic. According to Ellis, this is a surprising turn of events.
“It continues to be impressive how well machine-learning methods do when you scale them up,” he said. The results are “stunning,” added Wojciech Zaremba, a co-founder of OpenAI and co-author of their Codex paper.
Artificial intelligence coding might soon have applications beyond these simple use-cases too, said DeepMind computer scientist and paper co-author, Yujia Li. For example, it could perform admin work, freeing up development time. Alternatively, it could help non-technical coders create simple programs.
Another study co-author at DeepMind, David Choi, has posited the possibility to run the models in reverse: translating code into human-friendly descriptions of what’s happening. This would benefit programmers trying to identify others’ code. But for now, DeepMind is focused on reducing errors in the system as despite being impressive, there are still bugs which need fixing. For instance, the AI has a tendency of creating variables and not using them.
Another problem is the tens of billions of trillions of operations required to perform AI operations. Such computing power is only available to large tech companies, and real-world programming applications outside the test environment would require substantially more power and understanding of the software going forward.
The study notes some long-term risks of the software’s recursive progress. And some experts note that constant iterations could lead to a super intelligent AI that may destroy the world. While the scenario is somewhat sci-fi-esque, researchers are keen on coding guardrails and ethics standards that mitigate such an event.
“Even if this kind of technology becomes super successful, you would want to treat it the same way you treat a programmer within an organization,” Solar-Lezama said. “You never want an organization where a single programmer could bring the whole organization down.”
Researchers appear to be arguing for a hybrid decentralised governance model that protects against bad or catastrophic outcomes.