On Tuesday, Meta AI announced the development of Cicero, which it claims is the first AI to achieve human-level performance in a strategic board game. diplomacy. This is a remarkable achievement because the game requires deep interpersonal negotiation skills, which means that Cicero has mastered the language necessary to win the game.
Even before Deep Blue beat Garry Kasparov at chess in 1997, board games were a useful measure of AI achievement. In 2015, another barrier fell when AlphaGo defeated Go master Lee Sedol. Both games follow fairly clear analytical rules (although Go’s rules are usually simplified for computer AI).
But together diplomacy, A big part of the gameplay involves social skills. Players must show empathy, use natural language and build relationships to win, a difficult task for a PC player. With this in mind, Meta asked, “Can we build more efficient and flexible agents that can use language to negotiate, persuade, and work with people to achieve strategic goals similar to how humans do?”
According to Meta, the answer is yes. Cicero learned his skills by playing an online version diplomacy on webDiplomacy.net. Over time, he became a master at the game, achieving “more than double the average score” of human players and ranking in the top 10 percent of people who played more than one game.
To create Cicero, Meta combined AI models for strategic reasoning (similar to AlphaGo) and natural language processing (similar to GPT-3) and assembled them into a single agent. In each game, Cicero analyzes the state of the game board and chat history and predicts how other players will play. He devises a plan that he executes through a language model that can create human-like dialogue, allowing him to coordinate with other players.
Meta calls Cicero’s natural language skills a “controllable conversational pattern,” and therein lies the core of Cicero’s personality. Like GPT-3, Cicero draws from a large Internet text corpus mined from the web. “To build a controllable conversational model, we started with a 2.7 billion-parameter BART-like language model pre-trained on Internet texts and tuned to over 40,000 human games on webDiplomacy.net,” writes Meta.
The resulting model mastered the intricacies of a complex game. “Cicero can deduce, for example, that he’s going to need a player’s help later in the game,” says Meta, “and then devise a strategy to win that person’s favor, as well as recognize the risks and opportunities that player sees from their particular perspective.”
Metaren Cicero’s research was published in Science under the title “Human-level play in the game of Diplomacy by combining language models with strategic reasoning”.
As for broader applications, Meta suggests that his Cicero research could ease communication barriers between humans and AI, such as maintaining a long-term conversation to teach someone a new skill. Or it could power a video game where NPCs can talk like humans, understanding the player’s motivations and adjusting along the way.
At the same time, this technology could be used to manipulate humans by impersonating people and deceiving them in potentially dangerous ways, depending on the context. Along those lines, Meta hopes other researchers will build on its code “responsibly,” and says it has taken steps to detect and remove “toxic messages from this new domain,” likely referring to the conversation Cizeron learned from Internet texts. it swallowed—always a danger for great language models.
Meta dedicated a dedicated site to explain how Cicero works and has also open-sourced Cicero’s code on GitHub. online diplomacy fans—and maybe the rest of us—should keep an eye out.