Monday 2 March 2015

Algorithm Teaches Itself To Be a Better Gamer than You

Algorithm Teaches Itself To Be a Better Gamer than You
Playing Breakout on an old Atari 2600 might not seem like cutting-edge computing, but it is when a computer algorithm learns on its own how to play that and other games as well as humans. In a paper published Thursday in the journal Nature, researchers from Google-owned DeepMind describe how their "deep Q-network," or DQN, did better than any previous machine-learning algorithms in mastering 43 of 49 classic Atari video games.
Starting with just the pixels on the game screen, a set of available actions and a reward system as an incentive for earning higher game scores, DQN was able to figure out such games as Breakout, Enduro racing, Pong, Space Invaders, River Raid and Q*bert. In half of the games, the algorithm "learned" how to play at "more than 75 percent of the level of a professional human player."
DeepMind, founded in 2011 and based in London, was acquired by Google in early 2014 (reports put the sales price at between $400 million and $650 million). The company researches machine learning and artificial intelligence, something with which Google has long been interested.
An Eye on Smarter Google Apps
Describing the new game-learning research Wednesday in a post on Google's Research Blog, DeepMind's Dharshan Kumaran and Demis Hassabis said DQN could help lead to smarter computing with practical, daily applications for people.
"This work offers the first demonstration of a general purpose learning agent that can be trained end-to-end to handle a wide variety of challenging tasks, taking in only raw pixels as inputs and transforming these into actions that can be executed in real-time," Kumaran and Hassabis said. "This kind of technology should help us build more useful products -- imagine if you could ask the Google app to complete any kind of complex task ('Okay, Google, plan me a great backpacking trip through Europe!')."
We caught up with Hassabis, who is vice president for engineering at DeepMind, to elaborate on future uses.
"From a more concrete applications point of view, our team is generally interested in things like Search and other core Google efforts -- baking better 'smarts' into services," Hassabis told us. "Ultimately, we'd like to help tackle bigger problems, too, like helping researchers make sense of the incredibly complex systems in climate science, medicine, genomics, etc."
Despite such potentially useful applications, the rapid advances in machine learning in recent years has led even a few of science's and technology's top minds -- including Stephen Hawking, Bill Gates and Elon Musk -- to describe artificial intelligence as a possible threat to humanity. DeepMind has also given the implications of its research some thought: around the time of Google's acquisition, members of the DeepMind team reportedly pushed for Google to establish an AI ethics board.
AI Pinball Wizard
DQN, Kumaran and Hassabis wrote, achieved its latest successes through the combination of artificial neural networks -- called deep neural networks -- and reinforcement learning, a framework that gave the algorithm the goal of maximizing future rewards by earning higher scores. To enable the algorithm to "learn" video-game-playing skills effectively, DeepMind also had to find a way to emulate another human condition: sleep.
During the learning phase, Kumaran and Hassabis said DQN was "trained on samples drawn from a pool of stored episodes," a mechanism called "experience replay." That process is similar to how the human hippocampus draws on declarative and episodic memories for dreams during sleep.
In fact, if DQN could not "sleep" or "dream," it couldn't improve its gaming skills as well.
"The incorporation of experience replay was critical to the success of DQN: disabling this function caused a severe deterioration in performance," Kumaran and Hassabis said.
Among the games DQN did best at -- "human-level or above" -- were video pinball, boxing, Breakout, Star Gunner, Robotanks, Atlantis, Crazy Climber and Gopher. Games where its brand of machine learning didn't work so well, on the other hand, included Montezuma's Revenge, Private Eye, Gravitar, Frostbite, Ms. Pac-Man and bowling.

No comments:

Post a Comment