Quantum-enhanced deliberation of learning agents using trapped ions
Vedran Dunjko, Nicolai Friis, & Hans Briegel, Austrian Academy of Sciences, Innsbruck University & Ruder Boskovic Institute :: arXiv:1407.2830v1 [quant-ph] 10 Jul 2014
Summary and review of the above paper
The authors attempt to use quantum mechanics in the development of artificial intelligence. As part of this scheme, they use a projective simulation (PS) agent a memory system, which is explored by random walks, and which is amenable to quantisation. Quantum walks allowed for a speed-up from a classical walk. This could be implemented by a system of trapped ions.
The memory system utilised by the PS agent is central to this process. This system is called ‘episodic and compositional memory’ (ECM), and it simulates future actions, ahead of any actions being taken. The units of the memory systems are recalls of percepts, actions and resulting rewards. The ECM represents prior experiences of the PS learning agent, and the random walk allows it to make decisions.
Perceptual inputs can initiate the random walk, and thus replay existing memories. The memory network can be updated on the basis of the reaction/rewards of the environment in response to actions. Any one percept will give rise to a specific form of random walk. A key factor is the constant adjustment to probabilities as they are changed by new inputs. The random walk is terminated when an action is identified, and this action can then become a real action. Each interaction with the environment is either rewarded or not rewarded, and the memory system is updated to take account of the probability of future rewards or the lack of them. Actions are outputted at the same time that the memory network is updated by a form of learning process.