What do you get when you cross a monkey with robot?

robot with monkey head

Well, no. This isn't quite what it would look like, but I'm sure you get the idea.



"Cybernetic Monkey" is recent nickname given to an ongoing project that was started in August, 2003. The project involves the design and construction of a robot as a test bed for artificial intelligence experimentation. Despite the name, the robot doesn't resemble a monkey except for the head. As the project develops, the robot will become an autonomous agent, achieve mobility with several degrees of freedom, and react to its environment with myriad sensors.

The purpose of this project is manifold:


First and foremost: feeding the creative process. Like any artistic endeavor, it's fueled by expression, and meant to be a provocative and captivating experience.


Further research into robot-human interactions. The more depth of understanding that a computer can achieve about the world around it, particularly a world with humans in it, the more meaningful the communication between computers and humans can be achieved. In a sense, a richer means of communication between humans and computers could finally allow computers to do what we "mean", not just "exactly what we say".


Further research into natural language. The verbal languages that we humans speak are varied, rich and intricate, but they're all built upon a more primal foundation of natural language. Elements of natural language are recognizable in many other species. These elements range from the simple (i.e.; joint attention, posture, facial expression, vocal tone), to the more complex (i.e.; directed attention, vocal articulation, symbolic gesture).


Drinking beer and building robots is fun. Of course, the beer aspect is beyond the scope of this project.


Project Goals

The overall goal for this project is to create an autonomous agent which can explore its environment and participate in limited interaction with people and pets. Russel & Norvig (1995) refer to an autonomous agent is any self-contained system (hardware and/or software) which operates under independent control, interacting and adapting to its environment (p. 35). Michael Woodridge (1999) defines agents as:

"...systems that can decide for themselves what they need to do in order to satisfy their design objectives. Such computer systems are known as agents. Agents that must operate robustly in rapidly changing, unpredictable, or open environments, where there is a significant possibility that actions can fail are known as intelligent agents, or sometimes autonomous agents. (Wooldridge, p. 27)"

Another reason for this project is to explore the dynamics of simulated emotions and drives. This aspect of design in artificial intelligence is fascinating because it gives an agent more life-like behaviors, and could potentially lead to more socially intelligent robotic agents.




I was inspired by the Kismet project of Dr. Cynthia Breazeal and others at the MIT Artificial Intelligence Lab. Their goal was to develop a robot that could communicate with people in terms of natural language (i.e.; facial expressions and body posturing). To facilitate this behavior, Kismet's operating system includes attention and emotion simulators.

Kismet's attention simulator is based on a fantastically intricate sensory system. The engineers designed a video processing system that can detect human eyes, and even follow the gaze of the eyes (e.g.; to focus on the same target) to simulate joint attention. Also detected are whole faces and brightly colored toys. The robot's audio processing system is designed to detect the tone of voice coming from nearby human subjects. The attention simulator is designed to continually oscillate between interest and fatigue, where Kismet's camera-eyes or microphone-ears will focus on typical attention-grabbing targets such as shiny objects or sharp sounds.

Kismet's emotion simulator continually oscillates between affect extremes, seeking equilibrium, as people interact with it. For example, once Kismet has its fill of playing, it withdraws, then once it's had its fill of quiet, it seeks play again. As these emotion and social factors perpetuate, the robot's animatronic facial features display the corresponding emotional expressions. Further, as Kismet's social interaction with humans alternates between seeking and withdrawal, the corresponding postures and behaviors are displayed.

The large team of designers and engineers at MIT who created Kismet are far and above my ability (and paycheck), however the general designs of the attention and emotion simulators are very inspiring.



Works Cited:

Mori, Masahiro. (1970). "The Uncanny Valley." Karl F. MacDorman and Takashi Minato (trans.). Energy, 7(4), pp. 33-35.

Russell, S. J. & Norvig, P. (1995). Artificial Intelligence: A Modern Approach. New Jersey: Prentice Hall. pp. 31-49.

Wooldridge, M. (1999). "Intelligent Agents" in Multiagent Systems: A Modern Approach to Distributed Modern Approach to Artificial Intelligence. Gerhard Weiss (ed.). Massachusetts: MIT Press. p. 27.

[an error occurred while processing this directive]