Researchers have built an algorithm that can capture the emotions that an image evokes.
Experts in artificial intelligence have gotten quite good at creating computers that can “see” the world around them—recognizing objects, animals, and activities within their purview. These have become the foundational technologies for autonomous cars, planes, and security systems of the future.
But now a team of researchers is working to teach computers to recognize not just what objects are in an image, but how those images make people feel—i.e., algorithms with emotional intelligence.
“This ability will be key to making artificial intelligence not just more intelligent, but more human, so to speak,” says Panos Achlioptas, a doctoral candidate in computer science at Stanford University who worked with collaborators in France and Saudi Arabia.
To get to this goal, Achlioptas and his team collected a new dataset, called ArtEmis, which was recently published in an arXiv pre-print.
The dataset is based on the 81,000 WikiArt paintings and consists of 440,000 written responses from over 6,500 humans indicating how a painting makes them feel—and including explanations of why they chose a certain emotion. Using those responses, Achlioptas and team, headed by Stanford engineering professor Leonidas Guibas, trained neural speakers—AI that responds in written words—that allow computers to generate emotional responses to visual art and justify those emotions in language.
The researchers chose to use art specifically, as an artist’s goal is to elicit emotion in the viewer. ArtEmis works regardless of the subject matter, from still life to human portraits to abstraction.
The work is a new approach in computer vision, notes Guibas, a faculty member of the AI lab and the Stanford Institute for Human-Centered Artificial Intelligence.
“Classical computer vision capturing work has been about literal content,” Guibas says. “There are three dogs in the image, or someone is drinking coffee from a cup. Instead, we needed descriptions that defined emotional content.”
8 emotional categories
The algorithm categorizes the artist’s work into one of eight emotional categories—ranging from awe to amusement to fear to sadness—and then explains in written text what it is in the image that justifies the emotional read.
“The computer is doing this,” says Achlioptas. “We can show it a new image it has never seen, and it will tell us how a human might feel.”
Remarkably, the researchers say, the captions accurately reflect the abstract content of the image in ways that go well beyond the capabilities of existing computer vision algorithms derived from documentary photographic datasets, such as Coco.
What’s more, the algorithm does not simply capture the broad emotional experience of a complete image, but it can decipher differing emotions within a given painting. For instance, in the famous Rembrandt painting (below) of the beheading of John the Baptist, ArtEmis distinguishes not only the pain on John the Baptist’s severed head, but also the “contentment” on the face of Salome, the woman to whom the head is presented.
Achlioptas points out that, even while ArtEmis is sophisticated enough to gauge that an artist’s intent can be different within the context of a single image, the tool also accounts for subjectivity and variability of human response, as well.
“Not every person sees and feels the same thing seeing a work of art,” he adds. For instance, “I can feel happy upon seeing the Mona Lisa, but Professor Guibas might feel sad. ArtEmis can distinguish these differences.”
A tool for artists
In the near term, the researchers anticipate ArtEmis could become a tool for artists to evaluate their works during creation to ensure their work is having the desired impact.
“It could provide guidance and inspiration to ‘steer’ the artist’s work as desired,” Achlioptas says. A graphic artist working on a new logo might use ArtEmis to guarantee it is having the intended emotional effect, for example.
Down the road, after additional research and refinements, Achlioptas can foresee emotion-based algorithms helping bring emotional awareness to artificial intelligence applications such as chatbots and conversational AI agents.
“I see ArtEmis bringing insights from human psychology to artificial intelligence,” Achlioptas says. “I want to make AI more personal and to improve the human experience with it.”
Source: Stanford University