A new algorithm enables robots to put pen to paper, writing words using stroke patterns similar to human handwriting.
It’s a step, the researchers say, toward robots that are able to communicate more fluently with human coworkers and collaborators.
“Just by looking at a target image of a word or sketch, the robot can reproduce each stroke as one continuous action,” says Atsunobu Kotani, an undergraduate student at Brown University who led the algorithm’s development.
“That makes it hard for people to distinguish if it was written by the robot or actually written by a human.”
Copy cat
The algorithm makes use of deep learning networks that analyze images of handwritten words or sketches and can deduce the likely series of pen strokes that created them. The robot can then reproduce the words or sketches using the pen strokes it learned.
In a paper to be presented at this month’s International Conference on Robotics and Automation, the researchers demonstrate a robot that was able to write “hello” in 10 languages that employ different character sets. The robot was also able to reproduce rough sketches, including one of the Mona Lisa.
Stefanie Tellex, an assistant professor of computer science and Kotani’s advisor, says that what makes this work unique is the ability of the robot to learn stroke order from scratch.
“A lot of the existing work in this area requires the robot to have information about the stroke order in advance,” Tellex says.
“If you wanted the robot to write something, somebody would have to program the stroke orders each time. With what Atsu has done, you can draw whatever you want and the robot can reproduce it. It doesn’t always do the perfect stroke order, but it gets pretty close.”
Another remarkable aspect of the work, Tellex says, is how the algorithm can generalize its ability to reproduce strokes.
“…there’s something really beautiful about the robot writing in so many different languages.”
Kotani trained his deep learning algorithm using a set of Japanese characters, and showed that it could reproduce the characters and the strokes that created them with around 93 percent accuracy. But much to the researchers’ surprise, the algorithm wound up being able to reproduce very different character types it had never seen before—English print and cursive, for example.
“We would have been happy if it had only learned the Japanese characters,” Tellex says. “But once it started working on English, we were amazed. Then we decided to see how far we could take it.”
Tellex and Kotani asked everyone who works in Tellex’s Humans to Robots lab to write “hello” in their native languages, which included Greek, Hindi, Urdu, Chinese, and Yiddish, among others. The robot was able to reproduce them all with reasonable stroke accuracy.
“I feel like there’s something really beautiful about the robot writing in so many different languages,” Tellex says. “I thought that was really cool.”
Whiteboard Mona Lisa
The system’s masterwork may be its copy of Kotani’s Mona Lisa sketch. He drew his sketch on a dry erase board in Tellex’s lab, and then allowed the robot to copy it—fairly faithfully—on the same board just below Kotani’s original.
“It was early morning that our robot finally drew the Mona Lisa on the whiteboard,” Kotani says. “When I came back to the lab, everybody was standing around the whiteboard looking at the Mona Lisa and asking me if [the robot] drew this. They couldn’t believe it.”
It was a big moment for Kotani because “it was the moment that our robot defined what’s beyond mere printing.” An ink jet printer can recreate an image, but it does so with a print head that goes back in forth building the image line by line. But this was the robot creating an image with human-like strokes, which to Kotani is “something much more humane and expressive.”
Human-robot communication
Key to making the system work, Kotani says, is that the algorithm uses two distinct models of the image it’s trying to reproduce. Using a global model that considers the image as a whole, the algorithm identifies a likely starting point for making the first stroke. Once that stroke has begun, the algorithm zooms in, looking at the image pixel by pixel to determine where that stroke should go and how long it should be. When it reaches the end of the stroke, the algorithm again calls the global model to determine where the next stroke should start, then it’s back to the zoomed-in model. This process is repeated until the image is complete.
Both Kotani and Tellex say the work is a step toward better communication between people and robots. Ultimately, they envision robots that can leave Post-it Notes, take dictation, or sketch diagrams for their human coworkers and collaborators.
“I want a robot to be able to do everything a person can do,” Tellex says. “I’m particularly interested in a robot that can use language. Writing is a way that people use language, so we thought we should try this.”
Source: Brown University