It’s possible to curb online hate speech by inducing empathy for those affected, research on “counterspeech” finds.
In contrast, the use of humor or warnings of possible consequences have little effect, say the researchers.
To moderate hateful comments, many social media platforms have developed sophisticated filters. However, these alone are not sufficient to fix the problem. For example, Facebook estimates (according to the internal documents leaked in October 2021) that it is not able to delete more than 5% of the hate comments that users post. In addition, automatic filters are imprecise and could harm freedom of speech.
An alternative to deleting problematic comments is the use of targeted counterspeech, which numerous organizations use to tackle online hate speech. Until now, though, less has been known about which counterspeech strategies are most effective in addressing online hostility.
A team of researchers led by Dominik Hangartner, professor of public policy at ETH Zurich, joined forces with colleagues at the University of Zurich to investigate what kind of messages could encourage authors of hate speech to cut it out.
Using machine learning methods, the researchers identified 1,350 English-speaking Twitter users who had published racist or xenophobic content. They randomly assigned these accounts to a control group or one of following three, often-used counterspeech strategies: messages that elicit empathy with the group targeted by racism, humor, or a warning of possible consequences.
The results, which appear in the Proceedings of the National Academy of Sciences, are clear: only counterspeech messages that elicit empathy for the people affected by hate speech are likely to persuade the senders to change their behavior.
An example of such a response could be: “Your post is very painful for Jewish people to read…” Compared to the control group, the authors of hate tweets posted around one-third fewer racist or xenophobic comments after such an empathy-inducing intervention. Additionally, the probability that a user would delete their hate tweet increased significantly.
In contrast, the authors of hate tweets barely reacted to humorous counterspeech. Even a message that reminded the sender that their family, friends, and colleagues could see their hateful comments, too, were not effective. This is striking because these two strategies are frequently used by organizations that are committed to combatting hate speech.
“We have certainly not found a panacea against hate speech on the internet, but we have uncovered important clues about which strategies might work, and which do not,” says Hangartner. What remains to be studied is whether all empathy-based responses work similarly well, or whether particular messages are more effective. For example, hate speech authors could be encouraged to put themselves in the victim’s shoe or be asked to adopt an analogous perspective (“How would you feel if people talked about you like that?”).
The research is part of a more comprehensive project to develop algorithms that detect hate speech, and to test and refine further counterspeech strategies. To this end, the research team is collaborating with the Swiss women’s umbrella organization alliance F, which has initiated the civil society project Stop Hate Speech. Through this collaboration the scientists can provide an empirical basis for alliance F to optimize the design and content of their counterspeech messages.
“The research findings make me very optimistic. For the first time, we now have experimental evidence that show the efficacy of counterspeech in real-life conditions,” says Sophie Achermann, executive director of alliance F and co-initiator of Stop Hate Speech.
The Swiss innovation agency Innosuisse sponsored the work, which also involved media companies Ringier and TX Group via their newspapers Blick and 20 Minuten respectively.
Source: Simon Gemperli for ETH Zurich