Using machine learning to slow the spread of hate speech.

You can’t stop what you don’t understand.

The first key to countering hate speech is to have a clear definition of what it is.

We adapted Dr. Gregory Staunton’s 10 Stages of Genocide to create a structure for hate speech identification. Originally presented in a briefing to the U.S. Department of State in 1996, the report helped us understand the process of classification and dehumanization.

We changed it by condensing the stages and removing some that aren’t relevant to Twitter (e.g. extermination). We also added contemporary phenomena found in social media (e.g. coded language).

Our hate speech classifications

Mode
Output


5. Intention
Incitement to genocide
Incitement to general violence
Incitement to specific violence
Incitement to degrade and discriminate

4. Polarization
Inculpation of target group
Historical negationism
Promotion of known hate groups
Exclusion of target group

3. Dehumanization
Propagation of stereotype
Derogatory language against target group

2. Classification
Target group comparison
Target group identification

1. Coded Language
Innuendo signaling in-group/out-group nationalism
Innuendo implicating a target group
Innuendo excluding a target group

Mode	Output
5. Intention	Incitement to genocide Incitement to general violence Incitement to specific violence Incitement to degrade and discriminate
4. Polarization	Inculpation of target group Historical negationism Promotion of known hate groups Exclusion of target group
3. Dehumanization	Propagation of stereotype Derogatory language against target group
2. Classification	Target group comparison Target group identification
1. Coded Language	Innuendo signaling in-group/out-group nationalism Innuendo implicating a target group Innuendo excluding a target group

“When there is such a volume, we have to ask ourselves what can we do? What can the Internet service providers do? What can vast segments of society do? So that we hold people accountable and create safe spaces online the way we expect those spaces to be in the real world.”

— Oren Segal | Director of ADL's Center on Extremism

We teach machines to help us.

The power of machine learning is that it allows us to analyze thousands of tweets and return hate classifications within milliseconds. The flexibility of our platform allows us to continually adapt our model to constantly evolving terminologies used by hate groups on social media.

Step 1:
Build a Machine

We leverage enterprise-level AI platforms for Natural Language Processing and Image Recognition APIs, so that we are able to digest and interpret messages as they are posted, in near real time.

Step 2:
Train the Machine

Our Machine needs to be good at sniffing out one thing- hate speech. So we need to feed it a stream of hate speech in social media to break down and learn from. We use Spredfast, an intelligent social listening platform, to moderate incoming messages and categorize them into streams of hate speech. Those streams are fed on an ongoing basis, into our Machine so it can understand the linguistic nuances begin learning.

But even with artificial intelligence, there are challenges in identifying hate speech online.

Machines have trouble understanding the subjectivity and nuance of hate speech. See the examples below, all referencing "third world" in different ways.

Not Hate Speech

Folks like to say baseball is boring but we're about to have the third World Series Game 7 in four years. This one needed to go the distance
— Michael Lee (@MrMichaelLee) November 1, 2021

The year is 2029.

Yankees SS Kevin Maitan just won his forth-consecutive MVP award after leading his team to their third World Series title in as many years.

The Atlanta Braves finished 52-110, their sixth straight losing season.

Braves manager Larry Jones has been fired.
— fdu (@FakeDanUggla) November 21, 2021

Hateful

Not a single white country. White people don't break our laws the way third-worlders do.

They Have To Go Back#DACA #mepolitics #MAGA 🇺🇸 pic.twitter.com/c274AqZawm
— Maine First Media (@MaineFirstMedia) September 18, 2021

Uber Hateful

We defeated Hitler so we could pay for endless third worlders who gang rape our women while bearded “women” demand to be called women and Berlin could pay for a teaching manual that provides instruction for teachers on how to teach gender diversity issues to pre-school children
— ☩ EMPRESS WIFE 👑 (@EmpresWife) February 10, 2022

Our solution was to train our A.I. to understand and grade hate by distinct hate speech categorizations.

2017 Trends in hate speech categories

Key moments:

February: Travel Ban / Dehumanization
August: Charlottesville Protests / Polarization
September: DACA Debate and NFL Protests / Dehumanization
October: "Its Okay to be White" Movement / Coded Language
November: #KatesWall / Dehumanization

Supervised machine learning.

We Counter Hate is a human-moderated platform.

Our machine learning platform is continuously finding hate speech for us to counter. We're continuously giving it feedback based on what we’re given. This loop continually refines our framework, increasing reliability of the hate speech we "counter."

Pledge Now

References:

http://genocidewatch.net/genocide-2/8-stages-of-genocide/

https://www.undispatch.com/the-8-stages-of-genocide-against-burmas-rohingya/

https://www.forbes.com/sites/ewelinaochab/2017/09/02/can-genocide-ever-be-prevented/amp/

Technology