This new method of training artificial intelligence can curb online harassment


For about six In the last few months, Nina Nørgaard met with seven people for one hour a week to discuss gender discrimination and violent language used against women on social media. Nørgaard, a doctoral student at the IT University of Copenhagen, and her discussion group are participating in an unusual effort to better identify online misogyny. The researchers paid the seven individuals to check thousands of Facebook, Reddit, and Twitter posts and determine whether they proved sexism, stereotypes, or harassment. Once a week, the researchers brought together a group of Nørgaard as the facilitator to discuss the difficult appeals they disagreed with.

Misogyny is a scourge that shapes the way women behave online. 2020 International Plan learnThe largest survey ever found that in 22 countries, more than half of women said they had been harassed or abused online. One in five abused women said they changed their behavior because of this-reduce or stop using the Internet.

Social media company use artificial intelligence Identify and delete posts that demean, harass, or threaten violence against women, but this is a tricky issue. Among researchers, there are no criteria for identifying sexist or misogynistic posts; a recent paper proposed four types of troublesome content, while another one identified 23 types. Most research is conducted in English, which makes people working in other languages ​​and cultures less guided by difficult and often subjective decisions.

So the Danish researchers tried a new method and hired Nørgaard and 7 people to review and tag posts full-time instead of relying on part-time contractors. Pay by post. They deliberately choose people of different ages, different nationalities, and different political views to reduce the chance of prejudice from a single worldview. Labelers include a software designer, a climate activist, an actress and a health care worker. Nørgaard’s task is to get them to agree.

“The best thing is that they disagree. We don’t want tunnel vision. We don’t want everyone to have the same idea,” Nørgaard said. She said her goal is “to let them have discussions among themselves or between teams.”

Nørgaard believes that her job is to help labellers “find their own answers.” Over time, she gradually got to know each of these seven people, for example, who said more than the others. She tries to make sure that no one is leading the conversation, because it means discussion, not debate.

The harshest calls involve sarcasm, jokes, or sarcastic posts; they become important topics of conversation. However, over time, “meetings have become shorter and shorter, and people have less and less discussions, so I think this is a good thing,” Nørgaard said.

The researchers behind the project called it a success.They say the conversation leads to more accurately labeled data to train the artificial intelligence algorithmResearchers say that artificial intelligence that fine-tunes the data set can identify misogyny on popular social media platforms 85% of the time. A year ago, the accuracy of the most advanced misogyny detection algorithm was about 75%. In total, the team reviewed nearly 30,000 posts, of which 7,500 were deemed abusive.

The posts are written in Danish, but the researchers say their method can be applied to any language. “I think if you want to annotate misogyny, you have to follow a method that contains at least most of our elements. Otherwise, you will be at risk of low-quality data, which will ruin everything,” the co-author of the study and an associate professor at the University of IT in Copenhagen Leon Derczynski said.

These findings may be useful outside of social media. Companies are beginning to use artificial intelligence to filter job lists or public texts, such as sexist press releases. If women exclude themselves from online conversations in order to avoid harassment, it will stifle the democratic process.

“If you plan to turn a blind eye to threats and aggression against half of the population, then you will not have the best possible democratic online space,” Derczynski said.


Source link