National Research University Higher School of Economics, Russian Federation

Olessia Koltsova; Sergey Koltcov; Sergel Nikolenko; Svetlana Alexeeva; Oleg Nagorny

Authors

Olessia Koltsova National Research University Higher School of Economics
Sergey Koltcov National Research University Higher School of Economics
Sergel Nikolenko National Research University Higher School of Economics
Svetlana Alexeeva National Research University Higher School of Economics
Oleg Nagorny National Research University Higher School of Economics

Keywords:

social media, ethnicity, monitoring, big data, attitudes

Abstract

The ability of social media to rapidly disseminate judgements on ethnicity to wide publics and to influence offline ethnic relations creates demand for methods of automatic monitoring of ethnicity-related online content (Burnap & Williams 2015). In this study we seek to measure the overall volume of ethnicity-related discussion in the Russian-language social media and to develop an approach that would automatically detect various aspects of judgements on ethnicity. We develop a comprehensive list of ethnonyms that embrace 100 Post-Soviet ethnic groups and obtain all messages containing one of those items from a two-year period from all Russian-language social media (N=2,850,947 texts). We find meaningful regional variation in the volume of attention to different ethnicities. We hand-code 7,181 messages where rare ethnicities are over-represented and train a number of classifiers (logistic regressions) to recognize different text features. We reach good quality in detecting presence of intergroup conflict, positive intergroup contact, and overall negative and positive sentiment, as well as fair quality in predicting general attitude to an ethnic group. Relevance to the topic of ethnicity is least well predicted, while some aspects such as calls for violence against an ethnic group are not sufficiently present in the data to be predicted. Unlike previous studies (Bessudnov 2016), here we see that various Central Asians, not Caucasians, take the lead in negative representation. Caucasians lead in producing their own discourse which is most likely to shift their scores up. Finally, Ukrainians are among most negatively represented because of the recent military conflict.

National Research University Higher School of Economics, Russian Federation

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Language

Developed By