Predicting ethnicity with first names in online social media networks

Big Data and Society 5 (1) (2018)
  Copy   BIBTEX

Abstract

Social scientists increasingly use social media data to illuminate long-standing substantive questions in social science research. However, a key challenge of analyzing such data is their lower level of individual detail compared to highly detailed survey data. This limits the scope of substantive questions that can be addressed with these data. In this study, we provide a method to upgrade individual detail in terms of ethnicity in data gathered from social media via the use of register data. Our research aim is twofold: first, we predict the most likely value of ethnicity, given one's first name, and second, we show how one can test hypotheses with the predicted values for ethnicity as an independent variable while simultaneously accounting for the uncertainty in these predictions. We apply our method to social network data collected from Facebook. We illustrate our approach and provide an example of hypothesis testing using our procedure, i.e., estimating the relation between predicted network ethnic homogeneity on Facebook and trust in institutions. In a comparison of our method with two other methods, we find that our method provides the most conservative tests of hypotheses. We discuss the promise of our approach and pinpoint future research directions.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,219

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Sentiment analysis on online social network.Vijaya Abhinandan - forthcoming - International Journal of Computer Science, Information Technology, and Security.
Four Pillars of Internet Research Ethics with Web 2.0.Barry Rooke - 2013 - Journal of Academic Ethics 11 (4):265-268.
Cybervetting job applicants on social media: the new normal?Jenna Jacobson & Anatoliy Gruzd - 2020 - Ethics and Information Technology 22 (2):175-195.

Analytics

Added to PP
2020-11-24

Downloads
9 (#1,187,161)

6 months
5 (#544,079)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations