How biased is the sample? Reverse engineering the ranking algorithm of Facebook’s Graph application programming interface

Big Data and Society 7 (1) (2020)
  Copy   BIBTEX

Abstract

Facebook research has proliferated during recent years. However, since November 2017, Facebook has introduced a new limitation on the maximum amount of page posts retrievable through their Graph application programming interface, while there is limited documentation on how these posts are selected. This paper compares two datasets of the same Facebook page, a full dataset obtained before the introduction of the limitation and a partial dataset obtained after, and employs bootstrapping technique to assess the bias caused by the new limitation. This paper demonstrates that posts with high user engagement, Photo posts and Video posts, are over-represented, while Link posts are under-represented. Top-term analysis reveals that there are significant differences in the most prominent terms between the full and partial dataset. This paper also reverse engineered the new application programming interface’s ranking algorithm to identify the features of a post that would affect its odds of being selected. Sentiment analysis reveals that there are significant differences in the sentiment word usage between the selected and non-selected posts. This paper has significant implications for the representativeness of research that use Facebook page data collected after the introduction of the limitation.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,423

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Impact of excessive use of facebook on the youth of karachi.Yasmeen Sultana, Sadaf Ghaffar & Samia Saman - 2019 - Journal of Social Sciences and Humanities 58 (2):137-161.
Collision: Fakebook.Rich Andrew - 2012 - Evental Aesthetics 1 (2):49-55.
Contextual gaps: privacy issues on Facebook.Gordon Hull, Heather Richter Lipford & Celine Latulipe - 2011 - Ethics and Information Technology 13 (4):289-302.

Analytics

Added to PP
2020-11-24

Downloads
3 (#1,690,426)

6 months
2 (#1,240,909)

Historical graph of downloads
How can I increase my downloads?