Meaningful measures of human society in the twenty-first century

1. Pechenick, E. A., Danforth, C. M. & Dodds, P. S. Characterizing the Google Books…

  • 1.

    Pechenick, E. A., Danforth, C. M. & Dodds, P. S. Characterizing the Google Books Corpus: strong limits to inferences of socio-cultural and linguistic evolution. PLoS ONE 10, e0137041 (2015).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  • 2.

    Dietrich, B. J., Hayes, M. & O’Brien, D. Z. Pitch perfect: vocal pitch and the emotional intensity of congressional speech. Am. Polit. Sci. Rev. 113, 941–962 (2019).

    Article 

    Google Scholar
     

  • 3.

    Dietrich, B. J. Using motion detection to measure social polarization in the U.S. House of Representatives. Polit. Anal. 29, 250–259 (2021).

    Article 

    Google Scholar
     

  • 4.

    Michel, J.-B. et al. Quantitative analysis of culture using millions of digitized books. Science 331, 176–182 (2011). In this study, 4% of all books that have been published were digitized and used to examine changes in phonology, word use and the adoption of new technologies over long periods of time.

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 5.

    Merton, R. K. in Social Theory and Social Structure 39–72 (Free Press, 1968).

  • 6.

    Watts, D. J. Everything Is Obvious: Once You Know the Answer (Crown Business, 2011).

  • 7.

    Simon, H. A. Bandwagon and underdog effects and the possibility of election predictions. Public Opin. Q. 18, 245–253 (1954).

    Article 

    Google Scholar
     

  • 8.

    Mutz, D. C. Impersonal Influence in American Politics (Cambridge Univ. Press, 1998).

  • 9.

    Westwood, S. J., Messing, S. & Lelkes, Y. Projecting confidence: how the probabilistic horse race confuses and demobilizes the public. J. Polit. 82, 1530–1544 (2020).

    Article 

    Google Scholar
     

  • 10.

    O’Neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Crown, 2016).

  • 11.

    Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 12.

    Landsberger, H. A. Hawthorne Revisited (The New York State School of Industrial and Labor Relations, 1958).

  • 13.

    Mayo, E. The Human Problems of an Industrial Civilization (Routledge, 2004).

  • 14.

    Lazer, D., Kennedy, R., King, G. & Vespignani, A. The parable of Google Flu: traps in big data analysis. Science 343, 1203–1205 (2014). This paper shows that the increasing over-prediction of flu prevalence of Google Flu Trends was largely the result of changes to Google’s search algorithm, which altered the terms that people used to find flu-related information.

    ADS 
    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 15.

    Brunton, F. & Nissenbaum, H. Obfuscation: A User’s Guide for Privacy and Protest (MIT Press, 2015).

  • 16.

    Davis, D. W. The direction of race of interviewer effects among African-Americans: donning the Black mask. Am. J. Pol. Sci. 41, 309–322 (1997).

    Article 

    Google Scholar
     

  • 17.

    American National Election Studies. 1978 Time Series Study https://electionstudies.org/wp-content/uploads/2018/03/anes_timeseries_1978_qnaire_post.pdf (1978).

  • 18.

    Salganik, M. J. Bit by Bit: Social Research in the Digital Age (Princeton Univ. Press, 2017).

  • 19.

    Patty, J. W. & Penn, E. M. Analyzing big data: social choice and measurement. PS Polit. Sci. Polit. 48, 95–101 (2015).

    Article 

    Google Scholar
     

  • 20.

    Kraemer, M. U. G. et al. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 368, 493–497 (2020).

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 21.

    Jia, J. S. et al. Population flow drives spatio-temporal distribution of COVID-19 in China. Nature 582, 389–394 (2020).

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 22.

    Badr, H. S. et al. Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study. Lancet Infect. Dis. 20, 1247–1254 (2020).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 23.

    Munger, K. The limited value of non-replicable field experiments in contexts with low temporal validity. Soc. Media Soc. 5, 1–4 (2019).


    Google Scholar
     

  • 24.

    Deaton, A. & Cartwright, N. Understanding and misunderstanding randomized controlled trials. Soc. Sci. Med. 210, 2–21 (2018).

    PubMed 
    Article 

    Google Scholar
     

  • 25.

    Vraga, E. K., Bode, L., Smithson, A.-B. & Troller-Renfree, S. Accidentally attentive: comparing visual, close-ended, and open-ended measures of attention on social media. Comput. Human Behav. 99, 235–244 (2019).

    Article 

    Google Scholar
     

  • 26.

    Guess, A., Munger, K., Nagler, J. & Tucker, J. How accurate are survey responses on social media and politics? Polit. Commun. 36, 241–258 (2019).

    Article 

    Google Scholar
     

  • 27.

    Aleta, A. et al. Modelling the impact of testing, contact tracing and household quarantine on second waves of COVID-19. Nat. Hum. Behav. 4, 964–971 (2020).

    PubMed 
    Article 

    Google Scholar
     

  • 28.

    Echeverría, J. et al. LOBO: evaluation of generalization deficiencies in Twitter bot classifiers. In Proc. 34th Annual Computer Security Applications Conference 137–146 (ACM, 2018).

  • 29.

    Ferrara, E., Varol, O., Davis, C., Menczer, F. & Flammini, A. The rise of social bots. Commun. ACM 59, 96–104 (2016).

    Article 

    Google Scholar
     

  • 30.

    Hughes, A. G. et al. Using administrative records and survey data to construct samples of Tweeters and Tweets. Public Opin. Q. https://doi.org/10.1093/poq/nfab020 (2021).

  • 31.

    Napoli, P. M. Audience Evolution: New Technologies and the Transformation of Media Audiences (Columbia Univ. Press, 2011).

  • 32.

    Yang, T., Majó-Vázquez, S., Nielsen, R. K. & González-Bailón, S. Exposure to news grows less fragmented with an increase in mobile access. Proc. Natl Acad. Sci. USA 117, 28678–28683 (2020). This study tracked the news consumption of users across mobile and desktop devices and found that most individuals do not self-sort their news consumption by partisanship but, instead, consume news from a diversity of sources including partisan and nonpartisan ones.

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 33.

    Haythornthwaite, C. Exploring multiplexity: social network structures in a computer-supported distance learning class. Inf. Soc. 17, 211–226 (2001).

    Article 

    Google Scholar
     

  • 34.

    Campbell, K. E. & Lee, B. A. Name generators in surveys of personal networks. Soc. Netw. 13, 203–221 (1991).

    Article 

    Google Scholar
     

  • 35.

    Wagner, C. Measuring algorithmically infused societies. Nature https://doi.org/10.1038/s41586-021-03666-1 (2021).

  • 36.

    Healy, K. The performativity of networks. Eur. J. Sociol. 56, 175–205 (2015).

    Article 

    Google Scholar
     

  • 37.

    Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).

    ADS 
    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 38.

    Neuendorf, K. A. The Content Analysis Guidebook (Sage, 2017).

  • 39.

    Davidov, D., Tsur, O. & Rappoport, A. Semi-supervised recognition of sarcasm in Twitter and Amazon. In Proc. 14th Conference on Computational Natural Language Learning 107–116 (Association for Computational Linguistics, 2010).

  • 40.

    Groves, R. M. Nonresponse rates and nonresponse bias in household surveys. Public Opin. Q. 70, 646–675 (2006).

    Article 

    Google Scholar
     

  • 41.

    Hargittai, E. Potential biases in big data: omitted voices on social media. Soc. Sci. Comput. Rev. 38, 10–24 (2020). Using survey data, this study finds that younger, wealthier and more technically skilled people tend to use social media and that there were substantial gender and education differences in which platforms people used.

    Article 

    Google Scholar
     

  • 42.

    Lazer, D. & Radford, J. Data ex machina: introduction to big data. Annu. Rev. Sociol. 43, 19–39 (2017).

    Article 

    Google Scholar
     

  • 43.

    Correa, T. & Valenzuela, S. A trend study in the stratification of social media use among urban youth: Chile 2009–2019. J. Quant. Descr. Digit. Media 1, https://doi.org/10.51685/jqd.2021.009 (2021).

  • 44.

    Mellon, J. & Prosser, C. Twitter and Facebook are not representative of the general population: political attitudes and demographics of British social media users. Res. Polit. 4, 1–9 (2017).


    Google Scholar
     

  • 45.

    Beisch, N. & Schäfer, C. Internetnutzung mit großer Dynamik: Medien, Kommunikation, Social Media. AS&S https://www.ard-werbung.de/media-perspektiven/fachzeitschrift/2020/detailseite-2020/internetnutzung-mit-grosser-dynamik-medien-kommunikation-social-media/ (2020).

  • 46.

    Hargittai, E. & Litt, E. The Tweet smell of celebrity success: explaining variation in Twitter adoption among a diverse group of young adults. New Media Soc. 13, 824–842 (2011).

    Article 

    Google Scholar
     

  • 47.

    Henrich, J., Heine, S. J. & Norenzayan, A. Most people are not WEIRD. Nature 466, 29 (2010).

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 48.

    Wang, W., Rothschild, D., Goel, S. & Gelman, A. Forecasting elections with non-representative polls. Int. J. Forecast. 31, 980–991 (2015).

    Article 

    Google Scholar
     

  • 49.

    Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B. & Lazer, D. Fake news on Twitter during the 2016 U.S. presidential election. Science 363, 374–378 (2019).

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 50.

    Bakshy, E., Messing, S. & Adamic, L. A. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 1130–1132 (2015).

    ADS 
    MathSciNet 
    CAS 
    PubMed 
    MATH 
    Article 

    Google Scholar
     

  • 51.

    Meng, X.-L. Statistical paradises and paradoxes in big data (I): law of large populations, big data paradox, and the 2016 US presidential election. Ann. Appl. Stat. 12, 685–726 (2018).

    MathSciNet 
    MATH 
    Article 

    Google Scholar
     

  • 52.

    Hargittai, E., Füchslin, T. & Schäfer, M. S. How do young adults engage with science and research on social media? Some preliminary findings and an agenda for future research. Soc. Media Soc. 4, 1–10 (2018).


    Google Scholar
     

  • 53.

    Blumenstock, J. Don’t forget people in the use of big data for development. Nature 561, 170–172 (2018).

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 54.

    Battle-Baptiste, W. & Rusert, B. (eds) W. E. B. Du Bois’s Data Portraits: Visualizing Black America (Princeton Architectural Press, 2018).

  • 55.

    Siegel, A. A. et al. Trumping hate on Twitter? Online hate speech in the 2016 US election campaign and its aftermath. Quart. J. Polit. Sci. 16, 71–104 (2021).

    Article 

    Google Scholar
     

  • 56.

    Allen, J., Howland, B., Mobius, M., Rothschild, D. & Watts, D. J. Evaluating the fake news problem at the scale of the information ecosystem. Sci. Adv. 6, eaay3539 (2020).

    ADS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 57.

    Foucault Welles, B. On minorities and outliers: the case for making big data small. Big Data Soc. 1, 1–2 (2014).


    Google Scholar
     

  • 58.

    Newman, M. E. J. Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46, 323–351 (2005).

    ADS 
    Article 

    Google Scholar
     

  • 59.

    González-Bailón, S. Decoding the Social World: Data Science and the Unintended Consequences of Communication (MIT Press, 2017).

  • 60.

    Stopczynski, A. et al. Measuring large-scale social networks with high resolution. PLoS ONE 9, e95978 (2014).

    ADS 
    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  • 61.

    Lazer, D. Studying human attention on the Internet. Proc. Natl Acad. Sci. USA 117, 21–22 (2020).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 62.

    Aral, S. & Eckles, D. Protecting elections from social media manipulation. Science 365, 858–861 (2019).

    ADS 
    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 63.

    Puschmann, C. & Burgess, J. The politics of Twitter data. HIIG Discussion Paper Series No. 2013-01 http://www.ssrn.com/abstract=2206225 (2013).

  • 64.

    Chen, W. & Quan-Haase, A. Big data ethics and politics: toward new understandings. Soc. Sci. Comput. Rev. 38, 3–9 (2020).

    Article 

    Google Scholar
     

  • 65.

    Breuer, J., Bishop, L. & Kinder-Kurlanda, K. The practical and ethical challenges in acquiring and sharing digital trace data: negotiating public–private partnerships. New Media Soc. 22, 2058–2080 (2020).

    Article 

    Google Scholar
     

  • 66.

    Zook, M. et al. Ten simple rules for responsible big data research. PLOS Comput. Biol. 13, e1005399 (2017).

    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  • 67.

    Greenberg, A. An absurdly basic bug let anyone grab all of parler’s data. Wired (12 January 2021).

  • 68.

    Valentino-DeVries, J., Singer, N., Keller, M. H. & Krolik, A. your apps know where you were last night, and they’re not keeping it secret. The New York Times https://www.nytimes.com/interactive/2018/12/10/business/location-data-privacy-apps.html (10 December 2021).

  • 69.

    Sweeney, L. Simple demographics often identify people uniquely. Privacy Working Paper 3 https://dataprivacylab.org/projects/identifiability/paper1.pdf (Carnegie Mellon University, 2000). Using census data, this paper shows that 87% of the US population could be uniquely identified by date of birth, postal code and gender; demonstrating the ease with which study respondents can be re-identified from ostensibly anonymous data.

  • 70.

    Wood, A. et al. Differential privacy: a primer for a non-technical audience. Vanderbilt J. Entertain. Technol. Law 21, 209–276 (2019).


    Google Scholar
     

  • 71.

    Dwork, C. & Roth, A. The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–407 (2013).

    MathSciNet 
    MATH 
    Article 

    Google Scholar
     

  • 72.

    King, G. & Persily, N. A new model for industry–academic partnerships. PS Polit. Sci. Polit. 53, 703–709 (2020).

    Article 

    Google Scholar
     

  • 73.

    Bruckman, A., Luther, K. & Fiesler, C. in Digital Research Confidential: The Secrets of Studying Behavior Online (eds Hargittai, E. & Sandvig, C.) 243–258 (MIT Press, 2015).

  • 74.

    Marwick, A. E. & boyd, d. Networked privacy: how teenagers negotiate context in social media. New Media Soc. 16, 1051–1067 (2014).

    Article 

    Google Scholar
     

  • 75.

    Bieber, F. R., Brenner, C. H. & Lazer, D. Finding criminals through DNA of their relatives. Science 312, 1315–1316 (2006).

    CAS 
    PubMed 
    Article 

    Google Scholar
     

  • 76.

    Zheleva, E. & Getoor, L. To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In Proc. 18th International Conference on World Wide Web 531–540 (2009).

  • 77.

    Miller, G. As U.S. election nears, researchers are following the trail of fake news. Science (26 October 2020).

  • 78.

    Merton, R. K. The self-fulfilling prophecy. Antioch Rev. 8,193–210 (1948).