First Monday

A response to reconciling a media sensation with data



I was recently e–mailed a copy of the Pasek, More, and Hargittai (2009) publication in this month’s issue of First Monday. Mr. Pasek was kind enough to send me the manuscript roughly two days prior to its publication in preparation for what may come. This commentary is meant to discuss my small, exploratory study, and provide a critique of the current study by Pasek and colleagues (2009).

As acknowledged repeatedly in interviews, my exploratory study and subsequent poster presentation were very basic. I merely planned to do this for a conference (i.e., the annual American Educational Research Association conference) to get some ideas and network with more experienced researchers in this area. I wanted to have a dialogue with others who are examining similar phenomena. However, the media completely sensationalized it, although Pasek and colleagues (2009) implied that I abetted the media frenzy by only offering “minor caveats,” implying “a causal and directional influence of Facebook use on academic performance,” and stating that “‘… there’s a disconnect between students’ claim that Facebook use doesn’t impact their studies, and our finding showing that they had lower grades.’” Perhaps the last statement could easily be misconstrued, but it was accurate. Indeed, 79 percent of students in my study did not feel that Facebook interfered with their academic performance. My correlational results were not in accord with the students’ claims. If they were in accord, I would have found no covariation, or a higher percentage of students would have indicated they felt Facebook did indeed have an impact on their academic performance. Stating that there is a disconnect, in this case, does not mean that I was claiming causation. Pasek, et al. (2009) also claimed that I did not implement proper statistical controls and could have produced a spurious relationship by not considering that STEM students, who were more likely to use Facebook, probably had lower GPAs than others. I conducted additional analyses, and although STEM students had lower, but not statistically different GPAs relative to other students, including STEM status as a covariate did not influence the significance nor the direction of my results. Therefore, their speculation that STEM status was the culprit for my findings simply was not true in my sample.

In their study, Pasek and colleagues (2009) attempt to “set the record straight,” and “to discern whether or not a relationship indeed exists between Facebook use and grade point averages” by using three “representative” samples and by conducting regression analyses with available covariates. Given their bold claims and promise to set the record straight, it is vital to hold their study and the inferences drawn from the results to a high methodological and statistical standard. Unfortunately, their study has serious methodological and statistical flaws that render the results of their claims extremely suspect.

First, let us review the evidence that they provide to indicate that their samples are representative. The authors first examined a “representative” cross–sectional sample of students from University of Illinois at Chicago (UIC). The authors never state in the manuscript the population for which this sample represents. If they were referring to the population of first–year UIC students, one would hardly think that this is the target population for which one will draw inferences to “set the record straight.” If they are referring to a population of first–year students nationally, sampling from UIC dramatically over–represents minority, urban, and commuter students. Although I only had one freshman and five sophomore students in my sample, their UIC study consisted entirely of first–year students.

The two other samples used by Pasek, et al. (2009) came from the National Annenberg Survey of Youth (NASY), which recruited respondents through random–digit dialing. Parental permission was required to interview those under 18. Besides asking questions of Internet usage, the in–depth phone interviews were used to gather information about both risky and protective behaviors as well as potential targets of intervention. Perhaps the combination of innocuous questions (e.g., Internet usage) and questions of more sensitive matters (e.g., risky behaviors) contributed to the 45 percent response rate of initially sampled parents who agreed to allow their children to participate. This response rate may be acceptable, as in the CDC study, if one is studying risky behaviors, but when the primary purpose of the study is to investigate the effects of Facebook usage, a 45 percent response rate could be considered unacceptable. The authors provide no evidence to indicate that non–respondents did not differ systematically from participants. These additional participants may differ systematically from those who are currently in the sample, which could influence the outcome of the results. A more definitive study would entail a sample with a larger response rate that could result from informing parents that their children will be asked less sensitive questions (e.g., Internet or Facebook usage).

But more serious methodological and statistical flaws are present in the manner in which Pasek and colleagues (2009) designed and interpreted their measures and conducted their regression analyses. From the very confusing description of how GPA was coded, it appears that the researchers may have collapsed GPA into a dichotomous 0 to 1 code. In the UIC dataset, by recoding GPA into a dichotomous 0 to 1 variable with 1’s indicating mostly A’s, they found that 76 percent of students were coded as having a high GPA. This coding method essentially misrepresented some true variability in GPA, and led to increased measurement error and decreased statistical power. The authors were considerably less clear in how GPA was coded with the NASY data. The authors state in their encrypted description, “In the NASY studies, GPA was coded on a four–point scale from ‘D or less’ (0) to ‘A’ (1).” In Table 1, the authors report the GPA means on a 4–point scale, but their footnote indicates that “GPA is used as a 0–1 variable in the text” (which is an indication of what was done in the analyses as well). This is particularly damaging in the NASY panel analysis where the researchers were attempting to reflect year–to–year GPA change on a dichotomous variable.

Pasek, et al. (2009) then conducted covariate regression analyses “to ensure that these results were not spurious.” Based on the manuscript that I received, nowhere in the paper is it stated the type of regression analyses that they utilized. If GPA was dichotomized, logistic regression would have been most appropriate to mitigate any potential threats to statistical conclusion validity; however, if most students fell into one of the two categories, it is unlikely that GPA would be correlated with any other variable. However, if GPA is an ordinal variable with four categories in their study, then the ordinal nature of the dependent variable warrants the consideration of other regression analyses such as the cumulative odds model, the continuation ratio model, the ordered probit model, or stereotype model. Although it is common practice to pretend there is an interval scale underlying ordinal data, more appropriate ordinal regression models are specifically tailored to examine such data. Additionally, no theoretical justification was offered of why they decided to covary on gender, ethnicity, parental education (in the UIC study), imputed income by zip code (in the NASY studies), and education level (also in the NASY studies). By sprinkling in available covariates without a rationale, spurious results can also be produced. Because the authors did not provide the customary supporting information to decipher the effects (e.g., a zero–order correlation matrix among the variables), it is really difficult to interpret if their results are meaningful or a result of statistical artifact.

Using the NASY panel data, the authors attempted to examine GPA change as a function of Facebook usage. Such an analysis strongly implies a temporal sequence approximating the following: (1) A lagged prior–to–Facebook–use GPA is obtained; (2) Facebook use and nonuse is assessed; and, (3) A post–GPA score is collected. However, as the authors described in the NASY panel study, participants were simply asked if they currently use Facebook. From such limited data, it is impossible to establish the temporal sequence implied by the authors. Again, adjusting for apparent “prior” GPA is potentially misleading. Many Facebook users could have already been using Facebook prior to the “prior” GPA. This exemplifies the misuse of covariates without any theoretical or methodological rationale.

Research is a process of examination, peer review, replication, and many other steps along the way. Beginning at the exploratory end of the research spectrum is good practice and should not be dismissed. Acknowledgement of limitations is a component of responsible research, which I have done repeatedly in the past few months, and especially the past few weeks. I am fully aware of the limitations of my study, and merely want personnel at universities, researchers, parents, students, and others to think about this intricate relationship (or lack of) and start a dialogue. Indeed, the authors did “set the record straight,” that is, the questioned relationship between Facebook and GPA should still be under investigation using a more rigorous experimental design. Neither my study nor their study sets any record straight. End of article


About the author

Aryn C. Karpinski is a doctoral student in measurement at The Ohio State University in the College of Education and Human Ecology, in the School of Educational Policy and Leadership, in the Quantitative Research, Evaluation, and Measurement section. She has publications in Sleep Medicine and most recently in the Journal of Applied Developmental Psychology (in press). She has presented at meetings for the American Educational Research Association, American Evaluation Association, SLEEP, and the Society for Research in Child Development. Aryn is currently a graduate teaching associate and also has held graduate research associate positions.


Editorial history

Paper received 30 April 2009; accepted 30 April 2009.

Creative Commons License
“A response to reconciling a media sensation with data” by Aryn C. Karpinski is licensed under a Creative Commons Attribution 3.0 United States License.

A response to reconciling a media sensation with data
by Aryn C. Karpinski
First Monday, Volume 14, Number 5 - 4 May 2009