
Editor's Introduction
Reducing the Gender Achievement Gap in College Science: A Classroom Study of Values Affirmation
There are many stereotypes about the differences between women and men that are widely shared. For example, many people believe that men are better at physics than women. There are more men than women in science, technology, engineering and math professions, and women who pursue these fields in college tend to perform worse than their male colleagues. However, research shows that there is no difference in talent between men and women, so why do women continue to face barriers to success? Is there a way to close this “gender gap” in physics and empower women to perform to their full potential?
Paper Details
Abstract
In many science, technology, engineering, and mathematics disciplines, women are outperformed by men in test scores, jeopardizing their success in science-oriented courses and careers. The current study tested the effectiveness of a psychological intervention, called values affirmation, in reducing the gender achievement gap in a college-level introductory physics class. In this randomized double-blind study, 399 students either wrote about their most important values or not, twice at the beginning of the 15-week course. Values affirmation reduced the male-female performance and learning difference substantially and elevated women's modal grades from the C to B range. Benefits were strongest for women who tended to endorse the stereotype that men do better than women in physics. A brief psychological intervention may be a promising way to address the gender gap in science performance and learning.
Report
The substantial underrepresentation of women in science, technology, engineering, and mathematics (STEM) disciplines has long concerned policy-makers and the educational community (1, 2). In 2006, women earned only 28% of Ph.D.s in physical sciences, 25% in mathematics and computer science, and 20% in engineering in the United States (3). Although women made up 47% of the North American workforce in 2009, the percentage of women in lucrative technical professions, such as “computer and mathematical occupations” and “architecture and engineering occupations,” reached only 25% and 14%, respectively (4). Similar underrepresentation of women in STEM-related professions is also evident in other parts of the world (5).
The gender gap in STEM disciplines goes beyond the limited representation of women. In college physics—the field studied in the present investigation—women earn lower exam grades and lower scores on standardized tests of conceptual mastery (6, 7). Students’ prior background and preparation in mathematics and physics, identified as a major contributor to performance in introductory physics (8), can account for as much as 60% of the gender gap in exam performance at our institution, the University of Colorado, but background and preparation do not fully account for the gap (9).
Because gender achievement gaps have important educational and societal implications, several attempts have been made to reduce them. For example, in physics, interactive techniques such as peer instruction (10), where students discuss the answers to conceptual questions in small groups during lectures, and curricular materials, such as tutorials in introductory physics (11) and context-rich problems (12), can reduce the gender gap in college physics classrooms (13). Larger-scale attempts to reduce the gender gap in physics include restructuring the entire physics course (7, 14, 15) or introducing mentoring programs focused on women (16).
Although some of these attempts have reduced gender gaps, interventions in science education have focused mostly on instructional methods. Many have not taken into account social-psychological and cognitive processes that lead to gender differences in performance and learning. One such process involves psychological threat tied to the groups with which a person identifies. The fear of being devalued based on a group identity, such as becoming aware that one could be seen in light of a negative stereotype about one’s group, has been shown to undermine performance on difficult tests (17). For example, women’s performance on difficult math and science tests can suffer insofar as they worry that their poor performance could be seen to confirm a negative gender stereotype (18, 19). Although such identity threat has been shown to affect performance in lab experiments and classrooms (20, 21), attempts to reduce identity threat in authentic classroom contexts have been limited (22, 23).
Here, we report a large-scale classroom study that tested the effectiveness of a theoretically motivated psychological intervention, called values affirmation (24, 25), in reducing the gender gap in exam scores and in a standardized test of conceptual physics. Values affirmation could provide a complementary yet comparatively simple strategy to address gender differences in performance and learning in college science classrooms. A novel feature of this study is that it provides a formal assessment of the degree of student learning over a semester.
Physics is a challenging subject for many college students. Learning the material in lectures and textbooks and demonstrating understanding on exams put substantial pressure on students. Because of the stereotype that men are better than women at math and science (26), such pressure can prove more severe for women than men, especially for women who think that the stereotype might be valid and worry that it could apply or be applied to them (20, 27). Values affirmation, in which people reflect on self-defining values, can buffer people against such psychological threat. When they affirm their core values in a threatening environment, people reestablish a perception of personal integrity and worth, which in turn can provide them with the internal resources needed for coping effectively (24, 25, 28). Indeed, lab studies show that such affirmations lessen evaluative stress (29) and improve the performance of stereotype-threatened individuals (30).
The values-affirmation intervention used in this study involves writing about personally important values (such as friends and family). The writing exercise is brief (10 to 15 min) and is unrelated to the subject matter of the course. Nevertheless, it has been found effective in improving the grades of ethnic minority middle-school students and closing the racial achievement gap (23). Moreover, this benefit persisted in a 2-year follow-up study (31).
In this study, we applied the intervention to an entirely different context: the gender gap in college-level science. We tested whether values affirmation would reduce the gender achievement gap in a 15-week introductory physics course for STEM majors. Because the course had already implemented pedagogical practices aimed at lessening gender gaps (10, 11), this setup provided a strong test of the effectiveness of values affirmation to further reduce the gap. Moreover, compared to the African-American middle-school students in previous classroom intervention studies (23, 31), these students were relatively high-achieving (taking college physics and most planning to be STEM majors) and were from more-advantaged backgrounds.
In this randomized double-blind study, 399 students (283 men and 116 women) were randomly assigned to either the values-affirmation group or the control group (32). Students in the affirmation group selected their most important values from a list (such as relationships with friends and family or learning or gaining knowledge) and, in response to structured prompts, wrote about why these values were important to them. Students in the control group selected their least important values from the same list and wrote why these values might be important to other people. Thus, both groups wrote about values and their importance, but the exercise was self-relevant only for the affirmation group. This 15-min writing exercise was integrated into the class and was given once in the first recitation of the semester (week 1) and once in an online homework assignment (week 4) shortly before the first midterm exam (week 5). Each student was assigned to the same condition at both administrations. The course instructor and teaching assistants were unaware of students’ condition assignments, and the teaching assistants and students were unaware of the purpose of the writing exercises. As part of an online survey typically given in the course (week 2), students also indicated their endorsement of the stereotype that men perform better than women in physics.
The main outcome measure was scores on in-class exams (three midterms and one final) that consisted entirely of objective multiple-choice questions and constituted 75% of the final course grade. To assess the effect of values affirmation on learning, we also examined scores on an objective, nationally normed standardized test of conceptual physics [the Force and Motion Conceptual Evaluation (FMCE)] (33). Historically, men have substantially outperformed women on exams and the FMCE (9) in this course. In semesters when exams are heavily weighted (as in the semester of the present study), course grades have also shown a gender gap (9).
We predicted a reduced gender gap in performance for women who completed the values affirmation. Moreover, because people who endorse negative stereotypes about their group are most vulnerable to identity threat (20, 27), we expected the intervention to be particularly beneficial for women tending to endorse the gender stereotype.
The values-affirmation intervention succeeded in reducing the gender gap in performance. As shown in Fig. 1A, men outperformed women in the control condition on overall exam scores (the average of the percent correct for the four exams). However, the gender gap was significantly smaller in the affirmation condition than in the control condition, resulting in a significant gender × condition interaction [β = 0.16, t(387) = 3.08, P < 0.01] (34). The effect size for the observed gender gap was substantial in the control condition (Cohen’s d = 0.93) (F1,387 = 36.71, P < 0.01) but much smaller in the affirmation condition (d = 0.18) (F1,387 = 2.35, P = 0.13). Course grades, based substantially (75%) on the exam scores, showed a similar pattern [supporting online material (SOM) text].

Exam scores
Panel A shows the average exam score for men and women who did not receive the values affirmation intervention (control) and those who did. The gap in scores between men and women decreased in the values affirmation condition.
Panel B shows the average score on the end-of-semester Force and Motion Conceptual Evaluation (a national exam) for men and women in the control and values affirmation conditions. Again, values affirmation closed the gap between men and women.
Covariate adjustment
Students' scores on in-class exams covaried with their scores on the SATs or ACTs. This means that students who did well on the SATs or ACTs tended to do well on exams and students who did poorly on the SATs or ACTs tended to do worse on the exams.
The researchers used mathematical regression to compute adjusted scores that took into account each student's background performance (that is, their SAT or ACT score). The researchers used a similar procedure to adjust final FMCE scores based on each student's initial FMCE score (taken at the start of the course).
Covariate adjustment allows the researchers to remove any effects caused by covariation of the outcome measure with the other measures and just assess differences caused by the intervention.
For a more in-depth discussion on covariate adjustment, read more at TSG.
Although the second affirmation exercise was completed shortly before the first midterm exam, its benefits were not confined to that exam. The reduction in the gender gap remained evident on the final cumulative exam (table S1), as indicated by a significant gender × condition interaction [β = 0.19, t(387) = 3.45, P < 0.01; affirmation effect for women, F1,387 = 12.49, P < 0.01], even though no additional values-affirmation exercises took place beyond the fourth week of class.
The distribution of final letter grades indicated that values affirmation was particularly effective in improving women's performance from average (C) to above average (B). As shown in Fig. 2, women in the two groups differed primarily in the B-to-C range; more women earned B’s in the affirmation group than in the control group, whereas more women earned C’s in the control group than in the affirmation group [χ2(1, N = 91) = 4.07, P = 0.04]. There was no such difference in grade distribution for men [χ2(1, N = 202) = 0.02, P = 0.88].

Overall grades
Women got many more Cs than Bs in the course in the control condition. In comparison, women in the values affirmation condition were similarly likely to get a C or a B. This suggests that values affirmation helps push women from Cs to Bs overall.
Chi-square statistics
The researchers used the chi-square test to determine how the distributions were different for women in the control group compared to the values affirmation condition.
Learn more about the Chi-Square Statistic at Khan Academy: https://www.khanacademy.org/math/statistics-probability/inference-categorical-data-chi-square-tests/chi-square-goodness-of-fit-tests/v/chi-square-statistic
Reimagining the data
What would the grade distributions look like for women in the values affirmation condition if values affirmation did not close the gender gap in overall course grades (assume performance for all other groups matches what is presented in the figure)?
What would the distribution look like if the effect of values affirmation was something other than causing a shift from Cs to Bs (for example, if it caused a global increase in scores, or it caused a shift from Bs to As)?
The benefit of values affirmation for women was also observed on the FMCE. It was administered twice as part of the course, once at the beginning of the semester (week 1) and once at the end (week 15), to assess learning of physics concepts over the semester. Students were explicitly told that their performance would not influence their grades. Although there was no main or interactive effect of condition on the beginning-of-semester score [t’s < 1, NS], there was an effect at the end of the semester. Figure 1B illustrates the end-of-semester FMCE scores, controlling for scores on the beginning-of-semester scores to isolate effects on learning. The gender gap in the learning of physics concepts was substantial in the control condition (d = 0.46) (F1,304 = 6.23, P = 0.01), indicating that men improved their FMCE scores more than women over the semester. In the affirmation condition, however, this gender learning gap entirely disappeared (d = –0.12) (F1,304 = 0.96, P = 0.33), resulting in a significant gender × condition interaction [β = 0.12, t(296) = 2.13, P = 0.03]. That the benefit of affirmation was evidenced on the end-of-semester FMCE scores with beginning-of-semester scores controlled suggests that the intervention facilitated women’s learning of scientific concepts over the semester (SOM text).
Unexpectedly, affirmation negatively affected men’s exam scores (Fig. 1A), but, unlike the positive effect for women, this effect was not predicted, was not replicated for the end-of-semester FMCE score (Fig. 1B), and did not change men’s letter grade distribution (Fig. 2) (35). In contrast, the affirmation’s positive effect on women was significant for all outcome variables (SOM text), suggesting that the reduced gender gap observed in this study is based more robustly on the affirmation’s positive impact on women than on its negative impact on men.
Finally, the values affirmation was particularly beneficial for women who tended to endorse the gender stereotype. This moderation effect is illustrated in Fig. 3, with the exam (Fig. 3A) and end-of-semester FMCE (Fig. 3B) data. Although women as a group did not strongly endorse the negative gender stereotype (20), even a moderate level of stereotype endorsement was costly for women in the control condition, with their exam scores decreasing as a function of stereotype endorsement [β = –0.50, t(387) = –3.29, P < 0.01] (Fig. 3A). Affirmation, however, buffered women against this identity threat, eliminating the negative relation between stereotype endorsement and exam scores [β = 0.12, t(387) = 0.94, P = 0.35]. Moreover, among women expressing higher levels of stereotype endorsement (defined as 0.75 SDs above the mean here), affirmation improved the exam scores relative to the control condition [t(115) = 3.04, P < 0.01]. In contrast, men's exam scores were little affected by stereotype endorsement, regardless of condition [β = –0.08, t(387) = –1.70, P = 0.09, for the affirmation condition, and β = –0.07, t(387) = –0.92, P = 0.36, for the control condition]. These differential patterns for men and women resulted in a gender × condition × stereotype endorsement interaction [β = 0.16, t(387) = 2.74, P < 0.01].

Average exam scores
Panel A shows that average exam scores did not change for men regardless of their level of endorsement of the stereotype that men are better than women at physics.
In contrast, women who believed the gender stereotype did much worse than women who did not believe the stereotype.
Values affirmation eliminated this effect in women.
National exam score
Panel B shows that endorsement of the gender stereotype did not affect men's performance on the FMCE. Women who believed the gender stereotype did worse on the FMCE than those who did not believe the stereotype. Values affirmation eliminated this effect.
Types of variables
There are many types of variables, largely falling into two "buckets:" continuous and categorical variables. Stereotype endorsement, as used in this figure, is treated as a continuous variable in their statistical analysis. Read about more types of variables in this resource from the University of Minnesota.
Error
Error bars in the plot represent the standard error of the mean.
Learn more about standard error at Khan Academy: https://www.khanacademy.org/math/ap-statistics/sampling-distribution-ap/sampling-distribution-mean/v/standard-error-of-the-mean
A similar three-way interaction was found for the FMCE scores (Fig. 3B) [β = 0.15, t(296) = 2.45, P= 0.02]. Among women, there was a negative relationship between stereotype endorsement and end-of-semester FMCE scores in the control condition [β = –0.39, t(296) = –2.55, P = 0.01], but not in the affirmation condition [β = 0.22, t(296) = 1.54, P = 0.13]. Moreover, among women with higher levels of stereotype endorsement, the end-of-semester FMCE scores were higher in the affirmation condition than in the control condition [t(115) = 3.01, P < 0.01]. No such relationship was observed for men in either the control condition [β = –0.13, t(296) = –1.33, P = 0.18] or the affirmation condition [β = –0.10, t(296) = –1.41, P = 0.16].
Overall, these results suggest that values affirmation is a promising intervention that can help reduce the gender achievement gap in physics. Although the intervention was brief and did not directly concern the course material, it nonetheless provided a meaningful boost for women—especially women who tended to endorse the gender stereotype—on two objective measures: in-class multiple-choice exams and a standardized test of conceptual mastery (FMCE). Moreover, the results on the end-of-semester FMCE provide evidence that identity threat and affirmation affect the learning of scientific concepts, not just performance (SOM text). After controlling for prior background (prior SAT/ACT Math or beginning-of-semester FMCE scores), the affirmation closed the “residual” gender gap on in-class exam scores by approximately 61% and entirely eliminated the gap on the FCME. Although further efforts must aim to close the gap in prior preparation, the current results are promising in demonstrating that a brief psychological intervention can help close the residual gap, a problem of long-standing concern in science (9).
The introductory physics class we investigated was intended for STEM majors who have had success in STEM-related subjects before college and are motivated to do well in the course. Our results, therefore, demonstrate that, even among women who are relatively identified with and accomplished in science, a substantial gender gap exists, women’s performance is negatively related to stereotype endorsement, and gender differences can be reduced with a values-affirmation intervention.
Although previous attempts to reduce the gender achievement gap in science have focused mostly on instructional methods, the current results highlight the importance of social-psychological factors. One virtue of the affirmation is that it can be combined with instructional approaches that show promise in closing the gender gap, such as the interactive engagement approaches used in the present course (10). However, there is no reason to think that the effects of affirmation are confined to situations in which such instructional supports are already in place, given that the intervention was successful in reducing racial achievement differences among middle-school students in traditional classrooms (23, 31). Of course, even here, there were structural opportunities for learning in the form of a solid curriculum and qualified teachers; without such basic support, the efficacy of any psychological intervention would be limited (23).
Another virtue of the values-affirmation intervention is that it is not tied to a specific discipline. The psychological phenomenon that the intervention targets—identity threat—can prove an obstacle in other STEM disciplines where the underrepresentation and underperformance of women have been evident. Thus, the intervention has potential for broad applicability in many STEM disciplines.
Finally, the benefits of the affirmation may be long-lasting (25, 31) and could persist beyond the present course. The experience of lower identity threat, coupled with better mastery in a challenging science course, may encourage affirmed women to take other STEM courses and to pursue further education and even a career in STEM disciplines. More generally, the cumulative consequences of early performance—small differences at an early stage can get magnified over time—help explain how relatively brief interventions, when given early in a threatening environment, can have long-term effects. This snowballing effect may be particularly important in science, where subsequent learning builds on an earlier foundation of knowledge, making it increasingly difficult to catch up and enter a discipline later. Therefore, it may be important to intervene in gateway courses such as introductory physics. These courses are required for STEM majors, and performance in them can set long-term academic and career trajectories. Reducing the gender gap at gateways could not only benefit women’s performance in the short term but also encourage them to choose and persist in a scientific major and career path in STEM disciplines.
Supporting Online Material
www.sciencemag.org/cgi/content/full/330/6008/1234/DC1
Materials and Methods
SOM Text
Table S1
REFERENCES AND NOTES
1. National Academy of Sciences, Rising Above the Gathering Storm: Energizing and Employing America for a Brighter Future (National Academy Press, Washington, DC, 2005).
2. National Science Board, The Science and Engineering Workforce: Realizing America’s Potential [NSF03–69, National Science Foundation (NSF), Arlington, VA, 2003].
3. NSF, Division of Science Resources Statistics, Science and Engineering Degrees: 1966–2006 (detailed statistical tables in NSF 08-321, NSF, Arlington, VA, 2008); available at www.nsf.gov/statistics/nsf08321/.
4. U.S. Bureau of Labor Statistics (2009); available at ftp:// ftp.bls.gov/pub/special.requests/lf/aat11.txt.
5. Anita Borg Institute for Women and Technology, The State of Women and Technology Fields Around the World; available at http://anitaborg.org/files/ womenhightechworld.pdf.
6. S. J. Pollock, N. D. Finkelstein, L. E. Kost, Phys. Rev. Spec. Top. Phys. Ed. Res. 3, 010107 (2007).
7. E. Brewe et al., Phys. Rev. Spec. Top. Phys. Ed. Res. 6, 010106 (2010).
8. Z. Hazari, R. H. Tai, P. M. Sadler, Sci. Ed. 91, 847 (2007). 9. L. E. Kost, S. J. Pollock, N. D. Finkelstein, Phys. Rev. Spec. Top. Phys. Ed. Res. 5, 010101 (2009).
10. E. Mazur, Peer Instruction: A User’s Manual (Prentice-Hall, Upper Saddle River, NJ, 1997).
11. L. C. McDermott, P. S. Schaffer, Tutorials in Introductory Physics (Prentice-Hall, Upper Saddle River, NJ, 2002).
12. P. Heller, R. Keith, S. Anderson, Am. J. Phys. 60, 627 (1992).
13. M. Lorenzo, C. Crouch, E. Mazur, Am. J. Phys. 74, 118 (2006).
14. P. Laws, P. Rosborough, F. Poodry, Am. J. Phys. 67, S32 (1999).
15. S. Brahmia, E. Etkina, J. Coll. Sci. Teach. 31, 183 (2001).
16. M. Schneider, Phys. Teach. 39, 280 (2001).
17. C. M. Steele, S. J. Spencer, J. Aronson, in Advances in Experimental Social Psychology, M. P. Zanna, Ed. (Academic Press, San Diego, CA, 2002), pp. 183–242.
18. S. J. Spencer, C. M. Steele, D. M. Quinn, J. Exp. Soc. Psychol. 35, 4 (1999).
19. S. L. Beilock, R. J. Rydell, A. R. McConnell, J. Exp. Psychol. Gen. 136, 256 (2007).
20. T. Schmader, M. Johns, M. Barquissau, Sex Roles 50, 835 (2004).
21. T. Schmader, M. Johns, C. Forbes, Psychol. Rev. 115, 336 (2008).
22. C. Good, J. Aronson, M. Inzlicht, Appl. Dev. Psychol. 24, 645 (2003).
23. G. L. Cohen, J. Garcia, N. Apfel, A. Master, Science 313, 1307 (2006).
24. C. M. Steele, in Advances in Experimental Social Psychology, L. Berkowitz, Ed. (Academic Press, New York, 1988), pp. 261–302.
25. G. L. Cohen, J. Garcia, Curr. Dir. Psychol. Sci. 17, 365 (2008).
26. L. L. Schiebinger, Has Feminism Changed Science? (Harvard Univ. Press, Cambridge, MA, 2001).
27. J. Aronson, M. Inzlicht, Psychol. Sci. 15, 829 (2004).
28. D. K. Sherman, G. L. Cohen, in Advances in Experimental Social Psychology, M. P. Zanna Ed. (Academic Press, San Diego, CA, 2006), pp. 183–242.
29. J. D. Creswell et al., Psychol. Sci. 16, 846 (2005).
30. A. Martens, M. Johns, J. Greenberg, J. Schimel, J. Exp. Soc. Psychol. 42, 236 (2006).
31. G. L. Cohen, J. Garcia, V. Purdie-Vaughns, N. Apfel, P. Brzustoski, Science 324, 400 (2009).
32. Materials and methods are available as supporting material on Science Online.
33. R. K. Thornton, D. R. Sokoloff, Am. J. Phys. 66, 338 (2006).
34. The effect of values affirmation on exam scores and endof-semester FMCE scores was tested in regression models in which the outcome measures were regressed on gender, affirmation condition, and the gender × condition interaction. SAT/ACT Math scores were controlled for in the analysis of exam scores, and beginning-of-semester FMCE scores were controlled for in the analysis of end-of-semester FMCE scores. All b weights reported in this article are standardized weights. The full regression models are described in the SOM.
35. This unexpected finding for men's in-class exam scores is discussed in more detail in the SOM. In addition to the mixed evidence summarized here, this negative affirmation effect on men's exam scores was not significant when the analysis was conducted with the beginning-ofsemester FMCE scores (instead of SAT/ACT Math scores) as the covariate (S22).
36. We are grateful to course instructor M. Dubson, the course teaching assistants, and the student participants. We also thank L. Newnes and N. Golaszewski for invaluable assistance with data collection and C. Judd and B. Park for their statistical advice. This research was supported by NSF grant DRL0910373.