Interventions to reduce the gender gap in STEM

Women working together to solve STEM problems

Editor's Introduction

Reducing the Gender Achievement Gap in College Science: A Classroom Study of Values Affirmation

annotated by

There are many stereotypes about the differences between women and men that are widely shared. For example, many people believe that men are better at physics than women. There are more men than women in science, technology, engineering and math professions, and women who pursue these fields in college tend to perform worse than their male colleagues. However, research shows that there is no difference in talent between men and women, so why do women continue to face barriers to success? Is there a way to close this “gender gap” in physics and empower women to perform to their full potential?

Paper Details

Original title
Reducing the Gender Achievement Gap in College Science: A Classroom Study of Values Affirmation
Original publication date
Vol. 330, no. 6008, pp. 1234-1237
Issue name


In many science, technology, engineering, and mathematics disciplines, women are outperformed by men in test scores, jeopardizing their success in science-oriented courses and careers. The current study tested the effectiveness of a psychological intervention, called values affirmation, in reducing the gender achievement gap in a college-level introductory physics class. In this randomized double-blind study, 399 students either wrote about their most important values or not, twice at the beginning of the 15-week course. Values affirmation reduced the male-female performance and learning difference substantially and elevated women's modal grades from the C to B range. Benefits were strongest for women who tended to endorse the stereotype that men do better than women in physics. A brief psychological intervention may be a promising way to address the gender gap in science performance and learning.


The substantial underrepresentation of women in science, technology, engineering, and mathematics (STEM) disciplines has long concerned policy-makers and the educational community (12). In 2006, women earned only 28% of Ph.D.s in physical sciences, 25% in mathematics and computer science, and 20% in engineering in the United States (3). Although women made up 47% of the North American workforce in 2009, the percentage of women in lucrative technical professions, such as “computer and mathematical occupations” and “architecture and engineering occupations,” reached only 25% and 14%, respectively (4). Similar underrepresentation of women in STEM-related professions is also evident in other parts of the world (5).

The gender gap in STEM disciplines goes beyond the limited representation of women. In college physics—the field studied in the present investigation—women earn lower exam grades and lower scores on standardized tests of conceptual mastery (67). Students’ prior background and preparation in mathematics and physics, identified as a major contributor to performance in introductory physics (8), can account for as much as 60% of the gender gap in exam performance at our institution, the University of Colorado, but background and preparation do not fully account for the gap (9).

Because gender achievement gaps have important educational and societal implications, several attempts have been made to reduce them. For example, in physics, interactive techniques such as peer instruction (10), where students discuss the answers to conceptual questions in small groups during lectures, and curricular materials, such as tutorials in introductory physics (11) and context-rich problems (12), can reduce the gender gap in college physics classrooms (13). Larger-scale attempts to reduce the gender gap in physics include restructuring the entire physics course (71415) or introducing mentoring programs focused on women (16).

Although some of these attempts have reduced gender gaps, interventions in science education have focused mostly on instructional methods. Many have not taken into account social-psychological and cognitive processes that lead to gender differences in performance and learning. One such process involves psychological threat tied to the groups with which a person identifies. The fear of being devalued based on a group identity, such as becoming aware that one could be seen in light of a negative stereotype about one’s group, has been shown to undermine performance on difficult tests (17). For example, women’s performance on difficult math and science tests can suffer insofar as they worry that their poor performance could be seen to confirm a negative gender stereotype (1819). Although such identity threat has been shown to affect performance in lab experiments and classrooms (2021), attempts to reduce identity threat in authentic classroom contexts have been limited (2223).

Here, we report a large-scale classroom study that tested the effectiveness of a theoretically motivated psychological intervention, called values affirmation (2425), in reducing the gender gap in exam scores and in a standardized test of conceptual physics. Values affirmation could provide a complementary yet comparatively simple strategy to address gender differences in performance and learning in college science classrooms. A novel feature of this study is that it provides a formal assessment of the degree of student learning over a semester.

Physics is a challenging subject for many college students. Learning the material in lectures and textbooks and demonstrating understanding on exams put substantial pressure on students. Because of the stereotype that men are better than women at math and science (26), such pressure can prove more severe for women than men, especially for women who think that the stereotype might be valid and worry that it could apply or be applied to them (2027). Values affirmation, in which people reflect on self-defining values, can buffer people against such psychological threat. When they affirm their core values in a threatening environment, people reestablish a perception of personal integrity and worth, which in turn can provide them with the internal resources needed for coping effectively (242528). Indeed, lab studies show that such affirmations lessen evaluative stress (29) and improve the performance of stereotype-threatened individuals (30).

The values-affirmation intervention used in this study involves writing about personally important values (such as friends and family). The writing exercise is brief (10 to 15 min) and is unrelated to the subject matter of the course. Nevertheless, it has been found effective in improving the grades of ethnic minority middle-school students and closing the racial achievement gap (23). Moreover, this benefit persisted in a 2-year follow-up study (31).

In this study, we applied the intervention to an entirely different context: the gender gap in college-level science. We tested whether values affirmation would reduce the gender achievement gap in a 15-week introductory physics course for STEM majors. Because the course had already implemented pedagogical practices aimed at lessening gender gaps (1011), this setup provided a strong test of the effectiveness of values affirmation to further reduce the gap. Moreover, compared to the African-American middle-school students in previous classroom intervention studies (2331), these students were relatively high-achieving (taking college physics and most planning to be STEM majors) and were from more-advantaged backgrounds.

In this randomized double-blind study, 399 students (283 men and 116 women) were randomly assigned to either the values-affirmation group or the control group (32). Students in the affirmation group selected their most important values from a list (such as relationships with friends and family or learning or gaining knowledge) and, in response to structured prompts, wrote about why these values were important to them. Students in the control group selected their least important values from the same list and wrote why these values might be important to other people. Thus, both groups wrote about values and their importance, but the exercise was self-relevant only for the affirmation group. This 15-min writing exercise was integrated into the class and was given once in the first recitation of the semester (week 1) and once in an online homework assignment (week 4) shortly before the first midterm exam (week 5). Each student was assigned to the same condition at both administrations. The course instructor and teaching assistants were unaware of students’ condition assignments, and the teaching assistants and students were unaware of the purpose of the writing exercises. As part of an online survey typically given in the course (week 2), students also indicated their endorsement of the stereotype that men perform better than women in physics.

The main outcome measure was scores on in-class exams (three midterms and one final) that consisted entirely of objective multiple-choice questions and constituted 75% of the final course grade. To assess the effect of values affirmation on learning, we also examined scores on an objective, nationally normed standardized test of conceptual physics [the Force and Motion Conceptual Evaluation (FMCE)] (33). Historically, men have substantially outperformed women on exams and the FMCE (9) in this course. In semesters when exams are heavily weighted (as in the semester of the present study), course grades have also shown a gender gap (9).

We predicted a reduced gender gap in performance for women who completed the values affirmation. Moreover, because people who endorse negative stereotypes about their group are most vulnerable to identity threat (2027), we expected the intervention to be particularly beneficial for women tending to endorse the gender stereotype.

The values-affirmation intervention succeeded in reducing the gender gap in performance. As shown in Fig. 1A, men outperformed women in the control condition on overall exam scores (the average of the percent correct for the four exams). However, the gender gap was significantly smaller in the affirmation condition than in the control condition, resulting in a significant gender × condition interaction [β = 0.16, t(387) = 3.08, < 0.01] (34). The effect size for the observed gender gap was substantial in the control condition (Cohen’s d = 0.93) (F1,387 = 36.71, P < 0.01) but much smaller in the affirmation condition (d = 0.18) (F1,387 = 2.35, P = 0.13). Course grades, based substantially (75%) on the exam scores, showed a similar pattern [supporting online material (SOM) text].


Fig. 1 Student performance on two outcome measures examined in this study as a function of gender (men versus women) and affirmation condition (values affirmation versus control). (A) The overall combined exam scores, derived by averaging the percent correct for the three midterm exams (weeks 5, 9, and 14) and the final exam for each student. (B) The end-of-semester (week 15) score of the FMCE, a standardized test of conceptual knowledge in physics (33). This test was administered twice (in weeks 1 and 15) in recitation sections to assess the learning of basic physics concepts over the course of the semester. Exam scores are adjusted based on baseline math performance (SAT/ACT Math scores), and the end-of-semester FMCE scores are adjusted based on beginning-of-semester FMCE scores (SOM text). Error bars represent ±1 SE.
Exam scores

Panel A shows the average exam score for men and women who did not receive the values affirmation intervention (control) and those who did. The gap in scores between men and women decreased in the values affirmation condition.

Panel B shows the average score on the end-of-semester Force and Motion Conceptual Evaluation (a national exam) for men and women in the control and values affirmation conditions. Again, values affirmation closed the gap between men and women.

Covariate adjustment

Students' scores on in-class exams covaried with their scores on the SATs or ACTs. This means that students who did well on the SATs or ACTs tended to do well on exams and students who did poorly on the SATs or ACTs tended to do worse on the exams.

The researchers used mathematical regression to compute adjusted scores that took into account each student's background performance (that is, their SAT or ACT score). The researchers used a similar procedure to adjust final FMCE scores based on each student's initial FMCE score (taken at the start of the course).

Covariate adjustment allows the researchers to remove any effects caused by covariation of the outcome measure with the other measures and just assess differences caused by the intervention.

For a more in-depth discussion on covariate adjustment, read more at TSG.

Although the second affirmation exercise was completed shortly before the first midterm exam, its benefits were not confined to that exam. The reduction in the gender gap remained evident on the final cumulative exam (table S1), as indicated by a significant gender × condition interaction [β = 0.19, t(387) = 3.45, < 0.01; affirmation effect for women, F1,387 = 12.49, < 0.01], even though no additional values-affirmation exercises took place beyond the fourth week of class.

The distribution of final letter grades indicated that values affirmation was particularly effective in improving women's performance from average (C) to above average (B). As shown in Fig. 2, women in the two groups differed primarily in the B-to-C range; more women earned B’s in the affirmation group than in the control group, whereas more women earned C’s in the control group than in the affirmation group [χ2(1, N = 91) = 4.07, P = 0.04]. There was no such difference in grade distribution for men [χ2(1, N = 202) = 0.02, P = 0.88].


Fig. 2 Percentage of students receiving each letter grade (A, B, C, D, and F, combining letter grades with pluses and minuses) as a function of gender (men versus women) and affirmation condition (values affirmation versus control). The percentage was calculated separately for each values-affirmation condition within each gender.
Overall grades

Women got many more Cs than Bs in the course in the control condition. In comparison, women in the values affirmation condition were similarly likely to get a C or a B. This suggests that values affirmation helps push women from Cs to Bs overall.

Chi-square statistics

The researchers used the chi-square test to determine how the distributions were different for women in the control group compared to the values affirmation condition.

Learn more about the Chi-Square Statistic at Khan Academy:

Reimagining the data

What would the grade distributions look like for women in the values affirmation condition if values affirmation did not close the gender gap in overall course grades (assume performance for all other groups matches what is presented in the figure)?

What would the distribution look like if the effect of values affirmation was something other than causing a shift from Cs to Bs (for example, if it caused a global increase in scores, or it caused a shift from Bs to As)?

The benefit of values affirmation for women was also observed on the FMCE. It was administered twice as part of the course, once at the beginning of the semester (week 1) and once at the end (week 15), to assess learning of physics concepts over the semester. Students were explicitly told that their performance would not influence their grades. Although there was no main or interactive effect of condition on the beginning-of-semester score [t’s < 1, NS], there was an effect at the end of the semester. Figure 1B illustrates the end-of-semester FMCE scores, controlling for scores on the beginning-of-semester scores to isolate effects on learning. The gender gap in the learning of physics concepts was substantial in the control condition (d = 0.46) (F1,304 = 6.23, P = 0.01), indicating that men improved their FMCE scores more than women over the semester. In the affirmation condition, however, this gender learning gap entirely disappeared (d = –0.12) (F1,304 = 0.96, P = 0.33), resulting in a significant gender × condition interaction [β = 0.12, t(296) = 2.13, P = 0.03]. That the benefit of affirmation was evidenced on the end-of-semester FMCE scores with beginning-of-semester scores controlled suggests that the intervention facilitated women’s learning of scientific concepts over the semester (SOM text).

Unexpectedly, affirmation negatively affected men’s exam scores (Fig. 1A), but, unlike the positive effect for women, this effect was not predicted, was not replicated for the end-of-semester FMCE score (Fig. 1B), and did not change men’s letter grade distribution (Fig. 2) (35). In contrast, the affirmation’s positive effect on women was significant for all outcome variables (SOM text), suggesting that the reduced gender gap observed in this study is based more robustly on the affirmation’s positive impact on women than on its negative impact on men.

Finally, the values affirmation was particularly beneficial for women who tended to endorse the gender stereotype. This moderation effect is illustrated in Fig. 3, with the exam (Fig. 3A) and end-of-semester FMCE (Fig. 3B) data. Although women as a group did not strongly endorse the negative gender stereotype (20), even a moderate level of stereotype endorsement was costly for women in the control condition, with their exam scores decreasing as a function of stereotype endorsement [β = –0.50, t(387) = –3.29, < 0.01] (Fig. 3A). Affirmation, however, buffered women against this identity threat, eliminating the negative relation between stereotype endorsement and exam scores [β = 0.12, t(387) = 0.94, P = 0.35]. Moreover, among women expressing higher levels of stereotype endorsement (defined as 0.75 SDs above the mean here), affirmation improved the exam scores relative to the control condition [t(115) = 3.04, < 0.01]. In contrast, men's exam scores were little affected by stereotype endorsement, regardless of condition [β = –0.08, t(387) = –1.70, P = 0.09, for the affirmation condition, and β = –0.07, t(387) = –0.92, P = 0.36, for the control condition]. These differential patterns for men and women resulted in a gender × condition × stereotype endorsement interaction [β = 0.16, t(387) = 2.74, < 0.01].


Fig. 3 Student performance on two outcome measures examined in this study as a function of gender (men versus women), affirmation condition (values affirmation versus control), and the level of stereotype endorsement. (A) The overall combined exam scores, computed by averaging the percentage scores for the four exams. (B) The end-of-semester (week 15) score on the FMCE, a standardized test of conceptual knowledge in physics (33). Stereotype endorsement, treated as a continuous variable in our statistical analysis, was measured by agreement with the statement, “According to my own personal beliefs, I expect men to generally do better in physics than women,” answered on a five-point scale ranging from “strongly disagree” to “strongly agree.” The level of stereotype endorsement plotted in the graph was standardized for all students and expressed in terms of scores (20). The low and high levels of stereotype endorsement in the graph are represented by ±0.75 SD of the grand mean (SOM text). Error bars represent ±1 SE.
Average exam scores

Panel A shows that average exam scores did not change for men regardless of their level of endorsement of the stereotype that men are better than women at physics.

In contrast, women who believed the gender stereotype did much worse than women who did not believe the stereotype.

Values affirmation eliminated this effect in women.

National exam score

Panel B shows that endorsement of the gender stereotype did not affect men's performance on the FMCE. Women who believed the gender stereotype did worse on the FMCE than those who did not believe the stereotype. Values affirmation eliminated this effect.

Types of variables

There are many types of variables, largely falling into two "buckets:" continuous and categorical variables. Stereotype endorsement, as used in this figure, is treated as a continuous variable in their statistical analysis. Read about more types of variables in this resource from the University of Minnesota.


Error bars in the plot represent the standard error of the mean.

Learn more about standard error at Khan Academy:

A similar three-way interaction was found for the FMCE scores (Fig. 3B) [β = 0.15, t(296) = 2.45, P= 0.02]. Among women, there was a negative relationship between stereotype endorsement and end-of-semester FMCE scores in the control condition [β = –0.39, t(296) = –2.55, P = 0.01], but not in the affirmation condition [β = 0.22, t(296) = 1.54, P = 0.13]. Moreover, among women with higher levels of stereotype endorsement, the end-of-semester FMCE scores were higher in the affirmation condition than in the control condition [t(115) = 3.01, < 0.01]. No such relationship was observed for men in either the control condition [β = –0.13, t(296) = –1.33, P = 0.18] or the affirmation condition [β = –0.10, t(296) = –1.41, P = 0.16].

Overall, these results suggest that values affirmation is a promising intervention that can help reduce the gender achievement gap in physics. Although the intervention was brief and did not directly concern the course material, it nonetheless provided a meaningful boost for women—especially women who tended to endorse the gender stereotype—on two objective measures: in-class multiple-choice exams and a standardized test of conceptual mastery (FMCE). Moreover, the results on the end-of-semester FMCE provide evidence that identity threat and affirmation affect the learning of scientific concepts, not just performance (SOM text). After controlling for prior background (prior SAT/ACT Math or beginning-of-semester FMCE scores), the affirmation closed the “residual” gender gap on in-class exam scores by approximately 61% and entirely eliminated the gap on the FCME. Although further efforts must aim to close the gap in prior preparation, the current results are promising in demonstrating that a brief psychological intervention can help close the residual gap, a problem of long-standing concern in science (9).

The introductory physics class we investigated was intended for STEM majors who have had success in STEM-related subjects before college and are motivated to do well in the course. Our results, therefore, demonstrate that, even among women who are relatively identified with and accomplished in science, a substantial gender gap exists, women’s performance is negatively related to stereotype endorsement, and gender differences can be reduced with a values-affirmation intervention.

Although previous attempts to reduce the gender achievement gap in science have focused mostly on instructional methods, the current results highlight the importance of social-psychological factors. One virtue of the affirmation is that it can be combined with instructional approaches that show promise in closing the gender gap, such as the interactive engagement approaches used in the present course (10). However, there is no reason to think that the effects of affirmation are confined to situations in which such instructional supports are already in place, given that the intervention was successful in reducing racial achievement differences among middle-school students in traditional classrooms (2331). Of course, even here, there were structural opportunities for learning in the form of a solid curriculum and qualified teachers; without such basic support, the efficacy of any psychological intervention would be limited (23).

Another virtue of the values-affirmation intervention is that it is not tied to a specific discipline. The psychological phenomenon that the intervention targets—identity threat—can prove an obstacle in other STEM disciplines where the underrepresentation and underperformance of women have been evident. Thus, the intervention has potential for broad applicability in many STEM disciplines.

Finally, the benefits of the affirmation may be long-lasting (2531) and could persist beyond the present course. The experience of lower identity threat, coupled with better mastery in a challenging science course, may encourage affirmed women to take other STEM courses and to pursue further education and even a career in STEM disciplines. More generally, the cumulative consequences of early performance—small differences at an early stage can get magnified over time—help explain how relatively brief interventions, when given early in a threatening environment, can have long-term effects. This snowballing effect may be particularly important in science, where subsequent learning builds on an earlier foundation of knowledge, making it increasingly difficult to catch up and enter a discipline later. Therefore, it may be important to intervene in gateway courses such as introductory physics. These courses are required for STEM majors, and performance in them can set long-term academic and career trajectories. Reducing the gender gap at gateways could not only benefit women’s performance in the short term but also encourage them to choose and persist in a scientific major and career path in STEM disciplines.

Supporting Online Material

Materials and Methods

SOM Text

Table S1




1. National Academy of Sciences, Rising Above the Gathering Storm: Energizing and Employing America for a Brighter Future (National Academy Press, Washington, DC, 2005).

2. National Science Board, The Science and Engineering Workforce: Realizing America’s Potential [NSF03–69, National Science Foundation (NSF), Arlington, VA, 2003].

3. NSF, Division of Science Resources Statistics, Science and Engineering Degrees: 1966–2006 (detailed statistical tables in NSF 08-321, NSF, Arlington, VA, 2008); available at

4. U.S. Bureau of Labor Statistics (2009); available at ftp://

5. Anita Borg Institute for Women and Technology, The State of Women and Technology Fields Around the World; available at womenhightechworld.pdf.

6. S. J. Pollock, N. D. Finkelstein, L. E. Kost, Phys. Rev. Spec. Top. Phys. Ed. Res. 3, 010107 (2007).

7. E. Brewe et al., Phys. Rev. Spec. Top. Phys. Ed. Res. 6, 010106 (2010).

8. Z. Hazari, R. H. Tai, P. M. Sadler, Sci. Ed. 91, 847 (2007). 9. L. E. Kost, S. J. Pollock, N. D. Finkelstein, Phys. Rev. Spec. Top. Phys. Ed. Res. 5, 010101 (2009).

10. E. Mazur, Peer Instruction: A User’s Manual (Prentice-Hall, Upper Saddle River, NJ, 1997).

11. L. C. McDermott, P. S. Schaffer, Tutorials in Introductory Physics (Prentice-Hall, Upper Saddle River, NJ, 2002).

12. P. Heller, R. Keith, S. Anderson, Am. J. Phys. 60, 627 (1992).

13. M. Lorenzo, C. Crouch, E. Mazur, Am. J. Phys. 74, 118 (2006).

14. P. Laws, P. Rosborough, F. Poodry, Am. J. Phys. 67, S32 (1999).

15. S. Brahmia, E. Etkina, J. Coll. Sci. Teach. 31, 183 (2001).

16. M. Schneider, Phys. Teach. 39, 280 (2001).

17. C. M. Steele, S. J. Spencer, J. Aronson, in Advances in Experimental Social Psychology, M. P. Zanna, Ed. (Academic Press, San Diego, CA, 2002), pp. 183–242.

18. S. J. Spencer, C. M. Steele, D. M. Quinn, J. Exp. Soc. Psychol. 35, 4 (1999).

19. S. L. Beilock, R. J. Rydell, A. R. McConnell, J. Exp. Psychol. Gen. 136, 256 (2007).

20. T. Schmader, M. Johns, M. Barquissau, Sex Roles 50, 835 (2004).

21. T. Schmader, M. Johns, C. Forbes, Psychol. Rev. 115, 336 (2008).

22. C. Good, J. Aronson, M. Inzlicht, Appl. Dev. Psychol. 24, 645 (2003).

23. G. L. Cohen, J. Garcia, N. Apfel, A. Master, Science 313, 1307 (2006).

24. C. M. Steele, in Advances in Experimental Social Psychology, L. Berkowitz, Ed. (Academic Press, New York, 1988), pp. 261–302.

25. G. L. Cohen, J. Garcia, Curr. Dir. Psychol. Sci. 17, 365 (2008).

26. L. L. Schiebinger, Has Feminism Changed Science? (Harvard Univ. Press, Cambridge, MA, 2001).

27. J. Aronson, M. Inzlicht, Psychol. Sci. 15, 829 (2004).

28. D. K. Sherman, G. L. Cohen, in Advances in Experimental Social Psychology, M. P. Zanna Ed. (Academic Press, San Diego, CA, 2006), pp. 183–242.

29. J. D. Creswell et al., Psychol. Sci. 16, 846 (2005).

30. A. Martens, M. Johns, J. Greenberg, J. Schimel, J. Exp. Soc. Psychol. 42, 236 (2006).

31. G. L. Cohen, J. Garcia, V. Purdie-Vaughns, N. Apfel, P. Brzustoski, Science 324, 400 (2009).

32. Materials and methods are available as supporting material on Science Online.

33. R. K. Thornton, D. R. Sokoloff, Am. J. Phys. 66, 338 (2006).

34. The effect of values affirmation on exam scores and endof-semester FMCE scores was tested in regression models in which the outcome measures were regressed on gender, affirmation condition, and the gender × condition interaction. SAT/ACT Math scores were controlled for in the analysis of exam scores, and beginning-of-semester FMCE scores were controlled for in the analysis of end-of-semester FMCE scores. All b weights reported in this article are standardized weights. The full regression models are described in the SOM.

35. This unexpected finding for men's in-class exam scores is discussed in more detail in the SOM. In addition to the mixed evidence summarized here, this negative affirmation effect on men's exam scores was not significant when the analysis was conducted with the beginning-ofsemester FMCE scores (instead of SAT/ACT Math scores) as the covariate (S22).

36. We are grateful to course instructor M. Dubson, the course teaching assistants, and the student participants. We also thank L. Newnes and N. Golaszewski for invaluable assistance with data collection and C. Judd and B. Park for their statistical advice. This research was supported by NSF grant DRL0910373.