Measuring Gender Equality

Feb 13

Artist credit: Brianna Weir, 2019

Author: Marion Boulicault

Scientists report that, on average, men and women behave differently, display different personality traits, and have different interests and preferences.¹ What explains these differences? According to one popular view -- let’s call it the biological view -- the differences between men and women are innate, evolved biological differences. A different view -- the social view -- holds that these differences can be explained in social terms: Differences between men and women arise because of a lack of gender equality. If the social view is right, one might expect that as gender equality increases, gender differences will start to disappear. It turns out this isn’t the case. In what has been described as the “Gender Equality Paradox,” a number of studies report that as gender equality increases, gender differences, such as differences in personality traits and participation in STEM, also increase.² Does this mean that the social view of gender is wrong?

In this series, the GenderSci Lab is examining this seeming paradox from multiple angles, focusing on the case of women in STEM. In today’s post, I explore the paradox from the angle of measurement, drawing on my own area of expertise as a philosopher of quantification.

Investigation of the “Gender Equality Paradox” hypothesis requires the measurement of (at minimum) two variables: (1) sex/gender difference and (2) gender equality. The content of Variable (1) differs between studies, depending on which sex/gender difference is being investigated -- for example, some studies investigate differences in STEM participation,³ and others in personality traits. ⁴ The meaning of Variable (2), however, remains constant across studies; no matter which version of the Gender Equality Paradox is being investigated, gender equality must be measured. So how exactly is gender equality measured?

What is Measurement?

To answer this question, it helps to understand a little about how measurement works. Let’s start with the kind of basic picture that you might find in a social sciences textbook.⁵ To measure something, you first have to know what it is you want to measure; a step called “defining the construct.” Imagine you want to measure the US population. You define your construct -- the US population -- as, say, the total number of living individuals within US borders at a specific time. You then use a measuring instrument (e.g. a census) to follow a systematic procedure (e.g. send a census survey to every household and analyze returned forms) resulting in the output of a number and unit: 300 million people. And voila, you have measured the US population!

But of course, it’s never quite that simple in practice. For instance, the construct could be defined in many different ways -- maybe you should only include US citizens? -- and measured using many different instruments; in this case, you could use a census, or rely on birth and death records, or make estimates from tax records, each of which might lead to a different number. How do we know which way is best? What makes a measure of population a good measure of population? Generally, measurement theorists offer two criteria for good measurement: validity and reliability.⁶ A measure is valid if it measures what it’s intended to measure. So a census-based measure of population is valid if the number it outputs accurately represents the US population (however you defined it in step one). A measure is reliable if it can be used consistently by different people in different places and times. A census-based population measure is reliable if it, for example, produces the same number regardless of who administers the census, and if it can be used consistently to measure different populations, such as the US and Nepal’s populations, at different times and in different contexts.

Measuring Gender Equality: The GGGI

So how do “Gender Equality Paradox” researchers measure gender equality? Rather than creating new measures from scratch, researchers tend to make use of existing measures of nation-level gender equality, particularly those created and used by high-status international organizations like the United Nations and the World Economic Forum.

Gender Equality Indices

Researchers have a menu of country-level gender equality measures from which to choose. Options include the United Nations Development Programme’s Gender Inequality Index (GII), the United Nations Development Programme’s Gender Empowerment Measure (GEM), Social Watch’s Gender Equity Index (GEI), the Basic Index of Gender Inequality (BIGI) and the OECD’s Social Institutions and Gender Index (SIGI).⁷ Each of these measures are actually compound measures: They measure gender equality by taking different individual measures (known as indicators) of specific phenomena -- e.g. literacy rate, life expectancy, labor force participation, and government positions -- and combine them into one compound measure, called an index. Each index differs with respect to the particular indicators it includes (e.g. the SIGI includes only indicators related to gender equality in social institutions, whereas the GEI uses a much wider range of indicators), and with respect to how those indicators are combined and weighted.⁸

The Global Gender Gap Index

Among the many gender equality indices available, the World Economic Forum’s Gender Gender Gap Index (GGGI) has been particularly popular in “Gender Equality Paradox” research.⁹ The GGGI measures gender equality by combining four sub-indices, each of which is intended to measure the “gap between men and women in four fundamental categories”: (1) Economic Participation and Opportunity, (2) Educational Attainment, (3) Health and Survival, and (4) Political Empowerment. Each sub-index aggregates different indicators (14 in total), as listed in Table 1 below:

Source: http://reports.weforum.org/global-gender-gap-report-2017/measuring-the-global-gender-gap/ — *Source: http://reports.weforum.org/global-gender-gap-report-2017/measuring-the-global-gender-gap*/

The GGGI assigns different weights to each of the 14 indicators and combines them to produce a score between 0 and 1 for each country, where 1 indicates perfect gender equality, and 0 indicates perfect gender inequality. The GGGI is then used by the World Economic Forum to rank countries according to gender equality, and these rankings are published in an annual report.¹⁰

Is the GGGI a Good Measure of Gender Equality? Measures as Tools

Is the GGGI a good measure of gender equality? In other words, is it a valid and reliable way to quantify the phenomenon of gender equality?

The answer to this question depends on the construct definition, i.e. on how “gender equality” is defined. The UN defines gender equality as the “full equality of rights and opportunities between men and women.”¹¹ However, between the ten words of this definition lie a plethora of details and complications. What does it mean in practice for men and women to have “full equality of rights and opportunities”? Does it matter whether men and women feel equal or is it enough that they have equal rights and opportunities? Should the equality of rights and opportunities be understood differently in different domains, for example in healthcare vs. politics? These kinds of questions have been heavily debated, leading to the identification of different dimensions and definitions of gender equality.¹²

These complexities are reflected in the ways gender equality is measured. One reason that so many gender equality measures exist (and that these measures are compound indices rather than uni-dimensional indicators) is precisely because gender equality is complex and can be conceptualized and defined, and therefore measured, in many different ways. As such, rather than seeing all these measures as strictly competing, it’s helpful to think of them as different tools, each suited to measuring different constructs or dimensions of gender equality. For instance, if you want to measure gender equality within social institutions, you won’t want to use the GGGI, which is intended to measure gender equality across four broad domains. Instead, an index like SIGI -- which is specifically created to measure gender equality (and gender discrimination) in social institutions -- would be the better tool for the job. In other words, just like you would use a thermometer over a meter stick to measure water temperature, you would use SIGI over the GGGI to measure gender equality in social institutions.

The right question isn't whether the GGGI is a good measure of gender equality in general... The right question is whether it's the right tool for the Gender Equality Paradox.

With this tool metaphor in hand, we can reframe our question. The right question isn’t whether the GGGI is a good measure of gender equality in general (there may be -- and I think there are -- many contexts and settings in which the GGGI is a good measure). The right question is whether it’s a good measure for investigating the “Gender Equality Paradox”. In other words: is it the right tool for the “Gender Equality Paradox” job?

To answer this question, we need to know two things: (1) What are “Gender Equality Paradox” researchers trying to measure when they measure gender equality?; and (2) what does the GGGI measure? If the answers to questions (1) and (2) don’t match up, then the GGGI may not be the right tool for “Gender Equality Paradox” research. I tackle each question in turn.

What are "Gender Equality Paradox" Researchers Trying to Measure?

Let’s look at a particular example of “Gender Equality Paradox” research: the recent prominent publication by Stoet & Geary, in which they set out to investigate whether a correlation exists between “the graduation gap [between men and women] in STEM” and gender equality.¹³

How do Stoet & Geary define the construct “gender equality” in their study? While they don’t give an explicit definition, they do offer some clues. First, they summarize their study’s main finding: “Paradoxically, countries with lower levels of gender equality had relatively more women among STEM graduates than did more gender-equal countries” (p. 10). They then explain what makes this finding paradoxical:

"This [their finding] is a paradox, because gender-equal countries are those that give girls and women more educational and empowerment opportunities and that generally promote girls’ and women’s engagement in STEM fields." (2018, 10, emphasis added).

In this quote, Stoet & Geary say two (highly related) things about what exactly they mean by “gender-equality.” In more gender-equal countries:

Girls and women are given “more educational and empowerment opportunities”
These countries “generally promote girls’ and women’s engagement in STEM fields.”

What Does the GGI Measure?: A Tool for Benchmarking

What does the GGGI measure? According to the World Economic Forum’s website, the GGGI measures “the gap between men and women across four fundamental categories (subindexes): Economic Participation and Opportunity, Educational Attainment, Health and Survival and Political Empowerment.”¹⁴ The creators of the GGGI also emphasize three "underlying concepts" behind their index. These are that the GGGI measures:

Gaps not levels: "The Index is designed to measure gender-based gaps in access to resources and opportunities in countries, rather than the actual levels of available resources and opportunities in those countries."¹⁵ In other words, the GGGI only tells you about differences between men and women, and not about absolute levels. This means that the GGGI doesn't tell you what her life is like as compared to men in that country. For example, of the GGGI's health indicators is women's life expectancy relative to men. A country were both men and women die at 40 would be ranked higher on this indicator than one where women die at 68 and men at 72.
Outputs not inputs: What this means is that the GGGI only measures the size of the gaps that exist between men and women in the four fundamental categories. It does not measure anything related to how those gaps came to be or any factors that affect the existence or size of the gaps. For instance, the GGGI measures gender differences in labor force participation, but it doesn’t measure why those differences exist.
Gender equality not women’s empowerment: In the words of the GGGI Report: “The Index rewards countries that reach the point where outcomes for women equal those for men, but it neither rewards nor penalizes cases in which women are outperforming men in particular indicators in some countries.”"¹⁶ In other words, take two countries, one where outcomes for women are equal those for men, and another where women are outperforming men. The GGGI doesn’t discriminate between these two countries; as long as women are at least at parity with men, the GGGI will assign a value of ‘1.’

To understand why the GGGI creators defined gender equality in this way, we need to understand why the GGGI was created. The overarching goal behind the GGGI is to “create global awareness of the challenges posed by gender gaps and the opportunities created by reducing them.”¹⁷ In service of this goal, the GGGI was created as a tool for “capturing the magnitude of gender-based disparities and tracking their progress over time.”¹⁸ In other words, the GGGI, and the subsequent country rankings created from GGGI scores, are intended to serve as a tool for benchmarking, which is the practice of measuring something for the purposes of comparison: The GGGI “benchmarks national gender gaps on economic, education, health, and political criteria, and provides country rankings that allow for effective comparisons across regions and income groups.”¹⁹ Thus, the GGGI creators chose the construct definition they did (based on the above three underlying concepts) because they believed this would lead to the creation of a measure that would meet their goals: benchmarking gender-based disparities, and thereby creating global awareness of the challenges posed by gender gaps and the opportunities created by reducing them.

  
  Evidence of Invalidity: The GGGI isn’t the Right Tool for Gender Equality Paradox Research

When you look carefully at what the GGGI measures, and then look at how “Gender Equality Paradox” researchers like Stoet & Geary define gender equality, it becomes clear that the GGGI is not the right tool for Gender Equality Paradox research.

Although Stoet & Geary don’t give an explicit construct definition, they make it clear that they define gender equality in such a way that a more gender equal country is one where (a) there is greater promotion of women’s engagement in STEM and (b) where girls and women are given more educational and empowerment opportunities.²⁰

However, this is not what the GGGI measures. The GGGI measures “the gap between men and women across four fundamental categories (subindexes): Economic Participation and Opportunity, Educational Attainment, Health and Survival and Political Empowerment.”²¹ First, none of these subindices include any indicators that measure the extent to which a country promotes women’s participation in STEM. Second, the GGGI does not measure education and empowerment “opportunities,” which are input measures; it only measures outcomes, i.e. it only measures the gap between men and women’s educational achievements, and doesn’t measure anything about why that gap does or does not exist (e.g. whether these gaps were caused by differences in opportunities). The GGGI is thus an invalid measure in the context of “Gender Equality Paradox” research. It is not the right tool for the job.

Implications

What are the implications of this invalidity? First, it must be emphasized this invalidity does not indict the GGGI per se. No measure is ever valid in every context and for every purpose (just like there is no single tool for solving every problem), so the fact that the GGGI isn’t the right measure for “Gender Equality Paradox” research is certainly not a reason to abandon it entirely. There could be (and likely are) many contexts in which the GGGI is a valid measure -- contexts in which researchers are intending to measure those dimensions of gender equality captured by the GGGI.

The implications for Stoet & Geary’s conclusions about the “Gender Equality Paradox” hypothesis, however, are more dire. Stoet & Geary’s use of the GGGI is invalid and, unfortunately, this measurement problem doesn’t have an easy solution. Stoet & Geary couldn’t, for instance, simply achieve validity by revising their definition of gender equality so that it matches that of the GGGI (e.g. by responding that “by ‘gender equality,’ we didn’t mean anything to do with promotion of women’s engagement in STEM, we actually meant the gender gap in outcomes across the four fundamental categories”). Nor could they achieve validity by taking a neutral stance on the meaning of gender equality (e.g. by taking what’s known as an “operationalist” approach and defining “gender equality” as whatever it is that is measured by the GGGI). This is because their findings only appear to be evidence of a “Gender Equality Paradox” given their definition of gender equality. It is precisely because they define gender equality as greater access to educational and empowerment opportunities and, importantly, promotion of women’s engagement in STEM, that their findings of a negative correlation between gender equality and women’s participation in STEM appear so paradoxical.

In contrast, if you adopt the GGGI’s definition of gender equality, the paradox seems to disappear. As others have already noted, there may be nothing paradoxical about the existence of a negative correlation between the GGGI and women’s participation in STEM. There are, in fact, a whole host of hypotheses that could explain this negative correlation. For example, perhaps in countries with high GGGI scores, women’s participation in STEM is undermined by the existence of implicit bias (i.e. of relatively unconscious automatic mental associations between, for example, women and particular careers), or by the persistence of gendered stereotypes that discourage women from enrolling in STEM degrees -- factors which are not measured by any of the GGGI indicators. Perhaps in countries where there are smaller gaps between men and women in terms of access and opportunities, gendered stereotypes and biases actually increase (perhaps as a kind of “backlash” against increasing gender equality). If this were the case (and we’re not suggesting that it is the case, only that it is a live possibility not ruled out by Stoet & Geary’s study), countries that score high on the GGGI would actually be expected to have greater barriers to women’s participation in STEM -- i.e. they might be precisely those countries that don’t promote women’s participation in STEM as well as others do. If this were true, then the GGGI and Stoet & Geary’s definitions of gender equality would be entirely at odds, making Stoet & Geary’s use of the GGGI in their “Gender Equality Paradox” research highly problematic.

A Paradox Dissolves

Stoet & Geary claimed to have found a surprising paradox, one with worrisome implications for feminist efforts aimed at bringing equality to the workplace: The more a country promotes women’s engagement in STEM, the fewer women actually participate in STEM. However, it turns out that the measurement tool used by Stoet & Geary to measure gender equality does not actually measure the degree to which a country promotes women’s engagement in STEM. In fact, what it does measure -- the gap between men and women across four categories: Economic Participation and Opportunity, Educational Attainment, Health and Survival and Political Empowerment -- may not correlate at all with whether women feel encouraged to enter STEM fields. Stoet & Geary are using the wrong measurement tool, and once you realize this, you also realize that Stoet & Geary’s paradox may not, in fact, be a paradox at all.

Acknowledgments:

Thank you to Cosmo Grant and Milo Phillips-Brown from the MIT Philosophy Department for help thinking through some of these ideas.

Authorship Statement:

This blog series on the Gender Equality Paradox emerged from collective GenderSci Lab discussions. Each author outlined and drafted their own piece. GenderSci Lab members offered comments and authors integrated these revisions. Brianna Weir developed original artwork for the series. Maria Charles authored and approved the final version of her interview answers and provided images and figures for our use. Tyler Vigen developed a “women in STEM” spurious correlations widget for us and provided permission for the use of his findings in this blog series. Juanis Becerra and Nicole Noll assisted with formatting the blogs for the website. Heather Shattuck-Heidorn oversaw the blog series development, review, and publishing process. For the Psychological Science paper, Sarah Richardson drafted the manuscript. Meredith Reiches and Joe Bruch performed the data analysis. All authors (Richardson, Reiches, Bruch, Boulicault, Noll, and Shattuck-Heidorn) provided critical revisions and approved the final version of the manuscript for submission. Action editor Tim Pleskac shepherded the Corrigendum and Commentary through the peer review process at Psychological Science. We thank the anonymous peer reviewers and Gijsbert Stoet and David Geary for their contributions.

Recommended Citation:

Boulicault, Marion. “Measuring Gender Equality,” GenderSci Blog, February 13, 2020, https://www.genderscilab.org/blog/measuring-gender-equality-why-the-gggi-is-not-the-right-measure-for-gender-equality-paradox-research

Endnotes:

[1] Weisberg, Y. J., DeYoung, C. G., & Hirsh, J. B. (2011). Gender Differences in Personality across the Ten Aspects of the Big Five. Frontiers in Psychology, 2. https://doi.org/10.3389/fpsyg.2011.00178; Croson, R., & Gneezy, U. (2009). Gender Differences in Preferences. Journal of Economic Literature, 47(2), 448–474. https://doi.org/10.1257/jel.47.2.448; Falk, A., & Hermle, J. (2018). Relationship of gender differences in preferences to economic development and gender equality. Science, 362(6412), eaas9899. https://doi.org/10.1126/science.aas9899

[2] Giolla, E. M., & Kajonius, P. J. (2018). Sex differences in personality are larger in gender equal countries: Replicating and extending a surprising finding. International Journal of Psychology. https://doi.org/10.1002/ijop.12529; Stoet, G., & Geary, D. C. (2018). The Gender-Equality Paradox in Science, Technology, Engineering, and Mathematics Education. Psychological Science, 29(4), 581–593.https://doi.org/10.1177/0956797617741719

[3] Stoet, G., & Geary, D. C. (2018). The Gender-Equality Paradox in Science, Technology, Engineering, and Mathematics Education. Psychological Science, 29(4), 581–593. https://doi.org/10.1177/0956797617741719

[4] Giolla, E. M., & Kajonius, P. J. (2018). Sex differences in personality are larger in gender equal countries: Replicating and extending a surprising finding. International Journal of Psychology. https://doi.org/10.1002/ijop.12529

[5] Allen, M. J., & Yen, W. M. (1979). Introduction to Measurement Theory (1 edition). Long Grove, Ill: Waveland Pr Inc.

[6] Trochim, William M. (2006). The Research Methods Knowledge Base, 2nd Edition. Internet WWW page, at URL: http://www.socialresearchmethods.net/kb/ (version current as of October 20, 2006).

[7] Stoet, G., & Geary, D. C. (2019). A simplified approach to measuring national gender inequality. PLOS ONE, 14(1), e0205349. https://doi.org/10.1371/journal.pone.0205349; Hawken, A., & Munck, G. L. (2013). Cross-National Indices with Gender-Differentiated Data: What Do They Measure? How Valid Are They? Social Indicators Research, 111(3), 801–838.https://doi.org/10.1007/s11205-012-0035-7

[8] Hawken, A., & Munck, G. L. (2013). Cross-National Indices with Gender-Differentiated Data: What Do They Measure? How Valid Are They? Social Indicators Research, 111(3), 801–838. https://doi.org/10.1007/s11205-012-0035-7

[9] Stoet, G., & Geary, D. C. (2018). The Gender-Equality Paradox in Science, Technology, Engineering, and Mathematics Education. Psychological Science, 29(4), 581–593. https://doi.org/10.1177/0956797617741719; Giolla, E. M., & Kajonius, P. J. (2018). Sex differences in personality are larger in gender equal countries: Replicating and extending a surprising finding. International Journal of Psychology. https://doi.org/10.1002/ijop.12529.

[10] World Economic Forum. (2018). The global gender gap report 2018. Retrieved from http://www3.weforum.org/docs/WEF_GGGR_2018.pdf. Accessed June 13, 2019.

[11] United Nations. (2015, December 16). Gender Equality. Retrieved December 12, 2019, from https://www.un.org/en/sections/issues-depth/gender-equality/

[12] Sen, A. (2001). The Many Faces of Gender Inequality. New Republic, 225(12), 35–40.

[13] Stoet, G., & Geary, D. C. (2018). The Gender-Equality Paradox in Science, Technology, Engineering, and Mathematics Education. Psychological Science, 29(4), 581–593. https://doi.org/10.1177/0956797617741719

[14] World Economic Forum. (n.d.). Measuring the Global Gender Gap. Retrieved December 12, 2019, from Global Gender Gap Report 2018 website: http://reports.weforum.org/global-gender-gap-report-2018/measuring-the-global-gender-gap/

[15] World Economic Forum. (2018). The global gender gap report 2018, 4. Retrieved from http://www3.weforum.org/docs/WEF_GGGR_2018.pdf. Accessed June 13, 2019.

[16] World Economic Forum. (2018). The global gender gap report 2018, 4. Retrieved from http://www3.weforum.org/docs/WEF_GGGR_2018.pdf. Accessed June 13, 2019.

[17] World Economic Forum. (2018). The global gender gap report 2018, 3. Retrieved from http://www3.weforum.org/docs/WEF_GGGR_2018.pdf. Accessed June 13, 2019.

[18] World Economic Forum. (2018). The global gender gap report 2018, vii. Retrieved from http://www3.weforum.org/docs/WEF_GGGR_2018.pdf. Accessed June 13, 2019.

[19] World Economic Forum. (2018). The global gender gap report 2018, 3. Retrieved from http://www3.weforum.org/docs/WEF_GGGR_2018.pdf. Accessed June 13, 2019.

[20] Stoet, G., & Geary, D. C. (2018). The Gender-Equality Paradox in Science, Technology, Engineering, and Mathematics Education. Psychological Science, 29(4), 581–593. https://doi.org/10.1177/0956797617741719

[21] World Economic Forum. (2018). The global gender gap report 2018, 4. Retrieved from http://www3.weforum.org/docs/WEF_GGGR_2018.pdf. Accessed June 13, 2019.

measurement

GenderSci Lab

Measuring Gender Equality

Author: Marion Boulicault

What is Measurement?

Measuring Gender Equality: The GGGI

Gender Equality Indices

The Global Gender Gap Index

Is the GGGI a Good Measure of Gender Equality? Measures as Tools

What are "Gender Equality Paradox" Researchers Trying to Measure?

What Does the GGI Measure?: A Tool for Benchmarking

Evidence of Invalidity: The GGGI isn’t the Right Tool for Gender Equality Paradox Research

Implications

A Paradox Dissolves

Acknowledgments:

Authorship Statement:

Recommended Citation:

Endnotes:

©2025 GENDERSCI LAB

Info

Measuring Gender Equality

Author: Marion Boulicault

What is Measurement?

Measuring Gender Equality: The GGGI

Gender Equality Indices

The Global Gender Gap Index

Is the GGGI a Good Measure of Gender Equality? Measures as Tools

What are "Gender Equality Paradox" Researchers Trying to Measure?

What Does the GGI Measure?: A Tool for Benchmarking

Evidence of Invalidity: The GGGI isn’t the Right Tool for Gender Equality Paradox Research

Implications

A Paradox Dissolves

Acknowledgments:

Authorship Statement:

Recommended Citation:

Endnotes:

Gender Stereotypes, Gendered Self-Expression, and Gender Segregation in Fields of Study: A Q&A with Professor Maria Charles

Gender Equality Paradox Monkey Business: Or, How to Tell Spurious Causal Stories about Nation-Level Achievement by Women in STEM

©2025 GENDERSCI LAB

Info