Gender bias can be tough to prove. When it comes to AI or big data, however, gender bias tends to be the norm. Take for example, the voice recognition software that couldn’t recognize the higher register of women’s voices, the AI recruiting program that favored hiring men over women, or the facial recognition algorithm that was 34 percent less likely to recognize a black woman than a man.1 The instances are abundant. So much so, that Gartner estimates 85 percent of today’s AI projects may deliver erroneous outcomes due to bias in data, algorithms, or the teams responsible for managing them.2

These gaps have serious potential to undermine the promise of big data and AI. Overlooking gender and other intersections3 of diversity can amplify and advance existing inequalities, creating short- and long-term consequences for the people, businesses and economies interacting with algorithm-based decision-making.

Businesses are keen to embed AI and other big data technologies into their operations. According to KPMG’s 2021 CEO Outlook, 60 percent of CEOs are investing more capital into data-rich technologies.4 While technological advancement may be critical to remaining competitive in the marketplace, investing in misinformed algorithms can lead to poor decision-making practices, with far-reaching impacts on business processes, profitability, as well as brand and reputation.

As society looks to a future where big data and AI will become more engrained in everything we do, closing the gender data gap will require a conscious effort. This starts with the make-up of data and technology teams, but also considers the design and structure of data collecting practices, and how it all comes together to deliver equitable, accessible, and representative outcomes.

To get here though, we need to understand why the gender data gap exists today and what we can do about it.

Gender-blind data: A troubled tradition

There is a long history of men designing and leading research efforts, and women being left out. As a result, much of the data we depend on today doesn’t account for the biological sex or socio-cultural gender differences between men and women.

When gender isn’t a consideration – what is called gender-blind data – men become the default. This, in turn, creates blind spots that exacerbate existing gender gaps and introduce new inequalities with their own detrimental consequences. This can be seen in medical research, where men continue to represent the majority of participants in clinical trials. For example, after decades of male-focused research, cardiovascular disease is considered a man’s illness, despite it being the major cause of death for women over 65 years.5 Certain digital health apps today are therefore more likely to diagnose a woman experiencing chest pains and nausea with a panic attack or inflammation, while a male user displaying the same symptoms is told he may be having a heart attack and needs seek medical help immediately.6

Men also serve as the baseline of industry-based studies. Seatbelt and airbag safety research use crash test dummy drivers that are fashioned after the average physique of men and their seating positions. Women’s bodies, however – their size, stature, and physical attributes – are overlooked by design. As a result, women are 47 percent more likely to be seriously injured and 17 percent more likely to die than a man in a similar accident.7

There are also socioeconomic consequences to gender blind data. This has become especially apparent during the COVID-19 pandemic, where several data collection efforts have failed to recognize women’s unique experiences and needs. Failing to represent women in data collection efforts means discounting the serious and gendered impacts of the pandemic, including on women’s livelihoods, unpaid care burden, mental health and exposure to gender-based violence. With flawed data to support decision-making, gender-blind policy responses in many jurisdictions have rolled back progress and further widened gender gaps around health, education and economic opportunity.

The value of gender disaggregated data

Gender-disaggregated data (gendered data) is a simple first step that allows the invisible to be counted. When made available, gendered data can allow for more tailored solutions and informed decision making.

In health care, for example, gender-disaggregated data improves access to support and services, and thereby the outcomes of those encounters for women. The value of disaggregated data isn’t just the fact that it distinguishes between men and women. It can also lead to the collection of new data fields and data points to be investigated, that may not have been considered prior. This could include experience-based or user journey data, which offer richer and more measurable reporting on the outcomes of implemented policies across various intersections of a population.

In a workforce context, gendered data can lead to better talent retention and more effective decisions around employee programs, attracting consumers and driving company objectives. During the pandemic, for example, research found that many women experienced increased caregiving responsibilities. Companies that used these data points to tailor support initiatives, were thereby able to retain talent that may have otherwise been forced to leave.

Despite better information and outcomes, gender-disaggregated data is still lacking globally. According to the WHO, only 39 percent of countries – mostly high-to middle-income nations – collected sex-disaggregated COVID-19 case and mortality data. Gendered data practices continue to be considered the exception rather than a best practice, leaving many to perceive them as time consuming and costly.8 But, making this a basic prerequisite within data practices, along with small and affordable changes, can lead to better results for all.

Investing in disaggregation creates real value and can pay real dividends in the long-term. If urban planners, for example, adopt a ‘gender responsive’ approach in creation of safe cities, the needs of women and girls can be designed into infrastructure9. As companies continue to expand their use of AI technologies, their risk exposure increases. Analysis of phone use data in low to middle income countries is illuminating: ‘300 million fewer women than men access the internet on a mobile’10 and women in these countries are 20 percent less likely than men to own a smartphone11. These technologies generate data about their users, so the fact that women have less access to them inherently skews datasets. Awareness of data collection gaps as well as adoption of a gender lens in analyzing datasets that inform AI or machine learning algorithms can be one small step towards creating a gender responsive approach.

Bridging the gender data divide

With technologies still evolving, there’s an opportunity to be inclusive by design. To do this, the teams designing these systems should be seeking to better reflect the society their inventions are supposed to transform. That means bringing more women onto teams and offering them a real voice into the design and decision-making of data and technologies12.

Algorithms by themselves may not be biased. But the humans they are written by can display unconscious bias. With women making up less than 22 percent of the AI workforce, and significantly fewer holding decision-making roles, it is no surprise that unconscious bias is still prevalent in the technologies we use.13

As businesses embed more data-driven technologies into their organizations, diverse teams that support decision-making and algorithm design will develop a more equitable balance in data collection and programing. Diverse teams will not only result in more appealing messaging, but they have the potential to increase functionality of products and services for women and reduce unintended bias.

Much more is needed. Some of this is coming in the form of regulation, such as the EU calling for clearer directives to help ensure fairness along the entire AI value chain, starting in research and innovation. But with the right tools, organizations can get in front and be proactive about how they support more inclusive data practices.

As a start, KPMG offers three areas where business can begin to take action to build a more inclusive data culture:

Within your business
Establish an organization-wide mindset around gender equality. To ensure everyone is working from the same script, clearly define what gender equality, inclusion and fairness mean for your organization. Acknowledge gender bias and how it is everyone’s responsibility to challenge. To ensure a common understanding across our organization, embed these dimensions within the company’s priorities for big data and AI.
Build diversity within big data teams. Ensure the teams developing and managing your data systems can embed and advance inclusion and equity. This means balancing the diversity on your teams itself. Gender diverse teams working to design, develop and implement data-driven applications can challenge underlying assumptions, leading to more inclusive and more effective decision-making.

Products, services, and customers
Design products and services with a gender lens. Apply a gender lens to the design and testing of products and services, monitoring for potential bias and unintended impacts on women and girls. Adopt participatory practices which include the worldviews of women in decision-making. Design processes and controls to govern the development and use of data-driven applications.

Supply chain and communities
Further refine supply chain data. Consider what data is available across the supply chain with particular attention to the wellbeing of women and girls in the supply chain and associated communities.

These actions offer a starting point to design, collect and leverage data that will more fairly and equitably represent and serve women. With more inclusive, high-quality data as a fundamental input, we can make more effective decisions and create truly transformational technologies that can advance gender equity in our communities, businesses, and economies.

Written by:

Ruth Lawrence PhD, Senior Executive, KPMG International
Oriana Vaccarino PhD, Manager, People & Change, KPMG in Canada
Genie Boericke, Director, KPMG Global Lighthouse
With special thanks to Nishtha Chadha for her research assistance.

Contact us

Connect with us

Stay up to date with what matters to you

Gain access to personalized content based on your interests by signing up today


1 “We Need to Close the Gender Data Gap By Including Women in Our Algorithms”, TIME, 2020
2 “Nearly Half of CIOs Are Planning to Deploy Artificial Intelligence”, Gartner, 13 February 2018
3 The term ‘intersectionality’ has been used to understand women’s experiences at the intersection of a number of simultaneous oppressions including [but not limited to] race, class, caste, gender, ethnicity, sexuality, disability, nationality, immigration status, geographical location, religion and so on. “The value of intersectionality in understanding violence against women and girls”, UN women, January 2019 (PDF 639 KB)
5 “Gender differences in coronary heart disease”, US National Library of Medicine, 18 December 2010
6 “It’s hysteria, not a heart attack, GP app Babylon tells women” The Times, 13 October 2019
7 “AI Bias Could Put Women’s Lives At Risk - A Challenge For Regulators”, Forbes, 2 March 2020
8 “The need for gender data” GIWPS, 7 March 2018
Creating safe cities - KPMG Global (
10 The mobile gender gap report 2021, Mobile for development, 2021
11 The mobile gender gap report 2020, Mobile for development, 2020
12 “Women have stepped up to support each other in these roles creating a global network of women in Big Data” women in big data
13 “The Harsh Reality About Being a Woman in AI and Data Science”. Towards data science, 3 April 2021