top of page
Search

Why Less Than 2% of Human Genome Sequencing Comes from African Populations Is a Problem for Global Health, Ancient Mayans with Sickle Cell?



In the era of personalized medicine and groundbreaking advancements in genomics, the promise of better healthcare, early disease detection, and individualized treatment plans is closer than ever. Yet, this progress is not equally shared. A startling and often overlooked fact reveals a glaring disparity in scientific research: less than 2% of human genome sequencing data currently comes from individuals of African descent 1.


This figure is not just a statistic. It highlights a major blind spot in the foundation of modern genomic science—one that has significant consequences for global health equity, scientific discovery, and medical treatment. Why African Genomes Matter More Than Most Think

Africa is widely acknowledged as the cradle of humanity. Genetic evidence confirms that modern humans (Homo sapiens) originated in Africa over 200,000 years ago. Due to the deep evolutionary history of African populations and the relatively fewer bottlenecks in migration out of the continent, Africans harbor the greatest genetic diversity in the world2. This diversity holds the potential to unlock crucial insights into:

  • Human evolution and migration

  • Disease susceptibility and resistance

  • Rare genetic variants with significant medical value

    Yet paradoxically, the populations with the richest genomic information are the least studied.



The Global Genomic Bias

The vast majority of genome-wide association studies (GWAS) have been conducted on people of European ancestry. A 2021 analysis revealed that over 80% of genome sequencing data in global databases comes from European populations, while individuals from African descent, despite making up roughly 17% of the global population, contribute less than 2% .

This lack of representation has several negative consequences:

  • Reduced accuracy of genetic risk assessments for non-European populations.

  • Missed opportunities to identify unique genetic variants protective against diseases like cancer, hypertension, or HIV.

Skewed understanding of the genetic architecture of diseases, potentially leading to ineffective treatments for underrepresented populations.


For instance, polygenic risk scores developed using European data often perform poorly when applied to African genomes due to differences in allele frequencies and linkage disequilibrium patterns 4.




The Consequences of a Skewed Genomic Landscape


Let’s consider an example. Sickle cell disease (SCD), a condition most prevalent in African populations, has received far less research funding and therapeutic attention than other rarer disorders in European populations. This disparity exists despite the well-known genetic underpinnings of SCD and the potential for genomics to guide treatment development 5.

Furthermore, underrepresentation limits the detection of pharmacogenetic variants—differences in genes that affect how individuals metabolize medications. This could mean that standard doses of drugs may be too high, too low, or completely ineffective in African individuals compared to their European counterparts 6. Why This Is Happening

Several factors contribute to the underrepresentation of African genomes in research:

  • Colonial legacies and distrust in medical research due to unethical past practices (e.g., the Tuskegee Study).

  • Lack of funding and infrastructure for genomic research in Africa.

  • Limited inclusion of African scientists and institutions in global research initiatives.

Inequities in data ownership and benefit sharing, which discourage participation.


Initiatives Addressing the Gap

Fortunately, several initiatives are rising to meet this challenge. For example:

  • H3Africa (Human Heredity and Health in Africa) is a consortium funded by the NIH and the Wellcome Trust to support African-led genomic research projects7.

  • The African Genome Variation Project aims to characterize genetic diversity across the continent.

New startups and biotech firms within Africa, such as 54gene, are working to build infrastructure and promote genomic equity from within.


However, these efforts still require global collaboration, equitable funding, and respectful data sharing practices to succeed. Footnotes

  1. Fatumo, S., et al. (2022). "A roadmap to increase diversity in genomic studies." Nature Medicine, 28(2), 243–250. https://doi.org/10.1038/s41591-021-01672-5

  2. Tishkoff, S. A., & Verrelli, B. C. (2003). "Patterns of human genetic diversity: Implications for human evolutionary history and disease." Annual Review of Genomics and Human Genetics, 4(1), 293–340. ↩

  3. Popejoy, A. B., & Fullerton, S. M. (2016). "Genomics is failing on diversity." Nature, 538(7624), 161–164. ↩

  4. Martin, A. R., et al. (2019). "Clinical use of current polygenic risk scores may exacerbate health disparities." Nature Genetics, 51, 584–591. ↩

  5. Piel, F. B., et al. (2017). "Sickle cell disease." New England Journal of Medicine, 376, 1561–1573. ↩

  6. Johnson, J. A. (2013). "Pharmacogenetics in clinical practice: How far have we come and where are we going?" Pharmacogenomics, 14(7), 835–843. ↩

H3Africa Consortium. (2014). "Enabling the genomic revolution in Africa." Science, 344(6190), 1346–1348. ↩




Scientists warn African DNA is missing from the global genome map


African scientists say the world is missing vital medical insights by ignoring the continent’s vast genetic diversity. Less than 2 percent of human genome sequencing worldwide comes from African populations. https://www.rfi.fr/en/africa/20250404-scientists-warn-african-dna-is-missing-from-the-global-genome-map  Question: If Less Than 2% of Human Genome Sequencing Comes from African Populations, how is it used for commercial genome purposes? they are not using 98% of it ?


If less than 2% of human genome sequencing data comes from African populations, yet companies and researchers are using genomic data to power a multi-billion-dollar industry (think: ancestry services, personalized medicine, drug development, genetic risk scores), what exactly are they working with?


The answer: They’re largely using data from people of European ancestry. That 98%+ comes primarily from individuals in the U.S., Europe, and other high-income countries. Here's how that plays out in the commercial genomics world:



🔬 1. Commercial DNA Tests and Ancestry Reports

Companies like 23andMe, AncestryDNA, and others have built massive genomic databases — and over 75–80% of their customer base is of European descent. So when they analyze your DNA to tell you where your ancestors are from or whether you're likely to have a certain trait, the algorithm is matching your data to a largely Eurocentric reference panel.


Consequence:

  • If you're of African descent, your ancestry breakdown may be less precise or incomplete, because the comparative data just isn’t there.

Traits or health reports may be less accurate, or simply unavailable.



💊 2. Pharma and Drug Development

Many commercial genomics companies license their data to pharmaceutical companies for drug target discovery and development. Since most of that data comes from European populations:

  • Drugs are being designed and tested based on Euro-centric genomes.

  • Drug responses in other populations, especially Africans, are under-researched and may be less effective or have unforeseen side effects.

This perpetuates a dangerous feedback loop: limited representation → biased research → less effective treatments for marginalized groups → more data collected from the same populations → rinse and repeat.


🧬 3. Polygenic Risk Scores (PRS)

PRS are used to estimate a person’s genetic risk for complex conditions like diabetes, heart disease, or Alzheimer’s.

But here's the kicker: PRS models trained on European genomes perform up to 5 times worse when applied to African genomes1.

So for African or African-descended individuals:

  • These predictions may be inaccurate or misleading.

  • A low-risk score may not mean you’re low-risk.

  • A high-risk score might not mean much at all.


💸 So... How is that 2% being used?

Right now, it’s massively underutilized. That tiny fraction of African genomic data does exist in some research initiatives (e.g., H3Africa), but commercial players have largely not integrated it because:

  • It’s still a small data pool → not statistically strong enough for commercial modeling.

  • Ethical and data sovereignty concerns around African genomic data sharing → access is restricted.

  • There hasn't been enough investment in African-led sequencing at scale.



In short: they’re not using the 98% they’re missing, and that's exactly the problem.

💥 Bottom Line:

The global genomics market is currently built on a lopsided, non-representative foundation. And that’s risky — not just for equity, but for science itself. Ignoring African genetic diversity is like trying to write the history of humankind with only the last few pages of the book.


Footnotes

Martin, A. R. et al. (2019). "Clinical use of current polygenic risk scores may exacerbate health disparities." Nature Genetics, 51, 584–591. ↩





Chatquan Question : 

Human Genome Sequencing Comes from what specific African Populations ?  Excellent question — and it gets to the heart of why “African DNA” is not a monolith. When people say "African populations" in genome sequencing, they’re usually referring to a small subset of communities, not the full, rich diversity of the continent.

Let’s break down who is actually represented in current African genome sequencing efforts:


🌍 1. The 1000 Genomes Project (one of the first big efforts)

This project included just a few African populations:

  • Yoruba in Ibadan, Nigeria (YRI) – one of the most widely used African reference genomes.

  • Luhya in Webuye, Kenya (LWK) – East African Bantu-speaking population.

  • Gambian in Western Divisions, The Gambia (GWD)

  • Mende in Sierra Leone (MSL)

  • Esan in Nigeria (ESN)


🧬 These populations are all West or East African — and mostly from urban or accessible regions.



⚠️ Who’s Not Represented?

Despite these efforts, many African populations are still absent, including:

  • Pygmy and Khoisan groups (some data exists but is very limited)

  • Amazigh (Berber) populations of North Africa

  • Pastoralist communities like the Tuareg or Maasai

  • Ethnolinguistic minorities — of which Africa has thousands

Diaspora populations in the Americas and Caribbean



📊 Why This Matters

When genome sequencing efforts only include a handful of African groups:

  • Medical insights and drug responses might not apply to all Africans.

  • Ancestry tests lump vastly different groups into “West African” or “Sub-Saharan African.”

It reinforces a narrow, incomplete picture of African genetic diversity.


🧬 TL;DR:

Most African genome sequencing has come from:

  • West Africans: Yoruba, Mende, Gambians, Esan, Hausa

  • East Africans: Luhya, Ethiopians, Kenyans

  • Southern Africans: Zulu, Xhosa (still underrepresented)

  • Led by projects like 1000 Genomes, H3Africa, and AGVP

But this is just the beginning — Africa has over 2,000 distinct ethnolinguistic groups, and the vast majority still haven’t been sequenced.






Bone Change in the Sickle Cell Anemia With Similar Changes Observed In the Skull of Ancient Mayan Indians

S. Moore - 1929


 Sickle Cell Disease Simulating Advanced Rheumatoid Arthritis 

Report of a Case

 EUGENE L. COODLEY, M.D., and MORLEY J. KERT, M.D., Los Angeles









Recognition of sickle cell anemia in skeletal remains of children



Abstract

The present study discusses in detail the osteological changes associated with sickle cell anemia in children and their importance in differential diagnosis. Posterior calcaneal and specific articular surface disruptive metacarpal lesions are diagnostic for sickle cell anemia. Calvarial thickening, tibial and femoral cortical bone thickening, and bowing are of more limited utility in differential diagnosis. Granular osteoporosis, pelvic demineralization and rib broadening are nonspecific. Localized calvarial “ballooning,” previously not described, may have diagnostic significance. Bone marrow hyperplastic response (porotic hyperostosis) in sickle cell anemia produces minimal radiologic changes contrasted with that observed in thalassemia and blood loss/hemolytic phenomenon.

Two other issues, the osteological criteria for discriminating among the anemias and the purported relationship between porotic hyperostosis and iron deficiency anemia, are also discussed. There is sufficient information to properly diagnose the four major groups of anemias, and further, to establish that iron deficiency is only indirectly associated with porotic hyperostosis. The hyperproliferative bone marrow response (manifest as porotic hyperostosis) to blood loss or hemolysis exhausts iron stores, resulting in secondary iron deficiency. Am J Phys Anthropol 104:213–226, 1997. © 1997 Wiley-Liss, Inc.






FIRST TRIBE






 
 
 

Comments


Subscribe Form

Thanks for submitting!

239-273-5935

©2021 by FIRST TRIBE ABORIGINAL. Proudly created with Wix.com

bottom of page