Context: Recently, the Department of Biotechnology unveiled the Indian Genomic Data Set from the Genome India Project (GIP) and launched the ‘Framework for Exchange of Data Protocols (FeED)’ and the Indian Biological Data Centre (IBDC) Portals to make 10,000 whole genome samples accessible to researchers across India and the globe.
Relevance of the Topic: Prelims: Key facts about Genome, Genome Sequencing and its applications, Genome India Project.
What is Genome?
- A genome is an organism’s complete set of DNA. It is a collection of all the genes and the regions between the genes contained in our 23 pairs of chromosomes.
- Each chromosome is a contiguous stretch of DNA string composed of millions of individual building blocks called nucleotides or bases [adenine (A), cytosine (C), guanine (G), and thymine (T)].
- These bases (A, T, G and C) are arranged and repeated millions of times in different combinations.
- The genome contains all the data that is needed to describe the organism completely — acting essentially as a blueprint. The genome can be understood through the process described as genome sequencing.

What is Genome Sequencing?
Whole-genome sequencing is the decoding of the entire DNA present in the human cell, i.e., determining the precise order of the four bases and how they are arranged in chromosomes.
Applications of Genome Sequencing:
- Disease Diagnosis and Treatment:
- Genome sequencing can be used to identify genetic mutations, evaluate rare disorders, preconditions for disorders and even cancer from the viewpoint of genetics. E.g., Nearly 10,000 diseases (including cystic fibrosis and thalassemia) are the result of a single gene malfunctioning.
- It can be used to read the codes of viruses, which can be used to understand how to combat the virus, track mutating variants, and develop a vaccine. E.g., development of COVID-19 vaccine.
- Personalised Drug Development: Can identify genetic targets for drug development and testing, leading to the development of more effective and personalised drugs.
- Prenatal Screening: Can be used as a tool for prenatal screening to investigate whether the foetus has genetic disorders/anomalies.
- Agriculture: Can help identify genes that contribute to desirable traits in plants and animals, allowing for the selective breeding of crops and livestock.
- Forensics: Genome sequencing can be used to identify suspects in criminal investigations and to establish paternity in cases of disputed parentage.
- Evolutionary Biology: Can help trace the evolutionary history of species and understand the mechanisms underlying evolution.
What is the Genome India Project?
- Genome India Project (GIP) was launched by the Department of Biotechnology (DBT) in January 2020.
- Aim: To execute whole genome sequencing of 10,000 Indians and create a comprehensive reference database of genetic variations prevalent in the Indian population.
- Success:
- DBT announced the completion of the project in February 2024.
- The project data was officially released in January 2025. The data is securely stored at Indian Biological Data Centre.
- Note:
- The launch of the ‘Framework for Exchange of Data (FeED)’ Protocols under the Biotech-PRIDE Guidelines, 2021, ensures that the high-quality, nation-specific data will be shared in a transparent, fair, and responsible manner.
- Biotech-PRIDE Guidelines were introduced in 2021 and are a testament to India’s commitment to ethical and secure data sharing.
Key Highlights of GIP
- Genome sequencing of 10,000 individuals: The project successfully sequenced genomes of 10,074 samples, covering 99 ethnic groups.
- Creation of genetic database: Data is securely stored at Indian Biological Data Centre (IBDC). IBDC will facilitate seamless access to valuable genetic information to researchers.
- Sample collection milestones: Over 19,000 blood samples have been collected, exceeding the initial target, and stored in the GenomeIndia Biobank for future research.
- Phase-1 analysis: Detailed quality checks and joint genotyping of 5,750 samples have uncovered rare genetic variations unique to Indian populations.
Need of GIP
- India has around 1.4 billion population, consisting of over 4,600 population groups, many of which are endogamous (disease-causing mutations are often amplified within some of these groups).
- Hence, GIP is needed to understand India’s unique genetic landscape and deal with the prevalence of rare diseases in the country.
Significance of GIP
The project has the potential to:
- Improve disease diagnosis and prevention: Identifying genetic markers associated with diseases can lead to earlier diagnosis, effective treatment, and preventive measures. E.g., Identify prevalence of congenital disorders like sickle cell anaemia and thalassemia in certain tribal groups.
- Advance precision medicine: GIP's genomic data will be instrumental in implementing precision medicine, tailoring targeted therapies based on an individual's genetic profile.
- Empower genomic research: By sequencing genomes of a large and diverse group of individuals, the GIP will establish a baseline reference genome for the Indian population. This reference genome will be invaluable for researchers studying genomics in India and would contribute to the global understanding of human genetics.
Key Genome Sequencing Projects:
Human Genome Project:
- Launched in 1990 and completed in April 2003. Lead by the USA.
- The Human Genome Project, for the first time, led to the decoding of the entire human genome.
IndiGen Project:
- Initiated by: Council of Scientific and Industrial Research (CSIR), from April 2019 to October 2019.
- Aim: To sequence whole genomes of 1029 individuals from diverse ethnic groups across India.
- The project was completed in 2019, and the results have been published in the scientific journal Nucleic Acid Research.
- Significance: The data can be used to study genetics of the Indian population and develop new treatments for diseases common in India.
