For long we thought the non-coding regions of the genome (98%) to be ‘dark area’ or ‘Junk DNA’. This is no more the case.
They play a huge role in gene expression which has become an important area of study in the recent times called epigenetics.
Epigenetics deals with the processes that control how the genes are expressed.
We know that all the cells in our body have the same genome.
Further there are 37 trillion cells that are about 200 different types and same code (genome) exists in the nucleus of all these cells.
If the same code is there in all the cells, how is it that there are 220-odd cell types making up for 4 different tissue types and 78 different organ types all working in unison to make human life possible.
The answer is in gene expression.
Different genes are expressed in different cells that perform different function that look differently. (like your heart cell and your kidney cell)
All this relates to gene regulation. There are different ways in which gene is regulated for expression.
Introns, Exons and RNA Splicing
As we have seen mRNA is a copy of only the coding part of DNA (gene).
And the coding part do not occur on chromosome in one single sequence as one whole.
It is spread out on a chromosome in parts. Each part is separated by a non-coding part of the genome.
In fact, 25% of all the non-coding part occur in between genes.
The non-coding part between gene is called introns and the coding part that mRNA is interested in are called exons.
So, mRNA must copy only exons and cut out all the introns.
This cutting of introns to join only exons is called RNA splicing.
The final mRNA after splicing of introns is called exome (which represents only 1.5% of the genome).
As you can appreciate it is this final mRNA after splicing that is important for coding for protein.
Muscular Dystrophy
One type of muscular dystrophy, a genetic disease, is a result of defect in RNA splicing while copying X-chromosome.
Since it is associated with X-chromosome it is more prevalent in males as they have only one X-chromosome.
Regulation during transcription (DNA -> mRNA)
As we have seen transcription happens in the nucleus and only 1.5% of the genome codes for proteins.
So, mRNA needs to copy only the coding part of the genome.
Also, mRNA has to copy only a particular gene or set of genes depending on which cell it is acting in.
In order to transcribe mRNA uses an enzyme called RNA polymerase.
But how does it know what part of the genome to transcribe and when to start and stop the copying mechanism?
This is where regulatory factors of epigenetics come into picture
Promoter region
These are non-coding part of the genome which has proteins that attract RNA polymerase to the required coding part of the genome.
Start and Stop signals
Besides there are regions in the genome that act like traffic signals for starting and stopping transcription.
Post-transcriptional regulation
Regulation during translation (mRNA->Protein)
Translation, as we have seen, involves reading of mRNA by tRNA in the ribosome (protein making factories).
tRNA reads 3 letters at a time which translates into one amino acid, the building blocks of protein.
t-RNA simply brings amino acids that are lying in the cell after reading 3 letters of mRNA(codon).
As in case of transcription there should be some way for tRNA to know when to start and stop reading codons. These are called Start Codon and Stop Codon respectively.