DMD gene: the largest human gene with 79 exons

The genetic code for proteins is located on genes, which consist of DNA (Figure 1). DNA consists of two strands of DNA building blocks (nucleotides). There are four different types of nucleotides (depicted as four different coloured circles). Each nucleotide can bind to only one type of nucleotide in the opposite strand. In Figure 1, green circles always pair with yellow, while blue pairs with red. During the so called ‘transcription’ an RNA copy is generated from the DNA of a certain gene. RNA consists of nucleotides as well, but these make up only a single strand (depicted as squares). This RNA copy is translated into protein. Each three RNA nucleotides (triplets or codons) encode a protein building block (one single amino acid). There are over 20 different amino acids and each protein consists of tens to thousands amino acids.

Figure 1. From DNA to protein

Mutations (mistakes) in the DNA of a certain gene affect the functionality of the protein encoded by that gene. In case of a deletion (the disappearance of nucleotides) this can have different consequences. When the number of deleted nucleotides is divisible by 3 (3, 6, 9, 12 etc.) one or more amino acids will lack from the protein. In Figure 2, the fourth red amino acid is missing due to a deletion of 3 nucleotides that code for this amino acid. Depending on the number and the location of the missing amino acids, results for protein function can be negligible to disastrous.

Figure 2. Mutations that maintain the reading frame

When the number of deleted nucleotides is not divisible by three, this has bigger effects on the amino acid sequence (Figure 3). Due to the deletion, the reading frame will be disrupted and after the deletion wrong amino acids will be incoorporated into the protein (amino acid 4 and further). Often, a disruption will lead to a premature stop codon and protein function is generally completely lost with these mutations.

Figure 3. Mutation that disrupts the reading frame