Understanding DNA Sequencing, Part 1

The basic unit of all living organisms, from bacteria to humans, is the cell. Contained within the nucleus of these cells is a molecule called deoxyribonucleic acid (or DNA). Today, we know that DNA is the blueprint used to build an organism – our genetic makeup, or genotype, controls our phenotype (observable characteristics).   The directions coded for by our genes controls everything from growth and development to cell specification, neuronal function, and metabolism.

Figure 1:  The molecular structure of DNA.
Fig. 1: The molecular structure of DNA.

A strand of DNA is composed of building blocks known as nucleotides (Fig. 1). Each deoxynucleotide (dNTP) comprises three basic parts: a phosphate group, a deoxyribose sugar, and a nitrogen-containing base (adenine, cytosine, guanine, or thymine — abbreviated as A, C, G, or T). The order of these nucleotides gives rise to genes, each with a unique sequence.   The 3’ hydroxyl group on the sugar of one nucleotide forms a covalent bond with the 5’ phosphate group of its neighbor, making DNA a stable scaffold for genetic information. The nature of this bond results in DNA strands with a distinct polarity, ensuring that the strand of DNA is read in the correct direction.

After scientists identified that DNA was the genetic material, a worldwide race was launched to unlock the secrets coded for in our DNA. One breakthrough came in the late 1970’s, when Frederick Sanger developed a method to determine the nucleotide sequence of DNA by creating a series of copies of the original in vitro. This method creates a set of copies that are complementary to the original DNA sequence using a DNA primer to target the site to be sequenced, the enzyme DNA polymerase I (DNA Pol I), and free nucleotides. DNA Pol I uses the primer to start synthesis of the new strand of DNA in the 5’-3’ direction using the existing DNA as a template. To this mixture,

Fig. 2: Molecular structure  dNTPs versus ddNTPs
Fig. 2: Molecular structure dNTPs versus ddNTPs

Sanger added dideoxynucleotides (ddNTPs). These nucleotide analogs lack the 3’ hydroxyl group (Fig. 2), making it impossible for the polymerase to add another nucleotide to the end of growing strand. This creates a series of DNA fragments of differing size that can be used to map the location of each nucleotide in a given piece of DNA.

Today, if we wanted to determine the location of each guanine nucleotide in a given sequence, a sequencing reaction would be assembled that contains the DNA template, a high concentration of the four free nucleotides, DNA polymerase I and a low concentration of ddGTP. As DNA Pol I copies the DNA, the ddGTP is randomly and infrequently incorporated into the growing DNA strands, creating a “nested set” of fragments that each terminate with a ddGTP. Once a ddGTP is incorporated into the chain, DNA synthesis is terminated. After the reaction is complete, the products from the sequencing reaction are analyzed using polyacrylamide gel electrophoresis, which separates the mixture of DNA fragments by size. The length of the DNA fragment allows for determination of the position of that base (Fig. 3).

Fig. 3:  Principles of Sanger Sequencing
Fig. 3: Principles of Sanger Sequencing

For sequence analysis, four separate enzymatic reactions are performed, each containing a different ddNTP. Added to the reaction is radiolabeled dATP, which labels each of the growing nucleotide chains. The sequencing reactions are added into depressions (or “wells”) within a polyacrylamide gel. Traditionally, the “A” sample is loaded in the first well, “C” in the second, and “G” and “T” in the third and fourth wells, consecutively. Next, an electrical current is passed through the gel. Because the sugar-phosphate backbone of DNA has a strong negative charge, the current drives the DNA through the gel towards the positive electrode.

At first glance, a polyacrylamide gel appears to be a solid at room temperature.   On the molecular level, the gel contains small channels through which the DNA can pass. Small DNA fragments move through these holes easily, but large DNA fragments have a more difficult time squeezing through the tunnels. Because molecules with dissimilar sizes travel at different speeds, they become separated and form discrete “bands” within the gel. This allows polyacrylamide gels to separate fragments that differ in size by a single nucleotide. The sequencing gel is very long, allowing a scientist to determine the sequence of 500-800 nucleotides of information in a single set of sequencing reactions. Together, the four sequencing reactions contain DNA fragments that cover the entire piece of DNA’s sequence, with each fragment corresponding to a different nucleotide position (Fig. 4).

Fig. 4:  Analyzing Sanger sequencing by electrophoresis
Fig. 4: Analyzing Sanger sequencing by electrophoresis

After the electrophoresis is completed, the radiolabeled DNA can be visualized by autoradiography. The polyacrylamide gel is placed into direct contact with a sheet of x-ray film. The DNA fragments, having incorporated the radioactive dATP, will create a dark exposure band on the sheet of x-ray film that corresponds to its position in the gel. Since the smallest fragments move through the gel faster than the larger fragments, the sequence is read from the bottom of the gel to the top. For example, the simulated autoradiograph (or “autorad”) in Figure 4 would result from Sanger sequencing analysis of the DNA in Figure 3.

Check back next Monday to learn more about how DNA is sequenced in today’s laboratories.  We will also discuss the interpretation of DNA sequence information using bioinformatics.