1. Introduction

Figure 1: A DNA sequence

Sequencing is taking a sample of DNA – anything from a small fragment to the entire genome of an organism – and figuring out the specific sequence of A, T, C, and G nucleotides that make it up. This tutorial will explain one method by which that can be done.

The explanation that follows will be a lot easier if you’ve already done my previous tutorial about the polymerase chain reaction (because the two processes have a lot in common).

2. Sanger Sequencing Concept 1: Dideoxynucleotides

The first widely used sequencing method was developed by Frederick Sanger in 1977. This method, called Sanger Sequencing, earned Sanger the 1980 Nobel Prize, and was the basis of the techniques used to sequence the entire human genome, a feat that was completed in 2001 as the culmination of the Human Genome Project.

Sanger’s method involves in vitro DNA synthesis using dideoxynucleotides. In vitro means “in a test tube,” as opposed to in a living cell. The prefix “dideoxy” means “deoxygenated twice.” That’s because the monomers of DNA, deoxyribonucleotides (or just “deoxynucleotides”) have been deoxygenated once: they have one less oxygen atom than ribonucleotides, the monomer of RNA. To see this, take a look at molecule “B” below and find the number 2.  That “2” represents a carbon atom in deoxyribose (the sugar in DNA), and notice that it has a hydrogen attached to it. In ribonucleotides (see molecule “C”), you’d find an –OH in that spot. So, deoxyribose is deoxygenated ribose. Now look at molecule “A,” which is a dideoxynucleotide.  Notice that the number 3 carbon atom has a hydrogen atom attached to it. Compare it to the number 3 carbon in deoxyribose, which has an –OH attached to it. Hence, molecule “A” has been deoxygenated twice (compared to ribose). It’s a dideoxynucleotide.

Figure 2: Dideoxynucleotides (A), deoxynucleotides (B), and ribonucleotides (C)

Finally, note that all of the molecules above have three phosphate groups attached to them. When the nucleic acid polymers DNA and RNA are synthesized, two of these phosphate groups will be broken off and will form bonds with water molecules. That reaction liberates the free energy that can do important work for the cell: creating DNA or RNA.

3. Sanger Sequencing Concept 2: ddNucleotides cause chain termination

McGovern, RA (2015). The Use of Genetic Sequencing Technologies to Determine HIV-1 Viral Tropism and to Evaluate the Effects of Maraviroc on Patient Viral Populations. (Doctoral dissertation, University of British Columbia, Vancouver, Canada). Retrieved from Research Gate.

The chemical difference between deoxynucleotides and dideoxynucleotides is at the heart of Sanger sequencing. That’s because when dideoxynucleotides (molecule “A,” at right) are incorporated into a newly synthesized DNA strand (d) by DNA polymerase (which is not shown), they can’t form a sugar-phosphate bond (e) between the dideoxynucleotide and the next nucleotide (b) at the 3′ end of the growing strand. That causes chain termination. As we’ll see in the next section, that chain termination is the basis of the Sanger sequencing method. But first, let’s check to make sure you’ve gotten everything so far.

4. Sanger Sequencing Quiz 1

[qwiz random = “true” qrecord_id=”sciencemusicvideosMeister1961-Sanger Sequencing 1 (ddNucleotides), M17″] [h]

Sanger sequencing quiz 1
[i]

Frederick Sanger, 1918-2013

[q] Figuring out the order of the nucleotide bases in DNA is called[hangman].

[c] sequencing

[f] Correct!

[q] In the diagram below, the dideoxynucleotide is shown at letter

[textentry single_char=”true”]

[c*] A

[f] Good Job! “A” is a dideoxynucleotide. 

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. Dideoxynucleotides have lost two oxygen atoms compared to ribose (at “C”). Which of the molecules shown could be a dideoxynucleotide. 

[q] In the diagram below, the type of nucleotide that you’d find in messenger RNA is

[textentry single_char=”true”]

[c*] C

[f] Excellent. “C” represents a ribonucleotide, the monomer of RNA.

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. Here’s a hint. The monomers of RNA are ribonucleotides. The sugar that makes them up is ribose, and it has two hydroxyl (–OH) groups. 

[q] In the diagram below, the type of nucleotide that you’d find in the DNA making up your cells is

[textentry single_char=”true”]

[c*] B

[f] Excellent. “B” represents a deoxyribonucleotide, the monomer of DNA.

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. Here’s a hint. The monomers of DNA are deoxyribonucleotides. They’ve lost one oxygen atom compared to ribose. You can find ribose by finding the nucleotide that has two hydroxyl groups. 

[q] In the diagram below, the type of nucleotide used by Sanger to bring about chain termination is

[textentry single_char=”true”]

[c*] A

[f] Nice job. “A” represents a dideoxynucleotide, and these bring about chain termination.

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. Here’s a hint. Dideoxynucleotides don’t have hydroxyl groups attached to their 3′ carbon, and therefore can’t form sugar-phosphate bonds. 

[q] In the diagram below, the template strand is shown at

[textentry single_char=”true”]

[c*] C

[f] Excellent. “C” represents the template strand. 

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. Here’s a hint. The template is the strand that’s “read” by DNA polymerase as it synthesizes a new strand in the 5′ to 3′ direction. 

[q] In the diagram below, a dideoxynucleotide is shown at

[textentry single_char=”true”]

[c*] A

[f] Nice job. “A” represents a dideoxynucleotide.  

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. Here’s a hint. Incorporation of a dideoxynucleotide into a growing DNA strand results in strand termination. 

[q] In the diagram below, sugar-phosphate bonds are shown

[textentry single_char=”true”]

[c*] e

[f] Way to go! Letter “e” represents sugar phosphate bonds. 

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. Sugar-phosphate bonds are the bonds that connect adjacent nucleotides. 

[q] Incorporation of a dideoxynucleotide into a growing DNA strand (as shown below results in chain[hangman].

[c] termination

[q] Dideoxynucleotides can’t form sugar-[hangman] bonds at their 3′ end, resulting in chain termination.

[c] phosphate

[/qwiz]

5. Sanger Sequencing Concept 3: How ddnucleotide chain termination reveals DNA sequences

Pretend that you’re Frederick Sanger, trying to figure out the sequence of a strand of DNA that’s 9 nucleotides long. To assist your learning, I’m going to tell you what the sequence is, but the point is that by following the steps below, you could figure out the sequence (which is what Sanger did).

The sequence you’re interested in is shown in blue.

Sanger wasn’t concerned about 5′ ACAA 3′, shown in black. That’s because those nucleotides represent known primer binding sites, which I’ll explain more about below.

What Sanger did was to clone many copies of this sequence, using gene cloning techniques that you can learn about in the first  first tutorial in this module about genetic engineering. After 1983 this type of cloning could have been done using PCR.

With lots of DNA in hand, Sanger then divided the DNA into four fractions, which he placed into four separate reaction tubes. Each tube contained the DNA that Sanger was trying to sequence. Each tube also contained many copies of each of the necessary ingredients. These were:

  • A primer. This is a short stretch of DNA that will bind to the primer binding site. In this case, the primer 3′ TGTT 5′ will bind with 5’ACAA 3′. The primer was created with radioactive nucleotides so that the DNA fragments with with it would bind could be visualized by exposing them to an X-ray film after Southern blotting.
  • DNA polymerase. This is the key enzyme in DNA replication
  • Free deoxynucleotides. A generalized formula for one of these is shown in Figure 2 above, at letter “B.” Sanger would have added deoxynucleotides with each of the four bases to each tube: dATP (with adenine), dTTP (with thymine), dGTP (with guanine) and dCTP with cytosine). Note two things: 1) “dATP,” “dTTP” and so on indicate the normal deoxyribonucleotide that gets used during DNA synthesis.  2) the letters “TP” indicate that each nucleotide has three phosphate groups attached to it, enabling it to be incorporated by DNA polymerase into a new strand. For example, molecule “B” in figure 3 above is guanine triphosphate, or dGTP.

Each tube would contain a different type of dideoxynucleotide. This is the form of nucleotide that is doubly deoxygenated, which causes chain termination. Each tube contained one of the following: ddATP (with adenine), ddTTP (with thymine), ddGTP (with guanine) and ddCTP with cytosine).

Here’s what the whole setup would have looked like.

Now, let’s try to imagine what would have happened in tube # 4, the one with ddCTP.

The primer would bind with the primer binding site at the start of the template strand, at the 3′ end of the template.

When DNA Polymerase got to the first G (immediately after the primer) it could insert ddCTP. That would result in the shortest possible fragment: just one nucleotide past the primer, as shown below.

But it’s also possible that DNA polymerase would have inserted a dCTP (the “normal” nucleotide), and continued elongating the new strand until it got to the 7th nucleotide in the template strand, which is the 2nd “G.” At that point, DNA polymerase might insert another ddCTP, which would look like this:

6. Chain Termination, Checking Understanding

If you’re getting it, you should be able to predict what will happen in tube number 3, the tube with ddGTP. Answer the questions below.

[qwiz qrecord_id=”sciencemusicvideosMeister1961-Sanger Sequencing 2 (chain termination)M17″] [h]

Chain Termination in Reaction tube 3 (ddGTP): Checking Understanding

[q multiple_choice=”true”] You’re doing Sanger sequencing on the template 5’CTGACTTCG3′. In the reaction tube that has ddGTP, how many fragments will be produced?

[c] 1  [c] 2  [c*] 3  [c] 4

[f] No. The unknown template is  5’CTGACTTCG3′. In the ddGTP tube, the ddGTP molecules will be incorporated into the newly synthesized strand whenever there’s a “C” (cytosine) in the template. How many Cs are there?

[f] No. The unknown template is  5’CTGACTTCG3′. In the ddGTP tube, the ddGTP molecules will be incorporated into the newly synthesized strand whenever there’s a “C” (cytosine) in the template. How many Cs are there in the template?

[f] Excellent. The unknown template is  5’CTGACTTCG3′. In the ddGTP tube, the ddGTP molecules will be incorporated into the newly synthesized strand whenever there’s a “C” (cytosine) in the template. There are 3 cytosines, so there will be 3 fragments produced.

[f] No. The unknown template is  5’CTGACTTCG3′. In the ddGTP tube, the ddGTP molecules will be incorporated into the newly synthesized strand whenever there’s a “C” (cytosine) in the template. How many Cs are there?

[q] You’re doing Sanger sequencing on the template 5’CTGACTTCG3′. In the reaction tube that has ddG, the smallest fragment will be how many nucleotides long?
[textentry single_char=”true”]

[c*] 2

[f] Excellent! The smallest fragment will be two nucleotides long. That’s because the first cytosine is two nucleotides away from the 3′ end of the template strand, and that’s where DNA polymerase will insert the first ddG.

[c] Enter word

[f] No.

[c] *

[f] No. DNA polymerase is going to read the template DNA in the 3′ to 5′ direction, and lay down new complementary nucleotides as it synthesizes a new strand. In reaction tube 3, with ddG nucleotides, DNA polymerase is going to arrive at the first C (cytosine) that it encounters, and put in a complementary dGTP (guanine). However, because the reaction tube has ddG nucleotides, DNA polymerase might insert a ddGTP, resulting in chain termination. If you count how far that first C is from the 3′ end of the template, you’ll know how long that smallest ddG-terminated fragment will be.

[q] You’re doing Sanger sequencing on the template 5’CTGACTTCG3′. In the reaction tube that has ddGTP, the medium-sized fragment will be how many nucleotides long?
[textentry single_char=”true”]

[c*] 5

[f] Nice! The medium fragment will be five nucleotides long. That’s because the second cytosine is five nucleotides away from the 3′ end of the template strand, and that’s where DNA polymerase will insert the second ddG.

[c] Enter word

[f] No, that’s not correct.

[c] *

[f] No. DNA polymerase is going to read the template DNA in the 3′ to 5′ direction, and lay down new complementary nucleotides as it builds a new strand. In reaction tube 3, with ddGTP nucleotides, DNA polymerase is going to arrive at the first C (cytosine) that it encounters, and put in a complementary G (guanine). Assume, in this case, that DNA polymerase inserts a normal G nucleotide (a dGTP) where the first cytosine is on the template DNA, and then moves on to the second one. At this second position, DNA polymerase might insert a ddGTP, resulting in chain termination. If you count how far that second C is from the 3′ end of the template, you’ll know how long that medium sized ddG-terminated fragment will be.

Note that there will be three different fragments of three different lengths, depending where a ddGTP is first inserted.

[q] You’re doing Sanger sequencing on the template 5’CTGACTTCG3′. In the reaction tube that has ddGTP, the largest fragment will be how many nucleotides long?

[textentry single_char=”true”]

[c*] 9

[f] Nice! The largest fragment will be nine nucleotides long. That’s because the third cytosine is nine nucleotides away from the 3′ end of the template strand, and that’s where DNA polymerase will insert the third ddGTP.

[c] Enter word

[f] No.

[c] *

[f] No. DNA polymerase is going to read the template DNA in the 3′ to 5′ direction, and lay down new complementary nucleotides. Assume, in this case, that DNA polymerase inserts a normal G nucleotide where that first and second cytosines are on the template DNA, and then moves on to the third one. At this third position, DNA polymerase inserts a ddGTP, resulting in chain termination. If you count how far that third C is from the 3′ end of the template, you’ll know how long that largest ddG-terminated fragment will be.

[/qwiz]

SUMMARY

[qwiz]

[h]Reaction tube 3 (ddGTP)

[q]In tube number 3, the tube with ddGTP, you’d get three possible reaction products. Click “show the answer” to reveal what they are:

[c]Show the answer

[f]

[/qwiz]

Now, look one more time at the unknown template, and try to imagine the reaction products in tube 1 (with ddATP) and tube 2 (with ddTTP). You might want to sketch out your ideas on a sheet of paper before moving on to the next step.

7. Sanger Sequencing Concept 3: Electrophoresis of Reaction Products and Analysis of Results

Sanger’s next move was to take the reaction products in each tube, and separate them by size by gel electrophoresis. If you need a refresher of how that works, consult my tutorial about DNA fingerprinting by clicking here. Otherwise, just remember that the smallest fragments will move the most.

Remember that Sanger didn’t know the sequence: he was figuring it out. What he did know was that by using dideoxynucleotides, he would get nucleotide chains that terminated with the ddnucleotide in that test tube.

What you’re trying to do is to predict the way the DNA segments produced in each reaction chamber would sort themselves out by size. Based on our results above, I’ll do the first one (the results from the ddCTP tube) for you. There were two segments produced: one was one nucleotide long (not including the primer) and one was seven nucleotides long.

Both of these fragments will start in the well near the negative pole of the gel. The fragment with one nucleotide will migrate to fragment length row 1. That’s because that fragment is (not including the primer) only one nucleotide long. The fragment that’s seven nucleotides long will migrate to fragment length row 7. You do the rest (and don’t be afraid to guess).

[qwiz qrecord_id=”sciencemusicvideosMeister1961-Sanger Sequencing 3, Electrophoresis (M17)”]

[h]Sanger Sequencing: Analysis of Results of Electrophoresis

[q labels = “top”]The unknown sequence is 5’CTGACTTCG3′. Drag the fragments that end with ddnucleotides into the right place on the virtual gel below.

Negative pole
ddATP well ddTTP well ddGTP well ddCTP well Fragment Length
_________ _________ _________ _________ 9
_________ _________ _________ _________ 8
_________ _________ _________ ____ddC__ 7
_________ _________ _________ _________ 6
_________ _________ _________ _________ 5
_________ _________ _________ _________ 4
_________ _________ _________ _________ 3
_________ _________ _________ _________ 2
_______ _________ _________ ___ddC__ 1

[l]ddA

[fx] No, that’s not correct. Please try again.

[f*] Great!

[l]ddT

[fx] No. Please try again.

[f*] Correct!

[l]ddG

[fx] No, that’s not correct. Please try again.

[f*] Excellent!

[q labels = “bottom”]What Sanger saw after electrophoresis and Southern Blotting was something like the gel shown below. You know that DNA is always synthesized in the 5′ to 3′ direction. Therefore, you know that in the newly synthesized strands, the longest fragments are going to be the ones with a ddnucleotide at the 3′ end, and the shortest will have a ddnucleotide at the 5′ end. So, now your job is to use the gel to figure out the sequence of the newly synthesized strand. Once you’ve done that, just use the base pairing rules to determine the sequence of the original strand. I’ve given you two hints (so try not to look above in this page).

 

 

[l]A

[fx] No, that’s not correct. Please try again.

[f*] Good!

[l]T

[fx] No, that’s not correct. Please try again.

[f*] Good!

[l]C

[fx] No. Please try again.

[f*] Great!

[l]G

[fx] No. Please try again.

[f*] Great!

[/qwiz]

8. Sanger Sequencing: Application

That’s basically it for understanding how Sanger Sequencing. To recap.

  1. Sanger used dideoxynucleotides to bring about chain termination during in vitro DNA synthesis.
  2. Electrophoresis of the products in each reaction tube would sort the terminated chains by size.
  3. Analysis of the position of the fragments on the gel would reveal the sequence. The nucleotide at the 3′ end of the longest newly synthesized fragment would be complementary to the first nucleotide on 5′ end of the unknown template. The next longest fragment would end with the complement of the next nucleotide, and so on.

Try predicting and analyzing a few Sanger sequencing gels.

[qwiz qrecord_id=”sciencemusicvideosMeister1961-Sanger Sequencing 3, Practice (M17)”] [h]

Sanger Sequencing Practice

[q labels = “bottom”]Here’s a sequence of DNA: 5’TTAGACCCGAT 3′. Drag the bar below (representing a DNA fragment) into the right place on the virtual gel below. Note the order of the reaction tubes (A, T, C, G).

Negative pole
ddATP well ddTTP well ddCTP well ddGTP well Fragment Length
_________ _________ _________ _________ 11
_________ _________ _________ _________ 10
_________ _________ _________ _________ 9
_________ _________ _________ _________ 8
_________ _________ _________ _________ 7
_________ _________ _________ _________ 6
_________ _________ _________ _________ 5
_________ _________ _________ _________ 4
_________ _________ _________ _________ 3
_________ _________ _________ _________ 2
________ _________ _________ _________ 1
Positive Pole

[l]____

[f*] Good!

[fx] No. Please try again.

[q labels = “bottom”]Sanger sequencing of an unknown strand of DNA produces the gel shown below. What is the sequence of the newly synthesized strand, and what is the sequence of the original DNA template?

 

Newly synthesized DNA:           

3′_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 5′

Template DNA (what you’re sequencing)

5′_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 3′

 

[l]A

[f*] Great!

[fx] No. Please try again.

[l]T

[f*] Correct!

[fx] No. Please try again.

[l]C

[f*] Excellent!

[fx] No, that’s not correct. Please try again.

[l]G

[f*] Great!

[fx] No. Please try again.

[q multiple_choice=”true”] An unknown fragment of DNA is being processed for Sanger sequencing. Electrophoresis results in the gel below. What’s the sequence of this fragment of DNA?

[c] 5′ ATGCCAGTA 3′

[f]Here’s a hint: The longest fragment (which is at the top) ends in A. That means it was synthesized from a piece of DNA where the first nucleotide was 5 ‘T.

[c] 5′ TACGGTCAT 3’

[f]Here’s a hint: The longest fragment (which is at the top) ends in A. That means it was synthesized from a piece of DNA where the first nucleotide was 5 ‘T. The second longest fragment ended it T. That means it was synthesized from a piece of DNA where the second nucleotide was A. Keep on using the same method.

[c*] 5′ TACTGGCAT 3′

[f]Excellent. The gel above would be created during sequencing of 5’TACTGGCAT 3’.

[q] An unknown fragment of DNA is being processed for Sanger sequencing. Electrophoresis results in the gel below. What’s the sequence of this fragment of DNA?

[c]5′ ACCGTAGAT 3′

[f]No. Reading from the top down, you can see that the longest synthesized strand ended in A. That means that T is the first nucleotide on the 5′ side.  You take it from there.

[c*]5′ TGGCATCTA 3′

[f]Excellent.

[c] 5′ TAGATGCCA’

[f]No. Reading from the top down, you can see that the longest synthesized strand ended in A. That means that T is the first nucleotide on the 5′ side. The next longest synthesized strand ended in C. That means that G was the next nucleotide. You take it from there.

[/qwiz]

9. Sequencing Advances: Sanger Method and Beyond

Sanger Sequencing is still in use today, though in an automated form that’s replaced four-lane electrophoresis with capillary gel electrophoresis. 

Sequencing DNA using dye terminators, University of Queensland, Australia

The process begins in much the same way as with Sanger’s original method: using ddnucleotides and in vitro DNA synthesis, fragments are created, each of which represents the position of a specific nucleotide. These ddnucleotides used in capillary gel electrophoresis, however, each have a special fluorescent pigment molecule attached, enabling them to be identified later. After fragment synthesis (step 1), all the fragments are fed into a gel-containing capillary tube (2). Electrophoresis happens within that tube, sorting the fragments by size. The fragments flow past a laser beam (3), which stimulate each ddnucleotide to fluoresce with a unique color. This color is read by a detector (4) and the sequence recorded on a computer.

Those advances, however, are just the start of advances in sequencing. Sanger sequencing, while still in use, is being supplanted by high-throughput techniques. If you’re interested, you can read about these at this page on Wikipedia, or you can even take this course in Next Generation sequencing offered by the European Bioinformatics Institute.

Links

At this point in your study of biology, the information on sequencing above is all you need to know. Click the links below to continue

  1. Gene Editing Using CRISPR-Cas9 (the next tutorial in this series)
  2. Genetic Engineering and Biotechnology Main Menu