What is shotgun sequencing?

Shotgun sequencing involves randomly breaking up DNA sequences into lots of small pieces and then reassembling the sequence by looking for regions of overlap.

Large, mammalian genomes are particularly difficult to clone, sequence and assemble because of their size and structural complexity.  As a result clone-by-clone sequencing, although reliable and methodical, takes a very long time. With the emergence of cheaper sequencing and more sophisticated computer programs, researchers have therefore relied on whole genome shotgun sequencing to tackle larger, more complex genomes.  

  • Shotgun sequencing was originally used by Fred Sanger and his colleagues to sequence small genomes such as those of viruses and bacteria.
  • Whole genome shotgun sequencing bypasses the time-consuming mapping and cloning steps that make clone-by-clone sequencing so slow.
  • In whole genome shotgun sequencing the entire genome is broken up into small fragments of DNA for sequencing.
  • These fragments are often of varying sizes, ranging from 2-20 kilobases (2,000-20,000 base pairs) to 200-300 kilobases (200,000-300,000 base pairs).
  • These fragments are sequenced to determine the order of the DNA bases, A, C, G and T.
  • The sequenced fragments are then assembled together by computer programs that find where fragments overlap.
  • You can imagine shotgun sequencing as being a bit like shredding multiple copies of a book (which in this case is a genome), mixing up all the fragments and then reassembling the original text (genome) by finding fragments with text that overlap and piecing the book back together again.
  • This method of genome sequencing was used by Craig Venter, founder of the private company Celera Genomics, to sequence the human genome. Venter  wanted to sequence the human genome faster than the publicly funded effort and felt this was the best way. To assemble the sequence Venter used the clone-by-clone publically available data from the Human Genome Project. 
  • Now, as technologies are improving, whole genome shotgun sequencing is being used to improve the accuracy of existing genome sequences, such as the reference human genome.
  • It is used to remove errors, fill in gaps or correct parts of the sequence that were originally assembled incorrectly when clone-by-clone sequencing was used.
  • As a consequence the reference human genome is constantly being improved to ensure that the genome sequence is of the highest possible standard.

What are the advantages of shotgun sequencing?

  • By removing the mapping stages, whole genome shotgun sequencing is a much faster process than clone-by-clone sequencing.
  • Whole genome shotgun sequencing uses a fraction of the DNA that clone-by-clone sequencing needs.
  • Whole genome shotgun sequencing is particularly efficient if there is an existing reference sequence. It is much easier to assemble the genome sequence by aligning it to an existing reference genome.
  • Shotgun sequencing is much faster and less expensive than methods requiring a genetic map.

What are the disadvantages of shotgun sequencing?

  • Vast amounts of computing power and sophisticated software are required to assemble shotgun sequences together. To sequence the genome from a mammal (billions of bases long), you need about 60 million individual DNA sequence reads.
  • Errors in assembly are more likely to be made because a genetic map is not used. However these errors are generally easier to resolve than in other methods and minimised if a reference genome can be used.
  • Whole genome shotgun sequencing can only really be carried out if a reference genome is already available, otherwise assembly is very difficult without an existing genome to match it to.
  • Whole genome shotgun sequencing can also lead to errors which need to be resolved by other, more labour-intensive types of sequencing, such as clone-by-clone sequencing.
  • Repetitive genomes and sequences can be more difficult to assemble.

This page was last updated on 2021-07-21