Human Genome Project Approaches

Clone-by-clone sequencing

This approach means that the chromosomes were mapped and then split up into sections. A rough map was drawn for each of these sections, and then the sections themselves were split into smaller bits, with plenty of overlap between each of the bits. Each of these smaller bits would be sequenced, and the overlapping bits would be used to put the genome jigsaw back together again.

First, by mapping the genome, researchers produce, at an early stage, a genetic resource that can be used to map genes. In addition, because every DNA sequence is derived from a known region, it is relatively easy to keep track of the project and to determine where there are gaps in the sequence. Moreover, assembly of relatively short regions of DNA is an efficient step. However, mapping can be a time-consuming, and costly, process.

Whole genome shotgun sequencing

The alternative to the clone-by-clone approach is the 'bottom-up' whole genome shotgun (WGS) sequencing. Shotgun sequencing was developed by Fred Sanger in 1982. First, all the DNA is first broken into fragments. The fragments are then sequenced at random and assembled together by looking for overlaps.

In recent approaches, libraries have been made from DNA fragments of 2000 base pairs and of 10,000 base pairs in length. The two sizes of fragment provide complementary results; by sequencing the ends of the fragments, each provides information about DNA sequences separated by known distances. Computer analysis is used to search the sequences for overlaps.

The advantage of the whole-genome shotgun is that it requires no prior mapping. Its disadvantage is that large genomes need vast amounts of computing power and sophisticated software to reassemble the genome from its fragments. To sequence the genome from a mammal (billions of bases long), you need about about 60,000,000 individual DNA sequence reads. Reassembling these sequenced fragments requires huge investments in IT, and, unlike the clone-by-clone approach, assemblies can't be produced until the end of the project.

Whole genome shotgun for large genomes is especially valuable if there is an existing 'scaffold' of organized sequences, localized to the genome, derived from other projects. When the whole genome shotgun data are laid on the 'scaffold' sequence, it is easier to resolve ambiguities. Today, whole genome shotgun is used for most bacterial genomes and as a 'top-up' of sequence data for many other genome projects.

Clone-by-cloneWhole genome shotgunoverlap