What happens to DNA sequence when it comes off a sequencing machine?
What’s the challenge?
- After we have sequenced a sample of DNA we need a process to check that:
- The laboratory stage of the process, that prepares the DNA sample for sequencing, has worked properly
- The instrument carrying out the sequencing itself has run properly
- The DNA sample is from a single source and not been contaminated with DNA from another sample.
What do we need to do?
- Quality control is an extensive set of procedures carried out to ensure that the sample and DNA sequence are of good quality. It is used to check that all of the DNA sequence is:
- Of suitable quality that it can be sent on and used for the scientific study.
- One way this is assessed is by looking at how much DNA (measured in clusters) are in every mm2 of each lane of the sequencing machine.
- For a sample to be accepted there should be hundreds of thousands to millions of clusters of DNA per mm2 of each lane (depending on the sequencing machine being used). If the number of clusters is outside the range for a certain machine it indicates that something has gone wrong during sequencing and the sample will not be accepted for further processing.
- The strength of the signal from the DNA bases in the sequence is also measured. The signals should be as bright as possible, particularly for the first base in the sequence. If the signal is dull it means that something might have gone wrong or that the camera on the machine was out of focus.
- The DNA sample is not contaminated with DNA from another sample.
- This is checked by aligning the DNA sequence against the reference genome for that organism and checking that it matches with the species it should be. For example, if you have sequenced a mouse genome you would expect to see a 98-99 per cent match to the reference mouse genome and much lower matches with other reference genomes. It will never be 100 per cent because there is always some genetic variation between individuals of the same species.
- Individual ‘tags’ are added to each DNA sample before sequencing. These tags are short sequences of DNA that act as barcodes to identify DNA fragments from the same individual. These can then all be easily identified and sorted afterwards. After sequencing, if a tag does not appear in a sample when it should it is a sign that something has gone wrong before or during sequencing. This may be a result of contamination or human error.
- The time taken to transfer the sequence data off the machines and then undergo primary analysis takes about three to four days to complete. Although, the manual quality control process usually only takes about one hour.
- After this the sample will then either be passed or failed.
- If the sample is failed, the failed sequence will be discarded and sequencing will be carried out again.
- For all the samples that pass, the DNA sequence is stored in a large data ‘bucket’ along with additional information about the sample. This will include which sample the DNA sequence is from, which species it is from and which study the genome was sequenced for.
This page was last updated on 2021-07-21
How helpful was this page?👎 👍 Send
What's the main reason for your rating?Send
Which of these best describes your occupation?Send
how old are students / how old are you?Send
What is the first part of your school's postcode?Send
How has the site influenced you (or others)?Send
Thankyou, we value your feedback!
If you have any other comments or suggestions, please let us know at firstname.lastname@example.org
Can you spare 5-8 minutes to tell us what you think of this website? Open survey