Glossary

Click on term to view description.

a

adenine (A)
One of the four types of base that spells out our DNA code. When a base is attached to a phosphate and sugar, it makes up a nucleotide, one of DNA's 'building blocks'. In the double-helical structure of DNA, A (adenine) pairs with T (thymine). A is also present in RNA.
See also: base, DNA, thymine (T)
amino acid (AA)
The components or building blocks of protein. There are twenty different amino acids that join together in long chains to make proteins. These are (with 3 letter and 1 letter abbreviations in brackets): alanine (Ala, A), Arginine (Arg, R), Asparagine (Asn, N), Aspartic acid (Asp, D), Cysteine (Cys, C), Glutamic acid (Glu, E), Glutamine (Gln, Q), Glycine (Gly, G), Histidine (His, H), Isoleucine (Ile, I), Leucine (Leu, L), Lysine (Lys, K), Methionine (Met, M), Phenylalanine (Phe, F), Proline (Pro, P), Serine (Ser, S), Threonine (Thr, T), Tryptophan (Trp, W), Tyrosine (Tyr, Y), Valine (Val, V)
See also: protein
anticodon
A sequence of three unpaired nucleotides or bases on an 'adaptor' transfer RNA (tRNA) molecule. The anticodon can bind, following the rules of base pairing, to a triplet of nucleotides (a codon) in a messenger RNA (mRNA) molecule.
See also: codon, complementary, base, base pair

b

bacterial artificial chromosome (BAC)
A length of DNA with sequences that allow it to be maintained and reproduced in bacteria. DNA libraries for genome sequencing can be made using BACs: the DNA of interest is inserted into the BAC, and the package is taken up by bacteria. As the bacteria grow and divide, the BAC and the DNA of interest is also copied. BACs were the cornerstone of sequencing the human genome; at the time, they could hold 100,000 - 200,000 base pairs of DNA (100-200 kb); now they can hold double or triple that amount.
See also: DNA, sequencing
base
The basic unit of our genetic instructions: DNA instructions are encoded in the sequence of its chemical 'letters' or bases. There are four bases: adenine(A), cytosine(C), guanine (G) and thymine (T). Another base, uracil (U) replaces T in RNA.
See also: base pair, cytosine (C), adenine (A), guanine (G), thymine (T), uracil (U), DNA, RNA
base pair (bp)
Two bases on opposite strands of a DNA molecule that are held together by weak chemical bonds. In DNA's double helical structure, a base pair forms one 'rung' of the DNA 'ladder'. The rules of base pairing are that A always binds to T and C always binds to G. The bp is also the measurement unit of DNA; the human genome contains more than 3,000,000,000 bp. 1,000 bp = 1 kilobase pair and 1,000,000 bp make up 1 megabase pair. The word 'pair' is often dropped when referring to the genome sequence of an organism; a bacterium could be said to have 4.09 Mb (megabases) in its genome.
See also: base, DNA, RNA
bioinformatics
The science of using computer technology to gather, store, analyse and merge biological data. Expertise in bioinformatics is key to handling the enormous amounts of data produced by the Human Genome Project and other sequencing projects, and serving it out to the researchers who use the data. Advances in computer research are as important as the biology for such large scale projects.

c

catalyst
A chemical that initiates or accelerates a chemical reaction. A biological catalyst is an enzyme (usually a protein) with active regions which bind the components of a reaction.
See also: enzyme
cell
The basic component of living things. Our bodies are made up from about 100 million million cells. In humans and other complex organisms, these watery bodies contain a range of chemicals, individual components called organelles and copies of the organism's genome.
See also: nucleus, cytoplasm
chromosome
A threadlike structure in our cells, made of a long DNA molecule, wrapped around a protein scaffold. Other organisms also have chromosomes: most bacterial chromosomes are loops or circles of DNA. The smallest human chromosome has around 50 million DNA letters and the largest has more than 240 million.
See also: DNA, RNA
clone
An exact copy of something. In sequencing work, this refers to copying molecules, such as DNA, rather than whole cells or organisms. Small pieces of DNA are stitched into carrier molecules (vectors) that can be copied in bacteria or cells. This means that many copies can be made of any DNA segment as the bacteria grow and divide. Each unique combination is called a clone.
codon
A sequence of three DNA bases that specifies one amino acid (building block) of a protein. The DNA code for proteins is written in 'words' of three letters. Four DNA letters provide 64 different combinations of 3 letter words; 61 of the 64 possible combinations code for an amino acid. The 'start codon' signals the beginning of amino acid chain production beginning with methionine (Met). The 'stop codon' signals the end of the assembly of amino acids. There are three different codons which can act as 'stop codons'.
See also: base, amino acid
complementary
The preferential binding of bases A to T (or U) in DNA or RNA. For example, if there is a GTC on the DNA strand, the complementary RNA or DNA sequence will be CAG. This complementarity maintains the double helical structure of DNA, and makes cellular processes such as transcription (when an RNA copy is made of a DNA strand) or replication (where complementary copies of the existing DNA strands are made).
See also: base, DNA, RNA
copy number variation (CNV)
Copy number variation refers to differences in the number of copies of a particular DNA segment between two or more genomes. Variations in copy number have been found within individuals, between individuals and between humans and other mammals. CNVs can result from simple duplications of DNA segments or may involve complex gains or losses of similar sequences at many sites in the genome.
cystic fibrosis (CF)
An inherited disorder that makes it hard to breathe and digest food. CF is caused by mutations in a gene called the cystic fibrosis transmembrane conductance regulator (CFTR); a thick, sticky mucus clogs the lungs and digestive system of people with the faulty gene. In the UK, 1 in 2500 people have CF and although there is no cure, physiotherapy, medication and exercise allow most to live 30-40 years. CF is a recessive disorder: if a person has only one faulty copy of the gene, they are said to be a 'carrier'; if they have two faulty copies, they will have CF. All newborn babies in the UK are screened for CF, and a mouthwash test can show whether you are a carrier. The name cystic fibrosis refers to the characteristic 'fibrosis' (tissue scarring) and cyst formation within the pancreas, first recognized in the 1930s.
See also: CFTR
cystic fibrosis transmembrane conductance regulator (CFTR)
The product of the CFTR gene is a chloride ion channel important in creating sweat, digestive juices, and mucus. Alterations or mutations in the gene can responsible for causing the inherited disorder cystic fibrosis (CF). In CF, the ion channel is not working properly, leading to a water-shortage outside of the cells and a thick, sticky mucus which fails to move bacteria away, often resulting in infection and inflammation of the lungs. People with CF have two working copies of the CFTR gene, only one is needed to prevent cystic fibrosis. CF develops when neither gene works normally. Therefore, CF is considered an autosomal recessive disorder.
See also: cystic fibrosis
cytoplasm
The fluid-filled area of the cell in animals, plants and fungi that surrounds the nucleus and acts as the factory-floor of the cell. Instructions from the DNA and messages from outside the cell are interpreted by molecules in the cytoplasm.
See also: cell, nucleus
cytosine (C)
One of the four types of 'base' that spells out our DNA code. When a base is attached to a phosphate and sugar, it makes up a nucleotide, one of DNA's 'building blocks'. In the double-helical structure of DNA, C (cytosine) pairs with G (guanine). A is also present in RNA.
See also: base, DNA, guanine (G)

d

deoxyribonucleic acid (DNA)
The long molecule that encodes our genetic instructions. Two strands of DNA are twisted together into the double helix. The base-pairs between A and T and between C and G hold the two strands together. In our cells, DNA is usually packaged into chromosomes.
See also: base pair, helix, chromosome
diabetes
Diabetes mellitus is a metabolic disorder of the pancreas being unable to produce sufficient insulin (a hormone that regulates carbohydrate metabolism) to maintain a healthy level of blood sugar. This results in hyperglycaemia (high blood sugar) which can cause increased risk of heart disease, chronic kidney failure, blindness, nerve damage and damage to blood vessels.
See also: insulin
dominant
The 'stronger' version of a pair: this type of gene or characteristic will appear or be 'expressed' even if there is only one copy.
See also: recessive
Down's syndrome
Down's syndrome (Down syndrome or trisomy 21) is a genetic condition caused by the presence of all or part of an extra chromosome 21. It is named after John Langdon Down, a British doctor who described it in 1866. Often Down's syndrome is associated with delays in or impairment of learning and physical development as well as a distinctive facial appearance.
See also: chromosome

e

egg
A female reproductive cell. Also called an ovum. Combines with male reproductive cell (sperm) during fertilisation.
See also: sperm, fertilisation
enzyme
An enzyme is usually a protein (although some exceptions are made from RNA) that can initiate, facilitate or speed up a reaction. Enzymes have a variety of functions in the body, including digesting food, transmitting nerve impulses and making our muscles work.
See also: polymerase, catalyst
exon
The part of a gene that has the instructions to make a protein.
See also: intron
expression
Process of making molecules from the instructions in DNA. The amount of RNA and protein produced from a gene is referred to as its level of expression.

f

fertilisation
Union of egg and sperm in the creation of a new being of the same species.
See also: egg, sperm
fingerprinting
A method that detects the variation in DNA sequence between individuals. Short, variable sequences in the DNA are reproduced and their sizes are compared. The differences are seen as bands on a gel.
See also: DNA, gel electrophoresis, polymerase chain reaction

g

gel electrophoresis
A technique which uses a thin block of jelly-like material (gel) to act as a sieve to separate molecules. Fragments of DNA or proteins are separated based on their size by applying an electrical field across the gel. The smaller molecules move more quickly than the larger molecules.
See also: fingerprinting
gene
A sequence of DNA that carries the information required to make a molecule, usually a protein. It includes sequences that define the beginning and end of the protein (start and stop codons) and signals that control when the protein should be made. In humans and other complex organisms, genes are split into coding (exons) and non-coding sequences (introns). These split sections allow some genes to make more than one type of protein.
See also: DNA
genetic code
The genetic code is written in 'words' of three letters in DNA (such as ATG, CCG, TAA and so on). This code must be 'transcribed' and then 'translated' by the cell into the building blocks of molecules such as proteins. Other parts of the code are 'switches' to turn genes on or off, or to increase or decrease the amount of protein produced.
See also: codon
genome
A copy of all the DNA instructions used to make an organism. Our genome is approximately 3,000,000,000 base-pairs or genetic letters; most of our cells carry two copies of our genome, packaged into 23 pairs of chromosomes. The genomes of some plants are five times bigger than ours, while some bacteria have only 1,500,000 base-pairs in one chromosome.
See also: base pair, chromosome
guanine (G)
One of the four types of 'base' that spells out our DNA code. When a base is attached to a phosphate and sugar, it makes up a nucleotide, one of DNA's 'building blocks'. In the double-helical structure of DNA, G (guanine) pairs with C (cytosine). G is also present in RNA.
See also: base, DNA, cytosine (C)

h

haemoglobin (Hb)
Protein which carries oxygen in the blood.
See also: sickle cell anaemia
haemophilia
A genetic disorder that reduces the ability of a person's blood to clot and can result in uncontrolled bleeding. Haemophilia usually occurs in males and less often in females as the genes mutated in the two forms of haemophilia are on the X chromosome.
helix
Twisted shape in the form of a spiral, coil or screw. Can turn to right (clockwise) or left (anticlockwise), referred to as an alpha helix or a beta helix, respectively. DNA has an alpha-helical (right-turning) structure.
See also: DNA
homology
In genetics, when two or more DNA sequences are highly similar.
human immunodeficiency virus (HIV)
Human immunodeficiency virus (HIV) is a retrovirus that can lead to acquired immunodeficiency syndrome (AIDS, a condition in humans in which the immune system begins to fail, leading to life-threatening opportunistic infections).
Huntington's Disease
A progressive neurological disorder that causes uncontrollable muscle spasms. Also known as Huntington's chorea. Huntington's disease is caused by multiple 3-letter repeats within the huntingtin (HD) gene. This expansion produces an altered form of the Htt protein, mutant Huntingtin (mHtt), which results in neuronal cell death in select areas of the brain.
hypertension
Condition of high blood pressure. Usually refers to arterial hypertension. Hypertension has been associated with a higher risk of heart attack or stroke.

i

insulin
A small protein hormone that regulates glucose levels in the body. Insulin is secreted by the pancreas and stimulates cells to take up glucose from the blood for use in energy production.
See also: diabetes
intron
Part of a gene that is not used to make protein and is cut out from from the RNA between transcription and translation.
See also: exon
inversion
An inversion is a chromosome rearrangement in which a segment of a chromosome is reversed end to end. An inversion occurs when a single chromosome breaks and rearranges itself. If there's no missing or extra genetic information, inversions usually don't affect the cell or the organism.

j

junk
'Junk' DNA refers to sequence regions of DNA without known function, that is, it does not contain instructions for making proteins. About 97% of the human genome has previously been designated as "junk". As researchers find out more about the function of this DNA, it is now more appropriately referred to as "noncoding DNA".

k

kilobase (kb)
One thousand bases, or pairs of bases (1000 b or 1000 bp). In molecular biology, commonly used to describe the length of a DNA/RNA molecule. A Mb (megabase) is one million bases and Gb (gigabase) is one thousand million bases.
See also: megabase, base pair, nucleotide

l

leukaemia
A cancer of the blood or bone marrow. Leukaemia, like other cancers, result from changes to the DNA which activate cancer-causing cells or deactivate cancer-suppressing cells. These changes may occur spontaneously or as a result of exposure to radiation or carcinogenic substances and can be influenced by genetic factors. Studies have linked exposure to petrochemicals, such as benzene, and hair dyes to the development of some forms of leukaemia.
library
In molecular biology, a collection of isolated DNA sequences is called a library (or bank). The DNA sequences are often held in bacteria, which serve as factories to store and make copies of the DNA.
See also: bac

m

malaria
Malaria is one of the most common infectious diseases and a global public-health challenge. The disease is caused by protozoan parasites of the genus Plasmodium. Malaria parasites are transmitted by female Anopheles mosquitoes. The parasites multiply within red blood cells, causing symptoms that include symptoms of anaemia (light headedness, shortness of breath, racing heartbeat), as well as other general symptoms such as fever, chills, nausea, flu-like illness, and in severe cases, coma and death.
mapping
Biologists identify specific sequences or 'landmarks' on chromosomes or the DNA they contain. These may be visible landmarks based on pictures of chromosomes that have been stained to show 'stripes', or landmarks based on DNA sequences that can be detected using fluorescent markers. The maps help us to work out where a particular DNA sequence of interest is located.
See also: sequencing
megabase (Mb)
One million bases or base pairs (1,000,000 b or 1,000,000 bp). In molecular biology, commonly used to describe the length of a DNA/RNA molecule. A Mb (megabase) is one million bases and Gb (gigabase) is one thousand million bases.
See also: kilobase, nucleotide
messenger RNA (mRNA)
A single stranded molecule that acts as the template for protein assembly. When a protein is made, the DNA instructions are transcribed into messenger RNA (mRNA). The mRNA then moves out of the cell's nucleus to be read by molecular machines (called ribosomes) that assemble the protein. The product is a chain of protein building blocks (amino acids) that have been assembled according to the order of the bases along the RNA. In animals and plants, the mRNA copy is also 'edited' to remove sequences that won't become part of the protein (see intron).
See also: nucleus, cell, ribosome, amino acid, protein
mutation
A change in DNA sequence. Mutations are relatively common in our DNA, but most have no detectable effect.

n

nucleic acid
A complex molecule found in all living cells and viruses. Nucleic acids, in the form of DNA or RNA, carry the genetic information to make an organism, which is passed between generations.
See also: DNA, RNA
nucleotide (nt)
A nucleotide is a base, such as, adenine, guanine, cytosine or thymine, attached to a sugar and one or more phosphate groups. These are the 'building blocks' of DNA and RNA.
See also: base
nucleus
The central subcompartment of a cell that contains genetic information.
See also: cell, cytoplasm

p

pharmacogenetics
Studying an individual's genetic make up in order to predict responses to a drug and guide prescription.
See also: pharmacogenomics
pharmacogenomics
Analysing entire genomes, across groups of individuals, to identify the genetic factors influencing responses to a drug.
See also: pharmacogenetics
polymerase
DNA polymerase is the enzyme that makes new copies of DNA from a template strand and available nucleotides. RNA polymerase is the enzyme that makes an RNA copy of a DNA strand during transcription.
polymerase chain reaction (PCR)
A technique to make large quantities of a specific fragment of DNA. It is often used in DNA testing and fingerprinting to amplify small amounts of DNA so that scientists have enough to work with. Used in DNA profiling, testing and in some sequencing work.
See also: fingerprinting, sequencing
polypeptide
A protein is made from one or more chains of amino acids, known as polypeptides. Peptide bonds describe the links between amino acids as they are assembled to form a poly (meaning multiple) peptide.
See also: aminoacid, protein
protein
A molecule made from chains of amino acid building blocks, the order of which is coded in DNA. The order of amino acids is different for different proteins - the sequence determines properties of the protein. Major proteins include keratin which makes hair and nails, actin and myosin which make muscle, globin which carries oxygen and makes blood red, and antibodies which protect us from disease.
See also: aminoacid, ribosome

r

radioactive
A substance emitting ionizing radiation is said to be radioactive. This form of radiation can penetrate cells and create ions in the cell contents. These, in turn, can cause permanent alterations/mutations in DNA.
See also: x-ray, mutation
recessive
The 'weaker' one of a pair. A recessive disorder will only appear if two copies of the mutated gene are inherited. Otherwise, the characteristic encoded by this gene is overwhelmed by a dominant gene.
See also: dominant
ribonucleic acid (RNA)
Ribonucleic acid (RNA) is composed of nucleotides, as is DNA, but its structure is a single strand not double. In transcription, a transportable copy of the code for a particular molecule or protein is produced, called messenger RNA. This means the code can move from the nucleus to other regions of the cell where the protein assembly occurs. RNA exists in different forms, including mRNA (messenger RNA), tRNA (transfer RNA) and ribosomal RNA (rRNA).
See also: base, codon, messenger RNA (mRNA), transfer RNA (tRNA)
ribosome
Protein-making factory of the cell. Made of protein and RNA, ribosomes coordinate the reading of the mRNA template. They add new amino acids to the translated protein chain according to the nucleotide sequence of the mRNA.
See also: aminoacid, protein
rough endoplasmic reticulum (RER)
Subcellular structure in the cytoplasm of a cell where proteins destined for export are assembled.
See also: cell

s

sequencing
The method of determining the order of letters or bases in DNA, or the order of amino acids in a protein molecule.
See also: base, codon, DNA, RNA
sickle cell anaemia
A blood disorder in which red blood cells can become sickle-shaped under certain conditions. This is due to a mutation in the haemoglobin protein which is responsible for binding oxygen. The condition is a recessive genetic disorder meaning one copy of the mutated gene must be inherited from each parent for the individual to have sickle cell anaemia.
See also: haemoglobin
sperm
Male reproductive cell. Combines with a female reproductive cell (ovum) during fertilisation.
See also: fertilisation, egg

t

thymine (T)
One of the four types of 'base' that spells out our DNA code. When a base is attached to a phosphate and sugar, it makes up a nucleotide, one of DNA's 'building blocks'. In the double-helical structure of DNA, T (thymine) pairs with A (adenine).
See also: base, DNA, adenine (A)
transcription
The process of making a complementary (RNA) copy of a region of DNA. As in writing, transcription is essentially the copying of a string of text; in the cell, the information is copied into a form that can act in various ways. Some RNA copies -- of sections of DNA that encode proteins -- move out of the cell's nucleus to be translated. Other RNA copies -- non-coding RNAs -- can act directly on other parts of the DNA or become part of the protein-translation machinery.
See also: complementary, RNA, DNA
transfer RNA (tRNA)
Transfer RNA is a small RNA chain (73-93 nucleotides). It acts as a kind of 'adaptor molecule' during translation. Three RNA bases (the anticodon) of tRNA binds to a complementary sequence on the messenger RNA template to deliver a new amino acid to add to the growing polypeptide chain. Each type of tRNA molecule can be attached to only one type of amino acid.
See also: RNA, translation
translation
The process of turning the sequence of bases in a messenger RNA (mRNA) molecule into a protein. In a cell, translation occurs at structures called ribosomes in the cell's cytoplasm. There, the mRNA is read by 'adaptor molecules' called transfer RNAs (tRNAs) and amino acids are assembled according to the sequence of bases in mRNA.
See also: base, RNA, DNA, ribosome, messenger RNA (mRNA), transfer RNA (tRNA)
translocation
A chromosome translocation is a mutation caused by rearrangement of parts between different chromosomes.
See also: mutation

u

uracil (U)
One of the four types of 'base' used to encode information in RNA. When a base is attached to a phosphate and sugar, it makes up a nucleotide, one of RNA's (and DNA's) 'building blocks'. RNA uses the three standard bases of A, C, G, but substitutes U for T. When RNA is bound to DNA, U (uracil) pairs with A (adenine).
See also: base, RNA, thymine (T)

v

virus
Infectious particle containing DNA or RNA and sometimes a protein coat. Generally not considered a living organism as it requires infection into another organisms living cells to replicate.

x

x-ray
X-rays are a form of ionizing radiation primarily used for radiography and crystallography. They are a form of electromagnetic radiation with a wavelength in the range of 10 to 0.01 nanometers, corresponding to frequencies in the range 30 to 30 000 PHz (1 PHz = 1015 Hertz).
See also: radioactive