Association for Biology Laboratory Education

An Introduction to Bioinformatics

Robert J. Kosinski

Supplemental Materials

The table below contains text files of 17 DNA sequences (the DNA files) and 17 corresponding amino acid sequences (the Protein files) used in a bioinformatics exercise at Clemson University, South Carolina. The student must identify these sequences using NCBI BLAST. For each of these 17 genes, there are also sequences of genomic DNA (the Genomic files) that start 100,000 bp before the start of the gene and end 100,000 bp after its end. These allow the student to use BLAST to search for transcribed DNA in the neighborhood of each gene. These files are Microsoft Word files because they must be divided into separate, numbered pages.

The 17 “Bioterrorism” files each contain 9 nonhuman DNA sequences isolated from patients after a hypothetical, possible bioterror attack. The student must identify the DNA’s source organism and then consult the CDC Bioterrorism Web site. This site contains a link to a list of bioterror organisms and viruses. If any of the DNA comes from one of these organisms, we will conclude that a bioterror attack has probably occurred.

Finally, there are 17 “Phylogeny” files that contain the amino acid sequences of homologous proteins from a range of organisms (for example humans, chimps, rhesus monkeys, mice, chickens, clawed frogs, and Drosophila) that allow the student to use Phylogeny.fr to constuct a phylogam and determine if degree of relatedness to humans is a good predictor of the number of amino acid differences between humans and the organism. The proteins used here are closely related to the ones used in the protein files, but are not always identical to them.

Click on a file to download it.

The Data Files

 

DNA A Genomic A Protein A Bioterrorism A Phylogeny A
DNA B Genomic B Protein B Bioterrorism B Phylogeny B
DNA C Genomic C Protein C Bioterrorism C Phylogeny C
DNA D Genomic D Protein D Bioterrorism D Phylogeny D
DNA E Genomic E Protein E Bioterrorism E Phylogeny E
DNA F Genomic F Protein F Bioterrorism F Phylogeny F
DNA G Genomic G Protein G Bioterrorism G Phylogeny G
DNA H Genomic H Protein H Bioterrorism H Phylogeny H
DNA I Genomic I Protein I Bioterrorism I Phylogeny I
DNA J Genomic J Protein J Bioterrorism J Phylogeny J
DNA K Genomic K Protein K Bioterrorism K Phylogeny K
DNA L Genomic L Protein L Bioterrorism L Phylogeny L
DNA M Genomic M Protein M Bioterrorism M Phylogeny M
DNA N Genomic N Protein N Bioterrorism N Phylogeny N
DNA O Genomic O Protein O Bioterrorism O Phylogeny O
DNA P Genomic P Protein P Bioterrorism P Phylogeny P
DNA Q Genomic Q Protein Q Bioterrorism Q Phylogeny Q