“First, rinse your mouth.” That was the starting procedure earlier this month for a lab about extracting and analyzing DNA from samples—provided by the students themselves—in an interdepartmental class, Biology/Computer Science 309: Computational Biology.
The class, taught by Jeff Thompson, associate professor of biology, and Jessen Havill, professor of mathematics and computer science, held the interest of students in both disciplines. And as a bonus, a special guest was visiting that day—Dan Gusfield, a professor of computer science at the University of California and an expert in the intersection of these academic spheres, was on campus to lecture to a larger audience about the emerging fields of bioinformatics and computational biology. He was invited to teach the computer science section of the day's lab.
The principle behind this lab is pretty simple.
Each strand of DNA contains a sequence of four nucleotide bases—cytosine (C), guanine (G), adenine (A) and thymine (T). The sequence of these bases can change spontaneously over time and serves as an easy way of measuring genetic change. Substitution mutations, or the replacement of one DNA base with another, are the most common type of change observed. So a sequence of CGAT might in time spontaneously change to CCAT. Comparing DNA sequences can determine how closely or distantly individuals are related to one another. (The amount of difference between the two related individuals is known as divergence.)
And if you're wondering about the confidentiality of the DNA information used in this lab exercise—don't worry, they've got that covered. The samples were identified by number only—a number that students themselves randomly chose from a list. No unknown members of the family tree were discovered through this experiment.
The tables in room 428 of Samson Talbot Hall of Biological Science were littered with cylindrical test tubes, glass vials and other intriguing objects, as Thompson led the session on the methodology of extracting mitochondrial DNA. Swabbed from the inside of the students' cheeks, the samples were “lysed” (which means breaking open the cells) in a solution before being combined with DNA replicating factors and, in a final flourish, were placed in a thermocycler to amplify the DNA molecules.
When the lab began, the biology students were clearly in the lead, showing their computer science fellows how to perform the various procedures. But when the class exited the lab and entered a lecture hall to learn about writing a computer program to analyze the data, the computer science majors unquestionably came into their own. The talk turned to alignment methods to calculate the amount of similarity between the DNA sequences. The students were expected to create an algorithm to execute the analysis of many sequences at the same time.
The multitudes of data that DNA carries provide a valuable example of why these two separate academic fields benefit in big ways from working together. Computer scientists need to understand the biology underlying the data and the biological questions they are trying to answer in order to design effective algorithms. And understanding how computer programs can organize and analyze data allows biologists to imagine new experiments and correctly draw conclusions from computational results. Interdisciplinary collaboration is at the heart of the process, and is the centerpiece of this new course.