ASM events
This conference is managed by the American Society for Microbiology

Reading Reflections

Table of contents
No headers

How would you describe your “research problem(s)” to the Research Scholars group? 

The research problem I wish to address is that, while the volume and format of data that biologists use to answer questions continues to grow, biology students are not learning the quantitative analysis skills necessary to analyze these data. For example, the cost of DNA sequencing has dropped low enough to make whole genome sequencing of individuals cheaper than some medical procedures and technologies already covered by most US insurance companies (e.g., CT scans, MRI). The resulting explosion of human DNA sequence, RNA expression, and protein structure data will in turn revolutionize how basic and applied biomedical research advances in the next 20 years. Yet, analysis of most Big Data sets is performed by computer scientists and mathematicians with little or no understanding of the biological problems that drive the collection of these data, and the hypotheses that drive the experiments. My goal is to address this problem by exposing first year Biology undergraduates to Big Data sets and (in collaboration with two mathematicians) teach them how to perform relatively simple modeling and analysis of these data and to generate biologically meaningful hypotheses. Specifically, students will perform a search for potential genetic markers for autism in gene expresion data collected from neural stem cells stimulated to undergo differentiation.

What is especially toubling to me is that nearly all students who enter the Biology and Biochemistry/Biophysics programs at my school (approximately 80% of our majors) receive no computer science training at all, and stop taking math after calculus and statistics, which does not prepare them to analyze the kinds of large biological datasets we expose them to in our subject courses. Our required laboratories require rudimentary statistical analysis, but no modeling of large datasets. By providing simple tools for linking data analysis techniques to large biological datasets, my goal is to improve the students' conceptual understanding of how modern biology problems are defined and answered.

What theme(s) based on your readings, resonate with your “problem” and/or your proposed approach to address your problem.

One important theme is that our students are not equipped to solve the questions we expect them to ask. An important assumption about scientific inquiry is that it is iterative, based on a question-hypothesis-data gathering-evaluation cycle. At this stage our ability to generate the data has outpaced our ability to analyze or understand it. This in turn slows the cycle of scientific inquiry; we cannot slow the volume of data we generate, so we must expand our ability analyze it, and that means we must push our curriculum beyond the topics we have traditionally covered in the past thirty years. And while we cannot all be bioinformatics specialists, I feel we must undertand enough quantitative analysis to make sense out of large datsets.

Because this requires expaning my own technical expertise, it also demands careful attention to evidence-based learning. I simply cannot rely on intuitive ways of determining whether my students are learning these techniques. I need direct measures of how well my students learn to complete the scientific inquiry cycle, and that demands a rigor I have not used up to this point in my teaching.

Based on Pat Hutchings article, what taxonomy would you use to describe your research question and why?

This appears to be a "What works?" problem. I want to expand students' understanding of how scientiic inquiry works and am asking if teaching data analysis skills will improve their understanding of Big Data-driven biological problems.

Do you have any questions/concerns/comments that have evolved from your reading?

Two weeks after first confronting this assignment, I still have a lot of questions. For example, how do I define "learning" in this context? How many lines of evidence do I need, and what types? How do I collect this evidence? What are my controls? How do I account for differences in my teaching strategies from year to year? Can I separate out the impact of students' awareness of being studied? Where do I start?