Location: | |
Ticket Info: |
Open to Denison students, faculty and staff. Masks are required.
|
Open to Public: | No |
Questions: |
Computer Science Academic Administrative Assistant
|
The Gordon Lecture Series welcomes Chair Emeritus, NCR Chair in Computer Science and Engineering, at the University of South Carolina, Duncan A. Buell, presenting “Text Analysis: Torturing the data until they confess?”
“The purpose of computing is insight, not numbers.” (Richard Hamming)
Analysis of text has a long history in computing, but interest in text analysis has grown substantially in recent years because the corpora of text available for analysis have grown. In addition to formal corpora like JSTOR, the text in web pages and social media can be analyzed to uncover latent characteristics.
As part of a project to improve instruction in First Year English at the University of South Carolina, we collected and curated just under 20 thousand essays, including draft and final versions, over a period of four years. We have compared draft and final versions to see how (or whether?) students revise their draft versions. We have also looked at the linguistic characteristics in an attempt to situate the students’ writing compared against academic papers, magazine articles, and conversation. We are in no way doing automatic grading of essays. Rather, our hope has been that student writing would improve if one could explicitly show students where their writing falls short of what the first-year course has as goals.
We will present some results of our analysis of student essays. We will go further and discuss some of the larger issues in the analysis of text. A corpus needs to be created in formats that can be analyzed by standard packages. Results produced by different packages must be compared. Most importantly, researchers need to be skeptical of computational results until they are vetted and can legitimately be argued to be causal and not merely phenomena. One can easily torture the data to produce computational results. What is needed is to examine at least one if not more levels below the naïve results in order to obtain insight.
Buell’s Ph.D is in mathematics from the University of Illinois at Chicago (1976). He was from 2000 to 2009 the department chair at U of SC and interim dean in 2005-2006. He has done research in document retrieval, computational number theory, parallel computing, and the analysis of election data and simulation of wait times in elections. He has also been working in digital humanities as one of the emerging “marketplace” applications for computing. He has been engaged with First Year English at U of SC on the analysis of freshman English essays, searching for an understanding of student writing in an effort to improve pedagogy for first year English instruction. As part of this he and his collaborators have curated just under 20,000 student essays from First Year English at the U of SC. He has team taught multiple times with Dr. Heidi Rae Cooley on the presentation of unacknowledged history on mobile devices, and he and Dr. Cooley have been engaged in ways to go beyond text to fully enable the use of visual media in mobile applications that present humanities content, especially content that might normally remain unacknowledged by institutional authority.