Title
Enron Dataset Research: E-mail Relevance Classification
Document Type
Technical Report
Abstract
This paper discusses a probabilistic approach to address the problem of searching through large amount of data to find case-relevant documents. Using a valuable collection of data, e-mail communications from Enron, an actual corporation, we train a Bayes-based text classifier algorithm to identify e-mails known to be case-relevant and those known to be case-irrelevant.
Recommended Citation
VanBuren, Victoria; Villarreal, David; McMillen, Thomas A.; and Minnicks, Andrew L., "Enron Dataset Research: E-mail Relevance Classification" (2009). Technical Reports-Computer Science. Paper 9.
http://ecommons.txstate.edu/cscitrep/9
Comments
Report Number TXSTATE-CS-TR-2009-12
Research advisor: Professor Wilbon Davis