• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Milwaukee
    • UW Milwaukee Electronic Theses and Dissertations
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Milwaukee
    • UW Milwaukee Electronic Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments

    Thumbnail
    File(s)
    Main File (1.093Mb)
    Date
    2014-12-01
    Author
    Klosterman, Eric James
    Department
    Health Care Informatics
    Advisor(s)
    Rashmi Prasad
    Metadata
    Show full item record
    Abstract
    Automatic extraction of patient demographics and psychiatric diagnoses from clinical notes allows for the collection of patient data on a large scale. This data could be used for a variety of research purposes including outcomes studies or developing clinical trials. However, current research has not yet discussed the automatic extraction of demographics and psychiatric diagnoses in detail. The aim of this study is to apply text mining to extract patient demographics - age, gender, marital status, education level, and admission diagnoses from the psychiatric assessments at a mental health hospital and also assign codes to each category. Gender is coded as either Male or Female, marital status is coded as either Single, Married, Divorced, or Widowed, and education level can be coded starting with Some High School through Graduate Degree (PhD/JD/MD etc. Level). Classifications for diagnoses are based on the DSM-IV. For each category, a rule-based approach was developed utilizing keyword-based regular expressions as well as constituency trees and typed dependencies. We employ a two-step approach that first maximizes recall through the development of keyword-based patterns and if necessary, maximizes precision by using NLP-based rules to handle the problem of ambiguity. To develop and evaluate our method, we annotated a corpus of 200 assessments, using a portion of the corpus for developing the method and the rest as a test set. F-score was satisfactory for each category (Age: 0.997; Gender: 0.989; Primary Diagnosis: 0.983; Marital Status: 0.875; Education Level: 0.851) as was coding accuracy (Age: 1.0; Gender: 0.989; Primary Diagnosis: 0.922; Marital Status: 0.889; Education Level: 0.778). These results indicate that a rule-based approach could be considered for extracting these types of information in the psychiatric field. At the same time, the results showed a drop in performance from the development set to the test set, which is partly due to the need for more generality in the rules developed.
    Subject
    Information Extraction
    Patient Demographics
    Patient Psychiatric Diagnoses
    Psychology
    Text Mining
    Permanent Link
    http://digital.library.wisc.edu/1793/93996
    Type
    thesis
    Part of
    • UW Milwaukee Electronic Theses and Dissertations

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Contact Us | Send Feedback