Show simple item record

dc.contributor.advisorMcRoy, Susan
dc.creatorAuh, Yong
dc.date.accessioned2025-10-08T18:06:46Z
dc.date.issued2025-08
dc.identifier.urihttp://digital.library.wisc.edu/1793/95992
dc.description.abstractIn order to identify withdrawal symptoms from patient-generated online texts, the contents of the texts need to be understood in terms of symptoms and medical discontinuation events patients experience. Since withdrawal symptoms occur after discontinuation or reduction of drugs, temporal sequencing of drug discontinuation event descriptions and medical symptom descriptions in the texts need to be recognized.Applicability of conventional statistical classification algorithms is analyzed, and limitations of statistical methods for natural language texts are discussed. Misuse of statistical techniques by treating natural language tokens or n-grams as features is proved to be one of the reasons for lack of generalizability and understanding. The method of fine-tuning BERT and variations (SciBERT, and PubMedBERT) for classification is evaluated, and lack of generalizability for different types of texts, and focus on prediction based on distribution of sequences without structures have been pointed out as limitations. Processing capability of scrambled and alphabetized texts as if they are valid sequences of texts shows paradoxically the fundamental problem of this method. Although not perfect and incomplete, statistical techniques are still valuable for dependency parsing and parts of speech assignment. Utilizing Spacy’s dependency parser combined with parts of speech assignments, improvements are made to Spacy’s output for better meaningful structuring of input sentences. An application to display structures of sentences in a flexible fashion was created for better evaluation of structures. A complex structural pattern specification language (SPSL) was introduced which allows specification of hierarchical non-contiguous patterns, and a matching algorithm for complex patterns was developed. Databases of medical symptoms and medication-related events relevant for the domain of antidepressant drugs are setup, and identification of withdrawal symptoms is performed on the revised dataset PsyTAR. Theoretical background of the current approach is also discussed. The importance of multi-dialect and multilingualism is emphasized. For proper modeling of human linguistic competence, modular hierarchical modeling of linguistic knowledge and community-specific special terminology and belief database is required, and outlines of such system are presented.
dc.subjectComputer science
dc.subjectMathematics
dc.subjecthybrid structural representation
dc.subjectsocial media text processing
dc.subjectwithdrawal symptoms
dc.titleCOMPUTATIONAL ANALYSIS OF SOCIAL MEDIA TEXTS WITH A CASE STUDY OF WITHDRAWAL SYMPTOMS IDENTIFICATION
dc.typedissertation
thesis.degree.disciplineComputer Science
thesis.degree.nameDoctor of Philosophy
thesis.degree.grantorUniversity of Wisconsin-Milwaukee
dc.contributor.committeememberZhao, Tian
dc.contributor.committeememberKate, Rohit
dc.contributor.committeememberGervini, Daniel
dc.contributor.committeememberHe, Lu
dc.description.embargo2027-06-23
dc.embargo.liftdate2027-06-23


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record