Detection and Exploration of Individual Semantic Trajectories Using Social Media Data
Individual travel trajectories collected from social media platforms (i.e., digital footprints) are often aggregated using methods such as the density-based spatial clustering of applications with noise (DBSCAN) and varying DBSCAN (VDBSCAN) for travel activity (e.g., eating, working, entertainment) identification. However, spatial clusters cannot represent distinct individual representative travel activities. This thesis work first develops a multi-scale spatial clustering method to aggregate digital footprints of a group of users into collective spatial hot-spots (i.e., activity zones), and to identify the activity type (e.g., dwelling, service, transportation and office) of each collective zone by integrating Volunteered Geographic Information (VGI) data, specifically OpenStreetMap (OSM) datasets. Each digital footprint of an individual, represented as a spatiotemporal (ST) point, is then attached with a collective activity zone that either includes or overlaps a buffer zone of the ST point, which is generated by using the point as the centroid and a predefined threshold as the radius. Given an individual’s ST points with semantics (i.e., activity type information) derived from the attached collective activity zone, a semantic activity clustering method is then developed to detect daily representative activity clusters of the individual. Next, temporal information of a daily activity cluster, indicating the time period when the individual frequently visits the zone covered by the activity cluster, is detected, and individual representative daily semantic travel trajectory paths (i.e., semantic travel trajectory, defined as chronological travel activity sequences) are constructed between every two subsequent activity clusters. Finally, a geovisual analytical web portal is developed to display individual representative daily travel trajectories and associated activity zone information for better exploring individual and collective semantic travel patterns. Experiments with the historic geo-tagged tweets collected within Madison, Wisconsin for 49 eligible users reveal that: 1) The proposed multi-scale spatial clustering method can detect most significant activity zones with accurate zone types identified; 2) The semantic activity clustering method based on the derived activity zones can aggregate individual travel trajectories into activity clusters more efficiently comparing to both DBSCAN and VDBSCAN; 3) Individual semantic travel patterns can be explored and compared through geovisual analytics, and collective semantic travel patterns thus can be unfolded for a group of people with similar individual travel patterns.
Volunteered Geographic Information (VGI)
Semantic activity clustering
Semantic travel trajectory
Multi-scale spatial clustering
Geovisual analytics / Geovisualization