Article ID Journal Published Year Pages File Type
6576315 Travel Behaviour and Society 2018 10 Pages PDF
Abstract
Application of machine learning methods shows a popular attempt to identify the purpose of a trip and mode of travel on Global Positioning System (GPS) trajectory data. Data selection for the training and test sets is important in these methods. However, the feasibility and effects of choosing these data from different periods of the year are still unknown. This detail is particularly important since collecting data via GPS decreases the burden on participants to such an extent that it can last for seasons which may own distinct features. In order to bridge this gap, this paper employs Aslan & Zech's test (AZ-test) and Random Forests (RF) successively to investigate the influence of data selection from different seasons for training and test sets. The dataset obtained in a city with distinct seasons, Hakodate, Japan, is used for our empirical analysis. The results of AZ-test suggest that explanatory variables of the two data sets from distinct seasons follow different distributions. Furthermore, it concludes that data set from two-seasons and data set from single season also follow different distributions. However, this test achieves some contradictory results in some cases. Due to this, RF is used to check how the accuracy varies in a further detail. RF confirms the findings by AZ-test in most cases. In addition, RF results show that including GIS features as explanatory variables has positive effect on the identification accuracy while including weather features has negative effect on the identification accuracy.
Related Topics
Life Sciences Environmental Science Management, Monitoring, Policy and Law
Authors
, , ,