Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6576315 | Travel Behaviour and Society | 2018 | 10 Pages |
Abstract
Application of machine learning methods shows a popular attempt to identify the purpose of a trip and mode of travel on Global Positioning System (GPS) trajectory data. Data selection for the training and test sets is important in these methods. However, the feasibility and effects of choosing these data from different periods of the year are still unknown. This detail is particularly important since collecting data via GPS decreases the burden on participants to such an extent that it can last for seasons which may own distinct features. In order to bridge this gap, this paper employs Aslan & Zech's test (AZ-test) and Random Forests (RF) successively to investigate the influence of data selection from different seasons for training and test sets. The dataset obtained in a city with distinct seasons, Hakodate, Japan, is used for our empirical analysis. The results of AZ-test suggest that explanatory variables of the two data sets from distinct seasons follow different distributions. Furthermore, it concludes that data set from two-seasons and data set from single season also follow different distributions. However, this test achieves some contradictory results in some cases. Due to this, RF is used to check how the accuracy varies in a further detail. RF confirms the findings by AZ-test in most cases. In addition, RF results show that including GIS features as explanatory variables has positive effect on the identification accuracy while including weather features has negative effect on the identification accuracy.
Keywords
Related Topics
Life Sciences
Environmental Science
Management, Monitoring, Policy and Law
Authors
Lei Gong, Ryo Kanamori, Toshiyuki Yamamoto,