Article ID Journal Published Year Pages File Type
525342 Transportation Research Part C: Emerging Technologies 2013 16 Pages PDF
Abstract

•Incident duration prediction models so far either ignore or deal lightly with incident record messages.•We apply Topic Modeling techniques to convert textual information from real-time incident reports into predictive attributes.•We apply Topic Modeling techniques to convert textual information from real-time incident reports into predictive attributes.•Results show that models with text analysis consistently outperform those without such feature, decreasing the error by 28%.•Same technique applies to other contexts where information is available in textual form (e.g. special events websites).

Due to the heterogeneous case-by-case nature of traffic incidents, plenty of relevant information is recorded in free flow text fields instead of constrained value fields. As a result, such text components enclose considerable richness that is invaluable for incident analysis, modeling and prediction. However, the difficulty to formally interpret such data has led to minimal consideration in previous work.In this paper, we focus on the task of incident duration prediction, more specifically on predicting clearance time, the period between incident reporting and road clearance. An accurate prediction will help traffic operators implement appropriate mitigation measures and better inform drivers about expected road blockage time.The key contribution is the introduction of topic modeling, a text analysis technique, as a tool for extracting information from incident reports in real time. We analyze a dataset of 2 years of accident cases and develop a machine learning based duration prediction model that integrates textual with non-textual features. To demonstrate the value of the approach, we compare predictions with and without text analysis using several different prediction models. Models using textual features consistently outperform the others in nearly all circumstances, presenting errors up to 28% lower than models without such information.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, , ,