Machine Learning for Prediction of Mid to Long Term Habitual Transportation Mode Use
Prediction of daily transportation mode use (car, public transit, or active travel) is a important task in transportation research. Unlike statistical models that impose a predetermined model structure, machine learning models are learned from the data, making them more flexible with higher prediction accuracy. However, prediction of mid- to long-term habitual modes still largely relies on traditional statistical analysis using small samples of cross-sectional data. Low interpretability of “black-box” machine learning models limits their usefulness for generating behavior insights needed for designing appropriate interventions. This paper, leveraging a set of unique longitudinal life course data, is the first use case to demonstrate machine learning methods applied for both predicting and interpreting regularly used travel modes. We combine sequence clustering and tree-based machine learning methods coupled with TreeExplainer to predict and interpret habitual travel modes using mid- to long-term predictors. Five life course clusters are derived to provide evaluation and interpretation contexts. This allows us to improve upon a recently developed TreeExplainer method to better distinguish predictor importance locally and globally; and predictor interactions across subpopulations within distinctive life history contexts. Our results demonstrate a promising step toward interpretable machine learning applications to mid- to long-term prediction of travel modes for transportation planning.