Time series play a major role in many analysis tasks. As an example, in the stock market, they can be used to model price histories and to make predictions about future trends. Sometimes, information contained in a time series is complemented by other kinds of data, which may be encoded by static attributes, e.g., categorical or numeric ones, or by more general discrete data sequences. In this paper, we present J48SS, a novel decision tree learning algorithm capable of natively mixing static, sequential, and time series data for classification purposes. The proposed solution is based on the well-known C4.5 decision tree learner, and it relies on the concept of time series shapelets, which are generated by means of multi-objective evolutionary computation techniques and, differently from most previous approaches, are not required to be part of the training set. We evaluate the algorithm against a set of well-known UCR time series datasets, and we show that it provides better classification performances with respect to previous approaches based on decision trees, while generating highly interpretable models and effectively reducing the data preparation effort.
A Novel Decision Tree Approach for the Handling of Time Series
Andrea Brunello;Angelo Montanari;
2018-01-01
Abstract
Time series play a major role in many analysis tasks. As an example, in the stock market, they can be used to model price histories and to make predictions about future trends. Sometimes, information contained in a time series is complemented by other kinds of data, which may be encoded by static attributes, e.g., categorical or numeric ones, or by more general discrete data sequences. In this paper, we present J48SS, a novel decision tree learning algorithm capable of natively mixing static, sequential, and time series data for classification purposes. The proposed solution is based on the well-known C4.5 decision tree learner, and it relies on the concept of time series shapelets, which are generated by means of multi-objective evolutionary computation techniques and, differently from most previous approaches, are not required to be part of the training set. We evaluate the algorithm against a set of well-known UCR time series datasets, and we show that it provides better classification performances with respect to previous approaches based on decision trees, while generating highly interpretable models and effectively reducing the data preparation effort.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.