Search papers, labs, and topics across Lattice.
This paper investigates the performance of Hoeffding Trees in imbalanced regression tasks within streaming environments by incorporating Kernel Density Estimation (KDE) for smoothed predictions and Hierarchical Shrinkage (HS) for post-hoc regularization. The authors extend batch learning KDE and HS methods to incremental decision tree models using a telescoping argument and incremental implementation, respectively. Their experiments on common online regression datasets reveal that KDE improves performance in the initial stages of the stream, while HS provides minimal benefits.
Kernel Density Estimation can significantly boost the early-stage performance of Hoeffding Trees in imbalanced streaming regression, offering a simple way to improve model accuracy when data is scarce.
Many real-world applications provide a continuous stream of data that is subsequently used by machine learning models to solve regression tasks of interest. Hoeffding trees and their variants have a long-standing tradition due to their effectiveness, either alone or as base models in broader ensembles. At the same time a recent line of work in batch learning has shown that kernel density estimation (KDE) is an effective approach for smoothed predictions in imbalanced regression tasks [Yang et al., 2021]. Moreover, another recent line of work for batch learning, called hierarchical shrinkage (HS) [Agarwal et al., 2022], has introduced a post-hoc regularization method for decision trees that does not alter the structure of the learned tree. Using a telescoping argument we cast KDE to streaming environments and extend the implementation of HS to incremental decision tree models. Armed with these extensions we investigate the performance of decision trees that may enjoy such options in datasets commonly used for regression in online settings. We conclude that KDE is beneficial in the early parts of the stream, while HS hardly, if ever, offers performance benefits. Our code is publicly available at: https://github.com/marinaAlchirch/DSFA_2026.