This page collects papers describing the "tree alternating optimisation" (TAO) algorithm, a new way to train many types of decision trees (axisaligned, oblique, etc.) and treebased models, so that they optimise a desired objective function and regularisation term or constraint. TAO can be used by itself to learn a single tree, or as the base learner of a forest (tree ensemble). It vastly outperforms the traditional, greedy recursive partitioning algorithms such as CART or C4.5 (in isolation or as part of a forest), and allows one to learn models that strike a good balance between accuracy and interpretability.
This work has been done in collaboration with my past and present students and collaborators Magzhan Gabidolla, Kuat Gazizov, Yerlan Idelbayev, Rasul Kairgeldin and Arman Zharmagambetov.
It has been funded in part by:

Coming soon.
CarreiraPerpiñán, M. Á. and Tavallali, P. (2018): "Alternating optimization of decision trees, with application to learning sparse oblique trees". Advances in Neural Information Processing Systems 31 (NeurIPS 2018), pp. 12111221.
[external link] [paper preprint] [supplementary material] [poster]
Zharmagambetov, A. and CarreiraPerpiñán, M. Á. (2020): "Smaller, more accurate regression forests using tree alternating optimization". 37th Int. Conf. Machine Learning (ICML 2020), pp. 1139811408.
[external link] [paper preprint] [slides] [supplementary material] [video]
CarreiraPerpiñán, M. Á. and Zharmagambetov, A. (2020): "Ensembles of bagged TAO trees consistently improve over random forests, AdaBoost and gradient boosting". ACMIMS Foundations of Data Science Conf. (FODS 2020), pp. 3546.
[external link] [paper preprint] [slides]
Gabidolla, M., Zharmagambetov, A. and CarreiraPerpiñán, M. Á. (2020): "Boosted sparse oblique decision trees". Bay Area Machine Learning Symposium (BayLearn 2020).
[external link] [paper preprint] [video]
CarreiraPerpiñán, M. Á. and Hada, S. S. (2021): "Counterfactual explanations for oblique decision trees: exact, efficient algorithms". 35th AAAI Conf. Artificial Intelligence (AAAI 2021), pp. 69036911.
[external link] [paper preprint] [slides] [poster] [Python implementation (coming soon)]
Longer version: Mar. 1, 2021, arXiv:2103.01096: [external link] [paper preprint]
Zharmagambetov, A. and CarreiraPerpiñán, M. Á. (2021): "Learning a tree of neural nets". IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2021), pp. 31403144.
[external link] [paper preprint] [slides] [© IEEE]
Zharmagambetov, A., Gabidolla, M. and CarreiraPerpiñán, M. Á. (2021): "Improved boosted regression forests through nongreedy tree optimization". Int. Joint Conf. on Neural Networks (IJCNN 2021).
[external link] [paper preprint] [slides] [© IEEE]
Zharmagambetov, A., Hada, S. S., Gabidolla, M. and CarreiraPerpiñán, M. Á. (2021): "Nongreedy algorithms for decision tree optimization: an experimental comparison". Int. Joint Conf. on Neural Networks (IJCNN 2021).
[external link] [paper preprint] [slides] [© IEEE]
Older version: "An experimental comparison of old and new decision tree algorithms", Mar. 20, 2020, arXiv:1911.03054: [external link] [paper preprint]
Hada, S. S., CarreiraPerpiñán, M. Á. and Zharmagambetov, A. (2021): "Understanding and manipulating neural net features using sparse oblique classification trees". IEEE Int. Conf. Image Processing (ICIP 2021), pp. 37073711.
[external link] [paper preprint] [slides] [poster] [© IEEE]
Zharmagambetov, A. and CarreiraPerpiñán, M. Á. (2021): "A simple, effective way to improve neural net classification: ensembling unit activations with a sparse oblique decision tree". IEEE Int. Conf. Image Processing (ICIP 2021), pp. 369373.
[external link] [paper preprint] [slides] [poster] [© IEEE]
Zharmagambetov, A., Gabidolla, M. and CarreiraPerpiñán, M. Á. (2021): "Improved multiclass Adaboost for image classification: the role of tree optimization". IEEE Int. Conf. Image Processing (ICIP 2021), pp. 424428.
[external link] [paper preprint] [slides] [poster] [© IEEE]
Hada, S. S. and CarreiraPerpiñán, M. Á. (2021): "Exploring counterfactual explanations for classification and regression trees". Int. Workshop and Tutorial on eXplainable Knowledge Discovery in Data Mining (at ECML 2021).
[external link] [paper preprint] [slides]
Zharmagambetov, A., Gabidolla, M. and CarreiraPerpiñán, M. Á. (2021): "Softmax Tree: an accurate, fast classifier when the number of classes is large". Conf. Empirical Methods in Natural Language Processing (EMNLP 2021), pp. 1073010745.
[external link] [paper preprint] [slides] [poster]
Zharmagambetov, A. and CarreiraPerpiñán, M. Á. (2022): "Learning interpretable, treebased projection mappings for nonlinear embeddings". 25th Int. Conf. Artificial Intelligence and Statistics (AISTATS 2022), pp. 95509570.
[external link] [paper preprint] [slides] [poster] [animations] [video]
Hada, S. S. and CarreiraPerpiñán, M. Á. (2022): "Interpretable image classification using sparse oblique decision trees". IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2022), pp. 27592763.
[external link] [paper preprint] [slides] [poster] [© IEEE]
Gabidolla, M. and CarreiraPerpiñán, M. Á. (2022): "Pushing the envelope of gradient boosting forests via globallyoptimized oblique trees". IEEE Conf. Computer Vision and Pattern Recognition (CVPR 2022), pp. 285294.
[external link] [paper preprint] [supplementary material] [slides] [poster] [© IEEE]
Gabidolla, M., Zharmagambetov, A. and CarreiraPerpiñán, M. Á. (2022): "Improved multiclass AdaBoost using sparse oblique decision trees". Int. Joint Conf. on Neural Networks (IJCNN 2022).
[external link] [paper preprint] [slides] [© IEEE]
Hada, S. S. and CarreiraPerpiñán, M. Á. (2022): "Sparse oblique decision trees: a tool to interpret natural language processing datasets". Int. Joint Conf. on Neural Networks (IJCNN 2022).
[external link] [paper preprint] [slides] [© IEEE]
Gabidolla, M. and CarreiraPerpiñán, M. Á. (2022): "Optimal interpretable clustering using oblique decision trees". ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2022), pp. 400410.
[external link] [paper preprint] [slides] [poster]
Zharmagambetov, A. and CarreiraPerpiñán, M. Á. (2022): "Semisupervised learning with decision trees: graph Laplacian tree alternating optimization". Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp. 23922405.
[external link] [external link2] [paper preprint] [supplementary material] [poster] [video]
Extended abstract at the Bay Area Machine Learning Symposium (BayLearn 2022): [paper preprint] [poster]
CarreiraPerpiñán, M. Á. and Hada, S. S. (2023): "Very fast, approximate counterfactual explanations for decision forests". 37th AAAI Conf. Artificial Intelligence (AAAI 2023), pp. 69356943.
[external link] [paper preprint] [slides] [slides] [Python implementation (coming soon)]
Longer version: Mar. 5, 2023, arXiv:2303.02883: [external link] [paper preprint]
Hada, S. S., CarreiraPerpiñán, M. Á. and Zharmagambetov, A.: "Sparse oblique decision trees: a tool to understand and manipulate neural net features". Data Mining and Knowledge Discovery, (): (to appear).
[external link] [paper preprint]
Also as: Jan. 30, 2023, arXiv:2104.02922: [external link] [paper preprint]
Many of the figures in the publisher's version are badly messed up, with wrong labels. The arXiv paper has the correct figures.
CarreiraPerpiñán, M. Á., Gabidolla, M. and Zharmagambetov, A. (2023): "Towards better decision forests: Forest Alternating Optimization". IEEE Conf. Computer Vision and Pattern Recognition (CVPR 2023), pp. 75897598.
[external link] [OpenReview] [paper preprint] [supplementary material] [poster] [slides] [© IEEE]
CVPR 2023 Art Gallery videos:
This animation for a 2D dataset with 6 classes shows the result of training an oblique decision tree of depth 3 (with 8 leaves), with random initial parameters, for one TAO iteration. The animation shows the result of optimizing (bottomup, i.e., from the leaves towards the root) each node's parameters given the other nodes are fixed. The objective function over the whole tree decreases monotonically after each update. Note that some decision nodes are automatically pruned in the process. Animation done by Magzhan Gabidolla.
This animation for a 2D dataset with 6 classes shows the result of training a forest of 2 oblique decision trees of depth 2 (with 4 leaves each), with random initial parameters, for 3 FAO iterations. FAO (Forest Alternating Optimization) optimizes each tree in turn using TAO, and also optimizes all leaves of all trees jointly. The objective function over the whole forest decreases monotonically after each update. Nodes can also be automatically pruned in the process (although this does not happen in this particular case). Animation done by Magzhan Gabidolla.