Optimal Decision Trees
As powerful as black-box artificial intelligence with the interpretability of a single decision tree
On interpretability, trees rate an A+
This level of transparancy permits exploring the tree through "what-if" analysis, reinforcing global understanding of the decision process. Anybody that is familiar with the domain and data is able to audit the branching logic to judge whether it agrees with intuition and would be robust to new data, permitting an extremely high level of confidence in the behavior of the model.
Example Optimal Decision Tree predicting the probability that a property is going to sell on a real estate platform
Performance breakthrough with modern optimization
Harnessing the power of modern optimization, Optimal Decision Trees (OCT and OCT-H) deliver performance comparable to black-box methods while maintaining the interpretability of a single decision tree.
Our Optimal Decision Trees are named as one of the pre-eminent examples of self-explanatory models by the National Institute for Standards and Technologies (NIST) in their reference white paper on Explainable AI.
A large-scale benchmark study compares the out-of-sample performance of Optimal Decision Trees (green and blue) against other methods
Specialized for each problem
Four flavors of trees tailored to different problem types
-
Optimal Classification Trees
Predicts discrete labels - is this loan likely to default or not?
-
Optimal Regression Trees
Predicts continuous/numeric values - what is the expected revenue for next quarter?
-
Optimal Survival Trees
Predicts survival over time - what is the chance the machine breaks in the next week/month/quarter?
-
Optimal Prescriptive and Policy Trees
Prescribes personalized optimal decisions - which marketing outreach strategy is best for each client?
For more details see Causal Inference and Policy Learning
Global optimization extracts maximum value from data
Harnessing advances in modern optimization techniques, our proprietary optimization engine constructs the entire tree at once, finding the optimal tree that best fits the data.
Compared to existing decision tree methods, Optimal Decision Trees deliver both higher predictive power and smaller trees, by making efficient use of data.
A synthetic study demonstrates the improved out-of-sample performance of Optimal Classification Trees over CART, most pronounced with limited training data
Practical and scalable
-
Supports all data types
Support for numeric, categorical, ordinal and missing data, with tailored algorithms to extract the most value from each.
-
Hyperplane splits
Unique to Optimal Decision Trees, hyperplane splits permit use of more than one feature at a time, enabling more expressive modeling and better performance.
-
Scales to real-world problems
Our proprietary optimization engine scales to practical problem sizes with millions of observations, unlike other proposals for global tree learning.
Example hyperplane splits into three classes using two variables
Selected cases using Optimal Trees
Pricing for Real Estate Auctions
Real Estate
Optimizing pricing recommendations to increase sales rates and revenue
Surgical Risk Calculator: POTTER
Healthcare
A highly accurate and understandable risk predictor trusted by top surgeons
Understanding Machine Failures in Car Manufacturing Plants
Manufacturing
Enable collaboration between data scientists and operators in automobile manufacturing
Related publications
Optimal Classification Trees
Dimitris Bertsimas and Jack Dunn
Machine Learning, 2017
The original publication by the co-founders pioneering Optimal Trees. The paper developed the first scalable mixed-integer optimization formulation for training optimal decision trees, and presents empirical results that such trees outperform classical methods such as CART.
Optimal Prescriptive Trees
Dimitris Bertsimas, Jack Dunn, and Nishanth Mundru
INFORMS Journal on Optimization, 2019
This paper extends the optimal trees optimization framework to the field of prescriptive decision making. The resulting Optimal Prescriptive Trees learn how to prescribe directly from observational data, and perform competitively with the best black-box methods for the same task.
Optimal Survival Trees
Dimitris Bertsimas, Jack Dunn, Emma Gibson, and Agni Orfanoudaki
Machine Learning, 2021.
The optimal trees optimization framework is extended to the task of survival analysis. Optimal Survival Trees learn factors that affect survival over a continuous time period, with direct applications to healthcare and predictive maintenance.
Optimal Policy Trees
Maxime Amram, Jack Dunn, and Ying Daisy Zhuo
Machine Learning, under review.
Optimal Policy Trees combines methods from the causal inference literature with global optimality under the Optimal Trees framework. This method yields interpretable prescription policies, is highly scalable, handles both discrete and continuous treatments, and has shown superior performance in multiple expriments.