Optimal Decision Trees

As powerful as black-box artificial intelligence with the interpretability of a single decision tree

On interpretability, trees rate an A+

The logic of a decision tree is easy to follow and transparent. It mimics the human thought process by successively asking questions that adapt based on previous answers, until enough knowledge is gained to make a final prediction.

This level of transparancy permits exploring the tree through "what-if" analysis, reinforcing global understanding of the decision process. Anybody that is familiar with the domain and data is able to audit the branching logic to judge whether it agrees with intuition and would be robust to new data, permitting an extremely high level of confidence in the behavior of the model.

Example Optimal Decision Tree predicting the probability that a property is going to sell on a real estate platform

Performance breakthrough with modern optimization

Traditional decision tree methods like CART have poor performance compared to black-box methods like random forests and boosting.

Harnessing the power of modern optimization, Optimal Decision Trees (OCT and OCT-H) deliver performance comparable to black-box methods while maintaining the interpretability of a single decision tree.

Our Optimal Decision Trees are named as one of the pre-eminent examples of self-explanatory models by the National Institute for Standards and Technologies (NIST) in their reference white paper on Explainable AI.

A large-scale benchmark study compares the out-of-sample performance of Optimal Decision Trees (green and blue) against other methods

Specialized for each problem

Four flavors of trees tailored to different problem types

  • Optimal Classification Trees

    Predicts discrete labels - is this loan likely to default or not?

  • Optimal Regression Trees

    Predicts continuous/numeric values - what is the expected revenue for next quarter?

  • Optimal Survival Trees

    Predicts survival over time - what is the chance the machine breaks in the next week/month/quarter?

  • Optimal Prescriptive and Policy Trees

    Prescribes personalized optimal decisions - which marketing outreach strategy is best for each client?
    For more details see Causal Inference and Policy Learning

Global optimization extracts maximum value from data

Traditional decision tree methods like CART construct decision trees using greedy heuristics that form the tree one split at a time.

Harnessing advances in modern optimization techniques, our proprietary optimization engine constructs the entire tree at once, finding the optimal tree that best fits the data.

Compared to existing decision tree methods, Optimal Decision Trees deliver both higher predictive power and smaller trees, by making efficient use of data.

A synthetic study demonstrates the improved out-of-sample performance of Optimal Classification Trees over CART, most pronounced with limited training data

Practical and scalable

Our optimization-based approach enables unprecedented flexibility and speed in model construction.
  • Supports all data types

    Support for numeric, categorical, ordinal and missing data, with tailored algorithms to extract the most value from each.

  • Hyperplane splits

    Unique to Optimal Decision Trees, hyperplane splits permit use of more than one feature at a time, enabling more expressive modeling and better performance.

  • Scales to real-world problems

    Our proprietary optimization engine scales to practical problem sizes with millions of observations, unlike other proposals for global tree learning.

Example hyperplane splits into three classes using two variables

Related publications

Optimal Classification Trees

Dimitris Bertsimas and Jack Dunn

Machine Learning, 2017

The original publication by the co-founders pioneering Optimal Trees. The paper developed the first scalable mixed-integer optimization formulation for training optimal decision trees, and presents empirical results that such trees outperform classical methods such as CART.

Optimal Prescriptive Trees

Dimitris Bertsimas, Jack Dunn, and Nishanth Mundru

INFORMS Journal on Optimization, 2019

This paper extends the optimal trees optimization framework to the field of prescriptive decision making. The resulting Optimal Prescriptive Trees learn how to prescribe directly from observational data, and perform competitively with the best black-box methods for the same task.

Optimal Survival Trees

Dimitris Bertsimas, Jack Dunn, Emma Gibson, and Agni Orfanoudaki

Machine Learning, 2021.

The optimal trees optimization framework is extended to the task of survival analysis. Optimal Survival Trees learn factors that affect survival over a continuous time period, with direct applications to healthcare and predictive maintenance.

Optimal Policy Trees

Maxime Amram, Jack Dunn, and Ying Daisy Zhuo

Machine Learning, under review.

Optimal Policy Trees combines methods from the causal inference literature with global optimality under the Optimal Trees framework. This method yields interpretable prescription policies, is highly scalable, handles both discrete and continuous treatments, and has shown superior performance in multiple expriments.

Want to try Optimal Decision Trees?
We provide free academic licenses and evaluation licenses for commercial use.
We also offer consulting services to develop interpretable solutions to your key problems.

© 2020 Interpretable AI, LLC. All rights reserved.