a field guide, distilled from MIT OpenCourseWare · system by [email protected]
Intelligence is the part of the system that learns. A machine that runs collects data, and data used well makes the next run smarter. The question is never whether it learns, it is what it learns from. Here are the durable ideas from MIT's machine-learning and analytics courses, in plain terms, each one cited to the course it came from.
Every business decision is a prediction. Analytics makes the prediction honest.
Under every choice (which lead to call, which offer to run, which customer is about to leave) is a forecast about what will happen. Analytics replaces the gut version with one built from data you can check: regression, decision trees, and text analysis turn a hunch into a number you can be right or wrong about. The discipline is comparing the prediction to what actually happened, then updating.A model is only as honest as the data and the question behind it. Garbage in is still garbage out, and a confident prediction on bad data is more dangerous than a hunch you knew to doubt.The Analytics Edge 15.071-spring-2017 (Prof. Dimitris Bertsimas)
A model that only learns from itself overfits to its own habitat.
Overfitting is when a model memorizes the quirks of the small sample it trained on instead of the real pattern, so it looks perfect on what it has seen and fails on anything new. Generalization is the opposite: performing on data it was never shown. The cure is more data, and more varied data. One business is a sample of one market; a network of businesses is a larger, broader sample, so a model trained across the network generalizes where a walled-off one overfits.More data only helps if it is relevant. A network of unrelated businesses adds noise, not signal. The real edge comes from many operators solving a similar problem, where one brand's hard-won result becomes a starting point the next one inherits.Introduction to Machine Learning 6.036-fall-2020 (Profs. Kaelbling & Lozano-Pérez)
Predict, then optimize. They are two steps, not one.
Analytics has two halves. First predict what is likely to happen; then optimize, which is choosing the best action given that prediction and your constraints. A forecast that never becomes a decision is just a dashboard. Most of the value sits in the second step, the one that changes what you do on Monday.The optimization is only as good as the objective you give it. Optimize for booked calls and you may get cheap, unqualified ones. Name the real goal, constraints included, before you let a model chase it.The Analytics Edge 15.071-spring-2017 (Prof. Dimitris Bertsimas)
Most of what a customer tells you is unstructured text. A model can read it at scale.
Replies, reviews, and form notes are messy language, not tidy columns, and that is exactly where intent hides. Text analytics turns language into features a model can score, so every inbound message can be read, sorted, and ranked by how likely it is to buy. This is the qualify step done by machine instead of by hand.Language is context-heavy and easy to misread; a sarcastic "great" is not a happy customer. Keep a human on the edge cases and treat the score as a ranking aid, not a verdict.The Analytics Edge 15.071-spring-2017 (Prof. Dimitris Bertsimas)
Every "smarter model" claim rests on statistical inference.
Inference is estimating the truth from limited, noisy data while stating how sure you can be. It is the engine under classification and regression and the reason a model can be trusted at all. Methods like support vector machines, boosting, and Bayesian models are different ways to draw that estimate and carry the uncertainty through to the answer.A claim with no measure of uncertainty is a guess presented as a fact. Rigor means stating how confident the model is, and on how much data, before anyone bets money on its output.Machine Learning 6.867-fall-2006 (Prof. Tommi Jaakkola)
Walled off, a model memorizes its own data. On the network, it learns the pattern. Overfitting versus generalization, the core lesson of 6.036-fall-2020.
A forecast is not a decision. Predict what will happen, optimize the action, then act. After 15.071-spring-2017.
Sources · MIT OpenCourseWare
The Analytics Edge 15.071-spring-2017 (Prof. Dimitris Bertsimas)