New research streamlines machine learning

To solve complex problems, data scientists must shepherd their raw data through a series of steps, each one requiring many human-driven decisions. The last step in the process, deciding on a modeling technique, is particularly crucial. A new automated machine-learning system performs as well or better than its human counterparts — and works 100 times faster.


The tremendous recent growth of data science — both as a discipline and an application — can be attributed, in part, to its robust problem-solving power: It can predict when credit card transactions are fraudulent, help business owners figure out when to send coupons in order to maximize customer response, or facilitate educational interventions by forecasting when a student is on the cusp of dropping out.
To get to these data-driven solutions, though, data scientists must shepherd their raw data through a complex series of steps, each one requiring many human-driven decisions. The last step in the process, deciding on a modeling technique, is particularly crucial. There are hundreds of techniques to choose from — from neural networks to support vector machines — and selecting the best one can mean millions of dollars of additional revenue, or the difference between spotting a flaw in critical medical devices and missing it.
In a paper called "ATM: A distributed, collaborative, scalable system for automated machine learning," which was presented last week at the IEEE International Conference on Big Data, researchers from MIT and Michigan State University present a new system that automates the model selection step, even improving on human performance. The system, called Auto-Tuned Models (ATM), takes advantage of cloud-based computing to perform a high-throughput search over modeling options, and find the best possible modeling technique for a particular problem. It also tunes the model's hyperparameters — a way of optimizing the algorithm — which can have a substantial effect on performance. ATM is now available for enterprise as an open-source platform.
To compare ATM with human performers, the researchers tested the system against users of a collaborative crowdsourcing platform, On this platform, data scientists work together to solve problems, finding the best solution by building on each other's work. ATM analyzed 47 datasets from the platform and was able to deliver a solution better than the one humans had come up with 30 percent of the time. When it couldn’t outperform humans, it came very close, and crucially, it worked much more quickly than humans could. While open-ml users take an average of 100 days to deliver a near-optimal solution, ATM can arrive at an answer in less than a day.

Faster big-data analysis

A new MIT computer system speeds computations involving “sparse tensors,” multidimensional data arrays that consist mostly of zeroes. System for performing “tensor algebra” offers 100-fold speedups over previous software packages.


We live in the age of big data, but most of that data is “sparse.” Imagine, for instance, a massive table that mapped all of Amazon’s customers against all of its products, with a “1” for each product a given customer bought and a “0” otherwise. The table would be mostly zeroes.
With sparse data, analytic algorithms end up doing a lot of addition and multiplication by zero, which is wasted computation. Programmers get around this by writing custom code to avoid zero entries, but that code is complex, and it generally applies only to a narrow range of problems. At the Association for Computing Machinery’s Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH), researchers from MIT, the French Alternative Energies and Atomic Energy Commission, and Adobe Research recently presented a new system that automatically produces code optimized for sparse data.
That code offers a 100-fold speedup over existing, non-optimized software packages. And its performance is comparable to that of meticulously hand-optimized code for specific sparse-data operations, while requiring far less work on the programmer’s part. The system is called Taco, for tensor algebra compiler. In computer-science parlance, a data structure like the Amazon table is called a “matrix,” and a tensor is just a higher-dimensional analogue of a matrix. If that Amazon table also mapped customers and products against the customers’ product ratings on the Amazon site and the words used in their product reviews, the result would be a four-dimensional tensor.
“Sparse representations have been there for more than 60 years,” says Saman Amarasinghe, an MIT professor of electrical engineering and computer science (EECS) and senior author on the new paper. “But nobody knew how to generate code for them automatically. People figured out a few very specific operations — sparse matrix-vector multiply, sparse matrix-vector multiply plus a vector, sparse matrix-matrix multiply, sparse matrix-matrix-matrix multiply. The biggest contribution we make is the ability to generate code for any tensor-algebra expression when the matrices are sparse.”
Joining Amarasinghe on the paper are first author Fredrik Kjolstad, an MIT graduate student in EECS; Stephen Chou, also a graduate student in EECS; David Lugato of the French Alternative Energies and Atomic Energy Commission; and Shoaib Kamil of Adobe Research.

Identifying optimal product prices

New research describes a price-optimization method to increase online retailers’ revenue, market share, and profit.


How can online businesses leverage vast historical data, computational power, and sophisticated machine-learning techniques to quickly analyze and forecast demand, and to optimize pricing and increase revenue?
A research highlight article in the Fall 2017 issue of MIT Sloan Management Review by MIT Professor David Simchi-Levi describes new insights into demand forecasting and price optimization. Algorithm increases revenue by 10 percent in six months Simchi-Levi developed a machine-learning algorithm, which won the INFORMS Revenue Management and Pricing Section Practice Award, and first implemented it at online retailer Rue La La.
The initial research goal was to reduce inventory, but what the company ended up with was “a cutting-edge, demand-shaping application that has a tremendous impact on the retailer’s bottom line,” Simchi-Levi says. Rue La La’s big challenge was pricing on items that have never been sold before and therefore required a pricing algorithm that could set higher prices for some first-time items and lower prices for others.
Within six months of implementing the algorithm, it increased Rue La La’s revenue by 10 percent. Simchi-Levi's process involves three steps for generating better price predictions: The first step involves matching products with similar characteristics to the products to be optimized. A relationship between demand and price is then predicted with the help of a machine-learning algorithm.
The second step requires testing a price against actual sales, and adjusting the product's pricing curve to match real-life results. In the third and final step, a new curve is applied to help optimize pricing across many products and time periods.