Stanford DAWN

Home

Exploiting Building Blocks of Data to Efficiently Create Training Sets

The ability of deep learning models to achieve state-of-the-art performance is grounded in the availability of large, labeled training sets. However, gathering this magnitude of ground truth labels is expensive and time-consuming. While users can write rules that check for specific words or patterns in text data, developing such heuristics for image or video data is challenging since the raw pixels are difficult to interpret. To address this issue, we present Coral, a paradigm that allows users to write heuristics...

Learning to Compose Domain-Specific Transformations for Data Augmentation

Data augmentation is a popular technique for increasing the size of labeled training sets by applying class-preserving transformations to create copies of labeled data points. In the image domain, it is a crucial factor in almost every state-of-the-art result today. However, the choice of types, parameterizations, and compositions of transformations applied can have a large effect on performance, and is tricky and time-consuming to tune by hand for a new dataset or task. In this blog post we describe our...

There and Back Again: A General Approach to Learning Sparse Models

Sparse models – models where only a small fraction of parameters are non-zero – arise frequently in machine learning. Sparsity is beneficial in several ways: sparse models are more easily interpretable by humans, and sparsity can yield statistical benefits – such as reducing the number of examples that have to be observed to learn the model. In a sense, we can think of sparsity as an antidote to the oft-maligned curse of dimensionality. In a recent paper, we ask: can...

Accelerated Stochastic Power Iteration

Surprisingly, standard acceleration doesn’t always work for stochastic PCA. We provide a very simple stochastic PCA algorithm, based on adding a momentum term to the power iteration, that achieves the optimal sample complexity and an accelerated iteration complexity in terms of the eigengap. Importantly, it is embarrassingly parallel, allowing accelerated convergence in terms of wall-clock time. Our results hinge on a tight variance analysis of a stochastic two-term matrix recurrence, which implies acceleration for a wider class of non-convex problems....

Automatic Time Series Smoothing with ASAP

Dashboard-based visualization is critical in monitoring and diagnosing modern applications and services. However, most time-series dashboards simply plot raw data as it arrives. In a recent paper, we showed it’s possible to increase human accuracy in identifying anomalies in time series visualizations by up to 38% while reducing response time by up to 44% by adopting a simple strategy: smooth your dashboards! Moreover, our ASAP.js library will smooth your plots automatically. As a motivating example, consider the two plots of...

Weak Supervision: The New Programming Paradigm for Machine Learning

Getting labeled training data has become the key development bottleneck in supervised machine learning. We provide a broad, high-level overview of recent weak supervision approaches, where noisier or higher-level supervision is used as a more expedient and flexible way to get supervision signal, in particular from subject matter experts (SMEs). We provide a simple, broad definition of weak supervision as being comprised of one or more noisy conditional distributions over unlabeled data, and focus on the key technical challenge of...

YellowFin: An automatic tuner for momentum SGD

Hand-tuned momentum SGD is competitive with state-of-the-art adaptive methods, like Adam. We introduce YellowFin, an automatic tuner for the hyperparameters of momentum SGD. YellowFin can train models such as large LSTMs and certain ResNets in fewer iterations than the state of the art. It performs even better in asynchronous settings via an on-the-fly momentum adaptation scheme that uses a novel momentum measurement component along with a negative-feedback loop mechanism. Comparing YellowFin to Adam on training a ResNet on CIFAR100 (left)...

DAWN featured on the ARCHITECHT Show

Listen to PIs Peter Bailis and Matei Zaharia discuss the DAWN project on the ARCHITECT Show here (the discussion begins around the 21 minute mark).

NoScope: 1000x Faster Deep Learning Queries over Video

Video data is exploding – the UK alone has over 4 million CCTVs, and users upload over 300 hours of video to YouTube every minute. Recent advances in deep learning enable automated analysis of this growing amount of video data, allowing us to query for objects of interest, detect unusual and abnormal events, and sift through lifetimes of video that no human would ever want to watch. However, these deep learning methods are extremely computationally expensive: state-of-the-art methods for object...

HoloClean - Weakly Supervised Data Repairing

Data cleaning and repairing account for about 60% of the work of data scientists. Noisy and erroneous data is a major bottleneck in analytics. Data cleaning and repairing account for about 60% of the work of data scientists. To address this bottleneck, we recently introduced HoloClean, a semi-automated data repairing framework that relies on statistical learning and inference to repair errors in structured data. In HoloClean, we build upon the paradigm of weak supervision and demonstrate how to leverage diverse...