Ending Rolling Submissions for DAWNBench
Building on our experience with DAWNBench, we helped create MLPerf as an industry-standard for measuring machine learning system performance. Now that both the MLPerf Training and Inference benchmark suites have successfully launched, we have decided to end rolling submissions to DAWNBench on 3/27/2020 to consolidate benchmarking efforts. Until then, we will continue to accept new submissions via pull requests to dawn-bench-entries.
Since the end of the first round of DAWNBench, we continued to see impressive results from the community. ImageNet training time dropped by another 11x from 30 minutes to under 3 minutes, and ImageNet inference latency has dropped by 20x. Similarly, on CIFAR10, training time is down to 10 seconds. These improvements are thanks to a growing community with new submissions from engineers at Alibaba, Baidu, Huawei, myrtle.ai, Apple, and many more!
However, we have been actively involved with MLPerf to expand the end-to-end deep learning performance benchmarking methodology we introduced with DAWNBench to a more comprehensive suite of tasks and scenarios. The MLPerf Training benchmark suite has gone through two rounds of submissions (v0.5 and v0.6), and the MLPerf Inference benchmark suite has finished its first round, which garnered 595 inference benchmark results from 14 organizations. With the demonstrated success of both suites and more than 60 supporting organizations, we are passing the torch to MLPerf to continue to provide fair and useful benchmarks for measuring training and inference performance.