Tensorflow Speech Recognition Challenge
MM 0409/2018
can you build an algorithm that understands simple speech commands?
This competition belongs to Featured Prediction Competition.
This competition is found by Google Brain, and the total prize is $25,000
There are 1,315 teams joins this competition.
Overview
Description
We might be on the verge of too many screens. A promising antidote to our screen addition are voice interfaces. But it is hard to build a speech detector using free, open data and code for independent makers and entrepreneurs. Many voice recognition datasets require preprocessing before a neural network model can be built on them.
Tensorflow recently released the Speech Commands Datasets. It includes 65,000 one-second long utterances of 30 short words, by thousands of different people.
In this competition, you're challenged to use the Speech Commands Dataset to build an algorithm that understands spoken commands. By improving the recognition accuracy of open-sourced voice interface tools, the product effectiveness and their accessibility can be improved.
Evaluation
Submissions are evaluated on Multiclass Accuracy, which is the avverage number of observations with the correct label.
There are 12 possible labels for the Test set: yes, no, up, down, left, right, on, off, stop, go, silence, unknown.
The unknown label should be used for a command that is not one of the first 10 labels or that is not silence.
For audio clip in the test set, you must predict the correct label.
The submission file should comtain a header and have the following format
fname, label (fname refers to file name)
clip_000044442.wav,silence
clip_000adecb.wave,left
clip_0000d4322.wav,unknown
Prize
The leaderboard prizes:
1st place - $8,000
2nd place -$6,000
3rd place -$3,000
special Tensorflow prize -$8,000
The goal of the special prize is to encourage contestants to create a model that can be useful in practice to recognize commands on Raspberry Pi 3. In order to do this, there are several criteria:
1 The model must be runnable as frozen TensorFlow GraphDef files with no additional dependencies beyond TensorFlow 1.4.
2 The models must be smal in size (below 5 M bytes).
3 The model must have a standard set of inputs and outputs:
4 The model must run in less than 200ms on a stock Raspberry Pi 3 running Raspbian GNU/Linux 8 (Jessie), with no overclocking.
5 The model must come with code to train the model, which must be license-compatible with Tensorflow (Apache), and be submittable through Google's CLA to the Tensorflow project.
...
Timeline
January 9, 2018
- Entry deadline. You must accept the competition rules before this date in order to compete.
- January 9, 2018
- Team Merger deadline. This is the last day participants may join or merge teams.
- January 16, 2018
- Final submission deadline.
Tutorial & More Info
Google Research Blog Post announcing the Speech Commands Dataset. Note that much of what is provided as part of the training set is already public. However, the test set is not.
TensorFlow Audio Recognition Tutorial
Link to purchase Raspberry Pi 3 on Amazon. This will be at your own expense.
Also review the Prizes tab for details and tools for how the special prize will be evaluated.
[0]
https://www.kaggle.com/c/tensorflow-speech-recognition-challenge
antidote: 解毒藥
utterances:說話、講話