Google Speech-to-Text API Experiment

Propose $\colon$ transcribe audio wave to text

Figure 1waveform 上面註記文字跟segment time。

choice 檔案結構 choice 1, choice 2, choice 3, choice 4

Sentence Segmentation [字跟字之間暫停] vs [段落間隔]

Selecting models

Type	Enum constant	Description	Supported languages
Video	video	transcribing audio in video clips. For best results, audio is recorded at 16,000Hz or greater sampling rate.	en-US only
Phone call	phone_call	transcribing audio from phone call. Typically, phone audio is recorded at 8,000Hz sampling rate.	en-US only
Command and search	command_and_search	transcribing shorter audio clips.	All available languages
Default	default	Use this model if your audio does not fit one of the previously described models. Ideally, audio is high-fidelity, recorded at 16,000Hz or greater sampling rate.	All available languages

Phrase hints

speechContext can be pass by RecognitionConfig to provides information to aid in processing the given audio. a speechContext can hold a list of phrases to act as "hints" to the recognizer; these phrases can boost the probability that such words or phrases will be recognized.

Improve the accuracy for specific words and phrases that may tend to overrepresented in your audio data. For example, if specific "commands" are typically spoken by the user, you can provide these as phrase hints. Such additional phrases may be particularly useful if the supplied audio contains noise or speech is not very clear.
Add additional words to the vocabulary of the recognition task. Speech-to-Text includes a very large vocabulary. However, if proper names or domain-specific words are out-of-vocabulary, you can add them to the phrases provided to your request's speechContext.

Realization

    "speechContexts": {
      "phrases":["四","三","二","一"]
     }

original [output]

[0] https://cloud.google.com/speech-to-text/docs/basics?hl=zh-tw

Google Speech-to-Text API Experiment

Google Speech-to-Text API Experiment

Selecting models

Phrase hints

Realization

results matching ""

No results matching ""