Setup Google Cloud Speech-to-Text types.RecognitionConfig

048     samp_freq, _ = wavfile.read(speech_file)
049     if beta:
050         from google.cloud import speech_v1p1beta1 as speech
051         from google.cloud.speech_v1p1beta1 import enums
052         from google.cloud.speech_v1p1beta1 import types
053 
054         config = types.RecognitionConfig(
055             encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
056             sample_rate_hertz=samp_freq,
057             language_code=language_code,
058             #alternative_language_codes=['en-US'],
059             alternative_language_codes = alternative_language,
060             enable_word_time_offsets=True,
061             speech_contexts=[types.SpeechContext(
062                 phrases=['四', '三', '二', '一'],
063                 )],
064             enable_word_confidence=True
065             )
066 
067     else:        
068         from google.cloud import speech
069         from google.cloud.speech import enums
070         from google.cloud.speech import types
071 
072         config = types.RecognitionConfig(
073             encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
074             sample_rate_hertz=samp_freq,
075             language_code=language_code,
076             enable_word_time_offsets=True,
077             speech_contexts=[types.SpeechContext(
078                 phrases=['四', '三', '二', '一'],
079                 )],
080             #enable_word_confidence=True
081             )

行48透過scipy.io.wavfile.read取得音頻的採樣頻率。行49透過beta變數的布林值，來決定Google Cloud Speech-to-Text API是要用beta版(行50-65)還是正式版(行68-81)。

beta版是使用google.cloud.speech_v1p1beta1 module(行50~52)

正式版是使用google.cloud.speech module(行68-70)

config的設定，beta版跟正式版大同小異，相同的地方有 $\colon$

encoding(enums.RecognitionConfig.AudioEncoding.LINEAR16) $\colon$ 16-bit PCM 編碼。
sample_rate_hertz(samp_freq) $\colon$ 與音頻採樣頻率相同(16kHz)
language_code(language_code) $\colon$ cmn-Hant-TW 繁體中文。
enable_word_time_offsets(True) $\colon$ 是否顯示單字時間點(是)。
speech_contexts('四', '三', '二', '一') $\colon$ 需要加強辨識關鍵字彙。

beta版跟正式版不同的地方是beta多了 $\colon$

alternative_language_code $\colon$ 支援不同語言混雜的音頻。
enable_word_confidence(True) $\colon$ 每個單字的信心指數(0~1)。

Setup Google Cloud Speech-to-Text types.RecognitionConfig

Setup Google Cloud Speech-to-Text types.RecognitionConfig

results matching ""

No results matching ""