I. Technical Background and Scenario Value

Huawei Core Speech Kit provides HarmonyOS developers with offline text-to-speech (TTS) capabilities, supporting Chinese, numeric, and mixed Chinese-English text synthesis. Its core scenarios focus on accessibility services, such as providing screen reading for visually impaired users in offline environments, addressing the pain point that traditional TTS relies on the internet. Key features include:
● Offline operation: No network connection required, ensuring privacy and stability
● Multi-strategy broadcasting: Supports word/number pronunciation, silent insertion, and Chinese character tone customization
● Low-latency response: Meets real-time interaction needs

II. Core Function Implementation

1. Engine Initialization

Create a TTS engine instance via createEngine and configure language, voice (e.g., "Ling Xiaoshan"), and operation mode (online/offline):

// Engine parameter configuration example  
let initParams: textToSpeech.CreateEngineParams = {  
  language: 'zh-CN',  
  person: 0,  // Default voice  
  online: 1,   // Online mode  
  extraParams: { style: 'interaction-broadcast', locate: 'CN' }  
};  
textToSpeech.createEngine(initParams, (err, engine) => {  
  // Process the engine instance  
});

2. Text Broadcasting Control

Trigger broadcasting via the speak method, supporting dynamic parameter adjustment:

let speakParams: textToSpeech.SpeakParams = {  
  requestId: 'unique_id',  // Unique identifier  
  extraParams: {  
    speed: 1.2,     // Speaking rate  
    volume: 15,     // Volume  
    pitch: 0.8      // Pitch  
  }  
};  
ttsEngine.speak("文本内容[h1][n2]", speakParams);

3. Broadcasting Strategy Configuration (Syntax Markup)

● Word broadcasting strategy: [hN]
"Hello[h1] World" // h1=letter-by-letter pronunciation, h2=word pronunciation
● Number processing strategy: [nN]
"[n2]12345" // n2=numeric reading (e.g., "一万两千三百四十五")
● Silent insertion: [pN] (unit: ms)
"开始[p500]播报" // Insert 500ms silence
● Chinese character pronunciation calibration: [=Pinyin+Tone]
"着[=zhuo2]手" // Force pronunciation as "zhuó"

III. Development Practice Key Points

1. Lifecycle Management

● Status monitoring: Handle broadcasting events via SpeakListener callback

speakListener: textToSpeech.SpeakListener = {  
  onStart: (id, res) => { /* Broadcasting starts */ },  
  onData: (id, audio) => { /* Receive audio stream */ },  
  onError: (id, code, msg) => { /* Error handling */ }  
};

● Resource release:

ttsEngine.stop();      // Stop current broadcasting  
ttsEngine.shutdown();  // Destroy the engine instance

2. Advanced Function Integration

● Audio stream processing: Obtain PCM data via onData callback for custom playback
● Multilingual support: Query available voices via listVoices

ttsEngine.listVoices({ online: 1 }, (err, voices) => {  
  console.log("Supported voices:", voices);  
});

IV. Constraints and Optimization Suggestions

● Device limitations: Only supports real devices, not emulators
● Text length: Single input ≤ 10,000 characters
● Performance optimization: Reuse engine instances to avoid frequent creation/destruction
● Error handling: Listen to onError and handle error codes (e.g., network exceptions, engine busy states)

V. Application Scenario Expansion

● Accessibility services: System-level screen reading
● Education: E-book reading, language learning pronunciation assistance
● IoT devices: Smart home voice feedback

Conclusion

Huawei Core Speech Kit provides an efficient accessibility solution for the HarmonyOS ecosystem through offline TTS capabilities and flexible broadcasting strategies. Developers can quickly integrate using the code paradigms provided in this article, customize voice interaction solutions based on business needs, and help build more inclusive smart terminal experiences.

Technical Analysis and Development Practice of HarmonyOS AI - Core Speech Kit Offline Text-to-Speech Technology