Flutter Google Speech-to-Text Plugin

A Flutter plugin for real-time speech-to-text using Google Cloud Speech-to-Text API with native gRPC streaming. This plugin supports both Android (Kotlin) and iOS (Swift) platforms with native audio recording and continuous transcription capabilities.

✨ Key Features

🚀 Native gRPC Streaming: Direct protobuf-based communication with Google Cloud Speech-to-Text
📱 Cross-platform Support: Native Android (Kotlin) & iOS (Swift) implementations
🎤 Continuous Audio Processing: Real-time bidirectional streaming
⚡ Ultra-Low Latency: Optimal performance with direct gRPC protocol
🧠 Enhanced AI Models: Uses Google's latest_long model with improved accuracy
🔒 Secure Authentication: Google Cloud access token-based authentication
📝 Automatic Punctuation: Enhanced readability with smart punctuation
🎯 Production Ready: Clean, optimized codebase for production deployment

🆕 v2.0.0 - gRPC Architecture

This version introduces a major architectural improvement with native gRPC streaming replacing the previous WebSocket implementation:

Direct Protocol Communication: Native protobuf messages for optimal performance
Bidirectional Streaming: Real-time audio streaming with immediate results
Improved Efficiency: Eliminated intermediate WebSocket layer
Better Error Handling: Enhanced connection management and error reporting
Production Optimized: Removed debug logging and unnecessary dependencies

Installation

Add this plugin to your pubspec.yaml:

dependencies:
  flutter_google_stt: ^2.0.0
  # Or use the latest version from pub.flutter-io.cn

Run flutter pub get to install the package.

Quick Start

Get a Google Cloud Access Token (see Google Cloud Setup section below)
Add the plugin to your pubspec.yaml
Initialize the plugin with your access token
Start listening for speech input

import 'package:flutter_google_stt/flutter_google_stt.dart';

// Initialize
await FlutterGoogleStt.initialize(
  accessToken: 'your-google-cloud-access-token',
  languageCode: 'en-US',
);

// Start listening
await FlutterGoogleStt.startListening((transcript, isFinal) {
  print('Transcript: $transcript (Final: $isFinal)');
});

// Stop listening
await FlutterGoogleStt.stopListening();

Platform Setup

Android

The plugin automatically handles microphone permissions. Ensure your app's android/app/src/main/AndroidManifest.xml includes:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />

iOS

Add microphone usage description to your ios/Runner/Info.plist:

<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access for speech recognition</string>

Google Cloud Setup

Create a Google Cloud Project: Go to Google Cloud Console
Enable the Speech-to-Text API:
- Navigate to "APIs & Services" > "Library"
- Search for "Cloud Speech-to-Text API"
- Click on it and press "Enable"
Create API Credentials:
- Go to "APIs & Services" > "Credentials"
- Click "Create Credentials" > "API Key"
- Copy the generated API key
Secure your API Key (recommended):
- Click on the API key you created
- Under "API restrictions", select "Restrict key"
- Choose "Cloud Speech-to-Text API" from the list
- Save the changes

Using API Key vs Access Token

This plugin supports Google Cloud API Key authentication, which is simpler for development:

// Using API Key (recommended for development)
await FlutterGoogleStt.initialize(
  accessToken: 'your-google-cloud-api-key-here',
  languageCode: 'en-US',
);

For production applications, consider using service account authentication with temporary access tokens:

# Generate access token using gcloud CLI
gcloud auth application-default print-access-token

Usage

Basic Usage

import 'package:flutter_google_stt/flutter_google_stt.dart';

class SpeechExample extends StatefulWidget {
  @override
  _SpeechExampleState createState() => _SpeechExampleState();
}

class _SpeechExampleState extends State<SpeechExample> {
  String _transcript = '';
  bool _isListening = false;

  @override
  void initState() {
    super.initState();
    _initializeSpeech();
  }

  Future<void> _initializeSpeech() async {
    // Replace with your actual Google Cloud API key
    const String apiKey = 'YOUR_GOOGLE_CLOUD_API_KEY_HERE';
    
    try {
      final success = await FlutterGoogleStt.initialize(
        accessToken: apiKey,  // API key is passed as accessToken parameter
        languageCode: 'en-US',
        sampleRateHertz: 16000,
      );
      
      if (success) {
        print('Speech recognition initialized successfully');
      } else {
        print('Failed to initialize speech recognition');
      }
    } catch (e) {
      print('Initialization error: $e');
    }
  }

  Future<void> _startListening() async {
    // Check permissions first
    bool hasPermission = await FlutterGoogleStt.hasMicrophonePermission();
    if (!hasPermission) {
      hasPermission = await FlutterGoogleStt.requestMicrophonePermission();
      if (!hasPermission) {
        print('Microphone permission denied');
        return;
      }
    }

    // Start listening
    try {
      final success = await FlutterGoogleStt.startListening((transcript, isFinal) {
        setState(() {
          _transcript = transcript;
        });
        
        if (isFinal) {
          print('Final transcript: $transcript');
        } else {
          print('Interim transcript: $transcript');
        }
      });
      
      if (success) {
        setState(() {
          _isListening = true;
        });
      }
    } catch (e) {
      print('Error starting listening: $e');
    }
  }

  Future<void> _stopListening() async {
    try {
      final success = await FlutterGoogleStt.stopListening();
      if (success) {
        setState(() {
          _isListening = false;
        });
      }
    } catch (e) {
      print('Error stopping listening: $e');
    }
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: Text('Speech Recognition')),
      body: Column(
        children: [
          Expanded(
            child: Container(
              padding: EdgeInsets.all(16),
              child: Text(
                _transcript.isEmpty ? 'No speech detected' : _transcript,
                style: TextStyle(fontSize: 18),
              ),
            ),
          ),
          ElevatedButton(
            onPressed: _isListening ? _stopListening : _startListening,
            child: Text(_isListening ? 'Stop' : 'Start'),
          ),
        ],
      ),
    );
  }
}

Advanced Configuration

// Initialize with custom settings
await FlutterGoogleStt.initialize(
  accessToken: 'your-api-key-here',
  languageCode: 'es-ES',  // Spanish
  sampleRateHertz: 16000, // Standard audio quality
);

// Check if currently listening
bool isListening = await FlutterGoogleStt.isListening();

// Check microphone permission
bool hasPermission = await FlutterGoogleStt.hasMicrophonePermission();

// Request permission if needed
if (!hasPermission) {
  bool granted = await FlutterGoogleStt.requestMicrophonePermission();
}

// Handle transcription results with detailed logging
FlutterGoogleStt.startListening((transcript, isFinal) {
  if (isFinal) {
    // This is the final result for this speech segment
    print('Final: $transcript');
    // Save or process the final transcript
  } else {
    // This is an interim result (may change as user continues speaking)
    print('Interim: $transcript');
    // Update UI with interim results for better UX
  }
});

API Reference

Methods

`initialize({required String accessToken, String languageCode, int sampleRateHertz})`

Initialize the speech recognition service.

accessToken (required): Google Cloud API key or access token
languageCode (optional): Language for recognition (default: 'en-US')
sampleRateHertz (optional): Audio sample rate (default: 16000)

Returns: Future<bool> - true if initialization successful

`startListening(TranscriptionCallback onTranscript)`

Start listening for speech input.

onTranscript: Callback function that receives transcript and isFinal status

Returns: Future<bool> - true if started successfully

`stopListening()`

Stop listening for speech input.

Returns: Future<bool> - true if stopped successfully

`isListening()`

Check if currently listening.

Returns: Future<bool> - true if listening

`hasMicrophonePermission()`

Check if microphone permission is granted.

Returns: Future<bool> - true if permission granted

`requestMicrophonePermission()`

Request microphone permission from user.

Returns: Future<bool> - true if permission granted

Types

`TranscriptionCallback`

typedef TranscriptionCallback = void Function(String transcript, bool isFinal);

Callback function for receiving transcription results:

transcript: The transcribed text
isFinal: Whether this is the final result (true) or interim (false)

Supported Languages

The plugin supports all languages supported by Google Cloud Speech-to-Text API. Common language codes:

English (US): en-US
English (UK): en-GB
Spanish: es-ES
French: fr-FR
German: de-DE
Italian: it-IT
Portuguese: pt-BR
Japanese: ja-JP
Korean: ko-KR
Chinese (Mandarin): zh-CN

For a complete list, see Google Cloud Speech-to-Text Language Support.

Error Handling

The plugin provides detailed error messages for common issues:

try {
  final success = await FlutterGoogleStt.initialize(
    accessToken: 'your-api-key-here',
  );
  if (!success) {
    print('Failed to initialize speech recognition');
  }
} catch (e) {
  print('Initialization error: $e');
  // Handle specific error cases
  if (e.toString().contains('INVALID_TOKEN')) {
    print('Invalid API key provided');
  } else if (e.toString().contains('NETWORK_ERROR')) {
    print('Network connection issue');
  }
}

try {
  final success = await FlutterGoogleStt.startListening((transcript, isFinal) {
    // Handle transcription
  });
  if (!success) {
    print('Failed to start listening');
  }
} catch (e) {
  print('Start listening error: $e');
  if (e.toString().contains('PERMISSION_DENIED')) {
    print('Microphone permission not granted');
  }
}

Common error scenarios:

INVALID_TOKEN: Invalid or expired API key/access token
PERMISSION_DENIED: Missing microphone permissions
NETWORK_ERROR: Network connectivity issues
INITIALIZATION_ERROR: Failed to initialize the plugin
START_ERROR: Failed to start audio recording
STOP_ERROR: Failed to stop audio recording

Limitations

Internet Connection: Requires active internet connection for Google Cloud API
Audio Processing: Audio is processed in chunks, consuming bandwidth
Authentication: API keys/access tokens need proper management and renewal
Platform Differences: Implementation varies slightly between Android and iOS
Audio Quality: Best results with clear audio and minimal background noise
Language Support: Limited to languages supported by Google Cloud Speech-to-Text API

Plugin Architecture

Package Name: com.guptan404.flutter_google_stt
Android Implementation: Kotlin with REST API integration
iOS Implementation: Swift with REST API integration
Audio Format: 16-bit PCM, configurable sample rate
API Integration: Google Cloud Speech-to-Text REST API
Permission Handling: Native platform permission requests

Contributing

Contributions are welcome! Please feel free to submit a Pull Request at https://github.com/guptan404/flutter_google_stt.

Development Setup

Clone the repository
Run flutter pub get in the root directory
Run cd example && flutter pub get to setup the example app
Test on Android: flutter build apk --debug
Test on iOS: flutter build ios --debug

Testing

Run the test suite:

flutter test
flutter analyze

Changelog

See CHANGELOG.md for a detailed list of changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues and feature requests, please use the GitHub Issues page.

About

Developed by @guptan404. This plugin provides a clean, production-ready interface for integrating Google Cloud Speech-to-Text API into Flutter applications.