flutter_whisper_api 1.0.0 copy "flutter_whisper_api: ^1.0.0" to clipboard
flutter_whisper_api: ^1.0.0 copied to clipboard

A Flutter package for seamless integration with OpenAI's Whisper API for speech-to-text conversion with audio recording capabilities.

Flutter Whisper API #

A comprehensive Flutter package for seamless integration with OpenAI's Whisper API, providing speech-to-text conversion capabilities with built-in audio recording functionality.

✨ Features #

  • 🎀 Built-in Audio Recording: Record audio directly from device microphone
  • πŸ”Š Real-time Amplitude Monitoring: Visual feedback during recording
  • 🌐 OpenAI Whisper API Integration: High-quality speech-to-text transcription
  • πŸ“± Cross-Platform Support: Works on iOS, Android, web, and desktop
  • πŸŽ›οΈ Configurable Audio Quality: Multiple quality presets for different use cases
  • πŸ”’ Permission Handling: Automatic microphone permission management
  • πŸ“ Multiple Output Formats: JSON, text, SRT, VTT, and verbose JSON
  • ⚑ Error Handling: Comprehensive exception handling with detailed error messages
  • 🎯 Easy Integration: Simple API with minimal setup required

πŸš€ Getting Started #

Installation #

Add this to your package's pubspec.yaml file:

dependencies:
  flutter_whisper_api: ^1.0.0

Then run:

flutter pub get

Setup #

  1. Get OpenAI API Key:

    • Visit OpenAI Platform
    • Create a new API key
    • Keep your API key secure and never commit it to version control
  2. Configure Permissions:

    Android (android/app/src/main/AndroidManifest.xml):

    <uses-permission android:name="android.permission.RECORD_AUDIO" />
    <uses-permission android:name="android.permission.INTERNET" />
    

    iOS (ios/Runner/Info.plist):

    <key>NSMicrophoneUsageDescription</key>
    <string>This app needs microphone access to record audio for transcription</string>
    

πŸ“± Usage #

Basic Usage #

import 'package:flutter_whisper_api/flutter_whisper_api.dart';
import 'dart:io';

// Initialize the client with your API key
final client = WhisperClient(apiKey: 'your-openai-api-key');

// Initialize the recorder
final recorder = WhisperRecorder();

// Start recording
String recordingPath = await recorder.startRecording();

// Stop recording and get the file
File? audioFile = await recorder.stopRecording();

if (audioFile != null) {
  // Create transcription request
  final request = WhisperRequest(
    audioFile: audioFile,
    language: 'en', // Optional: specify language
  );

  // Transcribe the audio
  final response = await client.transcribe(request);
  
  print('Transcribed text: ${response.text}');
}

// Clean up
recorder.dispose();
client.dispose();

Advanced Usage #

// Configure audio quality
await recorder.startRecording(
  fileName: 'my_recording',
  quality: WhisperAudioQuality.high,
);

// Monitor recording amplitude
while (recorder.isRecording) {
  final amplitude = await recorder.getAmplitude();
  print('Current amplitude: $amplitude');
  await Future.delayed(Duration(milliseconds: 100));
}

// Advanced transcription options
final request = WhisperRequest(
  audioFile: audioFile,
  model: 'whisper-1',
  language: 'es', // Spanish
  temperature: 0.3,
  responseFormat: WhisperResponseFormat.verboseJson,
  prompt: 'This is a medical consultation...',
);

final response = await client.transcribe(request);

// Access detailed response data
print('Language detected: ${response.language}');
print('Duration: ${response.duration} seconds');

// Access segments (if available)
if (response.segments != null) {
  for (final segment in response.segments!) {
    print('${segment.start}s - ${segment.end}s: ${segment.text}');
  }
}

Error Handling #

try {
  final response = await client.transcribe(request);
  print('Success: ${response.text}');
} on WhisperAuthenticationException catch (e) {
  print('Authentication failed: ${e.message}');
} on WhisperNetworkException catch (e) {
  print('Network error: ${e.message}');
} on WhisperAudioException catch (e) {
  print('Audio file error: ${e.message}');
} on WhisperRecordingException catch (e) {
  print('Recording error: ${e.message}');
} on WhisperException catch (e) {
  print('General Whisper error: ${e.message}');
}

πŸŽ›οΈ Configuration Options #

Audio Quality Presets #

Quality Bit Rate Sample Rate Use Case
WhisperAudioQuality.low 64 kbps 16 kHz Voice notes, long recordings
WhisperAudioQuality.medium 128 kbps 44.1 kHz General purpose (default)
WhisperAudioQuality.high 256 kbps 44.1 kHz High-quality audio, music

Response Formats #

  • WhisperResponseFormat.json - Simple JSON with text only
  • WhisperResponseFormat.text - Plain text response
  • WhisperResponseFormat.srt - SRT subtitle format
  • WhisperResponseFormat.vtt - WebVTT subtitle format
  • WhisperResponseFormat.verboseJson - Detailed JSON with timestamps

Supported Languages #

The Whisper API supports 99+ languages. Some common ones:

  • en - English
  • es - Spanish
  • fr - French
  • de - German
  • it - Italian
  • pt - Portuguese
  • ru - Russian
  • ja - Japanese
  • ko - Korean
  • zh - Chinese

For automatic language detection, omit the language parameter.

πŸ“‹ API Reference #

WhisperClient #

WhisperClient({
  required String apiKey,
  String baseUrl = 'https://api.openai.com/v1',
  http.Client? httpClient,
})

WhisperRecorder #

// Start recording
Future<String> startRecording({
  String? fileName,
  WhisperAudioQuality quality = WhisperAudioQuality.medium,
})

// Stop recording
Future<File?> stopRecording()

// Cancel recording
Future<void> cancelRecording()

// Pause/Resume (platform dependent)
Future<void> pauseRecording()
Future<void> resumeRecording()

// Monitor amplitude
Future<double?> getAmplitude()

// Permission handling
Future<bool> requestPermission()
Future<bool> hasPermission()

πŸ”§ Example App #

Check out the complete example app in the /example folder that demonstrates:

  • API key configuration
  • Recording with visual feedback
  • Real-time amplitude monitoring
  • Transcription display
  • Error handling

To run the example:

cd example
flutter pub get
flutter run

🚨 Important Notes #

API Costs #

  • OpenAI charges for Whisper API usage
  • Current pricing: $0.006 per minute of audio
  • Monitor your usage on the OpenAI Platform

File Limitations #

  • Maximum file size: 25 MB
  • Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm
  • For longer audio, consider chunking

Security #

  • Never hardcode API keys in your app
  • Use environment variables or secure storage
  • Consider server-side API calls for production apps

🀝 Contributing #

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

πŸ“„ License #

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support #

If you encounter any issues or have questions:

  1. Check the example app for implementation details
  2. Review the API documentation
  3. Open an issue on GitHub

πŸ™ Acknowledgments #

  • OpenAI for the Whisper API
  • Flutter team for the amazing framework
  • Contributors and users of this package
1
likes
150
points
26
downloads

Publisher

unverified uploader

Weekly Downloads

A Flutter package for seamless integration with OpenAI's Whisper API for speech-to-text conversion with audio recording capabilities.

Repository (GitHub)
View/report issues

Documentation

Documentation
API reference

License

MIT (license)

Dependencies

flutter, http, path_provider, permission_handler, record

More

Packages that depend on flutter_whisper_api