flutter_ort_genai #

ONNX Runtime GenAI for Flutter - High-performance LLM inference with streaming token generation.

Features #

🚀 High-performance LLM inference using ONNX Runtime GenAI
📡 Streaming token generation with cancellation support
🎯 Clean API with Generation object pattern
🔧 Dynamic C API resolution (no ORT duplication)
📱 Cross-platform: Android, iOS, macOS
🛡️ Production-ready with thread safety and proper error handling

Installation #

Add to your pubspec.yaml:

dependencies:
  flutter_onnxruntime: ^latest  # Required peer dependency
  flutter_ort_genai: ^0.1.0

Usage #

Basic Example #

import 'package:flutter_ort_genai/flutter_ort_genai.dart';

// Load a GenAI model
final model = await OrtGenAIModel.load(
  'path/to/genai/model',
  options: GenAIOptions(deviceType: 'cpu'),
);

// Start generation
final generation = model.start(
  'What is Flutter?',
  temperature: 0.7,
  maxTokens: 256,
);

// Stream tokens
await for (final token in generation.stream) {
  print(token);
}

// Or cancel generation
await generation.cancel();

// Dispose when done
await model.dispose();

Advanced Usage #

// With all parameters
final generation = model.start(
  prompt,
  temperature: 0.8,
  maxTokens: 512,
  topP: 0.9,
  topK: 40,
  repetitionPenalty: 1.1,
  stopSequences: ['</end>', '\n\n'],
);

// Collect all tokens
final response = await generation.collectAll();

// With timeout
final response = await generation.collectWithTimeout(
  Duration(seconds: 30),
);

// Get model metadata (if available)
final metadata = await model.getMetadata();
print('Model: ${metadata?.modelName}');
print('Vocab size: ${metadata?.vocabSize}');

Model Requirements #

GenAI models require these files in the model directory:

genai_config.json - GenAI configuration
tokenizer.json or tokenizer.model - Tokenizer data
model.onnx - Model weights (or quantized variants)

See BUILD_INSTRUCTIONS.md for model conversion details.

Building from Source #

This plugin requires building ONNX Runtime GenAI from source. See BUILD_INSTRUCTIONS.md for detailed steps.

Quick Start #

Clone and build onnxruntime-genai:

git clone https://github.com/microsoft/onnxruntime-genai.git
cd onnxruntime-genai
python build.py --android --android_abi arm64-v8a

Copy libraries to plugin (see BUILD_INSTRUCTIONS.md)
Build and run:

flutter run

Platform Support #

Platform	Minimum Version	Architectures
Android	API 24 (7.0)	arm64-v8a, x86_64
iOS	11.0	arm64
macOS	10.14	arm64, x86_64

Architecture #

This plugin is a companion to flutter_onnxruntime:

flutter_onnxruntime: Generic ONNX inference (Whisper, TTS, etc.)
flutter_ort_genai: LLM-specific with streaming generation

Both share the same ONNX Runtime 1.22.0, avoiding duplication.

Contributing #

Contributions are welcome! Please read our Contributing Guide first.

License #

MIT License - see LICENSE file.

flutter_ort_genai 0.1.0
flutter_ort_genai: ^0.1.0 copied to clipboard

Metadata

flutter_ort_genai #

Features #

Installation #

Usage #

Basic Example #

Advanced Usage #

Model Requirements #

Building from Source #

Quick Start #

Platform Support #

Architecture #

Contributing #

License #

Acknowledgments #

← Metadata

Publisher

Weekly Downloads

Metadata

Topics

Documentation

License

Dependencies

More

flutter_ort_genai 0.1.0 flutter_ort_genai: ^0.1.0 copied to clipboard

Metadata

flutter_ort_genai #

Features #

Installation #

Usage #

Basic Example #

Advanced Usage #

Model Requirements #

Building from Source #

Quick Start #

Platform Support #

Architecture #

Contributing #

License #

Acknowledgments #

← Metadata

Publisher

Weekly Downloads

Metadata

Topics

Documentation

License

Dependencies

More

flutter_ort_genai 0.1.0
flutter_ort_genai: ^0.1.0 copied to clipboard