This is a fork of the original onnxruntime Flutter plugin, which appears to be no longer maintained. This fork adds support for 16KB memory page size and full GPU and hardware acceleration support

OnnxRuntime Plugin #

Overview #

Flutter plugin for OnnxRuntime via dart:ffi provides an easy, flexible, and fast Dart API to integrate Onnx models in flutter apps across mobile and desktop platforms.

Platform	Android	iOS	Linux	macOS	Windows
Compatibility	API level 21+	*	*	*	*
Architecture	arm32/arm64	*	*	*	*

*: Consistent with Flutter

Key Features #

Multi-platform Support for Android, iOS, Linux, macOS, Windows, and Web(Coming soon).
Flexibility to use any Onnx Model.
Acceleration using multi-threading.
Similar structure as OnnxRuntime Java and C# API.
Inference speed is not slower than native Android/iOS Apps built using the Java/Objective-C API.
Run inference in different isolates to prevent jank in UI thread.

Getting Started #

In your flutter project add the dependency:

dependencies:
  ...
  onnxruntime: x.y.z

Usage example #

Import #

import 'package:onnxruntime_v2/onnxruntime_v2.dart';

Initializing environment #

OrtEnv.instance.init();

Creating the Session #

final sessionOptions = OrtSessionOptions();

// 🚀 NEW: Automatically use GPU acceleration if available!
// This will try GPU providers first, then fall back to CPU
sessionOptions.appendDefaultProviders();

const assetFileName = 'assets/models/test.onnx';
final rawAssetFile = await rootBundle.load(assetFileName);
final bytes = rawAssetFile.buffer.asUint8List();
final session = OrtSession.fromBuffer(bytes, sessionOptions);

Performing inference #

final shape = [1, 2, 3];
final inputOrt = OrtValueTensor.createTensorWithDataList(data, shape);
final inputs = {'input': inputOrt};
final runOptions = OrtRunOptions();
final outputs = await _session?.runAsync(runOptions, inputs);
inputOrt.release();
runOptions.release();
outputs?.forEach((element) {
  element?.release();
});

Releasing environment #

OrtEnv.instance.release();

🚀 GPU Acceleration #

This fork includes full support for GPU and hardware acceleration across multiple platforms!

Supported Execution Providers #

Provider	Platform	Hardware	Speedup
CUDA	Windows/Linux	NVIDIA GPU	5-10x
TensorRT	Windows/Linux	NVIDIA GPU	10-20x
DirectML	Windows	AMD/Intel/NVIDIA GPU	3-8x
ROCm	Linux	AMD GPU	5-10x
CoreML	iOS/macOS	Apple Neural Engine	5-15x
NNAPI	Android	Google NPU/GPU	3-7x
OpenVINO	Windows/Linux	Intel GPU/VPU	3-6x
DNNL	All	Intel CPU	2-4x
XNNPACK	All	CPU optimizations	1.5-3x

Quick Start: Automatic GPU Selection #

The easiest way to enable GPU acceleration:

final sessionOptions = OrtSessionOptions();
sessionOptions.appendDefaultProviders(); // 🎯 That's it!

This automatically selects the best available provider in this order:

GPU: CUDA → DirectML → ROCm
NPU: CoreML → NNAPI → QNN
Optimized CPU: DNNL → XNNPACK
Fallback: Standard CPU

Manual Provider Selection #

For fine-grained control:

// NVIDIA GPU (Windows/Linux)
sessionOptions.appendCudaProvider(CUDAFlags.useArena);

// NVIDIA with TensorRT optimizations + FP16
sessionOptions.appendTensorRTProvider({'trt_fp16_enable': '1'});

// DirectML for Windows (any GPU)
sessionOptions.appendDirectMLProvider();

// Apple Neural Engine (iOS/macOS)
sessionOptions.appendCoreMLProvider(CoreMLFlags.useNone);

// Android acceleration
sessionOptions.appendNnapiProvider(NnapiFlags.useNone);

// AMD GPU on Linux
sessionOptions.appendRocmProvider(ROCmFlags.useArena);

// Intel optimization
sessionOptions.appendDNNLProvider(DNNLFlags.useArena);

// Always add CPU as fallback
sessionOptions.appendCPUProvider(CPUFlags.useArena);

Performance Tips #

Use appendDefaultProviders() first - it handles everything automatically
CUDA vs TensorRT: TensorRT is faster but takes longer to initialize
DirectML: Great for cross-vendor support on Windows
Mobile: CoreML (iOS) and NNAPI (Android) provide massive speedups
Thread count: Set setIntraOpNumThreads() to your CPU core count for CPU inference

GPU Setup Requirements #

Windows (NVIDIA):

Install CUDA Toolkit
Optional: TensorRT for extra speed

Linux (NVIDIA):

Install CUDA runtime: apt install nvidia-cuda-toolkit
Optional: TensorRT

Linux (AMD):

Install ROCm

Windows (Any GPU):

DirectML works out-of-the-box on Windows 10+

iOS/macOS:

CoreML works automatically (no setup needed)

Android:

NNAPI works automatically on Android 8.1+ (no setup needed)

Troubleshooting #

If GPU acceleration isn't working:

Check available providers:

OrtEnv.instance.availableProviders().forEach((provider) {
  print('Available: $provider');
});

Catch provider errors gracefully:

try {
  sessionOptions.appendCudaProvider(CUDAFlags.useArena);
} catch (e) {
  print('CUDA not available, falling back to CPU');
  sessionOptions.appendCPUProvider(CPUFlags.useArena);
}

Verify GPU runtime is installed (CUDA, DirectML, etc.)
Check that you're using the GPU-enabled ONNX Runtime library

onnxruntime_v2 1.23.2+1
onnxruntime_v2: ^1.23.2+1 copied to clipboard

Metadata

OnnxRuntime Plugin #

Overview #

Key Features #

Getting Started #

Usage example #

Import #

Initializing environment #

Creating the Session #

Performing inference #

Releasing environment #

🚀 GPU Acceleration #

Supported Execution Providers #

Quick Start: Automatic GPU Selection #

Manual Provider Selection #

Performance Tips #

GPU Setup Requirements #

Troubleshooting #

← Metadata

Publisher

Weekly Downloads

Metadata

Topics

Documentation

License

Dependencies

More

onnxruntime_v2 1.23.2+1 onnxruntime_v2: ^1.23.2+1 copied to clipboard

Metadata

OnnxRuntime Plugin #

Overview #

Key Features #

Getting Started #

Usage example #

Import #

Initializing environment #

Creating the Session #

Performing inference #

Releasing environment #

🚀 GPU Acceleration #

Supported Execution Providers #

Quick Start: Automatic GPU Selection #

Manual Provider Selection #

Performance Tips #

GPU Setup Requirements #

Troubleshooting #

← Metadata

Publisher

Weekly Downloads

Metadata

Topics

Documentation

License

Dependencies

More

onnxruntime_v2 1.23.2+1
onnxruntime_v2: ^1.23.2+1 copied to clipboard