llamafu 0.0.1 copy "llamafu: ^0.0.1" to clipboard
llamafu: ^0.0.1 copied to clipboard

A Flutter package for running language models on device with support for completion, instruct mode, tool calling, streaming, constrained generation, LoRA, and multi-modal inputs (images, audio).

Llamafu #

Pub License

A Flutter package for running language models on device with support for completion, instruct mode, tool calling, streaming, constrained generation, LoRA, and multi-modal inputs (images, audio).

Features #

  • πŸš€ Run language models directly on device (Android and iOS)
  • πŸ’¬ Support for text completion
  • πŸ€– Instruct mode for chat-like interactions
  • πŸ› οΈ Tool calling capabilities
  • 🌊 Streaming output
  • πŸ”’ Constrained generation (GBNF grammars)
  • 🧬 LoRA adapter support
  • πŸ–ΌοΈπŸŽ§ Multi-modal support (images, audio)

Prerequisites #

  • Flutter 3.0 or higher
  • Android SDK/NDK for Android development
  • Xcode for iOS development
  • Pre-built llama.cpp libraries

Installation #

Add llamafu as a dependency in your pubspec.yaml file:

dependencies:
  llamafu: ^0.0.1

Then run:

flutter pub get

Usage #

Text Completion #

import 'package:llamafu/llamafu.dart';

// Initialize the model
final llamafu = await Llamafu.init(
  modelPath: '/path/to/your/model.gguf',
  threads: 4,
  contextSize: 512,
);

// Generate text
final result = await llamafu.complete(
  prompt: 'The quick brown fox',
  maxTokens: 128,
  temperature: 0.8,
);

print(result);

// Clean up resources
llamafu.close();

Multi-modal Inference #

import 'package:llamafu/llamafu.dart';

// Initialize the model with multi-modal projector
final llamafu = await Llamafu.init(
  modelPath: '/path/to/your/model.gguf',
  mmprojPath: '/path/to/your/mmproj.gguf', // Multi-modal projector file
  threads: 4,
  contextSize: 512,
  useGpu: false, // Set to true to use GPU for multi-modal processing
);

// Generate text with image input
final mediaInputs = [
  MediaInput(
    type: MediaType.image,
    data: '/path/to/your/image.jpg', // Path to image file
  ),
];

final result = await llamafu.multimodalComplete(
  prompt: 'Describe this image: <image>',
  mediaInputs: mediaInputs,
  maxTokens: 128,
  temperature: 0.8,
);

print(result);

// Clean up resources
llamafu.close();

LoRA Adapter Support #

import 'package:llamafu/llamafu.dart';

// Initialize the model
final llamafu = await Llamafu.init(
  modelPath: '/path/to/your/model.gguf',
  threads: 4,
  contextSize: 512,
);

// Load a LoRA adapter
final loraAdapter = await llamafu.loadLoraAdapter('/path/to/your/lora.gguf');

// Apply the LoRA adapter with a scale factor
await llamafu.applyLoraAdapter(loraAdapter, scale: 0.5);

// Generate text with the LoRA adapter applied
final result = await llamafu.complete(
  prompt: 'Write a story about space exploration',
  maxTokens: 128,
  temperature: 0.8,
);

print(result);

// Remove the LoRA adapter
await llamafu.removeLoraAdapter(loraAdapter);

// Or clear all LoRA adapters
await llamafu.clearAllLoraAdapters();

// Clean up resources
llamafu.close();

Constrained Generation #

import 'package:llamafu/llamafu.dart';

// Initialize the model
final llamafu = await Llamafu.init(
  modelPath: '/path/to/your/model.gguf',
  threads: 4,
  contextSize: 512,
);

// Define a JSON grammar
final jsonGrammar = '''
root   ::= object
value  ::= object | array | string | number | ("true" | "false" | "null") ws

object ::=
  "{" ws (
            string ":" ws value
    ("," ws string ":" ws value)*
  )? "}" ws

array  ::=
  "[" ws (
            value
    ("," ws value)*
  )? "]" ws

string ::=
  "\"" (
    [^\"\\\\\x7F\x00-\x1F] |
    "\\\\" (["\\\\bfnrt] | "u" [0-9a-fA-F]{4}) # escapes
  )* "\"" ws

number ::= ("-"? ([0-9] | [1-9] [0-9]{0,15})) ("." [0-9]+)? ([eE] [-+]? [0-9] [1-9]{0,15})? ws

# Optional space: by convention, applied in this grammar after literal chars when allowed
ws ::= | " " | "\n" [ \t]{0,20}
''';

// Generate text constrained to JSON format
final result = await llamafu.completeWithGrammar(
  prompt: 'Generate a JSON object describing a person:',
  grammarStr: jsonGrammar,
  grammarRoot: 'root',
  maxTokens: 256,
  temperature: 0.8,
);

print(result);

// Clean up resources
llamafu.close();

Supported Multi-modal Models #

Llamafu supports various multi-modal models through the llama.cpp MTMD library:

Vision Models #

  • Gemma 3
  • SmolVLM
  • Pixtral 12B
  • Qwen 2 VL
  • Qwen 2.5 VL
  • Mistral Small 3.1 24B
  • InternVL 2.5 and 3
  • Llama 4 Scout
  • Moondream2

Audio Models #

  • Ultravox 0.5
  • Qwen2-Audio
  • SeaLLM-Audio
  • Voxtral Mini

Mixed Modalities #

  • Qwen2.5 Omni (audio + vision)

Building #

Android #

  1. Ensure you have the Android NDK installed
  2. Build the native libraries:
    cd android/src/main/cpp
    mkdir build
    cd build
    cmake .. -DLLAMA_CPP_DIR=/path/to/llama.cpp
    make
    

iOS #

  1. Ensure you have Xcode installed
  2. Build the native libraries using Xcode or CMake

API Reference #

For detailed API documentation, please refer to the API documentation.

License #

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments #

  • This project uses the excellent llama.cpp library for running language models.
  • Multi-modal support is provided by the MTMD library in llama.cpp.
  • LoRA support is provided by the native LoRA adapter functionality in llama.cpp.
  • Constrained generation support is provided by the grammar sampler functionality in llama.cpp.
0
likes
130
points
28
downloads

Publisher

unverified uploader

Weekly Downloads

A Flutter package for running language models on device with support for completion, instruct mode, tool calling, streaming, constrained generation, LoRA, and multi-modal inputs (images, audio).

Repository (GitHub)
View/report issues

Topics

#ai #llm #llama #on-device #machine-learning

Documentation

Documentation
API reference

License

MIT (license)

Dependencies

ffi, flutter

More

Packages that depend on llamafu

Packages that implement llamafu