llm_toolkit 0.0.2 copy "llm_toolkit: ^0.0.2" to clipboard
llm_toolkit: ^0.0.2 copied to clipboard

A comprehensive Flutter SDK for running Large Language Models (LLMs) locally on mobile and desktop devices. Supports multiple inference engines including Gemma (TFLite) and Llama (GGUF) with integrate [...]

LLM Toolkit SDK Changelog #

0.0.2 June 29, 2025 #

πŸš€ New Features #

  • Enhanced Model Browser with tabbed interface (Search, Recommended, Downloaded)
  • File Browser Integration - Load models directly from device storage
  • Device-Specific Recommendations - Memory-aware model suggestions based on device capabilities
  • Advanced Debug Console - Real-time logging with color-coded categories and filtering
  • Improved Model Cards - Better file organization, download status, and compatibility indicators
  • Native Library Validation - Automatic detection of llama.cpp compatibility and health checks
  • Progressive Model Loading - Fallback configurations for memory-constrained devices
  • Enhanced Error Handling - Detailed error messages with troubleshooting tips and recovery suggestions

πŸ”§ Improvements #

  • Quantization Safety Checks - Automatic detection and warnings for unstable quantizations (Q2_K, IQ1_S, IQ1_M)
  • Memory Management - Dynamic context size adjustment based on available device memory
  • Download Progress - Real-time progress indicators with speed metrics and ETA
  • Model Detection - Improved engine detection for GGUF and TFLite files with better accuracy
  • UI/UX Enhancements - Material Design 3 with gradient themes, better accessibility, and responsive design
  • Performance Monitoring - Token generation speed tracking, memory usage reporting, and performance analytics
  • Search Optimization - Better model discovery with multiple search strategies and relevance ranking

πŸ› Bug Fixes #

  • Fixed SIGSEGV crashes with problematic quantizations on Android ARM64 devices
  • Resolved memory leaks during model loading/unloading cycles
  • Fixed download interruption handling and resume functionality
  • Corrected model file validation for corrupted GGUF files
  • Fixed context size calculation for low-memory devices
  • Resolved engine detection issues for certain model formats
  • Fixed UI state management during concurrent operations

πŸ“š Documentation #

  • Added comprehensive model compatibility guide with device-specific recommendations
  • Enhanced troubleshooting documentation for Android devices with common solutions
  • Updated API documentation with new configuration options and examples
  • Added performance optimization guidelines for mobile deployment

πŸ› οΈ Technical Improvements #

  • Isolate-based Model Loading - Better crash protection and memory isolation
  • Enhanced Logging System - Categorized debug logs with filtering and export capabilities
  • Improved Error Recovery - Graceful fallback mechanisms for failed operations
  • Better Resource Management - Optimized memory usage and cleanup procedures

0.0.1 June 27, 2025 #

πŸŽ‰ Initial Release #

  • Dual Engine Support - Llama (GGUF) and Gemma (TFLite) inference engines
  • HuggingFace Integration - Direct model search and download from HuggingFace Hub
  • Model Management - Automatic model detection and configuration
  • Chat Interface - Real-time streaming text generation with conversation history
  • Cross-Platform - Support for Android and iOS devices

πŸ”§ Core Features #

  • Model Search - Search and filter models by format, size, and compatibility
  • Automatic Downloads - Background model downloading with progress tracking
  • Engine Detection - Automatic selection of appropriate inference engine based on model format
  • Memory Optimization - Dynamic configuration based on device capabilities
  • Error Handling - Comprehensive exception handling and user feedback

πŸ“± Supported Formats #

  • GGUF - Quantized models for efficient inference (Q4_K_M, Q4_0, Q5_K_M, Q8_0)
  • TFLite - TensorFlow Lite models optimized for mobile devices
  • Model Types - Support for Gemma, Llama, Phi, DeepSeek, and custom models

πŸ› οΈ Technical Stack #

  • Flutter Framework - Cross-platform mobile development
  • llama_cpp_dart - Native GGUF model inference with ARM64 optimization
  • flutter_gemma - TFLite model inference with GPU acceleration
  • Dio HTTP Client - Efficient model downloading and API communication
  • Path Provider - Cross-platform file system access and management

🎯 Initial Capabilities #

  • Text Generation - Streaming text generation with customizable parameters
  • Model Discovery - Browse and search thousands of compatible models
  • Local Storage - Efficient model caching and management
  • Debug Tools - Basic logging and error reporting
  • Configuration Management - Flexible inference configuration options
5
likes
0
points
119
downloads

Publisher

unverified uploader

Weekly Downloads

A comprehensive Flutter SDK for running Large Language Models (LLMs) locally on mobile and desktop devices. Supports multiple inference engines including Gemma (TFLite) and Llama (GGUF) with integrated model discovery, download, and chat capabilities.

Repository (GitHub)
View/report issues

License

unknown (license)

Dependencies

collection, crypto, dio, flutter, flutter_gemma, http, json_annotation, llama_cpp_dart, objectbox, objectbox_flutter_libs, path, path_provider

More

Packages that depend on llm_toolkit