JARVIS #

A Dart-based voice assistant inspired by JARVIS from Iron Man. Say "JARVIS" to wake it up, speak naturally, and get intelligent spoken responses.

Features #

Wake Word Detection - Always listening for "JARVIS" using sherpa_onnx
Speech-to-Text - Transcribes speech using whisper.cpp
LLM Responses - Generates contextual responses using llama.cpp
Text-to-Speech - Natural speech synthesis using sherpa_onnx VITS
Conversation Memory - Maintains context across conversation turns
Barge-in Support - Interrupt JARVIS by saying the wake word while it's speaking
Follow-up Listening - Responds to follow-up questions without needing the wake word
Session Recording - Record sessions for debugging and analysis
Audio Acknowledgments - Plays audio feedback when activated

Installation #

Option 1: Global Install (Recommended) #

Install JARVIS globally as a CLI tool:

# Install globally from pub.flutter-io.cn
dart pub global activate jarvis_dart

# Run first-time setup (downloads ~150MB models)
jarvis setup

# Edit configuration (set whisper/llama paths)
vim ~/.jarvis/config.yaml

# Run JARVIS
jarvis

Option 2: From Source #

# Clone and install
git clone https://github.com/sjhorn/jarvis.git
cd jarvis
dart pub get

# Configure (edit paths to your models)
cp config.yaml.example config.yaml
vim config.yaml

# Run
dart run bin/jarvis.dart --config config.yaml

Option 3: Compiled Binary (Fastest Startup) #

Compile JARVIS to a native binary for instant startup (~50ms vs ~500ms for JIT):

# Clone the repo
git clone https://github.com/sjhorn/jarvis.git
cd jarvis
dart pub get

# Compile to native binary
dart compile exe bin/jarvis.dart -o jarvis

# Install to PATH (optional)
sudo mv jarvis /usr/local/bin/
# Or for user-only install:
mkdir -p ~/.local/bin && mv jarvis ~/.local/bin/

# Run first-time setup (downloads models to ~/.jarvis/)
jarvis setup

# Run JARVIS (uses ~/.jarvis/config.yaml by default)
jarvis

The compiled binary automatically uses default paths:

Config: ~/.jarvis/config.yaml
Models: ~/.jarvis/models/
Assets: ~/.jarvis/assets/

No --config flag needed when using the standard ~/.jarvis/ directory structure.

CLI Commands #

jarvis              # Run the voice assistant
jarvis setup        # Download models and create config
jarvis version      # Show version
jarvis --help       # Show help

Requirements #

System Dependencies #

Dart SDK

Platform	Installation
macOS	`brew install dart`
Linux	See Dart install docs
Windows	`choco install dart-sdk` or `winget install Dart.Dart-SDK`

Sox (Audio Recording)

Platform	Installation
macOS	`brew install sox`
Ubuntu/Debian	`sudo apt install sox`
Fedora	`sudo dnf install sox`
Arch	`sudo pacman -S sox`
Windows	Download from SourceForge

whisper.cpp (Speech-to-Text)

Build from source on all platforms:

git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
cmake -B build
cmake --build build --config Release

# Download a model
./models/download-ggml-model.sh base.en

The executable will be at build/bin/whisper-cli (or build/bin/Release/whisper-cli.exe on Windows).

llama.cpp (LLM Inference)

Platform	Installation
macOS	`brew install llama.cpp`
Linux/Windows	Build from source (see below)

Build from source:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

The executable will be at build/bin/llama-cli (or build/bin/Release/llama-cli.exe on Windows).

Platform-Specific Notes

macOS: Uses afplay for audio playback (built-in).

Linux: Requires a command-line audio player. Install one of:

sudo apt install sox (uses play command)
sudo apt install ffmpeg (uses ffplay)
sudo apt install mpv

Windows: Audio playback uses PowerShell's built-in capabilities.

Models Required #

Whisper - Speech recognition model (e.g., ggml-base.en.bin)
LLM - Language model (e.g., gemma-3-1b-it from Hugging Face)
Wake Word - sherpa_onnx keyword spotter model
TTS - sherpa_onnx VITS model with espeak-ng data

Model Setup #

Scripts are provided to download the required models:

# Download and setup TTS model (JARVIS voice)
cd models/tts
./get_model.sh
cd ../..

# Download wake word detection model
cd models/kws
./get_model.sh
cd ../..

The TTS script downloads:

JARVIS voice model from HuggingFace (piper format)
Converts to sherpa-onnx format with metadata
espeak-ng phoneme data

Note: The convert script requires Python with the onnx package:

pip install onnx

Configuration #

Create config.yaml with your model paths:

# Speech-to-Text (Whisper)
whisper_model_path: /path/to/ggml-base.en.bin
whisper_executable: /path/to/whisper-cli

# LLM (Llama)
llama_model_repo: ggml-org/gemma-3-1b-it-GGUF
llama_executable: /opt/homebrew/bin/llama-cli

# Wake Word Detection
wakeword_encoder_path: ./models/kws/encoder.onnx
wakeword_decoder_path: ./models/kws/decoder.onnx
wakeword_joiner_path: ./models/kws/joiner.onnx
wakeword_tokens_path: ./models/kws/tokens.txt
wakeword_keywords_file: ./models/kws/keywords.txt

# Text-to-Speech
tts_model_path: ./models/tts/jarvis-high.onnx
tts_tokens_path: ./models/tts/tokens.txt
tts_data_dir: ./models/tts/espeak-ng-data

# Sherpa Native Library
sherpa_lib_path: ~/.pub-cache/hosted/pub.flutter-io.cn/sherpa_onnx_macos-1.12.20/macos

# Audio Feedback
acknowledgment_dir: ./assets/acknowledgments
barge_in_dir: ./assets/bargein

# Behavior Settings
system_prompt: |
  You are JARVIS, a helpful AI assistant.
  Keep responses concise for spoken delivery.

silence_threshold: 0.01
silence_duration_ms: 800
max_history_length: 10
sentence_pause_ms: 200

# Follow-up Listening
enable_follow_up: true
follow_up_timeout_ms: 4000
statement_follow_up_timeout_ms: 4000

# Barge-in
enable_barge_in: true

# Audio Playback (optional - auto-detects if not specified)
audio_player: auto           # auto, afplay, play, mpv, ffplay, aplay
audio_player_path: /usr/bin/afplay  # optional custom path

Audio Player Options #

Player	Platforms	Notes
`auto`	All	Auto-detect best available (default)
`afplay`	macOS	Built-in CoreAudio player
`play`	All	Sox audio player
`mpv`	All	Multimedia player
`ffplay`	All	FFmpeg player
`aplay`	Linux	ALSA player

Usage #

# Basic usage
dart run bin/jarvis.dart --config config.yaml

# With debug logging
dart run bin/jarvis.dart --config config.yaml --debug

# Record session for debugging
dart run bin/jarvis.dart --config config.yaml --record

# Record to custom directory
dart run bin/jarvis.dart --config config.yaml --record-dir ./my-sessions

CLI Options #

Option	Description
`-c, --config <path>`	Path to YAML config file
`-v, --verbose`	Enable INFO level logging
`-d, --debug`	Enable DEBUG level logging
`--trace`	Enable TRACE level logging
`-q, --quiet`	Suppress all logging
`--record`	Enable session recording
`--record-dir <path>`	Custom session directory
`-h, --help`	Show help message

Architecture #

┌─────────────────────────────────────────────────────────────┐
│                      VoiceAssistant                          │
│                    (Main Orchestrator)                       │
└─────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
┌───────────────┐    ┌───────────────┐    ┌───────────────┐
│  AudioInput   │    │ Conversation  │    │  AudioOutput  │
│   (sox rec)   │    │    Context    │    │   (afplay)    │
└───────────────┘    └───────────────┘    └───────────────┘
        │                                          ▲
        ▼                                          │
┌───────────────┐                        ┌───────────────┐
│   WakeWord    │                        │      TTS      │
│   Detector    │                        │   (sherpa)    │
└───────────────┘                        └───────────────┘
        │                                          ▲
        ▼                                          │
┌───────────────┐    ┌───────────────┐    ┌───────────────┐
│     VAD       │───►│    Whisper    │───►│     Llama     │
│   (Silence)   │    │   (STT)       │    │    (LLM)      │
└───────────────┘    └───────────────┘    └───────────────┘

Tools #

Utility scripts in tool/:

# Generate acknowledgment audio files
dart run tool/generate_acknowledgments.dart

# Generate barge-in audio files
dart run tool/generate_bargein.dart

# Regenerate a single acknowledgment
dart run tool/regenerate_ack.dart 8 "System active."

# Replay a recorded session
dart run tool/replay_session.dart ./sessions/session_* --verbose
dart run tool/replay_session.dart ./sessions/session_* --transcribe

Project Structure #

jarvis/
├── bin/
│   └── jarvis.dart              # CLI entry point
├── lib/src/
│   ├── audio/
│   │   ├── audio_input.dart     # Microphone capture
│   │   ├── audio_output.dart    # Audio playback
│   │   └── acknowledgment_player.dart
│   ├── cli/
│   │   └── config_loader.dart   # Configuration parsing
│   ├── context/
│   │   └── conversation_context.dart
│   ├── llm/
│   │   └── llama_process.dart   # LLM integration
│   ├── process/
│   │   └── process_pipe.dart    # Process communication
│   ├── recording/
│   │   ├── session_event.dart   # Event types
│   │   ├── session_recorder.dart
│   │   └── wav_writer.dart
│   ├── stt/
│   │   └── whisper_process.dart # Speech-to-text
│   ├── tts/
│   │   ├── tts_manager.dart     # Text-to-speech
│   │   └── text_processor.dart  # Response cleaning
│   ├── vad/
│   │   └── voice_activity_detector.dart
│   ├── wakeword/
│   │   └── wake_word_detector.dart
│   ├── logging.dart
│   └── voice_assistant.dart     # Main orchestrator
├── models/
│   ├── kws/
│   │   └── get_model.sh         # Download wake word model
│   └── tts/
│       ├── get_model.sh         # Download TTS model
│       └── convert.py           # Convert to sherpa format
├── test/                        # 277 tests
├── tool/                        # Utility scripts
├── assets/
│   ├── acknowledgments/         # Wake word audio
│   └── bargein/                 # Barge-in audio
└── config.yaml                  # Configuration

Development #

# Run all tests
dart test

# Run specific test
dart test test/voice_assistant_test.dart

# Format code
dart format lib test

# Analyze code
dart analyze

Session Recording #

When running with --record, sessions are saved to ./sessions/:

sessions/
└── session_2024-01-15_10-30-45/
    ├── session.jsonl           # Event log
    └── audio/
        ├── 001_user.wav        # User utterances
        ├── 002_user.wav
        └── ...

Event types in JSONL:

session_start - Config and metadata
wake_word - Wake word detection
user_audio - User speech recording
transcription - STT result
response - LLM response
barge_in - User interruption
session_end - Session summary

License #

MIT License - see LICENSE

Third-Party Licenses #

Component	License
whisper.cpp	MIT
llama.cpp	MIT
sherpa_onnx	Apache-2.0
yaml	MIT
logging	BSD-3-Clause

jarvis_dart 1.0.8
jarvis_dart: ^1.0.8 copied to clipboard

Metadata

JARVIS #

Features #

Installation #

Option 1: Global Install (Recommended) #

Option 2: From Source #

Option 3: Compiled Binary (Fastest Startup) #

CLI Commands #

Requirements #

System Dependencies #

Dart SDK

Sox (Audio Recording)

whisper.cpp (Speech-to-Text)

llama.cpp (LLM Inference)

Platform-Specific Notes

Models Required #

Model Setup #

Configuration #

Audio Player Options #

Usage #

CLI Options #

Architecture #

Tools #

Project Structure #

Development #

Session Recording #

License #

Third-Party Licenses #

← Metadata

Publisher

Weekly Downloads

Metadata

License

Dependencies

More

jarvis_dart 1.0.8 jarvis_dart: ^1.0.8 copied to clipboard

Metadata

JARVIS #

Features #

Installation #

Option 1: Global Install (Recommended) #

Option 2: From Source #

Option 3: Compiled Binary (Fastest Startup) #

CLI Commands #

Requirements #

System Dependencies #

Dart SDK

Sox (Audio Recording)

whisper.cpp (Speech-to-Text)

llama.cpp (LLM Inference)

Platform-Specific Notes

Models Required #

Model Setup #

Configuration #

Audio Player Options #

Usage #

CLI Options #

Architecture #

Tools #

Project Structure #

Development #

Session Recording #

License #

Third-Party Licenses #

← Metadata

Publisher

Weekly Downloads

Metadata

License

Dependencies

More

jarvis_dart 1.0.8
jarvis_dart: ^1.0.8 copied to clipboard