tiny_segmenter_dart 1.0.1
tiny_segmenter_dart: ^1.0.1 copied to clipboard
A compact Japanese text tokenizer for Dart. TinySegmenter is a Japanese word segmentation library based on the original JavaScript implementation by Taku Kudo.
Changelog #
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
1.0.1 - 2024-12-31 #
Changed #
- Updated license from MIT to BSD 3-Clause to match original TinySegmenter license
- Improved documentation and README formatting
- Updated repository URLs
1.0.0 - 2024-12-31 #
Added #
- Initial release of TinySegmenter for Dart
- Japanese text segmentation functionality
- Support for Hiragana, Katakana, Kanji, and mixed text
- Dictionary-free statistical approach for word segmentation
- Comprehensive test suite
- Full API documentation
- README with usage examples
- Character type classification system
- Support for all Dart platforms (Flutter, Web, Server)
Features #
segment(String input)
method for text segmentation- Handles empty strings gracefully
- Returns List
- Optimized for Japanese text processing
- No external dependencies
Technical Details #
- Based on the original JavaScript TinySegmenter by Taku Kudo
- Uses pre-trained statistical models for segmentation
- Character type analysis for improved accuracy
- Efficient implementation with minimal memory footprint