tiny_segmenter_dart 1.0.1 copy "tiny_segmenter_dart: ^1.0.1" to clipboard
tiny_segmenter_dart: ^1.0.1 copied to clipboard

A compact Japanese text tokenizer for Dart. TinySegmenter is a Japanese word segmentation library based on the original JavaScript implementation by Taku Kudo.

Changelog #

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

1.0.1 - 2024-12-31 #

Changed #

  • Updated license from MIT to BSD 3-Clause to match original TinySegmenter license
  • Improved documentation and README formatting
  • Updated repository URLs

1.0.0 - 2024-12-31 #

Added #

  • Initial release of TinySegmenter for Dart
  • Japanese text segmentation functionality
  • Support for Hiragana, Katakana, Kanji, and mixed text
  • Dictionary-free statistical approach for word segmentation
  • Comprehensive test suite
  • Full API documentation
  • README with usage examples
  • Character type classification system
  • Support for all Dart platforms (Flutter, Web, Server)

Features #

  • segment(String input) method for text segmentation
  • Handles empty strings gracefully
  • Returns List
  • Optimized for Japanese text processing
  • No external dependencies

Technical Details #

  • Based on the original JavaScript TinySegmenter by Taku Kudo
  • Uses pre-trained statistical models for segmentation
  • Character type analysis for improved accuracy
  • Efficient implementation with minimal memory footprint
3
likes
125
points
32
downloads

Publisher

verified publisheriori.dev

Weekly Downloads

A compact Japanese text tokenizer for Dart. TinySegmenter is a Japanese word segmentation library based on the original JavaScript implementation by Taku Kudo.

Repository (GitHub)
View/report issues

Topics

#japanese #text-processing #tokenizer #nlp #segmentation

Documentation

API reference

License

unknown (license)

More

Packages that depend on tiny_segmenter_dart