cl-wordcut

https://github.com/veer66/cl-wordcut.git

git clone 'http://github.com/veer66/cl-wordcut.git'

(ql:quickload :cl-wordcut)
1

cl-wordcut

cl-wordcut is a word segmentation tool for ASEAN languages written in Common Lisp.

Example

Khmer (Cambodian)

(require 'cl-wordcut)
(defvar *dict* (cl-wordcut:load-dict-from-bundle "khmerwords.txt"))
(defvar *wordcut* (cl-wordcut:create-basic-wordcut *dict*))
(funcall *wordcut* "ភាសាខ្មែរ")

Lao

(require 'cl-wordcut)
(defvar *dict* (cl-wordcut:load-dict-from-bundle "laowords.txt"))
(defvar *wordcut* (cl-wordcut:create-basic-wordcut *dict*))
(funcall *wordcut* "ພາສາລາວມີ")

Thai

(require 'cl-wordcut)
(defvar *dict* (cl-wordcut:load-dict-from-bundle "tdict-std.txt"))
(defvar *wordcut* (cl-wordcut:create-basic-wordcut *dict*))
(funcall *wordcut* "กากาม")