N-gram Parser
The built-in n-gram parser tokenizes text character by character (not by word boundaries), enabling full-text search for Chinese, Japanese, and Korean.
The built-in n-gram parser tokenizes text character by character (not by word boundaries), enabling full-text search for Chinese, Japanese, and Korean.
-- Create table with n-gram parser
CREATE TABLE cjk_articles (
id INT AUTO_INCREMENT PRIMARY KEY,
content TEXT,
FULLTEXT(content) WITH PARSER ngram
);
-- Set n-gram token size (default 2)
-- ngram_token_size = 2 (in my.cnf)
ngram_token_size = 2 means every 2-character sequence is indexed.