N-gram Parser for CJK Languages

N-gram Parser

The built-in n-gram parser tokenizes text character by character (not by word boundaries), enabling full-text search for Chinese, Japanese, and Korean.

Example

-- Create table with n-gram parser
CREATE TABLE cjk_articles (
  id INT AUTO_INCREMENT PRIMARY KEY,
  content TEXT,
  FULLTEXT(content) WITH PARSER ngram
);

-- Set n-gram token size (default 2)
-- ngram_token_size = 2  (in my.cnf)

SQL

Full Editor

Pro Tip

ngram_token_size = 2 means every 2-character sequence is indexed.

Related Resources

MySQL Reference

Complete tag & property list

MySQL How-To Guides

Step-by-step practical guides

MySQL Exercises

Practice what you've learned