PHP正则\p支持和Unicode字符集
一、用法
\p{xx} 一个有属性 xx 的字符
\P{xx} 一个没有属性 xx 的字符
\X 一个扩展的 Unicode 字符
Unicode 字符集在具体文字中定义。使用文字名可以匹配这些字符集中的一个字符。例如:
\p{Greek}
\P{Han}
二、支持的字符集
支持的文字 | ||||
---|---|---|---|---|
Arabic | Armenian | Avestan | Balinese | Bamum |
Batak | Bengali | Bopomofo | Brahmi | Braille |
Buginese | Buhid | Canadian_Aboriginal | Carian | Chakma |
Cham | Cherokee | Common | Coptic | Cuneiform |
Cypriot | Cyrillic | Deseret | Devanagari | Egyptian_Hieroglyphs |
Ethiopic | Georgian | Glagolitic | Gothic | Greek |
Gujarati | Gurmukhi | Han | Hangul | Hanunoo |
Hebrew | Hiragana | Imperial_Aramaic | Inherited | Inscriptional_Pahlavi |
Inscriptional_Parthian | Javanese | Kaithi | Kannada | Katakana |
Kayah_Li | Kharoshthi | Khmer | Lao | Latin |
Lepcha | Limbu | Linear_B | Lisu | Lycian |
Lydian | Malayalam | Mandaic | Meetei_Mayek | Meroitic_Cursive |
Meroitic_Hieroglyphs | Miao | Mongolian | Myanmar | New_Tai_Lue |
Nko | Ogham | Old_Italic | Old_Persian | Old_South_Arabian |
Old_Turkic | Ol_Chiki | Oriya | Osmanya | Phags_Pa |
Phoenician | Rejang | Runic | Samaritan | Saurashtra |
Sharada | Shavian | Sinhala | Sora_Sompeng | Sundanese |
Syloti_Nagri | Syriac | Tagalog | Tagbanwa | Tai_Le |
Tai_Tham | Tai_Viet | Takri | Tamil | Telugu |
Thaana | Thai | Tibetan | Tifinagh | Ugaritic |
Vai | Yi |