Module props

Expand description

This module defines all available properties.

Properties may be empty marker types and implement BinaryProperty, or enumerations¹ and implement EnumeratedProperty.

BinaryPropertys are queried through a CodePointSetData, while EnumeratedPropertys are queried through CodePointMapData.

In addition, some EnumeratedPropertys also implement ParseableEnumeratedProperty or NamedEnumeratedProperty. For these properties, PropertyParser, PropertyNamesLong, and PropertyNamesShort can be constructed.

either Rust enums, or Rust structs with associated constants (open enums) ↩

Structs§

Alnum: Characters with the Alphabetic or Decimal_Number property.
Alphabetic: Alphabetic characters.
AsciiHexDigit: ASCII characters commonly used for the representation of hexadecimal numbers.
BasicEmoji: Characters and character sequences intended for general-purpose, independent, direct input.
BidiClass: Enumerated property Bidi_Class
BidiControl: Format control characters which have specific functions in the Unicode Bidirectional Algorithm.
BidiMirrored: Characters that are mirrored in bidirectional text.
BidiMirroringGlyph: This is a bitpacked combination of the Bidi_Mirroring_Glyph, Bidi_Mirrored, and Bidi_Paired_Bracket_Type properties.
Blank: Horizontal whitespace characters
CanonicalCombiningClass: Property Canonical_Combining_Class. See UAX #15: https://www.unicode.org/reports/tr15/.
CaseIgnorable: Characters which are ignored for casing purposes.
CaseSensitive: Characters that are either the source of a case mapping or in the target of a case mapping.
Cased: Uppercase, lowercase, and titlecase characters.
ChangesWhenCasefolded: Characters whose normalized forms are not stable under case folding.
ChangesWhenCasemapped: Characters which may change when they undergo case mapping.
ChangesWhenLowercased: Characters whose normalized forms are not stable under a toLowercase mapping.
ChangesWhenNfkcCasefolded: Characters which are not identical to their NFKC_Casefold mapping.
ChangesWhenTitlecased: Characters whose normalized forms are not stable under a toTitlecase mapping.
ChangesWhenUppercased: Characters whose normalized forms are not stable under a toUppercase mapping.
Dash: Punctuation characters explicitly called out as dashes in the Unicode Standard, plus their compatibility equivalents.
DefaultIgnorableCodePoint: For programmatic determination of default ignorable code points.
Deprecated: Deprecated characters.
Diacritic: Characters that linguistically modify the meaning of another character to which they apply.
EastAsianWidth: Enumerated property East_Asian_Width.
Emoji: Characters that are emoji.
EmojiComponent: Characters used in emoji sequences that normally do not appear on emoji keyboards as separate choices, such as base characters for emoji keycaps.
EmojiModifier: Characters that are emoji modifiers.
EmojiModifierBase: Characters that can serve as a base for emoji modifiers.
EmojiPresentation: Characters that have emoji presentation by default.
ExtendedPictographic: Pictographic symbols, as well as reserved ranges in blocks largely associated with emoji characters
Extender: Characters whose principal function is to extend the value of a preceding alphabetic character or to extend the shape of adjacent characters.
FullCompositionExclusion: Characters that are excluded from composition.
GeneralCategoryGroup: Groupings of multiple General_Category property values.
GeneralCategoryOutOfBoundsError: Error value for impl TryFrom<u8> for GeneralCategory.
Graph: Invisible characters.
GraphemeBase: Property used together with the definition of Standard Korean Syllable Block to define “Grapheme base”.
GraphemeClusterBreak: Enumerated property Grapheme_Cluster_Break.
GraphemeExtend: Property used to define “Grapheme extender”.
GraphemeLink: Deprecated property.
HangulSyllableType: Enumerated property Hangul_Syllable_Type
HexDigit: Characters commonly used for the representation of hexadecimal numbers, plus their compatibility equivalents.
Hyphen: Deprecated property.
IdCompatMathContinue: ID_Compat_Math_Continue Property
IdCompatMathStart: ID_Compat_Math_Start Property
IdContinue: Characters that can come after the first character in an identifier.
IdStart: Characters that can begin an identifier.
Ideographic: Characters considered to be CJKV (Chinese, Japanese, Korean, and Vietnamese) ideographs, or related siniform ideographs
IdsBinaryOperator: Characters used in Ideographic Description Sequences.
IdsTrinaryOperator: Characters used in Ideographic Description Sequences.
IdsUnaryOperator: IDS_Unary_Operator Property
IndicConjunctBreak: Property Indic_Conjunct_Break. See UAX #44: https://www.unicode.org/reports/tr44/#Indic_Conjunct_Break.
IndicSyllabicCategory: Property Indic_Syllabic_Category. See UAX #44: https://www.unicode.org/reports/tr44/#Indic_Syllabic_Category.
JoinControl: Format control characters which have specific functions for control of cursive joining and ligation.
JoiningType: Enumerated property Joining_Type.
LineBreak: Enumerated property Line_Break.
LogicalOrderException: A small number of spacing vowel letters occurring in certain Southeast Asian scripts such as Thai and Lao.
Lowercase: Lowercase characters.
Math: Characters used in mathematical notation.
ModifierCombiningMark: Modifier_Combining_Mark Property
NfcInert: Characters that are inert under NFC, i.e., they do not interact with adjacent characters.
NfdInert: Characters that are inert under NFD, i.e., they do not interact with adjacent characters.
NfkcInert: Characters that are inert under NFKC, i.e., they do not interact with adjacent characters.
NfkdInert: Characters that are inert under NFKD, i.e., they do not interact with adjacent characters.
NoncharacterCodePoint: Code points permanently reserved for internal use.
PatternSyntax: Characters used as syntax in patterns (such as regular expressions).
PatternWhiteSpace: Characters used as whitespace in patterns (such as regular expressions).
PrependedConcatenationMark: A small class of visible format controls, which precede and then span a sequence of other characters, usually digits.
Print: Printable characters (visible characters and whitespace).
QuotationMark: Punctuation characters that function as quotation marks.
Radical: Characters used in the definition of Ideographic Description Sequences.
RegionalIndicator: Regional indicator characters, U+1F1E6..U+1F1FF.
Script: Enumerated property Script.
SegmentStarter: Characters that are starters in terms of Unicode normalization and combining character sequences.
SentenceBreak: Enumerated property Sentence_Break.
SentenceTerminal: Punctuation characters that generally mark the end of sentences.
SoftDotted: Characters with a “soft dot”, like i or j.
TerminalPunctuation: Punctuation characters that generally mark the end of textual units.
UnifiedIdeograph: A property which specifies the exact set of Unified CJK Ideographs in the standard.
Uppercase: Uppercase characters.
VariationSelector: Characters that are Variation Selectors.
VerticalOrientation: Property Vertical_Orientation
WhiteSpace: Spaces, separator characters and other control characters which should be treated by programming languages as “white space” for the purpose of parsing elements.
WordBreak: Enumerated property Word_Break.
Xdigit: Hexadecimal digits
XidContinue: Characters that can come after the first character in an identifier.
XidStart: Characters that can begin an identifier.

Enums§

BidiPairedBracketType: The enum represents Bidi_Paired_Bracket_Type.
GeneralCategory: Enumerated property General_Category.

Traits§

BinaryProperty: A binary Unicode character property.
EmojiSet: An Emoji set as defined by Unicode Technical Standard #51.
EnumeratedProperty: A Unicode character property that assigns a value to each code point.
NamedEnumeratedProperty: A property whose value names can be represented as strings.
ParseableEnumeratedProperty: A property whose value names can be parsed from strings.

Module props

Module props Copy item path

Structs§

Enums§

Traits§

Module props