Companion
Types
A family of character subsets representing the character scripts defined in the Unicode Standard Annex #24: Script Names. Every Unicode character is assigned to a single Unicode script, either a specific script, such as Latin, or one of the following three special values, Inherited, Common or Unknown.
Properties
General category "Mc" in the Unicode specification.
General category "Pc" in the Unicode specification.
General category "Sc" in the Unicode specification.
General category "Pd" in the Unicode specification.
General category "Nd" in the Unicode specification.
Weak bidirectional character type "AN" in the Unicode specification.
Weak bidirectional character type "BN" in the Unicode specification.
Weak bidirectional character type "CS" in the Unicode specification.
Weak bidirectional character type "EN" in the Unicode specification.
Weak bidirectional character type "ES" in the Unicode specification.
Weak bidirectional character type "ET" in the Unicode specification.
Weak bidirectional character type "FSI" in the Unicode specification.
Strong bidirectional character type "L" in the Unicode specification.
Strong bidirectional character type "LRE" in the Unicode specification.
Weak bidirectional character type "LRI" in the Unicode specification.
Strong bidirectional character type "LRO" in the Unicode specification.
Weak bidirectional character type "NSM" in the Unicode specification.
Neutral bidirectional character type "ON" in the Unicode specification.
Neutral bidirectional character type "B" in the Unicode specification.
Weak bidirectional character type "PDF" in the Unicode specification.
Weak bidirectional character type "PDI" in the Unicode specification.
Strong bidirectional character type "R" in the Unicode specification.
Strong bidirectional character type "AL" in the Unicode specification.
Strong bidirectional character type "RLE" in the Unicode specification.
Weak bidirectional character type "RLI" in the Unicode specification.
Strong bidirectional character type "RLO" in the Unicode specification.
Neutral bidirectional character type "S" in the Unicode specification.
Undefined bidirectional character type. Undefined {@code char} values have undefined directionality in the Unicode specification.
Neutral bidirectional character type "WS" in the Unicode specification.
General category "Me" in the Unicode specification.
General category "Pe" in the Unicode specification.
General category "Pf" in the Unicode specification.
General category "Pi" in the Unicode specification.
General category "Nl" in the Unicode specification.
General category "Zl" in the Unicode specification.
General category "Ll" in the Unicode specification.
General category "Sm" in the Unicode specification.
The maximum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant '\u005CuDBFF'. A high-surrogate is also known as a leading-surrogate.
The maximum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant '\u005CuDFFF'. A low-surrogate is also known as a trailing-surrogate.
The maximum value of a Unicode surrogate code unit in the UTF-16 encoding, constant '\u005CuDFFF'.
The minimum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant '\u005CuD800'. A high-surrogate is also known as a leading-surrogate.
The minimum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant '\u005CuDC00'. A low-surrogate is also known as a trailing-surrogate.
The minimum value of a Unicode surrogate code unit in the UTF-16 encoding, constant '\u005CuD800'.
General category "Lm" in the Unicode specification.
General category "Sk" in the Unicode specification.
General category "Mn" in the Unicode specification.
General category "Lo" in the Unicode specification.
General category "No" in the Unicode specification.
General category "Po" in the Unicode specification.
General category "So" in the Unicode specification.
General category "Zp" in the Unicode specification.
General category "Co" in the Unicode specification.
General category "Zs" in the Unicode specification.
General category "Ps" in the Unicode specification.
General category "Lt" in the Unicode specification.
General category "Cn" in the Unicode specification.
General category "Lu" in the Unicode specification.
Functions
Returns the code point at the given index of the CharSequence. If the char value at the given index in the CharSequence is in the high-surrogate range, the following index is less than the length of the CharSequence, and the char value at the following index is in the low-surrogate range, then the supplementary code point corresponding to this surrogate pair is returned. Otherwise, the char value at the given index is returned.
Returns the code point at the given index of the char array, where only array elements with index less than limit can be used. If the char value at the given index in the char array is in the high-surrogate range, the following index is less than the limit, and the char value at the following index is in the low-surrogate range, then the supplementary code point corresponding to this surrogate pair is returned. Otherwise, the char value at the given index is returned.
Returns the number of Unicode code points in a subarray of the char array argument.
Returns the numeric value of the specified character (Unicode code point).
Returns the leading surrogate (a high surrogate code unit) of the surrogate pair representing the specified supplementary character (Unicode code point) in the UTF-16 encoding. If the specified character is not a Character.html#supplementary, an unspecified char is returned.
Determines whether the specified character (Unicode code point) is in the #BMP. Such code points can be represented using a single char.
Determines if the specified character (Unicode code point) is an Extended Pictographic.
Determines if the given char value is a Unicode high-surrogate code unit (also known as leading-surrogate code unit).
Determines if the specified character (Unicode code point) is a lowercase character.
Determines if the given char value is a Unicode low-surrogate code unit (also known as trailing-surrogate code unit).
Determines whether the specified character (Unicode code point) is in the #supplementary range.
Determines whether the specified code point is a valid Unicode code point value.
Determines if the specified character (Unicode code point) is white space according to Java.
Returns the trailing surrogate (a low surrogate code unit) of the surrogate pair representing the specified supplementary character (Unicode code point) in the UTF-16 encoding. If the specified character is not a Character.html#supplementary, an unspecified char is returned.
Converts the specified character (Unicode code point) to its UTF-16 representation. If the specified code point is a BMP (Basic Multilingual Plane or Plane 0) value, the same value is stored in dst[dstIndex], and 1 is returned. If the specified code point is a supplementary character, its surrogate values are stored in dst[dstIndex] (high-surrogate) and dst[dstIndex+1] (low-surrogate), and 2 is returned.
Converts the specified surrogate pair to its supplementary code point value. This method does not validate the specified surrogate pair. The caller must validate it using .isSurrogatePair if necessary.
Converts the character (Unicode code point) argument to lowercase using case mapping information from the UnicodeData file.
Converts the character (Unicode code point) argument to uppercase using case mapping information from the UnicodeData file.