Companion

object Companion

Types

Link copied to clipboard

A family of character subsets representing the character scripts defined in the Unicode Standard Annex #24: Script Names. Every Unicode character is assigned to a single Unicode script, either a specific script, such as Latin, or one of the following three special values, Inherited, Common or Unknown.

Properties

Link copied to clipboard
const val BYTES: Int

The number of bytes used to represent a char value in unsigned binary form.

Link copied to clipboard

General category "Mc" in the Unicode specification.

Link copied to clipboard
const val CONNECTOR_PUNCTUATION: Byte = 23

General category "Pc" in the Unicode specification.

Link copied to clipboard
const val CONTROL: Byte = 15

General category "Cc" in the Unicode specification.

Link copied to clipboard
const val CURRENCY_SYMBOL: Byte = 26

General category "Sc" in the Unicode specification.

Link copied to clipboard
const val DASH_PUNCTUATION: Byte = 20

General category "Pd" in the Unicode specification.

Link copied to clipboard
const val DECIMAL_DIGIT_NUMBER: Byte = 9

General category "Nd" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "AN" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "BN" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "CS" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "EN" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "ES" in the Unicode specification.

Weak bidirectional character type "ET" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "FSI" in the Unicode specification.

Link copied to clipboard

Strong bidirectional character type "L" in the Unicode specification.

Link copied to clipboard

Strong bidirectional character type "LRE" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "LRI" in the Unicode specification.

Link copied to clipboard

Strong bidirectional character type "LRO" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "NSM" in the Unicode specification.

Link copied to clipboard

Neutral bidirectional character type "ON" in the Unicode specification.

Link copied to clipboard

Neutral bidirectional character type "B" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "PDF" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "PDI" in the Unicode specification.

Link copied to clipboard

Strong bidirectional character type "R" in the Unicode specification.

Link copied to clipboard

Strong bidirectional character type "AL" in the Unicode specification.

Link copied to clipboard

Strong bidirectional character type "RLE" in the Unicode specification.

Link copied to clipboard

Weak bidirectional character type "RLI" in the Unicode specification.

Link copied to clipboard

Strong bidirectional character type "RLO" in the Unicode specification.

Link copied to clipboard

Neutral bidirectional character type "S" in the Unicode specification.

Link copied to clipboard

Undefined bidirectional character type. Undefined {@code char} values have undefined directionality in the Unicode specification.

Link copied to clipboard

Neutral bidirectional character type "WS" in the Unicode specification.

Link copied to clipboard
const val ENCLOSING_MARK: Byte = 7

General category "Me" in the Unicode specification.

Link copied to clipboard
const val END_PUNCTUATION: Byte = 22

General category "Pe" in the Unicode specification.

Link copied to clipboard
const val ERROR: Int

Error flag. Use int (code point) to avoid confusion with U+FFFF.

Link copied to clipboard

General category "Pf" in the Unicode specification.

Link copied to clipboard
const val FORMAT: Byte = 16

General category "Cf" in the Unicode specification.

Link copied to clipboard

General category "Pi" in the Unicode specification.

Link copied to clipboard
const val LETTER_NUMBER: Byte = 10

General category "Nl" in the Unicode specification.

Link copied to clipboard
const val LINE_SEPARATOR: Byte = 13

General category "Zl" in the Unicode specification.

Link copied to clipboard
const val LOWERCASE_LETTER: Byte = 2

General category "Ll" in the Unicode specification.

Link copied to clipboard
const val MATH_SYMBOL: Byte = 25

General category "Sm" in the Unicode specification.

Link copied to clipboard
const val MAX_CODE_POINT: Int
Link copied to clipboard
const val MAX_HIGH_SURROGATE: Char = '\uDBFF'

The maximum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant '\u005CuDBFF'. A high-surrogate is also known as a leading-surrogate.

Link copied to clipboard
const val MAX_LOW_SURROGATE: Char = '\uDFFF'

The maximum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant '\u005CuDFFF'. A low-surrogate is also known as a trailing-surrogate.

Link copied to clipboard
const val MAX_RADIX: Int = 36

The maximum radix available for conversion to and from strings. The constant value of this field is the largest value permitted for the radix argument in radix-conversion methods such as the digit method, the forDigit method, and the toString method of class Integer.

Link copied to clipboard
const val MAX_SURROGATE: Char

The maximum value of a Unicode surrogate code unit in the UTF-16 encoding, constant '\u005CuDFFF'.

Link copied to clipboard
const val MAX_VALUE: Char = '\uFFFF'

The constant value of this field is the largest value of type char, '\u005CuFFFF'.

Link copied to clipboard
const val MIN_CODE_POINT: Int = 0
Link copied to clipboard
const val MIN_HIGH_SURROGATE: Char = '\uD800'

The minimum value of a Unicode high-surrogate code unit in the UTF-16 encoding, constant '\u005CuD800'. A high-surrogate is also known as a leading-surrogate.

Link copied to clipboard
const val MIN_LOW_SURROGATE: Char = '\uDC00'

The minimum value of a Unicode low-surrogate code unit in the UTF-16 encoding, constant '\u005CuDC00'. A low-surrogate is also known as a trailing-surrogate.

Link copied to clipboard
const val MIN_RADIX: Int = 2

The minimum radix available for conversion to and from strings. The constant value of this field is the smallest value permitted for the radix argument in radix-conversion methods such as the digit method, the forDigit method, and the toString method of class Integer.

Link copied to clipboard
Link copied to clipboard
const val MIN_SURROGATE: Char

The minimum value of a Unicode surrogate code unit in the UTF-16 encoding, constant '\u005CuD800'.

Link copied to clipboard
const val MIN_VALUE: Char = '\u0000'

The constant value of this field is the smallest value of type char, '\u005Cu0000'.

Link copied to clipboard
const val MODIFIER_LETTER: Byte = 4

General category "Lm" in the Unicode specification.

Link copied to clipboard
const val MODIFIER_SYMBOL: Byte = 27

General category "Sk" in the Unicode specification.

Link copied to clipboard
const val NON_SPACING_MARK: Byte = 6

General category "Mn" in the Unicode specification.

Link copied to clipboard
const val OTHER_LETTER: Byte = 5

General category "Lo" in the Unicode specification.

Link copied to clipboard
const val OTHER_NUMBER: Byte = 11

General category "No" in the Unicode specification.

Link copied to clipboard
const val OTHER_PUNCTUATION: Byte = 24

General category "Po" in the Unicode specification.

Link copied to clipboard
const val OTHER_SYMBOL: Byte = 28

General category "So" in the Unicode specification.

Link copied to clipboard
const val PARAGRAPH_SEPARATOR: Byte = 14

General category "Zp" in the Unicode specification.

Link copied to clipboard
const val PRIVATE_USE: Byte = 18

General category "Co" in the Unicode specification.

Link copied to clipboard
const val SIZE: Int = 16
Link copied to clipboard
const val SPACE_SEPARATOR: Byte = 12

General category "Zs" in the Unicode specification.

Link copied to clipboard
const val START_PUNCTUATION: Byte = 21

General category "Ps" in the Unicode specification.

Link copied to clipboard
const val SURROGATE: Byte = 19

General category "Cs" in the Unicode specification.

Link copied to clipboard
const val TITLECASE_LETTER: Byte = 3

General category "Lt" in the Unicode specification.

Link copied to clipboard
const val UNASSIGNED: Byte = 0

General category "Cn" in the Unicode specification.

Link copied to clipboard
const val UPPERCASE_LETTER: Byte = 1

General category "Lu" in the Unicode specification.

Functions

Link copied to clipboard
fun charCount(codePoint: Int): Int
Link copied to clipboard
fun codePointAt(seq: CharSequence, index: Int): Int

Returns the code point at the given index of the CharSequence. If the char value at the given index in the CharSequence is in the high-surrogate range, the following index is less than the length of the CharSequence, and the char value at the following index is in the low-surrogate range, then the supplementary code point corresponding to this surrogate pair is returned. Otherwise, the char value at the given index is returned.

fun codePointAt(a: CharArray, index: Int, limit: Int): Int

Returns the code point at the given index of the char array, where only array elements with index less than limit can be used. If the char value at the given index in the char array is in the high-surrogate range, the following index is less than the limit, and the char value at the following index is in the low-surrogate range, then the supplementary code point corresponding to this surrogate pair is returned. Otherwise, the char value at the given index is returned.

Link copied to clipboard
fun codePointAtImpl(a: CharArray, index: Int, limit: Int): Int
Link copied to clipboard
fun codePointCount(a: CharArray, offset: Int, count: Int): Int

Returns the number of Unicode code points in a subarray of the char array argument.

Link copied to clipboard
fun compare(x: Char, y: Char): Int

Compares two char values numerically. The value returned is identical to what would be returned by:

Link copied to clipboard
fun getNumericValue(codePoint: Int): Int

Returns the numeric value of the specified character (Unicode code point).

Link copied to clipboard
fun getType(codePoint: Int): Int

Returns a value indicating a character's general category.

Link copied to clipboard
fun highSurrogate(codePoint: Int): Char

Returns the leading surrogate (a high surrogate code unit) of the surrogate pair representing the specified supplementary character (Unicode code point) in the UTF-16 encoding. If the specified character is not a Character.html#supplementary, an unspecified char is returned.

Link copied to clipboard
fun isBmpCodePoint(codePoint: Int): Boolean

Determines whether the specified character (Unicode code point) is in the #BMP. Such code points can be represented using a single char.

Link copied to clipboard
fun isDigit(codePoint: Int): Boolean

Determines if the specified character (Unicode code point) is a digit.

Link copied to clipboard

Determines if the specified character (Unicode code point) is an Extended Pictographic.

Link copied to clipboard

Determines if the given char value is a Unicode high-surrogate code unit (also known as leading-surrogate code unit).

Link copied to clipboard
fun isLetter(codePoint: Int): Boolean

Determines if the specified character (Unicode code point) is a letter.

Link copied to clipboard
fun isLowerCase(codePoint: Int): Boolean

Determines if the specified character (Unicode code point) is a lowercase character.

Link copied to clipboard

Determines if the given char value is a Unicode low-surrogate code unit (also known as trailing-surrogate code unit).

Link copied to clipboard

Determines whether the specified character (Unicode code point) is in the #supplementary range.

Link copied to clipboard
fun isValidCodePoint(codePoint: Int): Boolean

Determines whether the specified code point is a valid Unicode code point value.

Link copied to clipboard
fun isWhitespace(codePoint: Int): Boolean

Determines if the specified character (Unicode code point) is white space according to Java.

Link copied to clipboard
fun lowSurrogate(codePoint: Int): Char

Returns the trailing surrogate (a low surrogate code unit) of the surrogate pair representing the specified supplementary character (Unicode code point) in the UTF-16 encoding. If the specified character is not a Character.html#supplementary, an unspecified char is returned.

Link copied to clipboard
fun offsetByCodePoints(a: CharArray, start: Int, count: Int, index: Int, codePointOffset: Int): Int

Returns the index within the given char subarray that is offset from the given index by codePointOffset code points.

Link copied to clipboard
fun toChars(codePoint: Int, dst: CharArray, dstIndex: Int): Int

Converts the specified character (Unicode code point) to its UTF-16 representation. If the specified code point is a BMP (Basic Multilingual Plane or Plane 0) value, the same value is stored in dst[dstIndex], and 1 is returned. If the specified code point is a supplementary character, its surrogate values are stored in dst[dstIndex] (high-surrogate) and dst[dstIndex+1] (low-surrogate), and 2 is returned.

Link copied to clipboard
fun toCodePoint(high: Char, low: Char): Int

Converts the specified surrogate pair to its supplementary code point value. This method does not validate the specified surrogate pair. The caller must validate it using .isSurrogatePair if necessary.

Link copied to clipboard
fun toLowerCase(codePoint: Int): Int

Converts the character (Unicode code point) argument to lowercase using case mapping information from the UnicodeData file.

Link copied to clipboard
fun toSurrogates(codePoint: Int, dst: CharArray, index: Int)
Link copied to clipboard
fun toUpperCase(codePoint: Int): Int

Converts the character (Unicode code point) argument to uppercase using case mapping information from the UnicodeData file.