Save This Page
Home » apache-harmony-6.0-src-r917296-snapshot » java » lang » [javadoc | source]
java.lang
public final class: Character [javadoc | source]
java.lang.Object
   java.lang.Character

All Implemented Interfaces:
    Comparable, Serializable

The wrapper for the primitive type {@code char}. This class also provides a number of utility methods for working with characters.

Character data is based upon the Unicode Standard, 4.0. The Unicode specification, character tables and other information are available at http://www.unicode.org/.

Unicode characters are referred to as code points. The range of valid code points is U+0000 to U+10FFFF. The Basic Multilingual Plane (BMP) is the code point range U+0000 to U+FFFF. Characters above the BMP are referred to as Supplementary Characters. On the Java platform, UTF-16 encoding and {@code char} pairs are used to represent code points in the supplementary range. A pair of {@code char} values that represent a supplementary character are made up of a high surrogate with a value range of 0xD800 to 0xDBFF and a low surrogate with a value range of 0xDC00 to 0xDFFF.

On the Java platform a {@code char} value represents either a single BMP code point or a UTF-16 unit that's part of a surrogate pair. The {@code int} type is used to represent all Unicode code points.

Nested Class Summary:
public static class  Character.Subset   
public static final class  Character.UnicodeBlock  Represents a block of Unicode characters, as defined by the Unicode 4.0.1 specification. 
static class  Character.valueOfCache   
Field Summary
public static final  char MIN_VALUE    The minimum {@code Character} value. 
public static final  char MAX_VALUE    The maximum {@code Character} value. 
public static final  int MIN_RADIX    The minimum radix used for conversions between characters and integers. 
public static final  int MAX_RADIX    The maximum radix used for conversions between characters and integers. 
public static final  Class<Character> TYPE    The Class object that represents the primitive type {@code char}. 
public static final  byte UNASSIGNED    Unicode category constant Cn. 
public static final  byte UPPERCASE_LETTER    Unicode category constant Lu. 
public static final  byte LOWERCASE_LETTER    Unicode category constant Ll. 
public static final  byte TITLECASE_LETTER    Unicode category constant Lt. 
public static final  byte MODIFIER_LETTER    Unicode category constant Lm. 
public static final  byte OTHER_LETTER    Unicode category constant Lo. 
public static final  byte NON_SPACING_MARK    Unicode category constant Mn. 
public static final  byte ENCLOSING_MARK    Unicode category constant Me. 
public static final  byte COMBINING_SPACING_MARK    Unicode category constant Mc. 
public static final  byte DECIMAL_DIGIT_NUMBER    Unicode category constant Nd. 
public static final  byte LETTER_NUMBER    Unicode category constant Nl. 
public static final  byte OTHER_NUMBER    Unicode category constant No. 
public static final  byte SPACE_SEPARATOR    Unicode category constant Zs. 
public static final  byte LINE_SEPARATOR    Unicode category constant Zl. 
public static final  byte PARAGRAPH_SEPARATOR    Unicode category constant Zp. 
public static final  byte CONTROL    Unicode category constant Cc. 
public static final  byte FORMAT    Unicode category constant Cf. 
public static final  byte PRIVATE_USE    Unicode category constant Co. 
public static final  byte SURROGATE    Unicode category constant Cs. 
public static final  byte DASH_PUNCTUATION    Unicode category constant Pd. 
public static final  byte START_PUNCTUATION    Unicode category constant Ps. 
public static final  byte END_PUNCTUATION    Unicode category constant Pe. 
public static final  byte CONNECTOR_PUNCTUATION    Unicode category constant Pc. 
public static final  byte OTHER_PUNCTUATION    Unicode category constant Po. 
public static final  byte MATH_SYMBOL    Unicode category constant Sm. 
public static final  byte CURRENCY_SYMBOL    Unicode category constant Sc. 
public static final  byte MODIFIER_SYMBOL    Unicode category constant Sk. 
public static final  byte OTHER_SYMBOL    Unicode category constant So. 
public static final  byte INITIAL_QUOTE_PUNCTUATION    Unicode category constant Pi.
    since: 1.4 -
 
public static final  byte FINAL_QUOTE_PUNCTUATION    Unicode category constant Pf.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_UNDEFINED    Unicode bidirectional constant.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_LEFT_TO_RIGHT    Unicode bidirectional constant L.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_RIGHT_TO_LEFT    Unicode bidirectional constant R.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC    Unicode bidirectional constant AL.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_EUROPEAN_NUMBER    Unicode bidirectional constant EN.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR    Unicode bidirectional constant ES.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR    Unicode bidirectional constant ET.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_ARABIC_NUMBER    Unicode bidirectional constant AN.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR    Unicode bidirectional constant CS.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_NONSPACING_MARK    Unicode bidirectional constant NSM.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_BOUNDARY_NEUTRAL    Unicode bidirectional constant BN.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_PARAGRAPH_SEPARATOR    Unicode bidirectional constant B.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_SEGMENT_SEPARATOR    Unicode bidirectional constant S.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_WHITESPACE    Unicode bidirectional constant WS.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_OTHER_NEUTRALS    Unicode bidirectional constant ON.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING    Unicode bidirectional constant LRE.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE    Unicode bidirectional constant LRO.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING    Unicode bidirectional constant RLE.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE    Unicode bidirectional constant RLO.
    since: 1.4 -
 
public static final  byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT    Unicode bidirectional constant PDF.
    since: 1.4 -
 
public static final  char MIN_HIGH_SURROGATE    The minimum value of a high surrogate or leading surrogate unit in UTF-16 encoding, {@code '\uD800'}.
    since: 1.5 -
 
public static final  char MAX_HIGH_SURROGATE    The maximum value of a high surrogate or leading surrogate unit in UTF-16 encoding, {@code '\uDBFF'}.
    since: 1.5 -
 
public static final  char MIN_LOW_SURROGATE    The minimum value of a low surrogate or trailing surrogate unit in UTF-16 encoding, {@code '\uDC00'}.
    since: 1.5 -
 
public static final  char MAX_LOW_SURROGATE    The maximum value of a low surrogate or trailing surrogate unit in UTF-16 encoding, {@code '\uDFFF'}.
    since: 1.5 -
 
public static final  char MIN_SURROGATE    The minimum value of a surrogate unit in UTF-16 encoding, {@code '\uD800'}.
    since: 1.5 -
 
public static final  char MAX_SURROGATE    The maximum value of a surrogate unit in UTF-16 encoding, {@code '\uDFFF'}.
    since: 1.5 -
 
public static final  int MIN_SUPPLEMENTARY_CODE_POINT    The minimum value of a supplementary code point, {@code U+010000}.
    since: 1.5 -
 
public static final  int MIN_CODE_POINT    The minimum code point value, {@code U+0000}.
    since: 1.5 -
 
public static final  int MAX_CODE_POINT    The maximum code point value, {@code U+10FFFF}.
    since: 1.5 -
 
public static final  int SIZE    The number of bits required to represent a {@code Character} value unsigned form.
    since: 1.5 -
 
Constructor:
 public Character(char value) 
Method from java.lang.Character Summary:
charCount,   charValue,   codePointAt,   codePointAt,   codePointAt,   codePointBefore,   codePointBefore,   codePointBefore,   codePointCount,   codePointCount,   compareTo,   digit,   digit,   equals,   forDigit,   getDirectionality,   getDirectionality,   getNumericValue,   getNumericValue,   getType,   getType,   hashCode,   isDefined,   isDefined,   isDigit,   isDigit,   isHighSurrogate,   isISOControl,   isISOControl,   isIdentifierIgnorable,   isIdentifierIgnorable,   isJavaIdentifierPart,   isJavaIdentifierPart,   isJavaIdentifierStart,   isJavaIdentifierStart,   isJavaLetter,   isJavaLetterOrDigit,   isLetter,   isLetter,   isLetterOrDigit,   isLetterOrDigit,   isLowSurrogate,   isLowerCase,   isLowerCase,   isMirrored,   isMirrored,   isSpace,   isSpaceChar,   isSpaceChar,   isSupplementaryCodePoint,   isSurrogatePair,   isTitleCase,   isTitleCase,   isUnicodeIdentifierPart,   isUnicodeIdentifierPart,   isUnicodeIdentifierStart,   isUnicodeIdentifierStart,   isUpperCase,   isUpperCase,   isValidCodePoint,   isWhitespace,   isWhitespace,   offsetByCodePoints,   offsetByCodePoints,   reverseBytes,   toChars,   toChars,   toCodePoint,   toLowerCase,   toLowerCase,   toString,   toString,   toTitleCase,   toTitleCase,   toUpperCase,   toUpperCase,   valueOf
Methods from java.lang.Object:
clone,   equals,   finalize,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from java.lang.Character Detail:
 public static int charCount(int codePoint) 
    Calculates the number of {@code char} values required to represent the specified Unicode code point. This method checks if the {@code codePoint} is greater than or equal to {@code 0x10000}, in which case {@code 2} is returned, otherwise {@code 1}. To test if the code point is valid, use the #isValidCodePoint(int) method.
 public char charValue() 
    Gets the primitive value of this character.
 public static int codePointAt(CharSequence seq,
    int index) 
    Returns the code point at {@code index} in the specified sequence of character units. If the unit at {@code index} is a high-surrogate unit, {@code index + 1} is less than the length of the sequence and the unit at {@code index + 1} is a low-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise the {@code char} value at {@code index} is returned.
 public static int codePointAt(char[] seq,
    int index) 
    Returns the code point at {@code index} in the specified array of character units. If the unit at {@code index} is a high-surrogate unit, {@code index + 1} is less than the length of the array and the unit at {@code index + 1} is a low-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise the {@code char} value at {@code index} is returned.
 public static int codePointAt(char[] seq,
    int index,
    int limit) 
    Returns the code point at {@code index} in the specified array of character units, where {@code index} has to be less than {@code limit}. If the unit at {@code index} is a high-surrogate unit, {@code index + 1} is less than {@code limit} and the unit at {@code index + 1} is a low-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise the {@code char} value at {@code index} is returned.
 public static int codePointBefore(CharSequence seq,
    int index) 
    Returns the code point that preceds {@code index} in the specified sequence of character units. If the unit at {@code index - 1} is a low-surrogate unit, {@code index - 2} is not negative and the unit at {@code index - 2} is a high-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise the {@code char} value at {@code index - 1} is returned.
 public static int codePointBefore(char[] seq,
    int index) 
    Returns the code point that preceds {@code index} in the specified array of character units. If the unit at {@code index - 1} is a low-surrogate unit, {@code index - 2} is not negative and the unit at {@code index - 2} is a high-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise the {@code char} value at {@code index - 1} is returned.
 public static int codePointBefore(char[] seq,
    int index,
    int start) 
    Returns the code point that preceds the {@code index} in the specified array of character units and is not less than {@code start}. If the unit at {@code index - 1} is a low-surrogate unit, {@code index - 2} is not less than {@code start} and the unit at {@code index - 2} is a high-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise the {@code char} value at {@code index - 1} is returned.
 public static int codePointCount(CharSequence seq,
    int beginIndex,
    int endIndex) 
    Counts the number of Unicode code points in the subsequence of the specified character sequence, as delineated by {@code beginIndex} and {@code endIndex}. Any surrogate values with missing pair values will be counted as one code point.
 public static int codePointCount(char[] seq,
    int offset,
    int count) 
    Counts the number of Unicode code points in the subsequence of the specified char array, as delineated by {@code offset} and {@code count}. Any surrogate values with missing pair values will be counted as one code point.
 public int compareTo(Character c) 
    Compares this object to the specified character object to determine their relative order.
 public static int digit(char c,
    int radix) 
    Convenience method to determine the value of the specified character {@code c} in the supplied radix. The value of {@code radix} must be between MIN_RADIX and MAX_RADIX.
 public static int digit(int codePoint,
    int radix) 
    Convenience method to determine the value of the character {@code codePoint} in the supplied radix. The value of {@code radix} must be between MIN_RADIX and MAX_RADIX.
 public boolean equals(Object object) 
    Compares this object with the specified object and indicates if they are equal. In order to be equal, {@code object} must be an instance of {@code Character} and have the same char value as this object.
 public static char forDigit(int digit,
    int radix) 
    Returns the character which represents the specified digit in the specified radix. The {@code radix} must be between {@code MIN_RADIX} and {@code MAX_RADIX} inclusive; {@code digit} must not be negative and smaller than {@code radix}. If any of these conditions does not hold, 0 is returned.
 public static byte getDirectionality(char c) 
    Gets the Unicode directionality of the specified character.
 public static byte getDirectionality(int codePoint) 
    Gets the Unicode directionality of the specified character.
 public static int getNumericValue(char c) 
    Gets the numeric value of the specified Unicode character.
 public static int getNumericValue(int codePoint) 
    Gets the numeric value of the specified Unicode code point. For example, the code point '\u216B' stands for the Roman number XII, which has the numeric value 12.
 public static int getType(char c) 
    Gets the general Unicode category of the specified character.
 public static int getType(int codePoint) 
    Gets the general Unicode category of the specified code point.
 public int hashCode() 
 public static boolean isDefined(char c) 
    Indicates whether the specified character is defined in the Unicode specification.
 public static boolean isDefined(int codePoint) 
    Indicates whether the specified code point is defined in the Unicode specification.
 public static boolean isDigit(char c) 
    Indicates whether the specified character is a digit.
 public static boolean isDigit(int codePoint) 
    Indicates whether the specified code point is a digit.
 public static boolean isHighSurrogate(char ch) 
    Indicates whether {@code ch} is a high- (or leading-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.
 public static boolean isISOControl(char c) 
    Indicates whether the specified character is an ISO control character.
 public static boolean isISOControl(int c) 
    Indicates whether the specified code point is an ISO control character.
 public static boolean isIdentifierIgnorable(char c) 
    Indicates whether the specified character is ignorable in a Java or Unicode identifier.
 public static boolean isIdentifierIgnorable(int codePoint) 
    Indicates whether the specified code point is ignorable in a Java or Unicode identifier.
 public static boolean isJavaIdentifierPart(char c) 
    Indicates whether the specified character is a valid part of a Java identifier other than the first character.
 public static boolean isJavaIdentifierPart(int codePoint) 
    Indicates whether the specified code point is a valid part of a Java identifier other than the first character.
 public static boolean isJavaIdentifierStart(char c) 
    Indicates whether the specified character is a valid first character for a Java identifier.
 public static boolean isJavaIdentifierStart(int codePoint) 
    Indicates whether the specified code point is a valid start for a Java identifier.
 public static boolean isJavaLetter(char c) 
Deprecated! Use - #isJavaIdentifierStart(char)

    Indicates whether the specified character is a Java letter.
 public static boolean isJavaLetterOrDigit(char c) 
Deprecated! Use - #isJavaIdentifierPart(char)

    Indicates whether the specified character is a Java letter or digit character.
 public static boolean isLetter(char c) 
    Indicates whether the specified character is a letter.
 public static boolean isLetter(int codePoint) 
    Indicates whether the specified code point is a letter.
 public static boolean isLetterOrDigit(char c) 
    Indicates whether the specified character is a letter or a digit.
 public static boolean isLetterOrDigit(int codePoint) 
    Indicates whether the specified code point is a letter or a digit.
 public static boolean isLowSurrogate(char ch) 
    Indicates whether {@code ch} is a low- (or trailing-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.
 public static boolean isLowerCase(char c) 
    Indicates whether the specified character is a lower case letter.
 public static boolean isLowerCase(int codePoint) 
    Indicates whether the specified code point is a lower case letter.
 public static boolean isMirrored(char c) 
    Indicates whether the specified character is mirrored.
 public static boolean isMirrored(int codePoint) 
    Indicates whether the specified code point is mirrored.
 public static boolean isSpace(char c) 
Deprecated! Use - #isWhitespace(char)

    Indicates whether the specified character is a Java space.
 public static boolean isSpaceChar(char c) 
    Indicates whether the specified character is a Unicode space character. That is, if it is a member of one of the Unicode categories Space Separator, Line Separator, or Paragraph Separator.
 public static boolean isSpaceChar(int codePoint) 
    Indicates whether the specified code point is a Unicode space character. That is, if it is a member of one of the Unicode categories Space Separator, Line Separator, or Paragraph Separator.
 public static boolean isSupplementaryCodePoint(int codePoint) 
    Indicates whether {@code codePoint} is within the supplementary code point range.
 public static boolean isSurrogatePair(char high,
    char low) 
    Indicates whether the specified character pair is a valid surrogate pair.
 public static boolean isTitleCase(char c) 
    Indicates whether the specified character is a titlecase character.
 public static boolean isTitleCase(int codePoint) 
    Indicates whether the specified code point is a titlecase character.
 public static boolean isUnicodeIdentifierPart(char c) 
    Indicates whether the specified character is valid as part of a Unicode identifier other than the first character.
 public static boolean isUnicodeIdentifierPart(int codePoint) 
    Indicates whether the specified code point is valid as part of a Unicode identifier other than the first character.
 public static boolean isUnicodeIdentifierStart(char c) 
    Indicates whether the specified character is a valid initial character for a Unicode identifier.
 public static boolean isUnicodeIdentifierStart(int codePoint) 
    Indicates whether the specified code point is a valid initial character for a Unicode identifier.
 public static boolean isUpperCase(char c) 
    Indicates whether the specified character is an upper case letter.
 public static boolean isUpperCase(int codePoint) 
    Indicates whether the specified code point is an upper case letter.
 public static boolean isValidCodePoint(int codePoint) 
    Indicates whether {@code codePoint} is a valid Unicode code point.
 public static boolean isWhitespace(char c) 
    Indicates whether the specified character is a whitespace character in Java.
 public static boolean isWhitespace(int codePoint) 
    Indicates whether the specified code point is a whitespace character in Java.
 public static int offsetByCodePoints(CharSequence seq,
    int index,
    int codePointOffset) 
    Determines the index in the specified character sequence that is offset {@code codePointOffset} code points from {@code index}.
 public static int offsetByCodePoints(char[] seq,
    int start,
    int count,
    int index,
    int codePointOffset) 
    Determines the index in a subsequence of the specified character array that is offset {@code codePointOffset} code points from {@code index}. The subsequence is delineated by {@code start} and {@code count}.
 public static char reverseBytes(char c) 
    Reverses the order of the first and second byte in the specified character.
 public static char[] toChars(int codePoint) 
    Converts the specified Unicode code point into a UTF-16 encoded sequence and returns it as a char array.
 public static int toChars(int codePoint,
    char[] dst,
    int dstIndex) 
    Converts the specified Unicode code point into a UTF-16 encoded sequence and copies the value(s) into the char array {@code dst}, starting at index {@code dstIndex}.
 public static int toCodePoint(char high,
    char low) 
    Converts a surrogate pair into a Unicode code point. This method assumes that the pair are valid surrogates. If the pair are not valid surrogates, then the result is indeterminate. The #isSurrogatePair(char, char) method should be used prior to this method to validate the pair.
 public static char toLowerCase(char c) 
    Returns the lower case equivalent for the specified character if the character is an upper case letter. Otherwise, the specified character is returned unchanged.
 public static int toLowerCase(int codePoint) 
    Returns the lower case equivalent for the specified code point if it is an upper case letter. Otherwise, the specified code point is returned unchanged.
 public String toString() 
 public static String toString(char value) 
    Converts the specified character to its string representation.
 public static char toTitleCase(char c) 
    Returns the title case equivalent for the specified character if it exists. Otherwise, the specified character is returned unchanged.
 public static int toTitleCase(int codePoint) 
    Returns the title case equivalent for the specified code point if it exists. Otherwise, the specified code point is returned unchanged.
 public static char toUpperCase(char c) 
    Returns the upper case equivalent for the specified character if the character is a lower case letter. Otherwise, the specified character is returned unchanged.
 public static int toUpperCase(int codePoint) 
    Returns the upper case equivalent for the specified code point if the code point is a lower case letter. Otherwise, the specified code point is returned unchanged.
 public static Character valueOf(char c) 
    Returns a {@code Character} instance for the {@code char} value passed. For ASCII/Latin-1 characters (and generally all characters with a Unicode value up to 512), this method should be used instead of the constructor, as it maintains a cache of corresponding {@code Character} instances.