Character Class in Java
In any programming language, characters are considered one of the fundamental data types. Basic text-based calculations are carried out for Java as well utilizing the char data type, which is the main data type for Java.
But occasionally, while developing, you need to implement objects in place of fundamental data types (char). Java programmers can use the wrapper class, which replaces the char data type with the Character class, to get this capability. The manipulation of characters is made simpler for programmers by this class, which offers a large range of useful classes and methods when dealing with characters in intricate applications.
Introduction to Character Class in Java
In the java. lang package, Java includes a wrapper class called Character. An object of type Character has only a field with the type char. The Character class provides several practical class (i.e., static) methods for manipulating characters.
The Wrapper class in Java helps to convert primitive into object and object into primitive.
Java offers the following 8 wrapper classes:
Datatype | Wrapper Class |
---|---|
boolean | Boolean |
byte | Byte |
char | Character |
short | Short |
int | Integer |
long | Long |
float | Float |
double | Double |
Learn more about wrapper classes here
In this post, we'll talk about Java's Character Class, which wraps the value of primitive data type char into its object. Each Character object can contain a single char field.
The Unicode Data file, updated by the Unicode Consortium, contains all of the Character class's attributes, methods, and constructors.
Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally.
**The Unicode Consortium, a non-profit organization, is dedicated to the advancement, upkeep, and advocacy of software internationalization standards and data. A central focus of its work is the Unicode Standard, a comprehensive specification governing text representation across contemporary software platforms and industry standards.
A Character class instance or object can contain individual character data. In addition, this wrapper class also provides some useful methods for manipulating, inspecting, or processing single-character data.
The character set from U+0000 to U+FFFF is sometimes called the Basic Multilingual Plane (or BMP). Characters with CodePoints greater than U+FFFF are called supplementary characters.
The Java language typically represents char arrays in Strings or String Buffers using the UTF-16 encoding method. In this type of representation, additional characters are displayed as character pairs. The first character is displayed from the high surrogate area (\uD800-\uDBFF), and the second from the low surrogate area (\uDc00-\uDBFF).
How to Add Character Object in Java
The Character class proviseveralr of convenient (static) class methods for manipulating characters. Let's see how we can create a Character object using the Character constructor.
character ch = newcharacter('a');
Now, you may wonder, what would happen if we pass a primitive character to a method that expects an object? In that case, the Java compiler automatically converts it to a character.
This feature is called Autoboxing or Unboxing
Autoboxing It is conversion of a primitive value into an object of the its wrapper class is called Autoboxing
Unboxing, It is converting an object of a wrapper type to its corresponding primitive value
Learn more about Autoboxing here
Character num = new Character('2');
In the example above, the Java compiler internally creates a drawing object for the programmer. When passing a value of the primary data type (char) to a user-defined method that is ready to accept an argument as an object, the compiler automatically converts the passed char value to a character value (which is, it becomes an object).
Another example,
Now, just like the String class, Character class is immutable, i.e, once an object is created, it cannot be changed
An immutable object is an object whose internal state remains same after it has been entirely created.
Learn more about why string is immutable in Java here
The above syntax can also be written as:
Methods of Character Class in Java
The methods of Character class are as follows:
Methods in Character Class | Description |
---|---|
boolean isLetter(char ch) | This method is used to determine if the given character value (ch) is a letter. This method returns true if it is a character ([A-Z], [a-z]), false otherwise. You can also pass ASCII values as arguments instead of characters. This is because the conversion from char to int is implicitly typed in Java. |
boolean isDigit(char ch) | This method is used to determine whether the specified char value(ch) is a digit or not. Here also, we can pass the `ASCII value as an argument. |
boolean isWhitespace(char ch) | Determines whether the specified char value(ch) is white space. Whitespace includes space, tab, or newline |
boolean isUpperCase(char ch) | Determines whether the specified char value(ch) is uppercase or not |
boolean isLowerCase(char ch) | Determines whether the specified char value(ch) is lowercase or not. |
char toUpperCase(char ch) | Returns the uppercase of the specified char value(ch). If an ASCII value is passed, then the ASCII value of its uppercase will be returned |
char toLowerCase(char ch) | It returns the lowercase of the specified char value(ch). |
toString(char ch) | It returns a String class object representing the specified character value(ch) i.e a one-character string. Here we cannot pass ASCII value |
1.boolean isLetter(char ch):
Syntax:
Here, ch is a primitive character passed as a parameter to this method to determine whether ch is an alphabet. If it is, the method returns true; if not, it returns false.
Example:
Output
Since A is an alphabet, the isLetter() method returns true, whereas it returns false for 0, which is not an alphabet
2. boolean isDigit(char ch):
Syntax:
Here, ch as a primitive character is passed as a parameter to this method. This method would return true if ch is a digit and return false if it's not
Let's see an example in Java:
Output
Since A is not a digit, the isDigit() method returns false, whereas it returns true for 0, which clearly is a digit.
3. boolean isWhitespace(char ch):
Syntax:
Here again, we've passed ch as a parameter to this method which is supposed to return true if it encounters whitespace and false if it does not
Let's see an example in Java:
Output
Here, we see that the isWhitespace() method returns true for the following: '': space, which is a whitespace character in ASCII '\n': newline is also one of the whitespace characters in ASCII '\t': tab is another whitespace character in ASCII 9: Since the codepoint,9 is a whitespace character
In ASCII, carriage return '\r', vertical tab '\v', and formfeed '\f' are also considered to be whitespace characters.
4. boolean isUpperCase(char ch):
Syntax:
ch is passed as a parameter, this method returns true if it's in upper case and returns false otherwise.
Let's see an example in Java:
Output
Here, we see that it returns true for A and the ASCII value of A, which is 65; however, it returns false for `a', which is in lowercase.
5. boolean isLowerCase(char ch):
Syntax:
ch is passed as a primitive character. This method returns true if ch is lowercase and false otherwise.
Let's see an example in Java:
Output
Clearly, it returns true for a and ASCII value of a and returns false for A which is it hehe per case.
6. char toUpperCase(char ch):
Syntax:
1
Let's see an example in Java:
Output
A
65
48
Here, we rly see that the parameters have changed to upper case, where 97 (ASCII of a) was passed, it changes to 48 which is an ASCII of A
7. char toLowerCase(char ch):
Syntax:
This method takes ch as a paramter and returns the lowercase form of the specified char value.
Let's see an example in Java:
Output
Here, we can see that the parameters have changed to lower case, where 65 (ASCII of A) was passed, it changes to 97 which is an ASCII of a
8. toString(char ch):
Syntax:
this method takes ch, a primitive character, as a parameter and returns a string object
Let's see an example in Java
Output
Here, X and Y are returned as strings in the output
Let's now see some more methods of Character class in Java
S. No. | Method | Description |
---|---|---|
1. | static byte getDirectionality(char ch) | This method returns the Unicode directionality property for the given character. |
2. | static byte getDirectionality(int codepoint) | This method returns the Unicode directionality property for the given character (Unicode code point). |
3. | static int codePointAt (char[] a, int index) | This method returns the code point at the specified index in the char array. |
4. | static int codePointAt (char[] a int index, int limit) | This method returns the code point at the specified index of the char array using only array elements with indices less than the limit. |
5. | static int codePointAt (CharSequence seq, int index) | This method returns the code point at the specified index in the CharSequence. |
6. | static int codePointBefore (char[] a int index) | This method returns the code point before the given index in the char array. |
7. | static int codePointBefore (char[] an int index, int start) | This method uses only array elements with an index greater than or equal to start to determine the code point before the given index in the char array. return. 8. |
10. | static int codePointCount (CharSequence seq, int beginIndex, int endIndex) | This method returns the number of Unicode code points in the text range of the specified character sequence |
11. | static int codePointOf(String name) | This method returns the code point value of the Unicode character specified by the given Unicode character name. |
12. | static int compare(char x, char y) | This method compares two char values numerically. |
13. | int compareTo(Character anotherCharacter) | This method compares two Character objects numerically. |
14. | static int digit(char ch, int radix) | This method returns the numeric value of the character ch in the specified radix. |
15. | static int digit(int codePoint, int radix) | This method returns the numeric value of the specified character (Unicode code point) in the specified radix. |
16. | boolean equals(Object obj) | This method compares this object against the specified object. |
17. | static char forDigit(int digit, int radix) | This method determines the character representation for a specific digit in the specified radix. |
18. | static int charCount(int codePoint) | This method determines the number of character values required to represent a given character (Unicode code point). |
19. | char charValue () | This method returns the value of this character object. |
20. | static String getName(int codePoint) | This method returns the Unicode name of the specified character codePoint, or null if the code point is unassigned. |
21. | static int getNumericValue(char ch) | This method returns the int value that the specified Unicode character represents. |
22. | static int getNumericValue(int codePoint) | This method returns the int value that the specified character (Unicode code point) represents. |
23. | static int getType(char ch) | This method returns a value indicating a character’s general category. |
24. | static int getType(int codePoint) | This method returns a value indicating a character’s general category. |
25. | int hashCode() | This method returns a hash code for this Character; equal to the result of invoking charValue(). |
26. | static int hashCode(char value) | This method returns a hash code for a char value; compatible with Character.hashCode(). |
27. | static char highSurrogate(int codePoint) | This method returns the leading surrogate (a high surrogate code unit) of the surrogate pair representing the specified supplementary character (Unicode code point) in the UTF-16 encoding. |
28. | static boolean isAlphabetic(int codePoint) | This method determines if the specified character (Unicode code point) is an alphabet. |
29. | static boolean isBmpCodePoint(int codePoint) | This method determines whether the specified character (Unicode code point) is in the Basic Multilingual Plane (BMP). |
30. | static boolean isDefined(char ch) | Determines if a character is defined in Unicode. |
31. | static boolean isDefined(int codePoint) | Ddetermines if a character (Unicode code point) is defined in Unicode. |
32. | static boolean isHighSurrogate(char ch) | Determines if the given char value is a Unicode high-surrogate code unit (also known as a leading-surrogate code unit). |
33. | static boolean isIdentifierIgnorable(char ch) | Determines if the specified character should be regarded as an ignorable character in a Java identifier or a Unicode identifier. |
34. | static boolean isIdentifierIgnorable(int codePoint) | Determines whether the given character (Unicode code point) should be considered an ignorable character in Java or Unicode identifiers. |
35. | static boolean isIdeographic(int codePoint) | This method determines if the specified character (Unicode code point) is a CJKV (Chinese, Japanese, Korean, and Vietnamese) ideograph, as defined by the Unicode Standard. |
36. | static boolean isISOControl(char ch) | Determines if the given character is an ISO control character. |
37. | static boolean isISOControl(int codePoint) | This method determines if the referenced character (Unicode code point) is an ISO control character. |
38. | static boolean isJavaIdentifierPart(char ch) | Determines if the specified character may be part of a Java identifier other than the first character. |
39. | static boolean isJavaIdentifierPart(int codePoint) | Determines if the character (Unicode code point) may be part of a Java identifier other than the first character. |
40. | static boolean isJavaIdentifierStart(char ch) | Determines if the specified character is permissible as the first character in a Java identifier. |
41. | static boolean isJavaIdentifierStart(int codePoint) | Determines if the character (Unicode code point) is permissible as the first character in a Java identifier. |
42. | static boolean isLowSurrogate(char ch) | This method determines if the given char value is a Unicode low-surrogate code unit (also known as trailing-surrogate code unit). |
43. | static boolean isLetterOrDigit(char ch) | Determines if the specified character is a letter or digit. |
44. | static boolean isLetterOrDigit(int codePoint) | Determines if the specified character (Unicode code point) is a letter or digit. |
45. | static boolean isMirrored(char ch) | Determines whether the character is mirrored according to the Unicode specification. |
46. | static boolean isMirrored(int codePoint) | Determines whether the specified character (Unicode code point) is mirrored according to the Unicode specification. |
47. | static boolean isSpaceChar(char ch) | Determines if the specified character is a Unicode space character. |
48. | static boolean isSpaceChar(int codePoint) | Determines if the specified character (Unicode code point) is a Unicode space character. |
49. | static boolean isSupplementaryCodePoint(int codePoint) | Determines whether the specified character (Unicode code point) is in the supplementary character range. |
50. | static boolean isSurrogate(char ch) | Determines if the given char value is a Unicode surrogate code unit. |
51. | static boolean isSurrogatePair(char high, char low) | Determines whether the specified pair of char values is a valid Unicode surrogate pair. |
52. | static boolean isTitleCase(char ch) | Determines if the specified character is a titlecase character. |
53. | static boolean isTitleCase(int codePoint) | Determines if the specified character (Unicode code point) is a titlecase character. |
54. | static boolean isUnicodeIdentifierPart(char ch) | Determines if the specified character may be part of a Unicode identifier other than the first character. |
55. | static boolean isUnicodeIdentifierPart(int codePoint) | Determines whether the specified character (Unicode code point) can be part of a Unicode identifier other than the first character. |
56. | static boolean isUnicodeIdentifierStart(char ch) | Determines if the specified character is permissible as the first character in a Unicode identifier. |
57. | static boolean isUnicodeIdentifierStart(int codePoint) | Determines whether the specified character (Unicode code point) is allowed as the first character in a Unicode identifier. |
58. | static boolean isValidCodePoint(int codePoint) | Determines whether the specified code point is a valid Unicode code point value. |
59. | static char lowSurrogate(int codePoint) | Returns the trailing surrogate (low-surrogate code unit) of a surrogate pair that represents the specified additional character (Unicode code point) in UTF-16 encoding. |
60. | static int offsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset) | Returns the index within the given char subarray that is offset from the given index by codePointOffset code points. |
61. | static int offsetByCodePoints(CharSequence seq, int index, int codePointOffset) | Returns the index within the specified character sequence that is offset from the specified index by codePointOffset code points. |
62. | static char reverseBytes(char ch) | Returns the value obtained by reversing the order of the bytes in the specified char value. |
63. | static char[] toChars(int codePoint) | Converts the specified character (Unicode codepoint) to its UTF-16 representation stored in a char array. |
64. | static int toChars(int cp, char[] dst, int di) | It converts the specified character (Unicode code point) to its UTF-16 representation. |
65. | static int toCodePoint(char h, char l) | It converts the specified surrogate pair to its supplementary code point value. |
66. | static char toTitleCase(char ch) | It converts the character argument to titlecase using case mapping information from the UnicodeData file. |
67. | static int toTitleCase(int cp) | It converts the character argument (Unicode code point) to Titlecase using the case mapping information in the UnicodeData file. |
68. | static Character valueOf(char ch) | Returns a Character instance representing the specified char value. |
Applications of Character Class in Java
Let's see an example in Java
Why Then Is There A Need For Character class?
During development, we come across situations where you need to use objects instead of primitive data types
Why do We Have to do This?
You may need to modify or apply different methods to character variables. You can only do this if you cast the value to an immutable object. The java.util package handles wrapper classes.
Character class is a kind of wrapper class like we've talked about before.
A wrapper class is a class that wraps(converts) a primitive datatype to an object.
But then, why would we want to convert a primitive data type into an object?
Often we need to change/apply different methods to a character variable which is not possible unless we convert the said value into an immutable object
Character class in Java provides that needed immutability where once the value of a character datatype is changed to a character object it cannot be reverted back to its form.
Examples of Character Class in Java
Let's look at some examples of implementing character classes in Java
Example 1
Let's implement various methods like charCount(), hashCode(), isDigit() etc in the Java code below
Output:
Here, as we can see we let the user provide 4 inputs to the program stored in value1, value2, value3, and value4 respectively.
We then generate it's character count value using charCount(), hashcode using hashCode(), checking whether it's a digit using isDigit() or an ISO control using isISOControl()
Example 2
Let's look at a Java example of implementing methods like isLetter(), isLowerCase(), isSpace(), and isDefined().
Output
Four char primitives ch1, ch2, ch3, and ch4 are created and are assigned some values. Four methods isLetter(ch1), isLowerCase(ch2), isSpace(ch3), isDefined(ch3) are used upon these objects, and their results are stored in boolean variables like b1, b2, b3 and b4 and are then printed.
Conclusion
- Character is one of Java's most basic data units. It is a primitive data type that we can convert into the objects of its respective wrapper class called the Character Class.
- The Wrapper class in Java provides the mechanism to convert primitive into object and object into primitive
- The Unicode Data file, which is updated by the Unicode Consortium, contains all of the Character class's attributes, methods, and constructors
- The Java language typically represents char arrays in Strings or String Buffers using the UTF-16 encoding method
- Character class in Java provides that needed immutability where once the value of a character datatype is changed to a character object, it cannot be reverted back to its form