Java Matcher Class
Overview
To search a text for numerous instances of a regular expression, the Java Matcher class, java.util.regex.Matcher, is used. Additionally, we can utilize a Matcher to look for the same regular expression across many texts.
Numerous useful methods can be found in the Java Matcher class. In this tutorial, we'll go through the Java Matcher class's fundamental functions.
Introduction to Java Matcher Class
We can search for multiple instances of a regular expression in a text using the Java Matcher class (java.util.regex.Matcher).
What is a Regex in Java?
A string of characters that makes up a pattern is known as a regular expression, or regex. We can use this specific pattern to identify matched strings whenever we are looking for any data. It can only be a single character, or it might be a more intricate design.
The Java regex package contains classes that enable the use of regular expressions for pattern modification and searching.
This package is imported into your code in the following way.
To know more about regex in java refer to our article here
Now, back to our topic, Regular expression pattern matching is offered via the Java Matcher Class.
All classes present in the regex package are imported by this line of code. You can import the Matcher class using the following line of code if you simply want to import that.
By interpreting a Pattern, the java.util.regex.Matcher class functions as an engine that executes match operations on a character sequence.
There are three classes and one interface in the Java regex package:
- MatchResult interface: It is a regex engine that is utilized to match characters in a sequence.
- Pattern Class: To specify the pattern that needs to be searched
- Matcher Class: To locate it by matching the pattern in the text.
- PatternSyntaxException Class: To highlight any regular expression syntax mistakes.
Its operation is fairly straightforward. To define a pattern, you must first build a Pattern object from the regular expression. The Pattern object is then used to generate a Matcher object.
There are numerous methods in the Matcher class. The matches() method, which returns true if the regular expression matches the text and false otherwise, is the most significant of these. In addition to the replace() methods in Java, the Java Matcher Class includes numerous other helpful methods for replacing text in input strings that carry out more complicated tasks.
Java only has two replace() methods, whereas the Matcher class has many more features available.
Declaration
Following is the declaration for the java.util.regex.Matcher class
Methods of Java Matcher Class
For convenience, the Matcher class methods are arranged in a table below according to their functional groups.
S. No. | Method | Description |
---|---|---|
1 | Index methods | It offers practical index values. It indicates whether or not the input string's input match was found. |
2 | Study methods | After going over the input string, it provides a boolean response indicating whether the pattern was discovered or not. |
3 | Replacement methods | These approaches help change the text in an input string. |
Methods Detail
1. Index Methods:
It offers practical index values. It indicates if the input string contained the match or not.
S. No. | Method Name | Description |
---|---|---|
1 | public int start() | The preceding match's start index is returned by this procedure. |
2 | public int start(int group) | The start index of the subsequence that the specified group previously used in a match operation is returned by this method. |
3 | public int end() | The offset after the last character is matched is returned by this procedure. |
4 | public int end(int group) | The offset after the final character of the subsequence that the specified group successfully matched in the previous match operation is returned by this method. |
2. Study Methods:
After going over the input string, it provides a boolean response indicating whether the pattern was discovered or not.:
S. No. | Method Name | Description |
---|---|---|
1 | public boolean lookingAt() | Starting at the beginning of the region, this technique attempts to match the input sequence against the pattern. |
2 | public boolean find() | This approach looks for the subsequent input sequence subsequence that meets the pattern. |
3 | public boolean find(int start) | Resets this matcher and then searches the input sequence starting at the specified index for the next subsequence that matches the pattern. |
4 | public boolean matches() | With this technique, the entire region is compared to the pattern. |
3. Replacement Methods:
The following approaches help change the text in an input string:
S. No. | Method Name | Description |
---|---|---|
1 | public Matcher appendReplacement(StringBuffer sb, String replacement) | This method implements a non-terminal append-and-replace step. |
2 | public StringBuffer appendTail(StringBuffer sb) | This method implements a terminal append-and-replace step. |
3 | public String replaceAll(String replacement) | This method substitutes the supplied replacement string for each subsequence of the input sequence that matches the pattern. |
4 | public String replaceFirst(String replacement) | This technique substitutes the supplied replacement string for the first subsequence of the input sequence that matches the pattern. |
5 | public static String quoteReplacement(String s) | This method produces a String that will function as a literal replacement of the Matcher class in the appendReplacement method as well as a literal replacement String for the supplied String. |
Java Matcher Example
To help you understand how the Matcher class functions, below is a little Java Matcher example:
A Matcher instance is formed from a Pattern instance after the Pattern instance has been created from a regular expression. The Matcher instance then receives a call to the matches() function. If the regular expression matches the text, the matches() function returns true; otherwise, it returns false.
The Matcher class allows you to accomplish a great deal more.
More Examples of Java Timestamp
An engine that analyses a Pattern to carry out match operations on a character sequence.
A matcher is produced from a pattern by using the matcher procedure of the pattern. Three separate match operations can be carried out using a matcher after it has been created:
- The matches method tries to compare every element of the input sequence to the pattern.
- The lookingAt method makes an attempt to compare the input sequence—beginning at the end—to the pattern.
- The find method searches through the input sequence for the subsequent subsequence that corresponds to the pattern.
A boolean indicating success or failure is returned by each of these functions. By checking the matcher's status, one can learn more details about a successful match.
- A matcher discovers matches in the region, a subset of its input. The region by default includes all input from the matcher.
- Through the region method and the regionStart and regionEnd methods, the region can be updated and accessed. It is possible to alter how various pattern structures interact with area borders.
- This class additionally has methods for replacing subsequences that match with fresh strings whose contents can optionally be determined from the match result. To aggregate the result into an existing string buffer or string builder, the appendReplacement and appendTail methods can be combined. As an alternative, the more practical replace Each matching subsequence from the input sequence can be replaced in a string by using the All technique.
- The start and end indices of the most recent successful match are included in the explicit state of a matcher. Along with a total count of these subsequences, it also contains the start and end indices of the input subsequence that was recorded by each capturing group in the pattern. There are also methods available for returning these captured subsequences in string format for your convenience.
- The input character sequence and the append position, which starts at zero and is updated by the appendReplacement method, are both included in a matcher's implicit state.
- A matcher can be explicitly reset by calling its reset() or reset(CharSequence) methods, depending on whether a new input sequence is requested. A matcher's explicit state information is discarded when it is reset, and the append position is set to zero. This class' instances cannot be used safely by several concurrent threads.
An IllegalStateException will be triggered if you try to query any part of an explicit matcher's state before there has been a successful match because it is initially undefined. Every match operation recomputes the matcher's explicit state.
Creating a Matcher
The matcher() function in the Pattern class is used to create a Matcher. Here's an illustration:
The matcher variable will have a Matcher instance after this example, which can be used to compare various inputs of text with the regular expression that was used to build it.
Java matches() Method
The entire text given to the Pattern.matcher() function, when the Matcher was formed, is checked against the regular expression using the matches() method of the Matcher class. Here is an illustration using Matcher.matches():
Output
The matches() method returns true if the regular expression matches the entire text. The matches() method returns false in that case.
The matches() method cannot be used to look for numerous instances of a regular expression in a text. Use the find(), start(), and end() methods to accomplish that.
Java lookingAt() Method
With one significant exception, the Matcher lookingAt() method functions similarly to the matches() method. Unlike matches(), which matches the regular expression against the entire text, lookingAt() just matches the regular expression against the beginning of the text. In other words, lookingAt() will return true whereas matches() will return false if the regular expression matches the beginning of a text but not the entire text.
Here is an example of a Matcher lookingAt():
In this example, the regular expression "this is the" is matched against both the start of the text and the entire content. LookingAt(return )'s value is true if the regular expression matches the text's start.
The text has more characters than the regular expression, therefore using the matches() function to match the regular expression against the entire text would return a false. According to the regular expression, the text must perfectly match the sentence "This is the" with no more letters appearing before or after the expression.
Java reset() Method
The internal matching state of the Matcher is reset via the Matcher reset() method. The Matcher will internally keep track of how far it has searched through the input text if you have begun matching occurrences in a string using the find() method. Resetting the matching causes it to start over at the beginning of the text.
A reset(CharSequence) method is also available. By using the CharSequence given as a parameter rather than the CharSequence the Matcher was initially built with, this method resets the Matcher.
Java group() Method
Imagine that you are looking for URLs in a text and that you would like to extract the URLs you find. Of course, you could accomplish this with the start() and end() methods, but using the group functions makes things simpler.
The regular expression uses parenthesis to denote groups. For illustration:
(John) The text John matches this regular phrase. The text that is matched does not include the parentheses. Parentheses indicate a group. You can access the portion of the regular expression inside the group when a match is discovered in a text.
Using the group(int groupNo) method, you can access a group. There can be more than one group in a regular expression. As a result, each group is indicated by a unique set of parenthesis. You can use the group(int groupNo) method to get the text in a group that contained the component of the expression that it matched.
The entire regular expression is always the group with the number 0. Start with group number 1 to gain access to a group that is enclosed in parentheses.
Here is an example of a Matcher group():
This illustration looks for the word John in the text. Group number 1, which corresponds to the group that matched the group in parentheses, is retrieved for each match that was detected. The example's output is:
Java find() + start() + end() Methods Example
In the text that was supplied to the Pattern.matcher(text) method when the Matcher was formed, the find() method of the Matcher looks for instances of the regular expressions. If there are many matches in the text, the find() method will locate the first one before moving on to the next match with each consecutive call.
The indexes into the text where the detected match starts and ends are provided by the methods start() and end(). In actuality, end() returns the character's index that comes right after the matched section's end. Thus, inside a call to String.substring(), you can use the return values of start() and end().
Here is a Java Matcher find(), start() and end() example:
In the search string, this example will locate the pattern "is" four times. This is what will be printed:
Java replaceAll() + replaceFirst() Methods Example
You can change specific elements of the string the Matcher is searching through by using the replaceAll() and replaceFirst() methods. All regular expression matches are changed with the replaceAll() function. Only the first match is replaced with replaceFirst().
The Matcher is reset such that matching begins at the beginning of the input text before any matching is done.
Here are two illustrations:
And here is what the example outputs:
The following line's indentation and line breaks are not included in the output. To make the output simpler to read, I added these.
Take note of how the string Joe Blocks has been substituted for every instance of John that follows a word in the first string printed. The second string only replaces the first instance.
Java appendReplacement() + appendTail() Methods Example
To replace string tokens in an input text and append the new string to a StringBuffer, use the Matcher appendReplacement() and appendTail() methods.
You can use the appendReplacement method after utilizing the find() method to locate a match (). By doing this, the matched text is substituted and the characters from the input text are appended to the StringBuffer. Only the characters up to right before the matched characters are replicated, beginning at the end of the last match.
You can use find() to look for matches in the input text until none are found because the appendReplacement() method keeps track of what has been copied into the StringBuffer.
A portion of the input text will still not have been transferred into the StringBuffer after the last match has been discovered. From the last match's conclusion through the end of the input text, these are the characters. You can append these final characters to the StringBuffer by using the appendTail() method.
Here is an example:
Keep in mind that appendTail() is called immediately after the while(matcher.find()) loop while appendReplacement() is called inside the loop.
The results of this example are:
A line break is added in the last line to make the text easier to read. No line breaks would be present in the actual output.
As you can see, the StringBuffer is constructed one match at a time using characters and substitutions from the input text.
Difference Between matcher() and Pattern.matches()
The matcher() method returns a Matcher that matches the input against the pattern
The static function Pattern.compares() however, builds a regex and matches the entire input against it.
Let's develop test scenarios to demonstrate the distinction:
Simply put, when we use matcher(), we inquire whether the string contains a pattern.
And using Pattern.matches(), we are determining whether the string is a pattern.
The return value of Pattern.matches() is false since it tries to match the entire string.
Conclusion
- To search a text for numerous instances of a regular expression, the Java Matcher class, java.util.regex.Matcher, is used
- A string of characters that makes up a pattern is known as a regular expression, or regex. We can use this specific pattern to identify matched strings whenever we are looking for any data
- A Matcher instance is formed from a Pattern instance after the Pattern instance has been created from a regular expression
- The find(), start(), and end() methods can be used to look for numerous instances of a regular expression in a text
- The entire regular expression is always the group with the number 0. group number 1 can be used to gain access to a group that is enclosed in parentheses.
- Java only has two replace() methods, whereas the Matcher class has many more features available