java.util.regex
public final class: Pattern [javadoc |
source]
java.lang.Object
java.util.regex.Pattern
All Implemented Interfaces:
Serializable
Represents a pattern used for matching, searching, or replacing strings.
{@code Pattern}s are specified in terms of regular expressions and compiled
using an instance of this class. They are then used in conjunction with a
Matcher to perform the actual search.
A typical use case looks like this:
Pattern p = Pattern.compile("Hello, A[a-z]*!");
Matcher m = p.matcher("Hello, Android!");
boolean b1 = m.matches(); // true
m.setInput("Hello, Robot!");
boolean b2 = m.matches(); // false
The above code could also be written in a more compact fashion, though this
variant is less efficient, since {@code Pattern} and {@code Matcher} objects
are created on the fly instead of being reused.
fashion:
boolean b1 = Pattern.matches("Hello, A[a-z]*!", "Hello, Android!"); // true
boolean b2 = Pattern.matches("Hello, A[a-z]*!", "Hello, Robot!"); // false
| Field Summary |
|---|
| static final boolean | _DEBUG_ | |
| public static final int | UNIX_LINES | This constant specifies that a pattern matches Unix line endings ('\n')
only against the '.', '^', and '$' meta characters. |
| public static final int | CASE_INSENSITIVE | This constant specifies that a {@code Pattern} is matched
case-insensitively. That is, the patterns "a+" and "A+" would both match
the string "aAaAaA". |
| public static final int | COMMENTS | This constant specifies that a {@code Pattern} may contain whitespace or
comments. Otherwise comments and whitespace are taken as literal
characters. |
| public static final int | MULTILINE | This constant specifies that the meta characters '^' and '$' match only
the beginning and end end of an input line, respectively. Normally, they
match the beginning and the end of the complete input. |
| public static final int | LITERAL | This constant specifies that the whole {@code Pattern} is to be taken
literally, that is, all meta characters lose their meanings. |
| public static final int | DOTALL | This constant specifies that the '.' meta character matches arbitrary
characters, including line endings, which is normally not the case. |
| public static final int | UNICODE_CASE | This constant specifies that a {@code Pattern} is matched
case-insensitively with regard to all Unicode characters. It is used in
conjunction with the #CASE_INSENSITIVE constant to extend its
meaning to all Unicode characters. |
| public static final int | CANON_EQ | This constant specifies that a character in a {@code Pattern} and a
character in the input string only match if they are canonically
equivalent. |
| static final int | BACK_REF_NUMBER | |
| static final int | flagsBitMask | Bit mask that includes all defined match flags |
| transient AbstractSet | start | |
| Method from java.util.regex.Pattern Summary: |
|---|
|
compCount, compile, compile, consCount, flags, getSupplement, groupCount, matcher, matches, pattern, quote, split, split, toString |
| Methods from java.lang.Object: |
|---|
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method from java.util.regex.Pattern Detail: |
int compCount() {
return this.compCount + 1;
}
|
public static Pattern compile(String pattern) {
return compile(pattern, 0);
}
Compiles a regular expression, creating a new Pattern instance in the
process. This is actually a convenience method that calls #compile(String, int) with a {@code flags} value of zero. |
public static Pattern compile(String pattern,
int flags) throws PatternSyntaxException {
if ((flags != 0) &&
((flags | flagsBitMask) != flagsBitMask)) {
throw new IllegalArgumentException(Messages.getString("regex.1C"));
}
AbstractSet.counter = 1;
return new Pattern().compileImpl(pattern, flags);
}
Compiles a regular expression, creating a new {@code Pattern} instance in
the process. Allows to set some flags that modify the behavior of the
{@code Pattern}. |
int consCount() {
return this.consCount + 1;
}
|
public int flags() {
return this.flags;
}
Returns the flags that have been set for this {@code Pattern}. |
static char getSupplement(char ch) {
char res = ch;
if (ch >= 'a' && ch < = 'z') {
res -= 32;
} else if (ch >= 'A' && ch < = 'Z') {
res += 32;
}
return res;
}
Returns supplementary character. At this time only for ASCII chars. |
int groupCount() {
return globalGroupIndex;
}
return number of groups found at compile time |
public Matcher matcher(CharSequence input) {
return new Matcher(this, input);
}
Returns a Matcher for the {@code Pattern} and a given input. The
{@code Matcher} can be used to match the {@code Pattern} against the
whole input, find occurrences of the {@code Pattern} in the input, or
replace parts of the input. |
public static boolean matches(String regex,
CharSequence input) {
return Pattern.compile(regex).matcher(input).matches();
}
Tries to match a given regular expression against a given input. This is
actually nothing but a convenience method that compiles the regular
expression into a {@code Pattern}, builds a Matcher for it, and
then does the match. If the same regular expression is used for multiple
operations, it is recommended to compile it into a {@code Pattern}
explicitly and request a reusable {@code Matcher}. |
public String pattern() {
return lexemes.toString();
}
Returns the regular expression that was compiled into this
{@code Pattern}. |
public static String quote(String s) {
StringBuilder sb = new StringBuilder().append("\\Q"); //$NON-NLS-1$
int apos = 0;
int k;
while ((k = s.indexOf("\\E", apos)) >= 0) { //$NON-NLS-1$
sb.append(s.substring(apos, k + 2)).append("\\\\E\\Q"); //$NON-NLS-1$
apos = k + 2;
}
return sb.append(s.substring(apos)).append("\\E").toString(); //$NON-NLS-1$
}
Quotes a given string using "\Q" and "\E", so that all other
meta-characters lose their special meaning. If the string is used for a
{@code Pattern} afterwards, it can only be matched literally. |
public String[] split(CharSequence input) {
return split(input, 0);
}
Splits a given input around occurrences of a regular expression. This is
a convenience method that is equivalent to calling the method
#split(java.lang.CharSequence, int) with a limit of 0. |
public String[] split(CharSequence inputSeq,
int limit) {
ArrayList res = new ArrayList();
Matcher mat = matcher(inputSeq);
int index = 0;
int curPos = 0;
if (inputSeq.length() == 0) {
return new String [] {""}; //$NON-NLS-1$
} else {
while (mat.find() && (index + 1 < limit || limit < = 0)) {
res.add(inputSeq.subSequence(curPos, mat.start()).toString());
curPos = mat.end();
index++;
}
res.add(inputSeq.subSequence(curPos, inputSeq.length()).toString());
index++;
/*
* discard trailing empty strings
*/
if (limit == 0) {
while (--index >= 0 && res.get(index).toString().length() == 0) {
res.remove(index);
}
}
}
return (String[]) res.toArray(new String[index >= 0 ? index : 0]);
}
Splits the given input sequence around occurrences of the {@code Pattern}.
The function first determines all occurrences of the {@code Pattern}
inside the input sequence. It then builds an array of the
"remaining" strings before, in-between, and after these
occurrences. An additional parameter determines the maximal number of
entries in the resulting array and the handling of trailing empty
strings. |
public String toString() {
return this.pattern();
}
|