|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--org.apache.oro.text.regex.Perl5Compiler
Safe: The Perl5Compiler class is used to create compiled regular expressions conforming to the Perl5 regular expression syntax. It generates Perl5Pattern instances upon compilation to be used in conjunction with a Perl5Matcher instance. Please see the user's guide for more information about Perl5 regular expressions.
PatternCompiler
,
MalformedPatternException
,
Perl5Pattern
,
Perl5Matcher
Field Summary | |
private static char |
__CASE_INSENSITIVE
|
private int |
__cost
|
private static char |
__EXTENDED
|
private static char |
__GLOBAL
|
private static String |
__HEX_DIGIT
|
private CharStringPointer |
__input
|
private static char |
__KEEP
|
private static String |
__META_CHARS
|
private char[] |
__modifierFlags
|
private static char |
__MULTILINE
|
private static int |
__NONNULL
|
private int |
__numParentheses
|
private char[] |
__program
|
private int |
__programSize
|
private static char |
__READ_ONLY
|
private boolean |
__sawBackreference
|
private static int |
__SIMPLE
|
private static char |
__SINGLELINE
|
private static int |
__SPSTART
|
private static int |
__TRYAGAIN
|
private static int |
__WORSTCASE
|
static int |
CASE_INSENSITIVE_MASK
Enabled: A mask passed as an option to the compile methods
to indicate a compiled regular expression should be case insensitive. |
static int |
DEFAULT_MASK
Enabled: The default mask for the compile methods. |
static int |
EXTENDED_MASK
Enabled: A mask passed as an option to the compile methods
to indicate a compiled regular expression should be treated as a Perl5
extended pattern (i.e., a pattern using the /x modifier). |
static int |
MULTILINE_MASK
Enabled: A mask passed as an option to the compile methods
to indicate a compiled regular expression should treat input as having
multiple lines. |
static int |
READ_ONLY_MASK
Enabled: A mask passed as an option to the compile methods
to indicate that the resulting Perl5Pattern should be treated as a
read only data structure by Perl5Matcher, making it safe to share
a single Perl5Pattern instance among multiple threads without needing
synchronization. |
static int |
SINGLELINE_MASK
Enabled: A mask passed as an option to the compile methods
to indicate a compiled regular expression should treat input as being
a single line. |
Constructor Summary | |
Perl5Compiler()
Enabled: |
Method Summary | |
private int |
__emitArgNode(char operator,
char arg)
|
private void |
__emitCode(char code)
|
private int |
__emitNode(char operator)
|
private char |
__getNextChar()
|
private static boolean |
__isComplexRepetitionOp(char[] ch,
int offset)
|
private static boolean |
__isSimpleRepetitionOp(char ch)
|
private int |
__parseAlternation(int[] retFlags)
|
private int |
__parseAtom(int[] retFlags)
|
private int |
__parseBranch(int[] retFlags)
|
private int |
__parseCharacterClass()
|
private int |
__parseExpression(boolean isParenthesized,
int[] hintFlags)
|
private static int |
__parseHex(char[] str,
int offset,
int maxLength,
int[] scanned)
|
private static int |
__parseOctal(char[] str,
int offset,
int maxLength,
int[] scanned)
|
private static boolean |
__parseRepetition(char[] str,
int offset)
|
private void |
__programAddOperatorTail(int current,
int value)
|
private void |
__programAddTail(int current,
int value)
|
private void |
__programInsertOperator(char operator,
int operand)
|
private void |
__setCharacterClassBits(char[] bits,
int offset,
char deflt,
char ch)
|
private static void |
__setModifierFlag(char[] flags,
char ch)
|
Pattern |
compile(char[] pattern)
Enabled: Same as calling compile(pattern, Perl5Compiler.DEFAULT_MASK); |
Pattern |
compile(char[] pattern,
int options)
Enabled: Compiles a Perl5 regular expression into a Perl5Pattern instance that can be used by a Perl5Matcher object to perform pattern matching. |
Pattern |
compile(String pattern)
Enabled: Same as calling compile(pattern, Perl5Compiler.DEFAULT_MASK); |
Pattern |
compile(String pattern,
int options)
Enabled: Compiles a Perl5 regular expression into a Perl5Pattern instance that can be used by a Perl5Matcher object to perform pattern matching. |
static String |
quotemeta(char[] expression)
Enabled: Given a character string, returns a Perl5 expression that interprets each character of the original string literally. |
static String |
quotemeta(String expression)
Enabled: Given a character string, returns a Perl5 expression that interprets each character of the original string literally. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
private static final int __WORSTCASE
private static final int __NONNULL
private static final int __SIMPLE
private static final int __SPSTART
private static final int __TRYAGAIN
private static final char __CASE_INSENSITIVE
private static final char __GLOBAL
private static final char __KEEP
private static final char __MULTILINE
private static final char __SINGLELINE
private static final char __EXTENDED
private static final char __READ_ONLY
private static final String __META_CHARS
private static final String __HEX_DIGIT
private CharStringPointer __input
private boolean __sawBackreference
private char[] __modifierFlags
private int __numParentheses
private int __programSize
private int __cost
private char[] __program
public static final int DEFAULT_MASK
compile
methods.
It is equal to 0.
The default behavior is for a regular expression to be case sensitive
and to not specify if it is multiline or singleline. When MULITLINE_MASK
and SINGLINE_MASK are not defined, the ^, $, and .
metacharacters are
interpreted according to the value of isMultiline() in Perl5Matcher.
The default behavior of Perl5Matcher is to treat the Perl5Pattern
as though MULTILINE_MASK were enabled. If isMultiline() returns false,
then the pattern is treated as though SINGLINE_MASK were set. However,
compiling a pattern with the MULTILINE_MASK or SINGLELINE_MASK masks
will ALWAYS override whatever behavior is specified by the setMultiline()
in Perl5Matcher.
public static final int CASE_INSENSITIVE_MASK
compile
methods
to indicate a compiled regular expression should be case insensitive.
public static final int MULTILINE_MASK
compile
methods
to indicate a compiled regular expression should treat input as having
multiple lines. This option affects the interpretation of
the ^ and $ metacharacters. When this mask is used,
the ^ metacharacter matches at the beginning of every line,
and the $ metacharacter matches at the end of every line.
Additionally the . metacharacter will not match newlines when
an expression is compiled with MULTILINE_MASK , which is its
default behavior.
The SINGLELINE_MASK and MULTILINE_MASK should not be
used together.
public static final int SINGLELINE_MASK
compile
methods
to indicate a compiled regular expression should treat input as being
a single line. This option affects the interpretation of
the ^ and $ metacharacters. When this mask is used,
the ^ metacharacter matches at the beginning of the input,
and the $ metacharacter matches at the end of the input.
The ^ and $ metacharacters will not match at the beginning
and end of lines occurring between the begnning and end of the input.
Additionally, the . metacharacter will match newlines when
an expression is compiled with SINGLELINE_MASK , unlike its
default behavior.
The SINGLELINE_MASK and MULTILINE_MASK should not be
used together.
public static final int EXTENDED_MASK
compile
methods
to indicate a compiled regular expression should be treated as a Perl5
extended pattern (i.e., a pattern using the /x modifier). This
option tells the compiler to ignore whitespace that is not backslashed or
within a character class. It also tells the compiler to treat the
# character as a metacharacter introducing a comment as in
Perl. In other words, the # character will comment out any
text in the regular expression between it and the next newline.
The intent of this option is to allow you to divide your patterns
into more readable parts. It is provided to maintain compatibility
with Perl5 regular expressions, although it will not often
make sense to use it in Java.
public static final int READ_ONLY_MASK
compile
methods
to indicate that the resulting Perl5Pattern should be treated as a
read only data structure by Perl5Matcher, making it safe to share
a single Perl5Pattern instance among multiple threads without needing
synchronization. Without this option, Perl5Matcher reserves the right
to store heuristic or other information in Perl5Pattern that might
accelerate future matches. When you use this option, Perl5Matcher will
not store or modify any information in a Perl5Pattern. Use this option
when you want to share a Perl5Pattern instance among multiple threads
using different Perl5Matcher instances.
Constructor Detail |
public Perl5Compiler()
Method Detail |
public static final String quotemeta(char[] expression)
In effect, this method is the analog of the Perl5 quotemeta() builtin method.
expression
- The expression to convert.
public static final String quotemeta(String expression)
In effect, this method is the analog of the Perl5 quotemeta() builtin method.
private static boolean __isSimpleRepetitionOp(char ch)
private static boolean __isComplexRepetitionOp(char[] ch, int offset)
private static boolean __parseRepetition(char[] str, int offset)
private static int __parseHex(char[] str, int offset, int maxLength, int[] scanned)
private static int __parseOctal(char[] str, int offset, int maxLength, int[] scanned)
private static void __setModifierFlag(char[] flags, char ch)
private void __emitCode(char code)
private int __emitNode(char operator)
private int __emitArgNode(char operator, char arg)
private void __programInsertOperator(char operator, int operand)
private void __programAddTail(int current, int value)
private void __programAddOperatorTail(int current, int value)
private char __getNextChar()
private int __parseAlternation(int[] retFlags) throws MalformedPatternException
MalformedPatternException
private int __parseAtom(int[] retFlags) throws MalformedPatternException
MalformedPatternException
private void __setCharacterClassBits(char[] bits, int offset, char deflt, char ch)
private int __parseCharacterClass() throws MalformedPatternException
MalformedPatternException
private int __parseBranch(int[] retFlags) throws MalformedPatternException
MalformedPatternException
private int __parseExpression(boolean isParenthesized, int[] hintFlags) throws MalformedPatternException
MalformedPatternException
public Pattern compile(char[] pattern, int options) throws MalformedPatternException
compile
in interface PatternCompiler
pattern
- A Perl5 regular expression to compile.options
- A set of flags giving the compiler instructions on
how to treat the regular expression. The flags
are a logical OR of any number of the five MASK
constants. For example:
regex = compiler.compile(pattern, Perl5Compiler. CASE_INSENSITIVE_MASK | Perl5Compiler.MULTILINE_MASK);This says to compile the pattern so that it treats input as consisting of multiple lines and to perform matches in a case insensitive manner.
MalformedPatternException
public Pattern compile(char[] pattern) throws MalformedPatternException
compile
in interface PatternCompiler
pattern
- A regular expression to compile.
MalformedPatternException
public Pattern compile(String pattern) throws MalformedPatternException
compile
in interface PatternCompiler
pattern
- A regular expression to compile.
MalformedPatternException
public Pattern compile(String pattern, int options) throws MalformedPatternException
compile
in interface PatternCompiler
pattern
- A Perl5 regular expression to compile.options
- A set of flags giving the compiler instructions on
how to treat the regular expression. The flags
are a logical OR of any number of the five MASK
constants. For example:
regex = compiler.compile("^\\w+\\d+$", Perl5Compiler.CASE_INSENSITIVE_MASK | Perl5Compiler.MULTILINE_MASK);This says to compile the pattern so that it treats input as consisting of multiple lines and to perform matches in a case insensitive manner.
MalformedPatternException
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |