Whatever message this page gives is out now! Go check it out!
<cfscript>
writeOutput(refind("[[:digit:]]","abc 456 ABC 789? Paraguay for $99 airfare!")) // Returns 5
</cfscript><cfscript>
writeOutput(refind("\p{Digit}","abc 456 ABC 789? Paraguay for $99 airfare!")) // Returns 5
</cfscript>| Feature | Java | Perl |
| Backslash escapes one metacharacter | YES | YES |
| \Q...\E escapes a string of metacharacters | Java 6 | YES |
| \x00 through \xFF (ASCII character) | YES | YES |
| \n (LF) | YES | YES |
| \f (form feed) and \v (vtab) | YES | YES |
| \a (bell) | YES | YES |
| \e (escape) | YES | YES |
| \b (backspace) and \B (backslash) | no | no |
| \cA through \cZ (control character) | YES | YES |
| \ca through \cz (control character) | no | YES |
| [abc] character class | YES | YES |
| [^abc] negated character class | YES | YES |
| [a-z] character class range | YES | YES |
| Hyphen in [\d-z] is a literal | YES | YES |
| Hyphen in [a-\d] is a literal | no | no |
| Backslash escapes one character class metacharacter | YES | YES |
| \Q...\E escapes a string of character class metacharacters | Java 6 | YES |
| \d shorthand for digits | ascii | YES |
| \w shorthand for word characters | ascii | YES |
| \s shorthand for whitespace | ascii | YES |
| \D | YES | YES |
| [\b] backspace | YES | YES |
| . (dot; any character except line break) | YES | YES |
| ^ (start of string/line) | YES | YES |
| $ (end of string/line) | YES | YES |
| \A (start of string) | YES | YES |
| \Z (end of string | YES | YES |
| \z (end of string) | YES | YES |
| \` (start of string) | no | no |
| \' (end of string) | no | no |
| \b (at the beginning or end of a word) | YES | YES |
| \B (NOT at the beginning or end of a word) | YES | YES |
| \y (at the beginning or end of a word) | no | no |
| \Y (NOT at the beginning or end of a word) | no | no |
| \m (at the beginning of a word) | no | no |
| \M (at the end of a word) | no | no |
| \< (at the beginning of a word) | no | no |
| \> (at the end of a word) | no | no |
| | (alternation) | YES | YES |
| Feature | Java | Perl |
| ? (0 or 1) | YES | YES |
| * (0 or more) | YES | YES |
| + (1 or more) | YES | YES |
| {n} (exactly n) | YES | YES |
| {n | YES | YES |
| {n | YES | YES |
| ? after any of the above quantifiers to make it "lazy" | YES | YES |
| (regex) (numbered capturing group) | YES | YES |
| (?:regex) (non-capturing group) | YES | YES |
| \1 through \9 (backreferences) | YES | YES |
| \10 through \99 (backreferences) | YES | YES |
| Forward references \1 through \9 | YES | YES |
| Nested references \1 through \9 | YES | YES |
| Backreferences non-existent groups are an error | YES | YES |
| Backreferences to failed groups also fail | YES | YES |
| (?i) (case insensitive) | YES | YES |
| (?s) (dot matches newlines) | YES | YES |
| (?m) (^ and $ match at line breaks) | YES | YES |
| (?x) (free-spacing mode) | YES | YES |
| (?n) (explicit capture) | no | no |
| (?-ismxn) (turn off mode modifiers) | YES | YES |
| (?ismxn:group) (mode modifiers local to group) | YES | YES |
| Feature | Java | Perl |
| (?>regex) (atomic group) | YES | YES |
| ?+ | n}+ (possessive quantifiers) | YES |
| (?=regex) (positive lookahead) | YES | YES |
| (?!regex) (negative lookahead) | YES | YES |
| (?<=text) (positive lookbehind) | finite length | fixed length |
| (?<!text) (negative lookbehind) | finite length | fixed length |
| \G (start of match attempt) | YES | YES |
| (?(?=regex)then|else) (using any lookaround) | no | YES |
| (?(regex)then|else) | no | no |
| (?(1)then|else) | no | YES |
| (?(group)then|else) | no | no |
| (?#comment) | no | YES |
| Free-spacing syntax supported | YES | YES |
| Character class is a single token | no | YES |
| # starts a comment | YES | YES |
| \X (Unicode grapheme) | no | YES |
| \u0000 through \uFFFF (Unicode character) | YES | no |
| \x{0} through \x{FFFF} (Unicode character) | no | YES |
| \pL through \pC (Unicode properties) | YES | YES |
| \p{L} through \p{C} (Unicode properties) | YES | YES |
| \p{Lu} through \p{Cn} (Unicode property) | YES | YES |
| \p{L&} and \p{Letter&} (equivalent of [\p{Lu}\p{Ll}\p{Lt}] Unicode properties) | no | YES |
| \p{IsL} through \p{IsC} (Unicode properties) | YES | YES |
| \p{IsLu} through \p{IsCn} (Unicode property) | YES | YES |
| \p{Letter} through \p{Other} (Unicode properties) | no | YES |
| \p{Lowercase_Letter} through \p{Not_Assigned} (Unicode property) | no | YES |
| \p{IsLetter} through \p{IsOther} (Unicode properties) | no | YES |
| \p{IsLowercase_Letter} through \p{IsNot_Assigned} (Unicode property) | no | YES |
| \p{Arabic} through \p{Yi} (Unicode script) | no | YES |
| \p{IsArabic} through \p{IsYi} (Unicode script) | no | YES |
| \p{BasicLatin} through \p{Specials} (Unicode block) | no | YES |
| \p{InBasicLatin} through \p{InSpecials} (Unicode block) | YES | YES |
| \p{IsBasicLatin} through \p{IsSpecials} (Unicode block) | no | YES |
| Part between {} in all of the above is case insensitive | no | YES |
| \P (negated variants of all \p as listed above) | YES | YES |
| \p{^...} (negated variants of all \p{...} as listed above) | no | YES |
| (?<name>regex) (.NET-style named capturing group) | no | no |
| (?'name'regex) (.NET-style named capturing group) | no | no |
| \k<name> (.NET-style named backreference) | no | no |
| \k'name' (.NET-style named backreference) | no | no |
| (?P<name>regex) (Python-style named capturing group | no | no |
| (?P=name) (Python-style named backreference) | no | no |
| multiple capturing groups can have the same name | n/a | n/a |
| \i | no | no |
| [abc-[abc]] character class subtraction | no | no |
| [:alpha:] POSIX character class | no | YES |
| \p{Alpha} POSIX character class | ascii | no |
| \p{IsAlpha} POSIX character class | no | YES |
| [.span-ll.] POSIX collation sequence | no | no |
| [=x=] POSIX character equivalence | no | no |
REReplace("Hello","[T]*","7","ALL") - #REReplace("Hello","[T]*","7","ALL")#<BR>
</cfoutput>REReplace("Hello","[T]*","7","ALL") - 7H7e7l7l7o7REReplace("Hello World","[T]*W","7","ALL")
#REReplace("Hello World","[T]*W","7","ALL")#<BR>
</cfoutput>REReplace("Hello World","[T]*W","7","ALL") - Hello 7orld<!--- The value of IndexOfOccurrence is 6---><!--- The value of IndexOfOccurrence is 5 ---><!--- The value of IndexOfOccurrence is 26--->+ * ? . [ ^ $ ( ) { | \"\+"| Special Character | Description |
| A backslash followed by any special character matches the literal character itself, that is, the backslash escapes the special character.For example, "+" matches the plus sign, and " " matches a backslash. | |
| \. | A period matches any character, including newline. To match any character except a newline, use [^#chr(13)##chr(10)#], which excludes the ASCII carriage return and line feed codes. The corresponding escape codes are \r and \n. |
| [ ] | A one-character character set that matches any of the characters in that set. For example, "[akm]" matches an "a", "k", or "m". A hyphen in a character set indicates a range of characters; for example, a-z matches any single lowercase letter. If the first character of a character set is the caret (^), the regular expression matches any character except those in the set. It does not match the empty string.For example, akm matches any character except "a", "k", or "m". The caret loses its special meaning if it is not the first character of the set. |
| ^ | If the caret is at the beginning of a regular expression, the matched string must be at the beginning of the string being searched.For example, the regular expression "^ColdFusion" matches the string "ColdFusion lets you use regular expressions" but not the string "In ColdFusion, you can use regular expressions." In a character set (ie: within square brackets), a caret character negates the following characters. [^A] matches any character that is not an upper case A. |
| $ | If the dollar sign is at the end of a regular expression, the matched string must be at the end of the string being searched.For example, the regular expression "ColdFusion$" matches the string "I like ColdFusion" but not the string "ColdFusion is fun." |
| ? | A character set or subexpression followed by a question mark matches zero or one occurrence of the character set or subexpression. For example, xy?z matches either " xyz " or " xz ". |
| | | The OR character allows a choice between two regular expressions. For example, jell ( yies ) matches either "jelly" or "jellies". |
| + | A character set or subexpression followed by a plus sign matches one or more occurrences of the character set or subexpression. For example, [a-z]+ matches one or more lowercase characters. |
| * | A character set or subexpression followed by an asterisk matches zero or more occurrences of the character set or subexpression. For example, [a-z]* matches zero or more lowercase characters. |
| () | Parentheses group parts of a regular expression into subexpressions that you can treat as a single unit.For example, (ha)+ matches one or more instances of "ha". |
| (?x) | If at the beginning of a regular expression, it specifies to ignore whitespace in the regular expression and lets you use ## for end-of-line comments. You can match a space by escaping it with a backslash.For example, the following regular expression includes comments, preceded by ##, that are ignored by ColdFusion:reFind("(?x) one ##first option {{two ##second option}} {{three\ point\ five ## note escaped spaces}} ", "three point five") |
| (?m) | If at the beginning of a regular expression, it specifies the multiline mode for the special characters ^ and $.When used with ^, the matched string can be at the start of the entire search string or at the start of new lines, denoted by a linefeed character or chr (10), within the search string. For $, the matched string can be at the end the search string or at the end of new lines. Multiline mode does not recognize a carriage return, or chr (13), as a new line character. The following example searches for the string "two" across multiple lines: {{#reFind("(?m)^two", "one#chr(10)#two")#}}This example returns 4 to indicate that it matched "two" after the chr (10) linefeed. Without (?m), the regular expression would not match anything, because ^ only matches the start of the string.The character (?m) does not affect \A or \Z, which always match the start or end of the string, respectively. For information on \A and \Z, see Using escape sequences. |
| (?i) | If at the beginning of a regular expression for REFind(), it specifies to perform a case-insensitive compare. For example, the following line would return an index of 1: {{#reFind("(?i)hi", "HI")#}}If you omit the (?i), the line would return an index of zero to signify that it did not find the regular expression. |
| (?=...) | If at the beginning of a regular expression, it specifies to use positive lookahead when searching for the regular expression. If you prefix a subexpression with this, ColdFusion uses positive lookahead for that subexpression. Positive lookahead tests for the parenthesized subexpression like regular parenthesis, but does not include the contents in the match - it merely tests to see if it is there in proximity to the rest of the expression. For example, consider the expression to extract the protocol from a URL: <cfset regex = " http (?=://)"><cfset string = " http ://"><cfset result = reFind(regex, string, 1, "yes")>{{mid(string, result.pos1, result.len1)}}This example results in the string " http ". The lookahead parentheses ensure that the "://" is there, but does not include it in the result. If you did not use lookahead, the result would include the extraneous "://".Lookahead parentheses do not capture text, so backreference numbering will skip over these groups. For more information on backreferencing, see Using backreferences. |
| (?!...) | If at the beginning of a regular expression, it specifies to use negative lookahead. Negative is just like positive lookahead, as specified by (?=...), except that it tests for the absence of a match.Lookahead parentheses do not capture text, so backreference numbering will skip over these groups. For more information on backreferencing, see Using backreferences. |
| (?:...) | If you prefix a subexpression with "?:", ColdFusion performs all operations on the subexpression except that it will not capture the corresponding text for use with a back reference. |
| Escape Sequence | Description |
| \b | Specifies a boundary defined by a transition from an alphanumeric character to a nonalphanumeric character, or from a nonalphanumeric character to an alphanumeric character.For example, the string " Big" contains boundary defined by the space (nonalphanumeric character) and the "B" (alphanumeric character). The following example uses the \b escape sequence in a regular expression to locate the string "Big" at the end of the search string and not the fragment "big" inside the word "ambiguous".reFindNoCase("\bBig\b", "Don't be ambiguous about Big."){{<!--- The value of IndexOfOccurrence is 26 --->}}When used inside a character set (for example [\b]), it specifies a backspace |
| \B | Specifies a boundary defined by no transition of character type. For example, two alphanumeric characters in a row or two nonalphanumeric characters in a row; opposite of \b. |
| \A | Specifies a beginning of string anchor, much like the ^ special character.However, unlike ^, you cannot combine \A with (?m) to specify the start of newlines in the search string. |
| \Z | Specifies an end of string anchor, much like the $ special character.However, unlike $, you cannot combine \Z with (?m) to specify the end of newlines in the search string. |
| \n | Newline character |
| \r | Carriage return |
| \t | Tab |
| \f | Form feed |
| \d | Any digit, similar to [0-9] |
| \D | Any nondigit character, similar to [^0-9] |
| \w | Any alphanumeric character, or the underscore (_), similar to [[:word:]] |
| \W | Any nonalphanumeric character, except the underscore similar to [^[:word:]] |
| \s | Any whitespace character including tab, space, newline, carriage return, and form feed. Similar to [ \t\n\r\f]. |
| \S | Any nonwhitespace character, similar to [^ \t\n\r\f] |
| x | A hexadecimal representation of character , where d is a hexadecimal digit |
| \ddd | An octal representation of a character, where d is an octal digit, in the form \000 to \377 |
REReplace ("Adobe Web Site","[[:space:]]","*","ALL")Adobe*Web*Site"Some BIG string")>
<!--- The value of IndexOfOccurrence is 5 --->| Character class | Matches |
| :alpha: | Any alphabetic character. |
| :upper: | Any uppercase alphabetic character. |
| :lower: | Any lowercase alphabetic character |
| :digit: | Any digit. Same as \d. |
| :alnum: | Any alphabetic or numeric character. |
| :xdigit: | Any hexadecimal digit. Same as [0-9A-Fa-f]. |
| :blank: | Space or a tab. |
| :space: | Any whitespace character. Same as \s. |
| :print: | Any alphanumeric, punctuation, or space character. |
| :punct: | Any punctuation character |
| :graph: | Any alphanumeric or punctuation character. |
| :cntrl: | Any character not part of the character classes [:upper:], [:lower:], [:alpha:], [:digit:], [:punct:], [:graph:], [:print:], or [:xdigit:]. |
| :word: | Any alphabetic or numeric character, plus the underscore (_). Same as \w |
| :ascii: | The ASCII characters, in the Hexadecimal range 0 - 7F |
| Situation | Use regex? | Alternative |
|---|---|---|
| Exact substring search (e.g., "find 'error' in log") | No | find(), findNoCase(), contains() |
| Simple prefix/suffix (starts with, ends with) | No | left(), right(), startsWith(), endsWith() |
| Fixed-format parsing (e.g., comma-separated values) | No | listToArray(), listFirst(), listRest() |
| Pattern matching (email, phone, URL, custom formats) | Yes | — |
| Extract multiple matches (all URLs, all IDs from text) | Yes | — |
| Replace by pattern (normalize whitespace, strip non-digits) | Yes | — |
| Complex validation (password rules, format rules) | Yes | isValid() for built-in types (email, etc.) |