Tinderbox regular expressions follow the norms of PCRE (Perl Compatible Regular Expresion) usage. If consulting general references on regular expressions, be sure to look for PCRE is there is a choice of 'flavours' of usage being described.
Note that an input that takes regular expression can take either:
- a literal string, e.g. "Hello"
- an entirely regular expression encoded string, e.g. "\d{2,3}"
- a mix of the two, e.g. "Hello.+".
The operators .contains() and .containsAnyOf() use Apple's regular expression engine.
In action codes (or other operators) that are described as using regular expression patterns for their input arguments (i.e. 'patterns'), it is possible to use the \uNNNN method to use the 4-digit Unicode code point number for a character to test for a particular character, especially those that cannot easily be typed. Thus a non-breaking space, which looks like a normal space when querying text, is encoded as \u00A0, whilst a normal space is \u0020:
$Text = $Text.replace("\u00A0","\u0020");
…replaces every instance of a non-breaking space with a normal space character.
An alternative, if the character is in the ASCII range, is to use the ASCII decimal number in the form \xNN. Thus:
$Text = $Text.replace("XXX","\x22");
…replaces every instance of 'XXX' with a straight double quote character.
Some reference links:
- Syntax for regular expressions using the Apple Regular Expression engine.
- Wikipedia ASCII codes. IMPORTANT: use the values in the 'Hex' column. Thus a straight double quote is
22, a straight single quote is27. - The ICU Regular Expression library documentation. This library is used internally in Tinderbox and replaces the previous Boost library (see below). ICU Regular Expressions conform to version 19 of the Unicode Technical Standard #18, Unicode Regular Expressions, level 1, and in addition include Default Word boundaries and Name Properties from level 2.
Historical note re pre-v6 Tinderbox
Originally, regular expressions in Tinderbox used Perl language conventions as further defined in documentation for the Boost regex code library: https://www.boost.org/doc/libs/1_34_1/libs/regex/doc/syntax_perl.html.
Tinderbox stopped using the older Boost regex library from v10.0.2.
See also—notes linking to here:
- ^inboundBasicLinks( [start, list-item-prefix, list-item-suffix, end, type] )^
- Export Code Arguments
- links[(scope)].directionStr.[linkTypeRegex].attributeNameRefStr
- Text line endings
- Macros
- ^outboundTextLinks( [start, list-item-prefix, list-item-suffix, end, type] )^
- ^outboundWebLinks( [start, list-item-prefix, list-item-suffix, end, type] )^
- String.split(regexStr)
- ^inboundTextLinks( [start, list-item-prefix, list-item-suffix, end, type] )^
- list.at(itemNum)
- String.icontains(regexStr)
- String.contains(regexStr)
- Quoting Regular Expressions
- Using regular expression back-references
- ^outboundBasicLinks( [start, list-item-prefix, list-item-suffix, end, type] )^
- Controlling Agent Update Cycle Time
- ^if( condition )^
- list.replace(regexMatchStr, replacementStr)
- String.replace(regexMatchStr, replacementStr)
- runCommand(commandStr[, inputsStr, dirStr])
- Regular Expressions in queries
- Exploding Notes
- Single and double quotes