Knowledge Base

Home > Features > Regular Expressions (regexp)

Created 01 Aug 2002
Modified 15 Dec 2010

Article 128

Regular Expressions (regexp)

Purpose

Normally, when one searches for a sub-string in a string, there has to be an exact match. Hence, if one searches for a sub-string "abc" then the string being searched for has to contain these exact letters in the same exact sequence for a match to be found. One can extend the search to be case insensitive where the sub-string "abc" will also find strings like "Abc", "ABC" etc. That is, the case is ignored but the sequence of the letters has to be exactly the same. However, a case insensitive search is not enough sometimes. For example, if we want to search for a numeric digit, one basically ends up searching for each digit independently. This is where regular expressions come to help.

Regular expressions are text patterns that are used for string matching. Regular expressions are strings that contain a mix of plain text and special characters to indicate what kind of matching to do. Here is a very brief tutorial on using regular expressions.

Special Characters

Character Description

^

 

Beginning of the string. The expression "^A" will match an ?A? only at the beginning of the string.

^

 

The caret (^) immediately following the left-bracket ([) has a different meaning. It is used to exclude the remaining characters within brackets from matching the target string. The expression "[^0-9]" indicates that the target character should not be a digit.

$

The dollar sign ($) will match the end of the string. The expression "abc$" will match the sub-string "abc" only if it is at the end of the string.

|

 

The alternation character (|) allows either expression on its side to match the target string. The expression "a|b" will match ?a? as well as ?b?.

.

The dot (.) will match any character.

*

The asterix (*) indicates that the character to the left of the asterix in the expression should match 0 or more times.

+

The plus (+) is similar to asterix but there should be at least one match of the character to the left of the + sign in the expression.

?

The question mark (?) matches the character to its left 0 or 1 times.

()

 

The parenthesis affects the order of pattern evaluation and also serves as a tagged expression that can be used when replacing the matched sub-string with another expression.

[]

Brackets ([ and ]) enclosing a set of characters indicates that any of the enclosed characters may match the target character.
Example: [0-9] The dash (-) between 0 and 9 indicates that it is a range from 0 to 9.

\char If we want to search for a special character literally we must use a backslash before the special character. For example, the single character regular expression "\*" matches a single asterix.

Examples

Regexp Simple Regexp (DOS) equivalent Description
  matchs any strings
.*\.html *.html* matchs any strings which contain .html
.*\.html$ *.html matchs any strings with .html at the end
.*\.html?$ *.html and *.htm matches any string with .htm or .html at the end  

Important Notes

In SmartFTP's regexp implementation all chars are handled case sensitive.

Keywords
regular expressions, reg, regular, expressions