RE Methods and Properties

Now you know how to write your very own patterns, what do you do with them? Well, we've already discovered that there is a RegExp object that is automatically created when you assign a string pattern to a variable. Now it's time to meet the methods we use to do our dirty work with. There are two attached to the RegExp object and four to the String object that work with REs, as follows.

str.search()

The String object's search() method is the simplest of all the operations we cover here.

var str="96521234";
var reg= new RegExp("965");
var reg2= /123/;
var index=str.search(reg);              // index = 0
var index2=str.search (reg2);            // index2 = 4

As you can guess from our example, search simply looks through the String specified and returns to index the position of the first matching character sequence in the String. Note that String ignores the global switch 'g' in a RE literal and that this index value begins at 0 as line 4 demonstrates. If a match is not found, search returns -1.

str.split()

The split() method has been in the language since JavaScript 1.1, but with version 1.2 came support for it to take a regular expression argument.

var large_number = "212,0,456,0,67889";
var reg3 = /,\d,/;
var numberList = large_number.split(re);
                                        // numberList = ["212","456","67889"]

The purpose of split() is to take a string and return an array of string elements. Each element exists in the original string separated by characters matching the pattern sent to split() as its argument.

str.replace()

As you would imagine, the replace() method returns a brand new string that contains a copy of the original string with any matching part of it replaced accordingly. For example

reg4 = /happy/gi;
str4 = "I'm happy. You're Happy";
newstr4=str4.replace(reg4, "sad");   // newstr4 = "I'm sad. You're sad"

Notice the use of the global and case-insensitive switches. Without them newstr4 would be assigned "I'm sad. You're Happy".

replace() is actually a lot more useful than it would appear from this simple example. Recall from part two of our syntax discussion the use of /1, /2, etc to refer to the match for a parenthesized sub-expression. The RegExp object has similar properties called $1, $2. etc up to $9, equivalent to /1 and so on which can be fed back into replace(). For example

reg5 = /(I am)(\s\w*.\s)(You are)(\s\w*)/;
str5 = "I am working. You are asleep";
newstr5 = str5.replace(reg5, "$3$2$1$4");
                                    //newstr5 = "You are working. I am asleep"

As you can guess, $1 and $3 correspond to "I am" and "You are" respectively and are swapped accordingly by replace().

str.match()

Our last string method, match(), is very similar to replace() except that instead of returning a new string, it returns an array of matches to the global regular expression as a result.

In the case that the regular expression does not contain the global switch, the first element of the array will always return the match for the complete expression while subsequent elements will hold $1, $2, etc.

reg6 = /(I am)(\s\w*.\s)(You are)(\s\w*)/;
str6 = "I am working. You are asleep";
var newarr6 = str6.match(reg6);

So in this example, newarr[0] holds "I am working. You are asleep", newarr[1] holds "I am", newarr[2] holds " working. " and so on.

re.test()

The first of the methods attached to the RegExp object is very similar to the search() method we looked at earlier. It simply returns a boolean value, true or false, depending on whether or not a pattern can be matched to a sequence of characters in the given string. For example:

var string="96521234";
var reg= /965/;
var isin = reg.test(string);            // isin = true

If the pattern has the global flag set, it will set the lastIndex property of the RegExp object (see below) and continue the search from that point in the string when called again. If it does not have the flag set, lastIndex will be reset to 0.

re.exec()

The last of our RE-utilizing methods is exec(), which acts in a similar fashion to match() when the global switch is not used. It also has useful side-effects. Let's start with an example.

var string="965212234";
var reg= /(\d{2}2)/g;                   //Look for two digits followed by a 2
var results = reg.exec(string);         // = ["652", "652"]   Call 1
var results = reg.exec(string);         // = ["122", "122"]   Call 2

exec does a little more than you may first have guessed. In fact it populates all the static properties of the RegExp object, the reg object and updates details of the array too. exec() also behaves the same as test() with respect to the global flag being set. Should it not find a match, exec() returns null for the array.

Given the code above, the following tables demonstrate what each call to reg.exec(string) populates various properties with.

Array Properties

Accessed asAfter Call 1After Call 2Notes
results.index 1 4 Character position at which match occurred
results.input 965212234 965212234 The target string

Pattern Properties

Accessed as After Call 1 After Call 2 Notes
reg.lastIndex 4 7 The index from which to begin the next search
reg.ignoreCase false false Has the 'i' switch been used?
reg.global true true Has the 'g' switch been used?
reg.source (\d{2}2) (\d{2}2) The pattern being matched

Static Regular Expression Properties

Always referred to as properties of the generic RegExp object

Accessed as After Call 1 After Call 2 Notes
RegExp. lastMatch 652 122 Last character sequence to match the pattern
RegExp. leftContext 9 9652 Characters to the left of the matching sequence
RegExp. rightContext 12234 34 Characters to the right of the matching sequence
RegExp.$1 652122 See previous explanation
RegExp. LastParen 652 122 The last substring match to a parenthesized subexpression.

As you can see, exec() does a lot of behind the scenes work, especially if you're running Navigator.

Limitations of exec() in Internet Explorer

The JScript version of exec() is quite limited in comparison to that in JavaScript in two ways:

RegExp.index, RegExp.input : Equivalent to results.index and results.input as above.

RegExp.lastIndex, reg.source: Equivalent to reg.lastIndex and reg.source as above

RegExp.$1: Equivalent to RegExp.$1 as above.

And so, with all that information behind us, we conclude our tour of regular expression support in JavaScript.

Summary

Beyond basic objects and syntax in JavaScript lurk some powerful and flexible features. Although the language easily and simply supports basic and trivial scripts, the scriptwriter can expand his horizons well past that if required. Support for Object-Oriented programming, including exceptions and inheritance lets the JavaScript language stand tall with the best. Self-evaluation of data as code and vice versa mean that JavaScript is even more amenable to advanced mobile code architectures than popular languages such as C++ and Java.

With the 'ECMAScript 2' standard still under development in late 1999 it is certain that these features will be standardized once the new version of the language is launched. Nevertheless, prototype and scope chains are fairly fundamental to JavaScript and can be relied upon almost everywhere.

Regardless of these advanced features, the basic language is still almost useless taken purely by itself. A host environment is needed in order for the interpreter to have an application domain to work with. Such a domain (e.g. web browsers) can be so interesting in its own right that high-end features of the language are entirely forgotten in the excitement. When that domain has been pushed beyond its obvious uses, advanced JavaScripters turn back to the more subtle language features for inspiration, innovation and solutions. In the meantime, the rest of this book considers those host applications – browsers, servers, shells, custom applications and others – and JavaScript's general role in extending their utility.