HOME  |    TRAINING  |   FREE TUTORIALS   |   JOBS
Find out more about our new RSS feed.
FREE Tutorial
BEGINNING PERL PART 4 - WORKING WITH REGEXPS

CATEGORY
SEARCH OUR OTHER TUTORIALS

DESCRIPTION

Now that we've matched a string, what do we do with it? Well, sometimes it's just useful to know whether a string contains a given pattern or not. However, a lot of the time we're going to be doing search-and-replace operations on text. We'll explain how to do that here. We'll also cover some of the more advanced areas of dealing with regular expressions.
Click here to be kept informed of our new Tutorials.


This free tutorial is a sample from the book Beginning Perl.


Substitution

Now we know all about matching text, substitution is very easy. Why? Because all of the clever things are in the 'search' part, rather than the 'replace': all the character classes, quantifiers and so on only make sense when matching. You can't substitute, say, a word with any number of digits. So, all we need to do is take the 'old' text, Our match, and tell perl what we want to replace it with. This we do with the s/// operator.

The s is for 'substitute' - between the first two slashes, we put our regular expression as before. Before the final slash, we put our text replacement. Just as with matching, we can use the =~ operator to apply it to a certain string. If this is not given, it applies to the default variable $_ :

#!/usr/bin/perl
# subst1.plx
use warnings;
use strict;
$_ = "Awake! Awake! Fear, Fire, Foes! Awake! Fire, Foes! Awake!";
# Tolkien, Lord of the Rings
s/Foes/Flee/;
print $_,"\n"; 

>perl subst1.plx
Awake! Awake! Fear, Fire, Flee! Awake! Fire, Foes! Awake!
>

Here we have substituted the first occurrence of 'Foes' with the word 'Flee'. Had we wanted to change every occurrence, we would have needed to use another modifier. Just as the /i modifier for matching case-insensitively, the /g modifier on a substitution acts globally:

#!/usr/bin/perl# subst1.plxuse warnings;use strict;
$_ = "Awake! Awake! Fear, Fire, Foes! Awake! Fire, 
Foes! Awake!";# Tolkien, Lord of the Rings 

s/Foes/Flee/g; 
print $_,"\n"; 

> perl subst1.plx
Awake! Awake! Fear, Fire, Flee! Awake! Fire, Flee! Awake!
>

Like the left-hand side of the substitution, the right-hand side also works like a double-quoted string and is thus subject to variable interpolation. One useful thing, though, is that we can use the backreference variables we collected during the match on the right hand side. So, for instance, to swap the first two words in a string, we would say something like this:

#!/usr/bin/perl
# subst2.plx
use warnings;
use strict;
$_ = "there are two major products that come out of Berkeley: 
LSD and UNIX";
# Jeremy Anderson
s/(\w+)\s+(\w+)/$2 $1/;
print $_, "?\n"; 

>perl subst2.plx 
are there two major products that come out of Berkeley: LSD and UNIX?
>

What would happen if we tried doing that globally? Well, let's do it and see:

#!/usr/bin/perl# subst2.plxuse warnings;use strict;
$_ = "there are two major products that come out of Berkeley: 
LSD and UNIX";# Jeremy Anderson 

s/(\w+)\s+(\w+)/$2 $1/g; 
print $_, "?\n"; 

>perl subst2.plx
are there major two that products out come Berkeley of: and LSD UNIX?
>

Here, every word in a pair is swapped with its neighbor. When processing a global match, perl always starts where the previous match left off.

Changing Delimiters

You may have noticed that // and s/// looks like q// and qq// . Well, just like q// and qq// , we can change the delimiters when matching and substituting to increase the readability of our regular expressions. The same rules apply: Any non-word character can be the delimiter, and paired delimiters such as <> , () , {}, and [] may be used - with two provisos.

First, if you change the delimiters on // , you must put an m in front of it. (m for 'match'). This is so that perl can still recognize it as a regular expression, rather than a block or comment or anything else. Second, if you use paired delimiters with substitution, you must use two pairs:

s/old text/new text/g; 

becomes:

s{old text}{new text}g; 

You may, however, leave spaces or new lines between the pairs for the sake of clarity:

s{old text} {new text}g; 

The prime example of when you would want to do this is when you are dealing with file paths, which contain a lot of slashes. If you are, for instance, moving files on your Unix system from /usr/local/share/ to /usr/share/ , you may want to munge the file names like this:

s/\/usr\/local\/share\//\/usr\/share\//g; 

However, it's far easier and far less ugly to change the delimiters in this case:

s#/usr/local/share/#/usr/share/#g; 

Continued...


NEXT PAGE



5 RELATED COURSES AVAILABLE
MICROSOFT VISUAL BASIC V6 INTRODUCTION
To go from the fundamentals of Visual Basic programming to the threshold of Advanced level. Gaining in depth prog....
MICROSOFT VISUAL BASIC 5.0 PROFESSIONAL INTRODUCTION
To provide readers with a solid foundation upon which to build Windows applications using Visual Basic 5. Readers....
MICROSOFT VISUAL BASIC 5.0 CLIENT SERVER DEVELOPMENT
This course teaches the skills required to develop client server applications using MS Visual Basic 5.0 Enterpris....
C++ PROGRAMMING
Object oriented programming is fast becoming the leading software design methodology, with C++ becoming ever more....
C PROGRAMMING
This course is design to provide non-C programmers with the essential skills and knowledge necessary to allow the....
 
0 RELATED JOBS AVAILABLE
CONTACT US
Friday 10th February 2012  © COPYRIGHT 2012 - website design by Website Design by Visualsoft