Perl Training Regex

PERL Regular Expressions
Regular Expressions (0)

Its a template that either matches or doesnt
match a given string.

One of the most important features of PERL -
a strong regular expression support
/PATTERN/

the Dirty Dozen Metacharacters
\ . * +? ( ) |[ { ^ $
These characters have special meaning in
regular expressions.
A backslash in front of any meta-character
makes it non special.

. matches any char except a newline \n
Quantifiers decides how many time the preceding

item has to be repeated.
/hello.you/ matches any string that has hello, followed by any
one (exactly one) character, followed by you.
/to*ols/ last character before * may be repeated zero or more
times. Matches tools,tooooools,tols (but not toxols !!!)
/to+ols/ ------//------- one or more -----//------.
/to.*ols/ matches to, followed by any string, followed by ols.
Regular Expressions(3)
/to?ols/ the character before ? is optional. Thus, there are only
two matching strings tools and tols.
/to{2}ls/ the number in {} tells about the repetitions
{count}
- Match exactly count times
{min,max} - Match at least min but not more than max times
{min,}
- Match at least min times
Write {} quantifier for *, +, ? ?

Grouping parentheses ( ) are used for grouping one or more
characters.
/(tools)+/ matches toolstoolstoolstools.
Alternatives:
/hello (world|Perl)/ - matches hello world, hello Perl.

Character Class - A list of all possible characters
/Hello [abcde]/ matches Hello a or Hello b
/Hello [a-e]/
the same as above
Negating:
[âbc] any char except a,b,c

Shortcuts
\d digit [0-9]
\w word character [A-Za-z0-9_ ]
\s white space [\n \t \r \s]
Negative ^ [^\d] matches non digit
\S anything not \s
\D anything not \d
\W anything not \w
The character classes for -
1. Matching of vowels
2. Matching of consonants
3. Anything other than non Numbers
Diff between \D and [^\d]

/âbc/ - ^ beginning of a string
Anchors
/a\^bc/ - matches \^
/[âbc]/ - negating
^ - marks the beginning of the string

$ - marks the end of the string
/^Hello Perl/ - matches Hello Perl, good by Perl, but not Perl
Hello Perl
What pattern will match blank lines ?
/^\s*$/ - matches all blank lines

\b - matches at either end of a word (matches the start or the
end of a group of \w characters)
/\bPerl\b/ - matches Hello Perl, Perl
but not Perl++
/^\w+\b/ matches with what part of Thats my house
\B - negative of \b

Back references:
/(World|Perl) \1/ - matches World World, Perl Perl.
/((hello|hi) (world|Perl))/
\1 refers to (hello|hi) (world|Perl)
\2 refers to (hello|hi)
\3 refers to (world|Perl)
$1,$2,$3 store the

values of \1,\2,\3 after
a reg.expr. is applied.

Option modifiers
/i : Case insensitive
/s : . will match \n
/m : Let ^ & $ match next to embedded \n
/x : Ignore white spaces
/o : Compile the pattern once

Bind Operator
=~
Tells Perl to match the pattern on the right

against the string on the left.
Pattern match operator m//

$str =~ /pattern/;
$str =~ m/pattern/;

When no variable is mentioned the pattern is
matched with default variable $_
if( $str =~ /hello/){
while( <STDIN> ){
if( /hello/ ){
}
@words = split /\s+/, $str;
}
}
Examples
$date="12 10
10";
if($date=~ /(\d+)/){
print
$1.":".$2.":".$3.":\n";
}
#output ($2 and $3 are empty):
#12:::
if($date=~ /(\d+)(\s+\1)+/){
print $1.":".$2.":".$3.":\n";
}
#output (notice $3 is empty):
#10:
10::
$str="Hello World";
if($str=~ /((Hello|Hi) (World|Perl))/)
{
print $1.":".$2.":".$3.":\n";
}
#output:
#Hello World:Hello:World:
$str="Hello Perl Hi";
if($str=~ /((Hello|Hi) (World|Perl)) \
1/){
print $1.":".$2.":".$3.":\n";
}
#output: non
$str="Hello Perl Hi";
if($str=~ /((Hello|Hi) (World|Perl)) \
1/){
print $1.":".$2.":".$3.":\n";
}
#output:
#Hi Perl:Hi:Perl:
Examples
1. What is it?
/^0x[0-9a-fA-F]+$/
2. Date format: Month-Day-Year -> Year:Day:Month

$date = 12-31-1901;
$date =~ s/(\d+)-(\d+)-(\d+)/$3:$2:$1/;
Examples
3. Make a pattern that matches any line of input that has
the same word repeated two (or more) times in a row.
Whitespace between words may differ.
4. /^\w+\b/ matches with what part of Thats my house
Example
1. /\w+/
#matches a word
2. /(\w+)/
#to remember later
3. /(\w+)\1/
#two times
4. /(\w+)\s+\1/ #whitespace between words

5. This is a test -> /\b(\w+)\s+\1/
6. This is the theory -> /\b(\w+)\s+\1\b/
Lets try
1) Write a regular expression that identifies a 24-hour
clock. For example: 0:01, 00:20, 15:00, 23:59
2) Write a regular expression that identifies a floating

point. For example: 10, 10.0001, -0.1, +001.3456789
For both write a single program that identifies these

patterns in the input lines and prints out only the
matched patterns.
Negated Match
Negation
if( $str =~ /hello/){
if( $str !~ /hello/){

$&
- what really was matched
$`
- what was before
- the rest of the string after the matched pattern
$` . $& . $ - original string
Caution: Never use this in your script if you really dont need
this.

Substitutions:
s/T/U/; #substitutes T with U (only once)
s/T/U/g; #global substitution
s/\s+/ /g; #collapses whitespaces
s/(\w+) (\w+)/$2 $1/g;
s/T/U/; #applied on $_ variable
$str =~ s/T/U/;

File Extension Renaming:
my ($from, $to) = @ARGV;
@files = glob (*.$from);
foreach $file (@files){
$newfile = $file;
s/\.$from$/\.$to/g
$newfile =~=~
s/\.$from/\.$to/g;
rename($file, $newfile);
}
Split and Join

$str=aaa bbb
ccc
dddd;
@words = split /\s+/, $str;

$str = join :, @words;
#result is aaa:bbb:ccc:dddd
@words = split /\s+/, $_; aaa b -> , aaa, b

@words = split;
aaa b ->
aaa, b
@words = split , $_;
aaa b ->
aaa, b
Grep
grep EXPR, LIST;

@results = grep /^>/, @array;
@results = grep /^>/, <FILE>;
Thank You !!!

Perl Training Regex

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Perl Training Regex

Cargado por

Copyright:

Formatos disponibles

PERL Regular Expressions

Regular Expressions (0)

match a given string.

a strong regular expression support

Regular Expressions (1)

Regular Expressions (2)

Quantifiers decides how many time the preceding

- Match exactly count times

- Match at least min times

Write {} quantifier for *, +, ? ?

Regular Expressions (4)

Regular Expressions (5)

the same as above

Regular Expressions (6)

The character classes for -

Regular Expressions (7)

^ - marks the beginning of the string

Regular Expressions (8)

Regular Expressions (9)

$1,$2,$3 store the

Regular Expressions (10)

Regular Expressions (11)

Tells Perl to match the pattern on the right

Pattern match operator m//

Regular Expressions (12)

2. Date format: Month-Day-Year -> Year:Day:Month

4. /^\w+\b/ matches with what part of Thats my house

#to remember later

4. /(\w+)\s+\1/ #whitespace between words

2) Write a regular expression that identifies a floating

For both write a single program that identifies these

if( $str !~ /hello/){

Regular Expressions (13)

- what really was matched

- what was before

- the rest of the string after the matched pattern

$` . $& . $ - original string

Regular Expressions (14)

Regular Expressions (15)

Split and Join

@words = split /\s+/, $str;

@words = split /\s+/, $_; aaa b -> , aaa, b

@words = split , $_;

grep EXPR, LIST;

Thank You !!!

También podría gustarte