Regex lookahead, lookbehind and atomic groups are
(?!) - negative lookahead(?=) - positive lookahead(?<=) - positive lookbehind(?<!) - negative lookbehind
(?>) - atomic group
EX 1:
given the string foobarbarfoo
bar(?!bar) finds the second bar in the string.
bar(?=bar) finds the first bar in the string.
(?<=foo)bar finds the firs bar in the string
(?<!foo)bar finds the second bar in the string
you can also combine them
(?<=foo)bar(?=bar)
EX 2:
Check for 5 characters, then a space, then a non-space
(?=.{5}\s\S)
EX 3:
^(?=.{3}$).*
^ # The caret is an anchor which denotes "STARTS WITH"
(?= # lookahead
. # wildcard match; the . matches any non-new-line character
{3} # quantifier; exactly 3 times
$ # dollar sign; I'm not sure if it will act as an anchor but if it did it would mean "THE END"
) # end of lookbehind
. # wildcard match; the . matches any non-new-line character
* # quantifier; any number of times, including 0 times
EX 4:
$a = "<no> 3232 </no> ";
$a =~ s#(?<=<no>).*?(?=</no>)# 000 #gi;
print "$a\n";
EX 5:
perl -pe 's/(.)(?=.*?\1)//g' FILE_NAME
The regex used is:
(.)(?=.*?\1)
.
: to match any char.
- first
()
: remember the matched single char.
(?=...)
: +ve lookahead
.*?
: to match anything in between
\1
: the remembered match.
(.)(?=.*?\1)
: match and remember any char only if it appears again later in the string.
s///
: Perl way of doing the substitution.
g
: to do the substitution globally...that is don't stop after first substitution.
s/(.)(?=.*?\1)//g
: this will delete a char from the input string only if that char appears again later in the string.
No comments:
Post a Comment