Tuesday, September 6, 2011

Secret Perl Operators

The Spaceship Operator <=>

<=> is the spaceship operator. Most commonly it's used to sort a list of numbers. Here is an example:
my @numbers = (-59, 99, 87, 1900, 42, 1, -999, 30000, 0);
my @sorted = sort { $a <=> $b } @numbers;
print "@sorted\n";

# output: -999 -59 0 1 42 87 99 1900 30000
If you don't specify a block with the spaceship operator to sort() function, it will treat the numbers as strings and sort them asciibetically:
my @numbers = (-59, 99, 87, 1900, 42, 1, -999, 30000, 0);
my @sorted = sort @numbers;
print "@sorted\n";

# output: -59 -999 0 1 1900 30000 42 87 99
In general the spaceship operator is defined as following:
  • $a <=> $b is -1 if $a < $b.
  • $a <=> $b is 0 if $a == $b.
  • $a <=> $b is 1 if $a > $b.
  • $a <=> $b is undef if $a and $b are NaN.

The Eskimo Greeting Operator }{

The Eskimo greeting operator can be most frequently met in Perl one-liners.
For example, this one-liner uses the Eskimo greeting to emulate `wc -l` command and prints the number of lines in a file:
perl -lne '}{ print $.' file
Here the Eskimo greets the print function. To understand what happens here, you have to know what the -n command line option does. It causes Perl to assume the following loop around your program:
while (<>) {
  ...
}
Where `...` contains the code specified by the -e command line option. If the code specified is `}{ ...` then it causes the while loop to be closed with no actions to be done and only the `...` part gets executed.
Therefore the one-liner above is equivalent to:
while (<>) {
}
{
print $.
}
This just prints the special variable $. which is the number of input lines processed.
This can be extended further and we can have Eskimo greet code on both sides:
perl -lne 'code1 }{ code2'
Code1 gets executed within the loop and code2 after the loop is done:
while (<>) {
  code1
}
{
  code2
}
If you are interested in the topic of Perl one-liners, see the first part of my article "Perl One-Liners Explained".

The Goatse Operator =()=

The Goatse operator, as nasty as it may sound, doesn't do any nasty things. Instead it does a wonderful thing and causes an expression on the right to be evaluated in array context.
Here is an example,
my $str = "5 foo 6 bar 7 baz";
my $count =()= $str =~ /\d/g;
print $count;
This program prints 3 - the number of digits in $str. How does it do it? Let's deparse the 2nd line:
(my $count = (() = ($str =~ /\d/g)));
What happens here is that the expression ($str =~ /\d/g) gets assigned to the empty list (). Assigning to a list forces the list context. The whole (() = ($str =~ /\d/g)) thing gets evaluated in list context, but then it gets assigned to a scalar which causes it to get evaluated again in scalar context. So what we have is a list assignment in scalar context. The key thing to remember is that a list assignment in scalar context returns the number of elements on the right-hand side of the list assignment. In this example the right-hand side of the list assignment is ($str =~ /\d/g). This matches globally (/g flag) and finds 3 digits in $str. Therefore the result is 3.

The Turtle Operator "@{[]}"

I couldn't find the name of this operator therefore I decided to name it the turtle operator, because it looks a bit like a turtle, @ being the head, and {[]} being the shell.
This operator is useful for interpolating an array inside a string.
Compare these two examples:
print "these people @{[get_names()]} get promoted"
and
print "these people ", join " ",get_names(), " get promoted"
Clearly, the first example wins for code clarity.
More precisely, writing
print "@{[something]}"
is exactly the same as writing
print join $", something

The Inchworm Operator ~~

The inchworm operator can be used to force scalar context.
Here is an example with localtime() function. In scalar context localtime() returns human readable time, but in list context it returns a 9-tuple with various date elements.
$ perl -le 'print ~~localtime'
Mon Nov 30 09:06:13 2009
Here localtime was evaluated in scalar context, even though it was called within print that forces list context. It returned human readable date and time.
$ perl -le 'print localtime'
579301010913330
Here localtime returned a list of 9 elements and print function just printed them one after another. To really see that it's a list of 9 elements, let's use the turtle operator:
$ perl -le 'print "@{[localtime]}"'
5 13 9 30 10 109 1 333 0

The Inchworm-On-A-Stick Operator ~-

For numbers greater than 0, this operator decrements them by one. Example:
my $x = 5;
print ~-$x;

# prints 4
It works because ~-$x parses to (~(-$x)), which on a two-complement machine is effectively the same as $x-1.

The Spacestation Operator -+-

The spacestation operator turns a string starting with positive number into a number. Here are some examples:
print -+-"4zy"   # prints 4
print -+-'3.99'  # prints 3.99
print -+-'2e5'   # prints 200000

The Venus Operator 0+

It's named the Venus operator because the astronomical symbol for the planet Venus looks similar.
It does the same as the spacestation operator, it numifies a string, but it binds less tightly than spacestation. An example:
print 0+"4zy"  # prints 4

No comments: