Tuesday, February 1, 2011

The usual advice

These are some of the usual things that people get wrong. Okay, some of them are advice… but perhaps advice you should follow. For the beginner who’s gotten themselves into deeper trouble, there’s more advice over there.

strict

You really really should use strict — this pragma restricts unsafe constructs.
strict helps enforce good programming habits, and can help you avoid bugs as simple as a typo or the subtle and confusing ones that liquify your brain. There are three categories of things to be strict about:
  1. refs – generates a runtime error if you use symbolic references
  2. vars – generates a compile-time error if you don’t declare your variables
  3. subs – disables “poetry optimization” – that is, generates a compile-time error if you try to use a bareword that’s not a subroutine
If you really need to relax some of these rules, do so for the smallest scope possible, and relax only the strictness(es) you need relaxed:
1
2
3
4
5
6
use strict;
{
    no strict qw(vars);
    $foo = 0; # would blow up with strict vars
}
$bar = 0; # blows up - strictness back in effect

warnings

You really really should also use warnings — this pragma controls optional warnings. Note that while the warnings are technically “optional” youshould consider them mandatory. The warnings pragma is a replacement for the -w command line flag, but the pragma is scoped to the enclosing block, while the flag is global. See perllexwarn for details.
warnings also helps enforce good programming habits, and can help you avoid subtle and confusing bugs. There are a number of warning categoriesenabled by default. If you really need to disable some of these warnings, do so for the smallest scope possible, and relax only the warnings you need relaxed:
1
2
3
4
5
6
7
8
use strict;
use warnings;
my $text;
{
    no warnings qw(uninitialized);
    print "$text\n";    # normally warns "Use of uninitialized value $text in concatenation (.) or string
}
print "$text\n"; # warnings back in effect!

indentation and code style

Format your code so it is readable. Do it for your own sanity, and certainly do it if you expect anyone else to read your code to help you out. Check out perltidy, which has a plethora of options like indent does, but works on Perl code. Stick with either tabs or spaces (spaces are better!).
Likewise, write your code in paragraphs with empty lines between them, and maybe a comment or two explaining what’s going to happen in the next paragraph. Consider breaking monolithic code up into a number of subroutines. This makes it much easier to test and work on.
While useful comments are… useful, avoid intrusive ################### and block comments. While using comments for internal documentation is fine, you should take the time to learn about Perl’s Plain Ol’ Documentation (aka POD) format, which is oh so very nifty.

lexical filehandles

You should use lexical filehandles instead of barewords (globals):
1
2
open TMP, "<$filename"; # bareword/global filehandle (bad)
open my $tmp, "<$filename"; # lexical filehandle (good)
Lexical filehandles are scalars just like any other, and they are scoped the same way.

three-argument open

The two-argument form of open shown above is insecure, because it relies on the filename having “normal” characters in it. Instead, specify the mode to open the file with separately, with the three-argument form:
1
open my $fh, '<', $filename;

check the return value of open

Calls to open can fail, so check the return value:
1
2
# if open fails, $! will have an error message
open(my $fh, '<', $filename) or die "Couldn't open '$filename' - $!";
Alternatively, you can use the autodie pragma, which replaces some functions which return false on failure with ones that succeed or die (“It is better to die() than to return() in failure”).

don't use foreach on lines

Don't be fooled into doing
1
2
3
foreach my $line (<$filehandle>) {
    ...;
}
This reads in the whole file and then iterates over the lines. What you want is almost certainly a while loop, which reads one line at a time, executing the block as you go:
1
2
3
while (my $line = <$filehandle>) {
    ...;
}

pay attention to your quotes

In Perl, there are many types of quotes, so please say what you mean and mean what you say.
"" (double-quotes) are used when you need interpolation, while '' (single-quotes) are used when you don't want interpolation:
1
2
print "line one\nline two\n...\nline $n\n";
print 'mike@email.com';
The same applies to heredocs:
1
2
3
4
5
6
7
8
9
10
my $sender = 'Buffy the Vampire Slayer';
my $recipient = 'Spike';
print <<"END";
Dear $recipient,

I wish you to leave Sunnydale and never return.

Not Quite Love,
$sender
END
versus:
1
2
3
4
print <<'END';
A $ sigil indicates that the variable is a $scalar,
while a @ indicates an @array.
END
For more flexibility, and to allow you to avoid painful backslash escapes, Perl has several quote-like operators - see perlop#Quote-Like-Operators for the gory details. Regardless what you use, be clear what you mean.

know how and when to use hashes and references

Do you know what a hash is? Please read perldsc (data structures cookbook) for a tutorial on creating complex data structures. Follow that withperllol (lists of lists), then perlreftut (references tutorial).
You create a reference either by using auto-vivification (see below), or with a backslash and your data structure. References (to anything) are a scalar. That's how you can pass arrays into subs the way you meant to:
1
2
my @stuff = qw(one two three);  # @stuff is an array
my_sub(\@stuff, $filename);     # \@stuff is a reference to @stuff
The section on "auto-vivification" is unnecessarily complex. Suffice to say, if you write to part of a complex data structure, Perl automagically creates the interveneing levels of the structure if they don't exist yet:
1
2
my $hash;
my $hash->{level1}->[0]->{level3} = q{that's deep};
And, you can mix-and-match to your heart's content:
1
2
3
4
5
6
7
8
9
10
my $reference; # Will it be a hashref? arrayref? who knows!
if ($input =~ /hello/i) {
    $reference->{hello} = 'yes';
    print ref $reference . "\n"; # HASH
}
else {
    $reference->[0] = 'no';
    print ref $reference . "\n"; # ARRAY
}
# Not that this is a good idea
Here are some signs that maybe you should be using a hash:
  • "How can I tell if an element is in my array?"
  • "I want to remove the duplicate elements from my list/array."
  • "How do I get the difference or intersection of two arrays?"
If you find complex data structures to be mind-bending, apply ref and Data::Dumper liberally to see where you are and what you're looking at.

know your comparison operators

Perl has different comparison operators for strings:
1
2
print "same\n" if $one eq $two;         # instead of ==
print "different\n" if $one ne $two;    # instead of !=

got yourself in the rough?

There's more advice over there.

No comments: