Perl/Shell Knowledge Sharing: 11/1/11

Thursday, November 17, 2011

How to pass optional parameters to a Perl function?

You can use a semicolon in the prototype to indicate the end of the required parameters:

sub someFunction($$;$) {
  my ( $oblig_param1, $oblig_param2, $option_param ) = @_;
  ...
}

The ; is optional before a @ or %, which, according to the docs, "gobbles up everything else".

How do I get a file's last modified time in Perl?

You can use the built-in module File::stat (included as of Perl 5.004).
Calling stat($fh) returns an array with the following information about the file handle passed in (from the perlfunc man page for stat):

  0 dev      device number of filesystem
  1 ino      inode number
  2 mode     file mode  (type and permissions)
  3 nlink    number of (hard) links to the file
  4 uid      numeric user ID of file's owner
  5 gid      numeric group ID of file's owner
  6 rdev     the device identifier (special files only)
  7 size     total size of file, in bytes
  8 atime    last access time since the epoch
  9 mtime    last modify time since the epoch
 10 ctime    inode change time (NOT creation time!) since the epoch
 11 blksize  preferred block size for file system I/O
 12 blocks   actual number of blocks allocated

The 9th element in this array will give you the last modified time since the epoch (00:00 January 1, 1970 GMT). From that you can determine the local time:

my $epoch_timestamp = (stat($fh))[9];
my $timestamp       = localtime($epoch_timestamp);

To avoid the magic number 9 needed in the previous example, additionally use Time::localtime, another built-in module (also included as of Perl 5.004). This requires some (arguably) more legible code:

use File::stat;
use Time::localtime;
my $timestamp = ctime(stat($fh)->mtime);

Monday, November 7, 2011

Perl & MySQL

Perl & MySQL help. Here we will give some help on some common issues with regard to MySQL database access from within your Perl programs.

Requirements: Perl 5 (preferably the latest production release), Perl DBI (downloadable from CPAN), DBD::mysql (see CPAN) and MySQL database server.

Simple script that illustrates how to connect to a MySQL database and issues a range of select, delete and update queries:

#! /usr/bin/perl -wT

use strict;
use DBI;

$ENV{PATH} = "/bin:/usr/bin";
delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};


# we assume that the name of your database is DATABASE, that the username 
# to connect is UN and the password is PW. Here we also assume that your 
# MySQL server is running on localhost (so your are connecting through 
# the Unix socket file).

my $dbh = DBI->connect('DBI:mysql:DATABASE:localhost', 'USERNAME', 'PASSWORD')
  or die "Cannot connect: " . $DBI::errstr;

# or:

my $dbh = DBI->connect('DBI:mysql:database=DATABASE;host=localhost','USERNAME','PASSWORD')
  or die "Cannot connect: " . $DBI::errstr;


# select a single field:

my $sql = qq`SELECT field FROM table WHERE otherfield=1`;
my $sth = $dbh->prepare($sql) or die "Cannot prepare: " . $dbh->errstr();
$sth->execute() or die "Cannot execute: " . $sth->errstr();
my $field = $sth->fetchrow_array();
$sth->finish();

print "field is $field \n";


# select multiple fields:

$sql = qq`SELECT fielda,fieldb,fieldc FROM table WHERE otherfield=1`;
$sth = $dbh->prepare($sql) or die "Cannot prepare: " . $dbh->errstr();
$sth->execute() or die "Cannot execute: " . $sth->errstr();
my ($fielda,$fieldb,$fieldc) = $sth->fetchrow_array();
$sth->finish();

print "fielda is $fielda , fieldb is $fieldb , fieldc is $fieldc \n";


# select multiple fields with multiple rows:

$sql = qq`SELECT fielda,fieldb,fieldc FROM table`;
$sth = $dbh->prepare($sql) or die "Cannot prepare: " . $dbh->errstr();
$sth->execute() or die "Cannot execute: " . $sth->errstr();
my @row;
my @fields;
while(@row = $sth->fetchrow_array()) {
  my @record = @row;
  push(@fields, \@record);
}
$sth->finish();


# now process the fields

if (@fields != 0) {
  my $i=0;
  foreach $line (@fields) {
    print "row $i - a is @$line[0] , b is @$line[1] , c is @$line[2] \n";
    $i++;
  }
}


# delete query

$sql = qq`DELETE FROM table WHERE field=1`;
$sth = $dbh->prepare($sql) or die "Cannot prepare: " . $dbh->errstr();
$sth->execute() or die "Cannot execute: " . $sth->errstr();
$sth->finish();


# update query

$sql = qq`UPDATE table SET field=1 WHERE otherfield=2`;
$sth = $dbh->prepare($sql) or die "Cannot prepare: " . $dbh->errstr();
$sth->execute() or die "Cannot execute: " . $sth->errstr();
$sth->finish();


# To increase your application security and prevent SQL injection attacks,
# it is advisable to use placeholders in queries (prepared statements):

$sql = qq`SELECT field FROM table WHERE name=? AND password=?`;
$sth = $dbh->prepare($sql) or die "Cannot prepare: " . $dbh->errstr();
$sth->execute($name, $password) or die "Cannot execute: " . $sth->errstr();
my $field = $sth->fetchrow_array();
$sth->finish();

Sorting Techniques perl

e. Moving to the meat of the matter staight away, we'll start from talking about comparison. Obviously, in order to put a list of things in order, you'll have to first define that order. Order is defined by how things compare to each other. If I give you two items from the list, can you tell me which one is bigger / better / nicer / sexier ... [insert you favourite adjective here] than the other? Or tell me thet they are both of equal order? Well, that's just about it! If you give me a list of items and promise me that you can answer this question for any pair of them, I can make a sorted list of them. All I have to do is take all possible pairs and ask you "how do these two compare?" and arrange them accordingly to finally come up with a sorted list. Actually there are even smarter ways to do it, minimising the amount of comparisons needed, but that is not an issue here, as we will see soon that perl performs that task for us, and we trust perl that it uses the least expensive method. Now, the issue in question being comparison, I assume you must be familiar with all (or at least most) of perl's comparison operators. There's a list of them:

Numbers	Strings
<	lt
>	gt
<=	le
>=	gr
==	eq
<=>	cmp

Now the first five rows should be ok, they're just like math. But what are the <=> and cmp operators? Basically, the expression $a <=> $b (or $a cmp $b for strings) returns one of the values 1, 0, -1 if $a is, respectively, larger, equal or lower than $b. (see table below)

Relation of $a and $b	Value Returned by $a <=>
$a greater than $b	1
$a equal to $b	0
$a less than $b	-1

Does that ring a bell? Coming to think of it, the <=> and cmp operators actually provide the answer to the question we were investigating earlier when we talked about how to sort by using a comparative criterion. So, if we already have an operator that answers this question ("how do two items compare?") all we need is a function that will take a list of items and perform the necessary comparisons to arrive at a sorted list. And guess what? That's exactly what perl's sort operator does. So, if you have an unsorted list @not_sorted and want to created a sorted list @sorted, you just say: @sorted = sort { $a <=> $b } @not_sorted # numerical sort or @sorted = sort { $a cmp $b } @not_sorted # ASCII-betical sort or better @sorted = sort { lc($a) cmp lc($b) } @not_sorted # alphabetical sort Looking at the sort function, we notice that it is exactly as we described it in words, earlier in this article. Perl needs just two things: the list of items to sort, and a subroutine that answers the question "how do these two compare?" for any pair of items in the list. Perl puts the two items it want to compare int the special variables $a and $b and your function is responsible to give a return value that corresponds to the existing relationship of the two, as shown in the table shown earlier. Here, in this simple example, we used perl's built-in comparison operators that work for numerical and alphabetical (not realy... to be correct it is ASCII order) sorting. Of course, you can roll your own comparison function to create sorts for any kind of ordering you wish to have. Before you start coding your own functions, take a look to the following examples:

Get a list of hash keys sorted by value.

@sorted = sort { $hash{$a} cmp $hash{$b} } keys %hash;

Get a reverse sort of a list.

@sorted = sort { $b cmp $a } @list; Which can also be done with @sorted = reverse sort { $a cmp $b } @list;

Get an alphabetical sort of words, but make 'aardvark' always come last.

(Now, why you would want to do that is another question...)

@sorted = sort {    if ($a eq 'aardvark') { return 1; }   elsif ($b eq 'aardvark') { return -1; }   else { return $a cmp $b; }  } @words;

Quick Start Perl

Perl is about saving time. In just 20 minutes, this short introduction to Perl will show you enough to start using Perl straight away. Your new Perl knowledge may save you 20 minutes of work today... so you'll be even. Tomorrow it may save you 20 minutes more.
Read from top to bottom. Leave out nothing, the tutorial is cumulative. Read the example code too, it holds many important insights.

Topics

These links are useful for reference. Read the whole tutorial the first time.

  Scalar Variables
  Logic and Truth
  Arrays and Hashes
  Command Line Arguments
  Conditions
  Files
  Pattern Matching
  Regular Expressions
  Groups
  Netlist Filtering

Scalar Variables

Scalar variables have no explicit type. Scalar variables do not have to be declared. Context is very important in Perl. Mathematical operators treat variables as numbers, integers or floating point, depending on the context. String operators treat variables as strings of characters. Conversion between strings and numbers is automatic. Everything in Perl behaves in the most obvious common-sense way. Perl is very intuitive.
Here are some scalar variables and some numeric operators:

# End of line comments begin with a #

$a = 17;      # Scalar variables begin with a dollar symbol
              # The Perl assigment operator is =
              # Statements finish with a semicolon ;

$b = 0x11;    # Hexadecimal (17 in decimal)
$c = 021;     # Octal       (17 in decimal)
$d = 0b10001; # Binary      (17 in decimal)
$f = 3.142;   # Floating point

$a = $a + 1;  # Add 1 to variable $a
$a += 1;      # Add 1 to variable $a
$a++;         # Add 1 to variable $a

$b = $b * 10; # Multiply variable $b by 10;
$b *= 10;     # Multiply variable $b by 10;

              # Other arithmetic operators include:
              #  **  Exponentiation
              #  %   Modulo division
              #  ++  Auto increment
              #  --  Auto decrement
              #  <   Numeric less than
              #  >   Numeric greater than
              #  ==  Numeric equality
              #  !=  Numeric inequality
              #  <=  Numeric less than or equal to
              #  >=  Numeric greater than or equal to
              #  <=> Numeric compare: Returns -1 0 1

Scalar variables can store strings of characters or words. Single quotes, 'like these', allow spaces to appear inside strings. Double quotes, "like these", allow variables to be automatically substituted or interpolated inside strings.

$a = 'Number of DFFs: '; # No interpolation with 'single quotes'
$b = "$a$c\n";           # Interpolation (variable substitution) with "double quotes"
                         # \n is the newline character
print $b;                # This makes "Number of DFFs: 17\n" appear on the standard output
print $a, $c, "\n";      # As does this line because print takes
                         #    a comma separated list of arguments to print
print "That's all\n";    # No commas means a list of one element

                         # String operators include:
                         #  lt  String less than
                         #  gt  String greater than
                         #  le  String less than or equal to
                         #  ge  String greater than or equal to
                         #  cmp String compare: Returns -1 0 1
print 'one' lt 'two';    # Prints 1
                         #    ASCII-betically 'o' is less than 't'
print 'buf4' lt 'buf3';  # Prints nothing (that is undef, numerically zero)
                         # Perl's undefined value is undef
                         #    ASCII-betically '4' is not less than '3'

Logic and Truth

Perl considers these to be false:

0;      # Integer zero
  0.0;    # Decimal zero
  '0';    # String containing a single zero character
  '';     # Empty string
  undef;  # Undefined

Everything else is true. Here are some logical operators:

$a = 0; $b = 45;      # More than one statement per line possible
print( $a and $b++ ); # prints 0        *
$a = 22;
print( $a and $b++ ); # prints 45       *
print $b;             # prints 46
                      # *  $b++ only evaluated when $a was true
                      # Some logic operators take shortcuts

                      # Other logical operators include
                      # or  Logical OR
                      # ||  Logical OR
                      # and Logical AND
                      # &&  Logical AND
                      # not Logical NOT
                      # !   Logical NOT
                      # |   Bitwise OR
                      # &   Bitwise AND
                      # ~   Bitwise NOT

print 6 & 5;          # prints 4, 0b0110 & 0b0101 = 0b0100
print 6 | 5;          # prints 7, 0b0110 | 0b0101 = 0b0111
print ! 0;            # prints 1
print ! 5;            # prints nothing (that is undef or false)
print ~5;             # prints 4294967290, same as:
                      # 0b11111111111111111111111111111010

Arrays and Hashes

An array is a list of scalar variables. The first element has index 0. The @ symbol is used to denote an array variable.

@components = ( 'X_LUT4', 'X_AND2', 'X_BUFGMUX', 'X_BUF_PP', 'X_FF' );

# or use qw''. Saves typing commas or quotes, gives the same result
# qw stands for Quoted Words
@components = qw'X_LUT4 X_AND2 X_BUFGMUX X_BUF_PP X_FF';

# or even put the data in columns, gives the same result again
@components = qw'
                    X_LUT4
                    X_AND2
                    X_BUFGMUX
                    X_BUF_PP
                    X_FF
                ';             # Easier to read this way

push( @components, 'X_MUX2' ); # Push another item onto the top
push( @components, 'X_ONE' );  # And one more

print $components[0];          # Prints element 0, that is, 'X_LUT4'
print $components[5];          # Prints element 5, that is, 'X_MUX2'

print "@components\n";         # Prints everything separated by spaces:
# X_LUT4 X_AND2 X_BUFGMUX X_BUF_PP X_FF X_MUX2 X_ONE

print  @components   ;         # No double quotes, no spaces:
# X_LUT4X_AND2X_BUFGMUXX_BUF_PPX_FFX_MUX2X_ONE

In a scalar context an array variable returns its size, the number of elements it contains. The test expression of a while loop is tested for true or false, a scalar question, a scalar context. The shift statement removes items from the bottom of the array, one at a time.

while( @components ) {
#     ^^^^^^^^^^^^^                       Array in scalar context
  $next_component = shift( @components );
  print "$next_component\n";
}
# Array variable @components is now empty

In this example @components begins with size 7, which is true. After 7 loops each of the 7 elements have been shifted or removed from the bottom of the array. In the while test expression @components would have returned 7, 6, 5, 4, 3, 2, 1 and finally 0. Zero is false, end of while loop.
Hash arrays differ from arrays because the elements are not ordered numerically but associated with unique strings or keys. Hash arrays are associative arrays because they associate values with keys. Maintaining many separate counters is a good hash array application as we will see later. Hashes make a big contribution to Perl's text processing power. The % symbol is used to denote a hash array.

# Initialising several hash keys
%components = qw'
                  X_LUT4     0
                  X_AND2     0
                  X_BUFGMUX  0
                  X_BUF_PP   0
                  X_FF       0
                ';
#                 ^^^^^^^^^        keys
#                            ^   values
$components{'X_LUT4'} = 1; # Set key X_LUT4 to the value 1
$components{'X_LUT4'}++;   # Increment value associated with X_LUT4
print $components{'X_FF'}; # Print value associated with X_FF
@keys = keys %components;  # Get a list of hash keys
print "@keys\n";           # Print them - order is indeterminate
%components = ();          # Emptying the components hash

Command Line Arguments

There is a special array called @ARGV. Command line arguments are automatically copied into this array variable.

# This script is called process_netlist.pl
# Perl scripts often have the file extension .pl

$netlist_filename = $ARGV[0];
$report_filename  = $ARGV[1];
print "    Processing $netlist_filename\n";
print "    Writing report to $report_filename\n";
print "    ARGV contains '@ARGV'\n";

# Use it in this way:

#    C:\perl process_netlist.pl chip_timesim.vhd report.txt
#        Processing chip_timesim.vhd
#        Writing report to report.txt
#        ARGV contains 'chip_timesim.vhd report.txt'
#    C:\

Conditions

Perl has many conditional statements. The if statement asks a true/false question. If the answer is true it executes a block of code.

if( $ff_count == 1 )
#   ^^^^^^^^^^^^^^ Is this expression true or false?
{
  # Do this action if it is true
  print "There is 1 flip flop\n";
}
else
{
  # Do this action if it is false
  print "There are $ff_count flip flops\n";
}

# More compact layout
if( $ff_count == 1 ) {
  print "There is 1 flip flop\n";
} else {
  print "There are $ff_count flip flops\n";
}

It is not necessary to have an else part. The (round brackets) are required around the expression. The {curly brackets} are required around the actions.
The while loop repeatedly executes a block of statements while a conditional expression is true.

# Counting to one hundred
while( $count < 100 ) {
  $count++;             # Perl assumes $count == 0 the first time
  print "$count\n";
}

Variables do not have to be declared or initialised. Perl will create the variable when it is first used. If the variable is first used in a numeric context then its undefined initial value will be interpreted as zero. If the variable is first used in a string context then its undefined initial value will be interpreted as an empty string. Perl's default behaviour makes good sense. Counting begins at zero. Writing begins with an blank page.
Another loop statement is foreach. It is used for looping through a list of scalars, numeric or string.

foreach $course ( 'VHDL', 'SystemVerilog', 'SystemC', 'Perl', 'Tcl/Tk', 'PSL' ) {
  print "There is a $course Doulos training course\n";
}
# $course is the loop variable.
# It takes the string value 'VHDL' for the first loop
# and 'PSL' for the last loop.

# Get a list from an array variable
foreach $component ( @components ) {
  print "Component is $component\n";
}

Files

Text files are created with the open and print statements. Perl uses file handles to refer to each open file. A file handle is just a string, conventionally written in uppercase letters without quotes. Use the print statement to write text into a file.

open( FILE1, '>file1.txt' );
#             ^                               > means open in write mode
print FILE1 "The first line to file1.txt\n";
print FILE1 "The final line to file1.txt\n";
close( FILE1 );                               # Don't have to explicitly close a file

print STDOUT "This goes to the standard output\n";
print        "So does this\n";
#     ^^^^^^    STDOUT is a file handle that always
#               refers to the standard output.
#               It is the default so doesn't have to be stated.

Text files are read using the open statement and the input record operator. Standard input, the input typed into the keyboard in a command line application, can be read from the STDIN file handle.

open( FILE2, 'file2.txt' );  # Open in read mode - the default mode
$first_line = <FILE2>;       # Reads the first line from file2.txt into $first_line.
                             # Includes the newline character, \n.
while( $line = <FILE2> ) {
  print $line;               # Read and print remaining lines from file2.txt.
}                            # When every line has been read <FILE2> returns undef.

$standard_input = <STDIN>;   # Read a line from the standard input.
                             # Can be the keyboard if run from the command line.

chomp( $standard_input );    # Remove the trailing newline character

Here is a short program. It reads every line from a file named in the first command line argument. The lines are written to a report file named in the second command line argument. Numbers are printed at the beginning of each line.

$netlist_filename = $ARGV[0];
$report_filename  = $ARGV[1];
open( FILE_IN, $netlist_filename );
open( FILE_OUT, ">$report_filename" );
while( $line = <FILE_IN> ) {
  $line_count++;
  print FILE_OUT "$line_count: $line";
}
# perl filter_netlist.pl chip_timesim.vhd report.txt

This Perl script does the same using standard input and standard output.

while( $line = <STDIN> ) {
  $line_count++;
  print "$line_count: $line";
}
# perl filter_netlist.pl < chip_timesim.vhd > report.txt

Pattern Matching

Perl's matching operator uses regular expressions to search a string for a match. Regular expressions are patterns used to find a match for a string of characters contained in a longer string. Regular expressions are built from character classes. The simplest character class is a single character. The letter A matches a capital letter A once. Character classes only match one character but the character can be any from a list of qualifying characters.

$string = "Novice to Expert in a 3 day Perl course.\n";
print $string;
if( $string =~ m/Expert/ ) {
  # A successful match returns 1 so this statement is executed
  print "This string contains the substring 'Expert'\n";
}
# m stands for match
# Forward slashes are used to /delimit/ regular expressions.
# =~ tells the m operator which string to search.
# The m is optional when // are used.

Regular Expressions

Individual letters are very limited character classes. Ranges of characters are matched using character class shortcuts. Any alphanumeric character matches \w, including underscores. Conveniently, most languages recognise identifiers containing letters, digits and underscores.
Quantifiers allow character classes to be repeated. The most common quantifiers are question mark, ?, asterisk, *, and plus, +. Zero or one repetition is ?. Zero or more repetitions is *. One or more repetitions is +.

use English;
$string = "Novice to Expert in a 3 day Perl course.\n";
if( $string =~ /\w+/ ) {
  # \w+ matches one or more alphanumeric characters in a row
  print "Matched: $MATCH\n";  # Matched: Novice
}

Readable English names for some Perl special variables are provided by the English Perl module. $MATCH gets a copy of the substring successfully matched. Without using the English Perl module $MATCH would have to be called $&.

use English;
$string = "Novice to Expert in a 3 day Perl course.\n";
if( $string =~ /Perl\s+\w+/ ) {
  #             ^^^^          matches Perl
  #                 ^^^       matches one or more white space characters
  #                                    (including space, tab and newline)
  #                    ^^^    matches one or more alphanumeric characters
  print "Matched: $MATCH\n";  # Matched: Perl course
}
#  \w?    Zero or one letter, digit or underscore
#  \w     One letter, digit or underscore
#  \w*    Zero or more letters, digits or underscores
#  \w+    One or more letters, digits or underscores
#  \W     One character but not a letter, digit or underscore

#  \s     White space character, space, tab or newline
#  \S     One character but not a space, tab or newline

Groups

Sub expressions can be grouped together and stored in back reference buffers. Round brackets are used to (group) sub expressions. The back reference buffers have the names $1, $2, $3 etc. The substring that matches the first bracketed group is copied into $1. The second into $2 etc.
There is a Xilinx Vital timing model in a file called chip_timesim.vhd. Here is one component instantiation from it:

c4_n001449 : X_LUT4
    generic map(
      INIT => X"0001"
    )
    port map (
      ADR0 => c4_count(4),
      ADR1 => c4_count(18),
      ADR2 => c4_count(3),
      ADR3 => c4_count(5),
      O => CHOICE121_pack_1
    );

The component name, X_LUT4, could be matched using a regular expression containing a group. This short script opens the file, finds every component instantiation reporting the component name.

open( VHDL_FILE, 'chip_timesim.vhd' );
while( $line = <VHDL_FILE> ) {
  if( $line =~ /\w+\s*:\s*(X_\w+)/ ) {
#               ^^^                      Instance label
#                  ^^^                   Zero or more white space characters
#                     ^                  :
#                      ^^^               Zero or more white space characters
#                          ^^^^^         Group containing a word beginning with X_
#                                                                 (copied into $1)
    print "Found instantiation of $1\n";
  }
}

Netlist Filtering

The following script takes the filename of a Xilinx netlist from the command line. It finds and counts every component instantiation. Finally, it prints a list of all the component names found and the number of appearances.

# Pulling it all together
# Everything in this script is described above

$netlist_filename = $ARGV[0];
open( VHDL_FILE, $netlist_filename );

while( $line = <VHDL_FILE> ) {
  if( $line =~ /\w+\s*:\s*(X_\w+)/ ) {
    $component_hash{$1}++;
  }
}

@name_array = keys %component_hash;
foreach $component_name ( @name_array ) {
  print "$component_name: $component_hash{$component_name}\n";
}

Extracting information from text files is easy given a little Perl knowledge. The following output was generated by the above script:

X_FF: 56
X_AND2: 29
X_ONE: 25
X_INV_PP: 23
X_BUF_PP: 395
X_ZERO: 4
X_TOC: 1
X_XOR2: 53
X_BUFGMUX: 1
X_OR2: 8
X_ROC: 1
X_MUX2: 96
X_LUT4: 123
X_TRI_PP: 20