Perl/Shell Knowledge Sharing: 5/1/11

Friday, May 27, 2011

Remove particular word in the starting of line

You just want to remove all lines that start with HPL_? That's easy!

perl -pi -e 's/^HPL_.*//s' myfile.txt

OR

#!perl

use strict;
use autodie;
use warnings FATAL => "all";

my $infile = "myfile.txt";
my $outfile = "changed.txt";

open( my $infh, '<', $infile );
open( my $outfh, '>', $outfile );
while( my $line = <$infh> ) {
next if $line =~ /^HPL_/;
print $outfh $line;
}
close( $outfh );
close( $infh );

OR

#!/usr/bin/perl
use strict;
use warnings;

open(my $in, '<', 'myfile.txt') or die "failed to open input for read: $!";
my @lines = <$in> or die 'no lines to read from input';
close($in);

# collect all lines that do not begin with HPL_ into @result
my @result = grep ! /^HPL_/, @lines;

open(my $out, '>', 'changed.txt') or die "failed to open output for write: $!";
print { $out } @result;
close($out);

How to reverse a sentence in Perl

Whether it is more efficient or not will be something you can test but
in Perl you could do something along the lines of:

my $reversed = join( " ", reverse( split( / /, $string ) ) );

OR

Perl makes this kind of text manipulation very easy, you can even test
this easily on the shell:

echo "run as fast as you can" | perl -lne 'print join $",reverse split /\W+/'

or:

echo "all your bases are belong to us" | perl -lne '@a=reverse
/\w+/g;print "@a"'

OR

$_ = "My name is Jack";
unshift @_, "$1 " while /(\w+)/g;
print @_;

perl, unix: fastest way to merge thousands of small files into one file

The cat command works nicely:

cat *someglob* > output.txt

It's name (short for concatenate) even gives away its purpose.

OR

I'm sure cat is faster, and simpler, but here's a perl version, just
because you asked about it.

perl -we '@ARGV = glob("@ARGV"); print while (<>);' *.txt > all.csv

defined and exists difference in perl

Perl has the defined and exists keywords that operate on hash elements.

$hash{'foo'} = 'bar';
print defined $hash{'foo'}; # prints 1
print exists $hash{'foo'}; # prints 1

For most purposes, they do the same thing. The one subtle difference
is when the hash value is the special "undefined" value:

$hash{'baz'} = undef;
print defined $hash{'baz'}; # doesn't print 1
print exists $hash{'baz'}; # prints 1

How do I add a directory to my include path (@INC) at runtime?

Here are the suggested ways of modifying your include path, including
environment variables, run-time switches, and in-code statements:

the "PERLLIB" environment variable
$ export PERLLIB=/path/to/my/dir
$ perl program.pl

the "PERL5LIB" environment variable
$ export PERL5LIB=/path/to/my/dir
$ perl program.pl

the "perl -Idir" command line flag
$ perl -I/path/to/my/dir program.pl

the "lib" pragma:
use lib "$ENV{HOME}/myown_perllib";

the "local::lib" module:
use local::lib;

use local::lib "~/myown_perllib";

The last is particularly useful because it knows about machine dependent
architectures. The "lib.pm" pragmatic module was first included with the
5.002 release of Perl.

Wednesday, May 11, 2011

Multiple Files Compress::Zlib example in perl

#!/usr/bin/perl
use strict;
use warnings;
use Compress::Zlib;

@ARGV > 0 or die "Usage: $0 <file1> [<file2> [<file3> ... ] ]\n";

for my $file (@ARGV) {
open my $fh, '<', $file or
warn "Could not open '$file': $!\n" and next;
my $gz = gzopen("$file.gz", "w") or die "Cannot open $file.gz: $!";
while (<$fh>) {
$gz->gzwrite($_);
}
$gz->gzclose();
}

perl xml tidy examples

Method 1:

my $tidy_doc = XML::Tidy->new("filename"=>"/Users/.../tidy.xml") ;
$tidy_doc->tidy() ;
$tidy_doc->write() ;

Method 2:

#!/usr/bin/perl
use strict;
use warnings;

my $doc;

# use an anonymous code block to limit the scope of the IRS unset
{
 # unset IRS (input record seperator)
 # this allows us to read whole file at once
 local $/=undef;
 # open file
 open my $fh,"<","./test.xml"
 or die "Could not open file: $!";
 # read entire file into scalar variable
 $doc = <$fh>;
 # close file
 close $fh;
}

# process file content with XML::Tiday
my $tidy_doc = XML::Tiday->new(xml => $doc);
$tidy_doc->tidy();
$tidy_doc->write("output.xml");

An alternative, which makes use of the LibXML module:
# create XML::LibXML::Document object
my $doc = $parser->parse_file("./test.xml");

# use the the toString fuction to extract the XML content from the object
my $tidy_doc = XML::Tiday->new(xml => $doc->toString);
$tidy_doc->tidy();
$tidy_doc->write("output.xml");

Method 3:

use warnings;
use strict;
use XML::Tidy;

my $doc = <<EOF;
<?xml version="1.0" encoding="utf-8"?>
<inode>
<perfdata collection="GigabitEthernet0/0">
<info cmd="show interface" detail="GigabitEthernet0/0">
<input_rate>show_interface_input_rate.tcl</input_rate>
<output_rate>show_interface_output_rate.tcl</output_rate>
</info>
<info cmd="show interface" detail="GigabitEthernet0/0/1">
<output_rate>show_interface_output_rate.tcl</output_rate>
</info>
</perfdata>
<perfdata collection="GigabitEthernet1/1">
<info cmd="show interface" detail="GigabitEthernet1/1">
<input_rate>show_interface_input_rate.tcl</input_rate>
<output_rate>show_interface_output_rate.tcl</output_rate>
</info>
<info cmd="show interface" detail="GigabitEthernet1/1/0">
<output_rate>show_interface_output_rate.tcl</output_rate>
</info>
</perfdata>
<perfdata collection="self">
<info cmd="show buffers summary" detail="">
<big_pool>show_buffers_summary_big_pool.tcl</big_pool>
<small_pool>show_buffers_summary_small_pool.tcl</small_pool>
</info>
</perfdata>
</inode>
EOF

my $tidy_doc = XML::Tidy->new(xml => $doc);
$tidy_doc->tidy();
$tidy_doc->write('out.xml');

perl overriding example

#Earlier in the program, you must have requested to override localtime with one that returns an object. You can access the builtin localtime using

my @timeData = CORE::localtime(time);
print join(' ', @timeData);
print "\n\n\n";

#You're using Time::localtime somewhere, and that is turning the results of localtime into an object.
#Also note it is redundant to pass time as a parameter to localtime.

use Time::localtime;
print "Object: ", join(' ', localtime), "\n";
print "Array: ", join(' ', CORE::localtime), "\n";
exit;

Extract URL fields

$url = param('url');
print "url=$url \n";
$url =~ m|(\w+)://([^/:]+)(:\d+)?/(.*)|; # use m|...| so that we do not need to use a lot of "\/"
$protocol = $1;
$domainName = $2;
$uri = "/" . $4;
print "\$3=$3 \n";
if ($3 =~ /:(\d+)/) { $portNo = $1} else { $portNo = 80}
print "protocol=$protocol domainName=$domainName 
portNo=$portNo uri=$uri \n";

How to change @INC in Strawberry Perl?

To prepend paths, set environment variable PERL5LIB to those paths.
Note: This will affect all installations of Perl you run when this is effect.

Howto: Right-click (My) Computer, Properties, Advanced, Environment Variables, (the top) New. You will probably have to restart already running consoles to get the change.

How do I download a file with WWW::Mechanize

#!/usr/bin/perl
use strict;
use WWW::Mechanize;

my $url = 'http://divxsubtitles.net/page_subtitleinformation.php?ID=111292';
my $m = WWW::Mechanize->new(autocheck => 1);
$m->get($url);
$m->form_number(2);
$m->click();
my $response = $m->res();
my $filename = $response->filename;

if (! open ( FOUT, ">$filename" ) ) {
die("Could not create file: $!" );
}
print( FOUT $m->response->content() );
close( FOUT );

How can I add a progress bar to WWW::Mechanize

$m->get($u, ":content_cb" => \&callback);

open (VID,">$i.flv") or die "$!";
$total = 0;
sub callback
{
    my( $data, $response, $proto ) = @_;
    print VID "$data"; # write data to file
    $total+= length($data);
    $size = $response->header('Content-Length');
    print floor(($total/$size)*100),"% downloaded\n"; # print percent downloaded
}

Print lines from one file that are not contained in another file

fgrep -x -f file2 -v file1

-x match whole line

-f FILE takes patterns from FILE

-v inverts results (show non-matching)

OR

use strict;
use warnings;

my %file2;
open my $file2, '<', 'file2' or die "Couldn't open file2: $!";
while ( my $line = <$file2> ) {
++$file2{$line};
}

open my $file1, '<', 'file1' or die "Couldn't open file1: $!";
while ( my $line = <$file1> ) {
print $line unless $file2{$line};
}

sever alive checking in perl using Net::Ping in Linux or Mac system

use Net::Ping;

sub pinghost {

my $host = shift;
my $type = shift || 'icmp';
my $port = shift || 7; # for syn ping

my $status;
my $p = Net::Ping->new($type);

if ($type eq 'icmp' or $type eq 'tcp') {

if ( $p->ping($host,10) ) {
$status = 1;

} else {
$status = 0;
}

} elsif ($type eq 'syn') {

$p->port_number($port);
$p->ping($host,10);

if ( $p->ack ) {
$status = 1;

} else {
$status = 0;
}
}

$p->close;
return $status;
}

__END__

call it:

pinghost($host); # do the same stuff as unix's ping command with ICMP,
requires root
pinghost($host,'syn',80); # not requires root,can ping to a special TCP
port (i.e, the http port), send a syn and wait for ack.
pinghost($host,'tcp'); # not requires root, try to connect to peer's

echo port

ENV Tips

%ENV, when not execute thru CGI, contains the current environnement :

USERNAME                            the user name used to log.
PROMPT                                      the prompt (ex : $p$g).
PROCESSOR_IDENTIFIER                  a string with information on your CPU
                                                               (ex : x86 Family 5 Model 2 Stepping 12, GenuineIntel).
PROCESSOR_ARCHITECTURE          indicate the family of your CPU (ex : x86).
                                                               OSa string with the name of the OS (ex : Windows_NT).
HOMEDRIVE                                       letter of the home drive (ex : C:).
INCLUDE                                               path of the include.
CPU                                                        kind of CPU (ex : i386).
SYSTEMROOT                                      the place of windows (ex : c:\WINNT).
COMSPEC                                             the place of command interpreter
                                                               (ex : c:\WINNT\system32\cmd32.exe)
PATH                                                       the path ...
PROCESSOR_REVISION                 is your pentium buggy ?
USERPROFILE                                 path of the profile for the current user.
COMPUTERNAME                              name of the computer (ex : UC651)
HOMEPATH                                      your place (ex : /users/bill)
USERDOMAIN                                   name of the domain you logged in.
PROCESSOR_LEVE    L                      level of CPU in its family.
OS2LIBPATH                                    path for OS2 library.
SYSTEMDRIVE                                    letter of the system drive.

Example
foreach $i (keys %ENV) {
    print "$i : $ENV{$i} \n";
}

ref and qr

>perl -E"$r=qr/a/; say ref($r) eq 'Regexp' ||0"
1

>perl -E"$r=qr/a/; say ref($$r) eq 'Regexp' ||0"
0

>perl -E"$r=qr/a/; say re::is_regexp($r) ||0"
1

>perl -E"$r=qr/a/; say re::is_regexp($$r) ||0"
1

threads lock example

threads lock is used for web page manipulation. Suppose one person see one web page, at the same time another person edition or updating. In this case you can use variable lock.

use threads;
use threads::shared;

my $sem :shared;
sub tprint {
    my $tid = threads->tid;
    lock $sem;
    print "$tid: ",@_, "\n";
}

sub twarn {
    my $tid = threads->tid;
    lock $sem;
    warn "$tid: ", @_;
}