Introduction
Perl was created by Larry Wall in 1987, based on his earlier Unix system administrative tool called awk. Perl, stands for Practical Extraction and Report Language, originally meant for text formatting and processing, has grown over times to cover system administration, network, web and database programming, glue between systems and languages (system integration and rapid prototyping), bioinformatics, data mining, and even application development.
The main features of Perl are:
- "Perl is the Swiss Army Knife of programming languages: powerful and adaptable." Perl is a mixture of C, Unix's shell script, awk, sed, and more. Perl is much more expressive than these languages ("maximum expressivity", "There is more than one way to do it"). You can write a Perl program fairly quickly in a few lines of codes.
- Perl is an interpreted language, and therefore platform-independent. You can run Perl scripts in any platform (Unix, Windows, Mac) where Perl interpreter is available.
- Perl provides a powerful regular expression facility to support text processing and report generation. Perl also has symbolic debugger, built-in support for database management, and etc.
- Perl 5 supports Object-oriented programming.
- Many Perl utilities and add-ons are available at CPAN (Comprehensive Perl Archive Network @ www.cpan.org).
- Perl is free. Perl is open-source.
The Perl versions include:
- 1.0 (1987)
- 2.0 (1988)
- 3.0 (1989)
- 4.0 (1991)
- 5.0 (1994),..., 5.5 (1998),..., 5.10 (2007), 5.11 (2009)
- Perl 6 (coming soon @ perl6.org)
Popular sites for Perl are www.perl.org, www.perl.com, www.pm.org, www.perlmongers.org. Perl documentation is available at http://perldoc.perl.org.
Installing Perl
There are many ways to get the Perl Interpreter:
- For Windows systems: Can be installed as part of CYGWIN (Read How to install CYGWIN. Select "Category" ⇒ "Interpreters" ⇒ "perl").
- For Unix systems: Pre-installed in all Unixes.
- Download and install ActivePerl from http://www.activestate.com/activeperl.
The path of Perl interpreter "perl.exe
" must be in included in the PATH
environment variable.
First Perl Program
Use a programming text editor (such as NotePad++, PSPad, TextPad) to enter the following source codes and save as "Hello.pl
":
1 2 3 4 5 |
#!/usr/bin/env perl use strict; # Terminate when error occurs use warnings; # Display all warning messages print "Hello world!\n"; # Print a message print 'Hello world, ', 'Again!', "\n"; # Print another message |
How It Works
- Line 1, called shebang, is meant for Unixes, which specifies the location of the Perl Interpreter. This line is ignored under Windows.
- Line 2 and 3 are directive (or pragma) to instruct Perl on how to handle errors. "
use strict
" instruct Perl to terminate the program immediately when an error occurs. "use warning
" instruct Perl to display all the warning messages. - Line 1 to 3 are optional, but recommended for writing robust program.
- A comment begins with a '
#
' and lasts until the end of line. Comments are used to explain the codes; but are ignored by the interpreter. - A Perl's statement ends with a semi-colon (
;
). - The
print
function prints the given string to the console.\n
denotes a new-line. A function can takes zero or more arguments (separated by commas). In Perl, you can enclose the function's arguments with parentheses()
or omit them. For examples, the followings are equivalent:print 'Hello world, ', 'Again', "\n"; # Function arguments without parentheses print('Hello world, ', 'Again', "\n"); # Function arguments enclosed in parentheses
- Strings can be enclosed in double-quotes or single-quotes. However, double-quotes interpret variables and special character (like
\n
for new-line,\t
for tab), but single-quotes don't. For example,print "\n"; # print a newline print '\n'; # print \n literally
- Extra white-spaces (blanks, tabs, new-lines) are ignored.
- The file extension of "
.pl
" is not mandatory but recommended.
Running In Windows
To run the program under Windows, start a cmd
shell and issue the command:
> ... change directory to the directory containing Hello.pl ... > perl Hello.pl
Hello world! Hello world, Again!
Note: The path for Perl Interpreter “perl.exe
” must be in included in the PATH
environment variable.
To display the Interpreter's help menu, issue:
> perl -h
To find the version of the Perl Interpreter, issue:
> perl -v This is perl, v5.10.0 built for cygwin-thread-multi-64int ......
Running In Unix
For Unixes, include "#!/usr/bin/env perl
" as the first line of the program (which specifies the location of the Perl Interpreter - just like any other Unix shell script). To run the program: first make the file executable (via change-mode command) and then execute the file:
$ ... change directory to the directory containing Hello.pl ... $ chmod u+x Hello.pl $ ./Hello.pl
Perl 5.10 Features
Perl 5.10 introduces many nice features. For example, the Hello-world can be written as follows (to save as "Hello510.pl
"):
1 2 3 4 5 |
#!/usr/bin/env perl use strict; # Terminate when error occurs use warnings; # Display all warning messages use 5.010; # Use Perl 5.10 features say 'Hello, world!'; # Print a message |
> perl Hello510.pl Hello, world!
Notes:
- Line 4 instructs Perl to enable the new features in Perl 5.10. It could also be written as "
use feature ':5.10'
". - Function
say
is similar toprint
,say
automatically prints a newline at the end of the message, whereasprint
don't.say
is available in Perl 5.10.
Basic Syntax
Comment
Comments are ignored by the Perl runtime but greatly useful in explaining your codes to others (and also to yourself three days later). You should use comments liberally to explain or document your codes.
A comment begins with symbol #
, and lasts until the end of the line. There is no multi-line comment other than putting a # at the beginning of each line.
Statement & Block
A statement is a single instruction consisting of operators, variables and expression. A Perl's statement terminates with a semicolon (;
).
Blanks, tabs and newlines are collectively called whitespaces. In Perl, extra white-spaces (blanks, tabs, newlines) are ignored (that is, multiple whitespaces are treated as one whitespace).
You can place many statements on a single line.
A block consists of zero or more statements enclosed in pair of curly braces { ... }
. No semi-colon is needed after the closing brace.
Calling Perl's Built-in Functions
Perl has many built-in functions, which takes a comma-separated list of arguments. You can enclose the arguments in parentheses or omit them, depending on your programming style. For examples,
print 'Hello, world', "\n"; # Function arguments are separated by commas print('Hello, world', "\n"); # Parentheses are optional say 'Hello, world'; # Function say (Perl 5.10) always prints a newline say('Hello, world');
Variables, Literals & Data Types
A variable is a named storage location that holds a value, of a certain data type. A literal is a fixed value, e.g., 5566
, 3.14
, 'Hello'
, that can be assigned to a variable or form part of an expression.
Perl supports the following data types. It uses different initial symbols to denote and differentiate the various data types.
- Scalar: begins with symbol
$
. - Array: begins with symbol
@
. - Hash or Associative Array: begins with symbol
%
.
An expression is a combination of variables, literals, operators, and sub-expressions that can be evaluated to produce a single value.
Scalar Variables and Contexts
A scalar is a single item. A scalar variable's name begins with symbol $
, followed by a letter or underscore, followed by more letters, digits, or underscore. For example, $size
, $_min_value
, and $average
.
Perl is case-sensitive. A $rose
is not a $ROSE
, and is not a $Rose
.
Unlike strong-type languages like C/C++/C#/Java, but like JavaScript/Unix Shell Script:
- Perl's variables name need not be declared before use, which often leads to poor programs. It is strongly recommended to declare a variable before use!!
- The actual type of a scalar (e.g., integer, floating-point number or string) need NOT be specified. Perl's scalar is simply a single item, which could take on context of number (integer or floating-point number), string, or boolean automatically.
You could assign a value (called literal) to a scalar variable using the assignment operators (=
). The scalar variable takes on the context of the literal assigned. For example, a variable takes on a string context if a string literal is assigned; takes on a numeric context if a numeric literal is assigned.
You can declare a local variable via the keyword my
.
my $num = 123; # numeric context my $str = "Hello"; # string context
The context of the scalar is important because many operations are confined to a certain context, e.g., arithmetic operations (+
, -
, *
, /
) can be applied to numbers but not strings; strings can concatenate using ".
" operator; logical operations (and
, or
, not
) are applied to boolean. Perl automatically converts between the different contexts as needed to perform an operation. In other word, the context of a scalar is determined by the operation. For example:
#!/usr/bin/env perl # ScalarContextTest.pl use strict; use warnings; my $num1 = 11; # Numeric context my $num2 = 22; # Numeric context my $str1 = 'Hello'; # String context my $str2 = 'world'; # String context my $str3 = '33'; # String context my $str4 = '44'; # String context print $num1 + $num2 , "\n"; # + takes numbers print 12 * 3.4 , "\n"; # * takes numbers print $str1 + $str2 , "\n"; # + takes numbers, not strings. Invalid output print $str1 . " " . $str2 , "\n"; # . takes strings print $str3 + $str4 , "\n"; # + takes numbers - String converted to numeric context print "5.5" - 5 , "\n"; # - takes numbers - String converted to number print $num1 . $num2 , "\n"; # . takes strings - Numbers converted to string
> perl ScalarContextTest.pl
33
40.8
Argument "world" isn't numeric in addition (+) at ScalarContextTest.pl line 13.
Argument "Hello" isn't numeric in addition (+) at ScalarContextTest.pl line 13.
0
Hello world
77
0.5
1122
If you remove the "use warning
", the warning messages will not be shown, and you will have no idea that something went wrong.
How does Perl know that a variable is a number or a string? In fact, Perl does not know. Whenever a variable or string literal is used as an argument to an arithmetic operation (+
, -
, *
, /
), Perl tries to convert it to a number. If the variable does not contain a valid number, Perl simply sets it to 0; and you will not be warned unless you specify "use warning
" or turn on the -w
(warning) flag!
A variable takes a value called UNDEF
, if no value is assigned to it.
Numeric Context and Operations
In Perl, numbers are stored as double-precision floating-point. All the arithmetic operations are carried out in floating-point. There is no distinct integer type in Perl!
Numeric literals include:
- Point-point literals: e.g.,
3.1416
,-0.8e18
,1.2E-0.5
. - Integer literals: e.g.,
5566
,-128
. You can delimit a long integer with underscore, e.g.,12_111_222_333
. - Octal literals: begin with a leading
0
(zero), e.g.,0127
. - Hexadecimal literals: begin with
0x
, e.g.,0xABCD
. - Binary literals: begin with
0b
, e.g.,0b10110011
.
Arithmetic Operators
Perl provides the following arithmetic operators for numbers. The following results are obtained assuming that $x=5
, $y=2
before the operation.
OPERATOR | DESCRIPTION | EXAMPLE | RESULT |
---|---|---|---|
+ |
Addition | $z = $x + $y; |
$z is 7 |
- |
Subtraction (or Unary Negation) | $z = $x - $y; |
$z is 3 |
* |
Multiplication | $z = $x * $y; |
$z is 10 |
/ |
Division | $z = $x / $y; |
$z is 2.5 |
% |
Modulus (Division Remainder) | $z = $x % $y; |
$z is 1 |
** |
Exponentiation | $z = $x ** $y; |
$z is 25 |
++ |
Unary Pre- or Post-Increment | $y = $x++; $z = ++$x; Same as: $y = $x; $x = $x+1; $x = $x+1; $z = $x; |
$y is 5, $z is 7, $x is 7 |
-- |
Unary Pre- or Post-Decrement | $y = --$x; $z = $x--; Same as: $x = $x-1; $y = $x; $z = $x; $x = $x-1; |
$y is 4, $z is 4, $x is 3 |
Arithmetic operations are carried out in floating-point (double precision). In other words, 1/2
give 0.5
(whereas in C/Java, 1/2
gives 0
). You can truncate a floating point number to integer via built-in function int()
.
#!/usr/bin/env perl # NumericOpTest.pl use strict; use warnings; my $num1 = 11; my $num2 = 22; print $num1, "\n"; # 11 print $num2, "\n"; # 22 print $num1+$num2, "\n"; # 33 print $num1-$num2, "\n"; # -11 print $num1*$num2, "\n"; # 242 print $num1/$num2, "\n"; # 0.5 print ++$num1, ' ', $num1, "\n"; # 12 12 print $num2--, ' ', $num2, "\n"; # 22 21 $num1 -= $num2; print $num1, "\n"; # -9
Arithmetic cum Assignment Operators
These are short-hand operators to combine two operations.
OPERATOR | DESCRIPTION | EXAMPLE | RESULT |
---|---|---|---|
+= |
Addition cum Assignment | $x += $y; |
Same as: $x = $x + $y; |
-= |
Subtraction cum Assignment | $x -= $y; |
Same as: $x = $x - $y; |
*= |
Multiplication cum Assignment | $x *= $y; |
Same as: $x = $x * $y; |
/= |
Division cum Assignment | $x /= $y; |
Same as: $x = $x / $y; |
%= |
Modulus cum Assignment | $x %= $y; |
Same as: $x = $x % $y; |
**= |
Exponentiation cum Assignment | $x **= $y; |
Same as: $x = $x ** $y; |
Comparison Operators
Perl provides the following operators for comparing numbers:
OPERATOR | DESCRIPTION | EXAMPLE | RESULT |
---|---|---|---|
== |
Equal To | |
|
!= |
Not Equal To | |
|
> |
Greater Than to | |
|
>= |
Greater Than or Equal To | |
|
< |
Less Than | |
|
<= |
Less Than or Equal To | |
|
String Context and Operations
Strings are sequence of zero or more characters. String literals can be enquoted with single quotes or double quotes. However, the type of quotes is significant: double quotes interpret (or interpolate) variables and special characters (e.g., \n
for new-line, \t
for tab, \\
for back-slash), whereas single quotes don't. Perl look for the longest possible variable name in interpolation (i.e., greedy). For example,
my $msg = 'Hello'; print "$msg world\n"; # print Hello world followed by newline print '$msg world\n'; # print $msg world\n literally (no interpretation)
Using single quotes is probably more efficient if the string does not need to be interpreted.
String Operators
Perl provides the following string operators:
OPERATOR | DESCRIPTION | EXAMPLE | RESULT |
---|---|---|---|
. |
String Concatenation | 'Hello, ' . 'world' |
'Hello, world' |
x |
Duplicate | 'ba' x 4 |
'babababa' |
.= |
Concatenation cum Assignment | $str .= $str1; |
Same as $str = $str . $str1; |
x= |
Duplicate cum Assignment | $str x= $num; |
Same as $str = $str x $num; |
String Comparison Operators
Perl provide the following operators for comparing strings:
OPERATOR | DESCRIPTION | EXAMPLE | RESULT |
---|---|---|---|
eq |
String Equal To | |
|
ne |
String Not Equal To | |
|
gt |
String Greater Than to | |
|
ge |
String Greater Than or Equal To | |
|
lt |
String Less Than | |
|
le |
String Less Than or Equal To | ||
cmp |
String Compare To | |
|
cmp
: returns 1 if the fist string is greater than the second string; 0 if equal; -1 otherwise.
String Functions
Perl provides many built-in functions for manipulating strings:
substr(var, index, length)
: returns the substring from stringvar
, starting from positionindex
, oflength
. String index begins at 0. You can also usesubstr
to modify the original string. For example,my $msg = 'Perl is fun!'; my $adj = substr($msg, 8, 3); # Extract a portion of string print $adj, "\n"; # 'fun' print substr($msg, 8), "\n"; # 'fun!' substr($msg, 8, 3) = 'quite cool'; # Modify a portion of string print $msg, "\n"; # 'Perl is quite cool!'
index(string, substring)
: return the index of thesubstring
instring
, or -1 if not found.rindex(string, substring)
: return the index but searching from the right.length(string)
: returns the number of characters instring
.lc(string)
: returns a lowercasestring
.uc(string)
: returns an uppercasestring
.lcfirst(string)
: returns a first-letter lowercasestring
.ucfirst(string)
: returns a first-letter uppercasestring
.
q and qq
Instead of using single quotes '...'
or double quotes "..."
, you could also use q (for single quotes) or qq (for double quotes), as follows:
say q(Perl's cool); # Generalized single quote - may contain single quote say q|Perl's cool|; say qq(Perl is "cool"); # Generalized double quote - may contain double quote say qq|Perl is "cool"|;
Boolean Context and Operations
A scalar can take a boolean context of either true or false. "False" includes:
- The number
0
. - An empty string
''
or""
. - A string containing a zero (i.e.,
'0'
or"0"
). - A variable that has not been assigned a value (i.e.
UNDEF
). - An empty array or hash (to be discussed later).
Anything else is considered as true.
Functions defined and undef: defined(var)
returns true if the variable var
is defined.
undef(var)
un-defines the variable var
.
Boolean Operators
Perl provides the following boolean (or logical) operators:
OPERATOR | DESCRIPTION | EXAMPLE | RESULT |
---|---|---|---|
&& |
C-style's Logical AND | |
|
|| |
C-style's Logical OR | |
|
! |
C-style's Logical NOT | |
|
and |
Perl's Logical AND | ||
or |
Perl's Logical OR | ||
not |
Perl's Logical NOT |
Notes:
- Perl's
not
,and
,or
carry out the same operations as C-style's!
,&&
,||
, but these logical operators have very low precedence (lower than assignment operator =) and can be useful in certain situations (but you can also use the parentheses to change the precedence). They are also easier to read than the C-style logical operators. - Logical operations are always short-circuited. That is, the operation is terminated as soon as the result is certain, e.g.,
false && ...
is short-circuited to givefalse
,true || ...
givestrue
.
Input From Keyboard & Formatted Output
You can use the operator <>
or <STDIN>
(called file-handle, to be discussed in details later in File IO) to read input from keyboard. The input, however, contains the newline character (corresponding to the enter key), which can be stripped away via function chomp
.
Functions chomp and chop: chop
removes the last character of a string. chomp
removes the last character only if that is a newline character. Both chop
and chomp
returns the number of character removed.
For example,
#!/usr/bin/env perl # UserInputTest.pl use strict; use warnings; print 'Enter your message: '; my $msg = <>; # <> to read user's input print "Your message is $msg"; # $msg include a newline print 'Enter your last name: '; my $lastName = <>; chomp $lastName; # Strip ending newline print 'Enter your first name: '; my $firstName = <>; chomp $firstName; # Strip ending newline my $fullName = $firstName . ' ' . $lastName; # Concatenate print "Your full name is $fullName\n"; # $fullname does not have newline
Function printf and sprintf: C-style's printf
and sprintf
(string printf) are supported in Perl for formatted output. For example,
my $str = 'Hello';
my $float = 1.2;
my $num = 33;
# %s for string, %f for floating-point number, %d for integer
printf "%10s %6.2f and %3d\n", $str, $float, $num;
my $pstr = sprintf "%10s %6.2f and %3d\n", $str, $float, $num;
say $pstr;
Conditional Flow Control
Perl provides many variations of flow control constructs:
SYNTAX | EXAMPLE |
---|---|
if (condition) { trueBlock; } |
if ($day eq 'sat' || $day eq 'sun') { print 'Super weekend!'; } |
trueSingleStatement if condition; |
print 'Super weekend!' if ($day eq 'sat' || $day eq 'sun'); |
unless (condition) { falseBlock; }
same as:
if (!condition) { falseBlock; } |
unless ($day eq 'sat' || $day eq 'sun') { print 'It is a weekday'; } unless ($day ne 'sat' || $day ne 'sun') { print 'Super weekend!'; } unless $error { print 'Yes, Hello'; } |
falseSingleStatement unless condition; |
print 'It's a weekday' unless ($day eq 'sat' || $day eq 'sun'); |
if (condition) { trueBlock; } else { falseBlock; } |
if ($day eq 'sat' || $day eq 'sun') { print 'Super weekend!'; } else { print 'It is a weekday...'; } |
if (condition1) { trueBlock1; } elsif (condition2) { trueBlock2; } elsif (condition3) { trueBlock3; } elsif { ... } else { elseBlock; } |
if ($day eq 'sat' || $day eq 'sun') { print 'Super weekend!'; } else if ($day eq 'fri') { print "Thank God, it's friday!"; } else { print 'It is a weekday...'; } |
condition ? trueStatement : falseStatement; |
max = (a > b) ? a : b; abs = (a >= 0) ? a : -a; |
# Perl 5.10 switch-case:
given (variable) {
when (value1) { ... }
when (value2) { ... }
......
} |
given ($day) { when ('sat', 'sun') { print 'Super weekend!'; } when ('mon', 'tue', 'wed', 'thu') { print 'It is a weekday...'; } when ('fri') { print "Thank God, it's friday"; } } |
Notes:
- The curly braces are mandatory even if there is only one statement in the block.
- A negate version of
if
calledunless
is provided. It could be hard to read and should be used only for negative logic, e.g.,unless $error { ... }
, could be better thanif not $error { ... }
. - The statement block can be placed before or after the
if
orunless
clause. - The keyword for else-if is
elsif
. - Switch-case available from 5.10's
given-when
.
Arrays
An array contains a list of zero or more scalars. An array variable begins with @
, whereas a scalar variable begins with $
. A @rose
is nothing to do with a $rose
.
An array can be assigned to and from a list of commas-separated scalars enclosed in parentheses. For example:
my @months = ('jan', 'feb', 'mar', 'apr'); my @days = qw(mon tue wed thu fri sat sun); # single-quoted words (my $first, my $second, my $third, my $fourth) = @months; print @months, "\n"; # janfebmarapr print $first, "\n"; # jan print $fourth, "\n"; # apr
You can mix numbers and strings (and undef
) inside an array, e.g.,
my @mixmonths = ('jan', 2, 'mar', 4);
print @mixmonths, "\n"; # jan2mar4
You can use array index in the form of $arrayName[index]
to reference individual element of an array. The array index starts at 0. Note that scalar context $
is used for referencing individual element instead of array context @
. Accessing an array past its bound gives UNDEF
.
You can also refer to a portion (or slice) of an array (i.e., sub-array) using an index range in the form of @arrayName[beginIndex..endIndex]
or @arrayName[index1,index2,...]
. For example,
my @months = ('jan', 'feb', 'mar', 'apr'); print $months[2], "\n"; # Scalar 'feb' print @months[1..3], "\n"; # Array slice ('feb', 'mar') print @months[3,1], "\n"; # Array slice ('apr', 'jan') print @months[2], "\n"; # Array slice ('feb') my @emptyArray = (); # Empty array
Some functions, such as localtime
, return an array or scalar based on the context, e.g.,
my ($sec, $min, $hour, $day, $month, $year, $weekday,$dayOfYear, $isdst) = localtime;
my ($m, $d, $y) = (localtime)[4,3,5];
my $dateTime = localtime; # gives Tue Oct 6 19:04:44 2009
In Perl, array is not bounded. Its size will be dynamically expanded when new elements are added. For example,
my @months = ('jan', 'feb', 'mar', 'apr'); @months[4..5]= ('may', 'jun'); # @months is ('jan', 'feb', 'mar', 'apr', 'may', 'jun') $months[7] = 'aug'; # $month[6] gets UNDEF
The scalar variable $#arrayName
maintains the last index of the array @arrayName
. You might be tempted to use $#arrayName+1
as the length of the array. This is not necessary, as Perl will return the length of the array if @arrayName
is used in a scalar context (e.g., assign to a scalar, arithmetic and comparison operations). In other words, to reference the length of an array, you can simply assign @arrayName
to a scalar context. For example,
my @months = ('jan', 'feb', 'mar', 'apr'); print $#months, "\n"; # Gives 3 print $months[$#months], "\n"; # Gives 'apr' $months[$#months + 1] = 'may' my $size = @months; # Get the length of the array print $size, "\n"; for (my $i = 0; $i < @months; $i++) { # @months in scalar context print $months[$i], "\n"; }
Negative array index n can be used to reference the nth-to-last element of the array, e.g.,
my @months = ('jan', 'feb', 'mar', 'apr'); print $months[-1], "\n"; # Gives 'apr' print $months[-2], "\n"; # Gives 'mar'
Array Functions
Perl provides many functions to manipulate arrays:
push(array, list)
: appends thelist
of elements to the end of thearray
.pop(array)
: removes and returns the last element of thearray
.shift(array)
: removes and returns the first element of thearray
.unshift(array, list)
: add thelist
of the elements in front of the array.splice(array, offset, length, list)
: removes and returnslength
elements fromarray
, starting fromoffset
, and optionally, replace them withlist
.
my @months = ('jan', 'feb', 'mar', 'apr'); push @months, 'may'; # @months = ('jan', 'feb', 'mar', 'apr', 'may')
print @months, "\n";
print pop @months, "\n"; # @months = ('jan', 'feb', 'mar', 'apr')
print pop @months, "\n"; # @months = ('jan', 'feb', 'mar')
push (@months, shift @months); # Move the first element to last
print @months, "\n"; # @months = ('feb', 'mar', 'jan')
Special Array Variable: The Command-Line Argument Array @ARGV
The command-line arguments (excluding the program name) are packed in an array, and passed into the Perl's program as an array named @ARGV
, The function shift
, which takes @ARGV
as the default argument, is often used to process the command-line argument.
Flow Control - Loops
Perl provides many types of loop constructs:
SYNTAX | EXAMPLE |
---|---|
while (condition) { trueBlock; } |
my $i = 0; while ($i < 10) { print "$i\n"; $i++; } |
do { trueBlock; } while (condition); |
my $i = 0; do { print "$i\n"; $i++; } while ($i < 10); |
until (condition) {
falseBlock;
}
same as:
while (!condition) { falseBlock; } |
my $i = 0; until ($i >= 10) { print "$i\n"; $i++; } |
foreach $scalarName ( @arrayName ) {
statementBlock;
}
or
for $scalarName ( @arrayName ) {
statementBlock;
} |
my @months = ('jan', 'feb', 'mar', 'apr'); foreach my $month (@months) { print $month, "\n"; } for my $i (5, 4, 3, 2, 1) { print "$i "; } |
for (initialization; expression; postIncrement) { statementBlock; } |
my @months = ('jan', 'feb', 'mar', 'apr'); for (my $i = 0; $i < @months; $i++) { print $months[$i], "\n"; } |
Notes:
- Again, the curly braces are mandatory, even if there is only one statement in the block.
foreach
loop is handy for reading each item of the array. It cannot modify the array.- The negation version
until
should be used only for negative logic, e.g.,until ($done) { ... }
.
Loop Control Statements
last
: exit the for loop (similar tobreak
statement in C/Java).next
: aborts the current iteration and continues to the next iteration of the loop (similar tocontinue
statement in C/Java)redo
: redo the current iteration (from the begin brace).- last, next and redo work with a labeled block in the form of
labelName: ...
For example:
#!/usr/bin/env perl # LoopTest.pl use strict; use warnings; my $num = 1; while (1) { # Always true $num++; next if ($num % 3) == 0; # Continue to next num if num is divisible by 3 last if $num == 17; # Break the loop if num is 17 if (($num % 2) == 0) { $num += 3; # Add 3 for even number } else { $num -= 3; # Subtract 3 for odd number } print "$num "; }
> perl LoopTest.pl 5 4 2 7 11 10 8 13 17 16
Special Scalar Variable: The Default Scalar Variable $_
Perl introduces a feature called the default variable, which is not found in other languages. The default scalar variable is named $_
.
Many constructs and functions, such as foreach
loop and print
, takes $_
as the default argument. For example,
foreach my $month (@months) { print $month; }
can be rewritten as:
foreach (@months) { print; } # same as: foreach $_ (@months) { print $_; } # or for (@months) { print; }
Another example:
while (<>) { # while ($_ = <>) to read input from keyboard print; # print $_ chomp; # chomp $_ to remove ending newline from $_ last if ($_ eq 'done'); # break the loop if input is 'done' }
Hash or Associative Array
We have so far covered two data types, scalar (which begins with $
) for single item; and array (which begins with @
) for a list of scalars. The third data type provided by Perl is called Hash or Associative Array, which begins with a %
. Take note that %rose
is not a @rose
is not a $rose
.
Hash stores key-value (or name-value) pairs. Hash is similar to regular array, except that regular arrays are indexed by numbers; but hashes are indexed by key-strings. Hash lets you associate one scalar to another, hence, it is also called associative array.
To initialize a hash, you could provide a list of key-value pairs in the form of (key1 => value1, key2 => value2, ...)
or (key1, value1, key2, value2, ...)
. Key must be unique.
You can retrieve the value associated to a key, in the scalar-context form of $hashName{keyName}
. Recall that array uses square bracket with numerical index, $arrayName[index]
, whereas hash uses curly bracket and key-string index.
For example,
#!/usr/bin/env perl # HashTest.pl use strict; use warnings; # Declare and initialize a hash with key-value pairs. my %countryCodes = ('us' => 'United States', 'sg' => 'Singapore'); # Use $hashName{keyName} (scalar context) to reference the value of an item. print $countryCodes{'us'}, "\n"; # prints 'United States' print $countryCodes{'sg'}, "\n"; # prints 'Singapore' # Add in more key-value pairs $countryCodes{'fr'} = 'France'; $countryCodes{'cn'} = 'China'; print %countryCodes, "\n"; # prints all items my %emptyHash = (); # an initially empty hash
You can converts a hash to an array and vice versa. The array stores the key-value pairs as sequential entries but in no particular order, e.g.,
# Assign Hash to Array my %countryCodes = ('us' => 'United States', 'sg' => 'Singapore'); # Hash my @countryArray = %countryCodes; # Assign a Hash to an array print $countryArray[0], "\n"; # Referencing array print $countryArray[1], "\n";
# Assign an Array (a list of items) to a Hash my %countryHash = ('us', 'United States', 'sg', 'Singapore'); # Hash print $countryHash{'us'}, "\n"; # Referencing hash print $countryHash{'sg'}, "\n";
Hash Functions
keys(hashName)
: returns an array containing all the keys inhashName
.values(hashName)
: returns an array containing all the values inhashName
.each(hashName)
: returns a 2-element array (key, value) containing the next key-value pair fromhashName
.delete($hashName{keyName})
: removes the key-value pair ofkeyName
fromhashName
, and returns the deleted value.exists($hashName{keyName})
: returns true ifkeyName
exists inhashName
.defined($hashName{keyName})
: check if value ofkeyName
is defined inhashName
.
For example:
my %countryCodes = ('us' => 'United States', 'sg' => 'Singapore'); while ((my $key, my $value) = each %countryCodes) { print "$key is associated with $value.\n"; }
Special Hash Variable: The Environment Variables Hash %ENV
A program can access an operating environment which contains information such as the current directory, the username, and etc. Perl stores the environment variables in a special hash named %ENV
.
print $ENV{'PATH'}; # print environment variable PATH while ((my $key, my $value) = (each %ENV)) { # prints all environment variables print "$key=$value\n"; }
%ENV
hash is useful in writing server-side CGI Perl scripts.
Sorting the Hash
foreach my $key (sort keys %ENV) { # returns array of sorted keys. print "$key=$ENV{$key}\n"; # get the value with the sorted keys }
Subroutines (or Functions)
You can define your own subroutine (or functions) by using the keyword sub
with a processing block:
sub subroutineName { statementBlock; return aReturnValue; }
In Perl, subroutine returns a single piece of data or nothing, via statement return aReturnValue
(or the last expression evaluated if there is no return
statement).
You can invoke a subroutine by referencing it with an ampersand &
before the subroutine name. (Recall that $
identifies a scalar; @
identifies an array, and %
identifies a hash.)
For example:
# Define subroutine sub hello { return 'Hello, world'; } # Invoke subroutine print &hello, "\n";
Passing arguments into subroutines
You can pass argument(s) into subroutine. Perl places the arguments into a special array variable named @_
. You can access the first element using $_[0]
, the second with $_[1]
, and so on. (Recall that $_
is the default scalar variable.)
You can use keyword local
to define local variables or my
to define lexical variables (available inside a block) for the subroutine, which hides the global version temporarily if there is one.
For example,
# Define a subroutine add which takes zero or more arguments sub add { my $sum = 0; foreach (@_) { $sum += $_; } return $sum; } # Invoke subroutine add with various number of arguments print &add(1), "\n"; print &add(2, 3), "\n"; print &add(4, 5, 6), "\n";
Perl's Build-in Functions
Mathematical Functions
sqrt(number)
: returns the square root ofnumber
.abs(number)
: returns the absolute value ofnumber
.sin(number)
: returns the sine ofnumber
, in radian.cos(number)
: returns the cosine ofnumber
, in radian.atan(y, x)
: returns the arc-tangent ofy/x
in the range of -π to π radians.exp(number)
: returns the exponent ofnumber
.log(number)
: returns the natural logarithm ofnumber
.
Converting between Number Bases
ord(character)
returns the ASCII value ofcharacter
.chr(number)
returns the character given its ASCIInumber
.oct(number)
returns the decimal value of the octalnumber
.hex(number)
returns the decimal value of the hexadecimalnumber
.
Error Reporting Functions - exit, die, warn
exit(number)
: exits the program with the statusnumber
. Normal termination of program exits with number 0.die(string)
: exits the program with the current value of the special variable$!
and printsstring
.warn(string)
: prints thestring
but does not terminates the program.
For example,
exit unless open(HANDLE, $file) open (HANDLE, $file) or die 'cannot open $file!\n';
Special Scalar Variable: Error Number $!
$!
(or $ERRNO
or $OS_ERROR
) contains the system error. In numeric context, it contains the error number; in string context, it contains the error string.
Backquotes `command` and Function System
`command
` executes command
in a sub-shell and returns the command
's output. For examples,
my $today = `date`;
print $today, "\n";
my @dirlines = `dir`; # Use `ls -l` for Unix
foreach (@dirlines) { print; }
system(program, args)
executes the program
with argument args
and waits for it to return. system
is similar to backquotes. However, backquotes return the output of the program; whereas system
returns the exit code of the program (where 0 indicates normal termination). system
lets the command go ahead and prints to the console. For example,
print system('date'), "\n"; print system('dir'), "\n";
Function sort
sort(subroutine, array)
sorts the array
using the comparison function subroutine
and returns the sorted array. Inside the subroutine
, scalar variables $a
and $b
are automatically set to the two elements to be compared. If sort
is used without the subroutine
, it sorts according to string order. (Caution: By default, number are sorted as string, that is, the number 10 is less than 2 in string order).
For examples,
#!/usr/bin/env perl use strict; use warnings; my @color = ('black', 'white', 'blue', 'green'); my @sorted = sort @color; foreach (@sorted) { print "$_ "; }
#!/usr/bin/env perl use strict; use warnings; # Define sorting subroutine sub numerically { if ($a > $b) {1} elsif ($a < $b) {-1} else {0} } # Compare numbers my @price = (77, 100, 99, 55, 1); my @sorted = sort numerically @price; foreach (@sorted) { print "$_ "; } # 1 55 77 99 100 # A "spaceship" operator as the shorthand for the above because it is used very often @sorted = sort { $a <=> $b } @price; @sorted = sort @price; foreach (@sorted) { print "$_ "; } # 1 100 55 77 99
#!/usr/bin/env perl use strict; use warnings; # Define sorting subroutine sub alphabetically { lc($a) cmp lc($b); } # Compare lowercase string my @color = ('red', 'YELLOW', 'Blue', 'green'); my @sorted = sort alphabetically @color; foreach (@sorted) { print "$_ "; }
Random Number Functions srand and rand
srand(seed)
: initializes the random number generator with theseed
. Use it once at the beginning of the program. Ifseed
is omitted, the current time is used.rand(number)
returns a random floating-point number between 0 andnumber
.
srand; print rand(1), "\n"; # Generate a random number between 0.0 and 1.0 print int(rand(100)), "\n"; # Generate a random integer between 0 and 99
Time Functions time, localtime, gmttime
time
: returns the number of second since January 1, 1970, GMT (Greenwich Mean Time).localtime(time)
: converts the numerictime
to time/day/date fields in the local time zone.gmttime(time)
: converts the numerictime
to time/day/date fields in GMT.
Function sleep
sleep(number)
makes the program wait for number
of seconds before resuming execution.
Encryption Function crypt
crypt(password, salt)
encrypts password
with salt
, and returns the encrypted password. crypt
takes only the first 8 characters of the password for encryption. salt
is up to 12 bits (or 16 bits?). The first 2 characters in the encrypted password are the salt
. That is needed to verify the password.
Miscellaneous
do(...): Executing another Perl program
For example, do(FILENAME)
evaluates the Perl code in FILENAME
. do(...)
is similar to #include
in C/C++.
Bitwise Operations
OPERATOR | DESCRIPTION | EXAMPLE | RESULT |
---|---|---|---|
<< |
Left bit-shift (padded with 0's) | bitPattern << number |
|
>> |
Right bit-shift (padded with ??) | bitPattern >> number |
|
& |
Bitwise AND | bitPattern1 & bitPattern2 |
|
| |
Bitwise OR | bitPattern1 | bitPattern2 |
|
~ |
Bitwise NOT (1's compliment) | ~bitPattern |
|
^ |
Bitwise XOR | bitPattern1 ^ bitPattern2 |
|
Notes:
- You can also use the compound operators
|=
,&=
,^=
,~=
,<<=
,>>=
.
Debugging Perl Programs
[TODO]
Perl Documentations
Perl comes with thousands of pages of documentations @ http://perldoc.perl.org.
- perlfaq: Perl frequently asked questions
- perldata: Perl data structures
- perlsyn: Perl syntax
- perlop: Perl operators and precedence
- perlre: Perl regular expressions
- perlrun: Perl execution and options
- perlfunc: Perl builtin functions
- perlvar: Perl predefined variables
- perlsub: Perl subroutines
- perlmod: Perl modules: how they work
Code Examples
Print Calendar
Given a month (e.g., mar) and the first day of the week of that month (e.g., wed) , print the calendar of the month.
#!/usr/bin/env perl use strict; use warnings; use 5.010; # CalendarMonth.pl # Given a month and the first day of the week of that month, # print the calendar for the month. For example, # > perl CalendarMonth.pl mar wed my @weekdays = ("sun", "mon", "tue", "wed", "thu", "fri", "sat"); my %daysInMonth = ("jan" => 31, "feb" => 28, "mar" => 31, "apr" => 30, "may" => 31, "jun" => 30, "jul" => 31, "aug" => 31, "sep" => 30, "oct" => 31, "nov" => 30, "dec" => 31); # Get inputs from the command-line argument @ARGV, convert to lowercase. my $theMonth = lc(shift); my $firstWeekDay = lc(shift); # Check valid input for the first week day of the month my $weekDayNum; for ($weekDayNum = 0; $weekDayNum < @weekdays; $weekDayNum++) { last if ($weekdays[$weekDayNum] eq $firstWeekDay) } die "Error: Incorrect first weekday '$firstWeekDay'" if ($weekDayNum >= @weekdays); # Check valid input for the month die "Error: Incorrect month '$theMonth'" unless (exists $daysInMonth{$theMonth}); # Print heading - Each month takes 4 places bMMM printf "%16s\n", uc($theMonth); # User C-style printf for formatted output for my $day (@weekdays) { printf "%4s", ucfirst($day); } print "\n"; # Skip to the first day of the week $weekDayNum = 0; until ($firstWeekDay eq $weekdays[$weekDayNum]) { print " "; $weekDayNum++; } # Printing the month for (my $dayNum = 1; $dayNum <= $daysInMonth{$theMonth}; $dayNum++) { printf "%4d", $dayNum; $weekDayNum++; if ($weekDayNum == 7) { $weekDayNum = 0; print "\n"; } }
Given a year (e.g., 2009), print the calendar of the year.
#!/usr/bin/env perl use strict; use warnings; use 5.010; # CalendarYear.pl # Given a year (>=1961), print the calendar for the year. # > perl CalendarYear.pl 2009 my @weekdays = ("sun", "mon", "tue", "wed", "thu", "fri", "sat"); my @months = ('jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec'); my %daysInMonth = ("jan" => 31, "feb" => 28, "mar" => 31, "apr" => 30, "may" => 31, "jun" => 30, "jul" => 31, "aug" => 31, "sep" => 30, "oct" => 31, "nov" => 30, "dec" => 31); # Get inputs from the command-line argument @ARGV my $theYear = shift; my $startYear = 1961; # Check valid inputs die "Error: no year given" unless ($theYear); die "Error: Incorrect year number '$theYear'" unless ($theYear >= $startYear); # Knowing that Jan 1, 1961 is a Sunday, # compute the first week day of the given year my $yearsDiff = $theYear - $startYear; my $daysDiff = $yearsDiff * 365; # Account for leap years $daysDiff += int($yearsDiff / 4); my $firstWeekDay = ($daysDiff + 0) % 7; # +0 for Sunday my $weekDayNum = 0; # Print Month's heading - Each month takes 4 places bMMM for my $month (@months) { # Print heading for month printf "%16s\n", uc($month); for my $day (@weekdays) { printf "%4s", ucfirst($day); } print "\n"; # Skip to the first day of the week $weekDayNum = 0; until ($firstWeekDay == $weekDayNum) { print " "; $weekDayNum++; } # Check for leap year - divisible by 4 but not divisible by 100, or divisible by 400 if (((($theYear % 4) == 0) && (($theYear % 100) != 0)) || ($theYear % 400) == 0) {
$daysInMonth{'feb'} = 29; } # Continue for the rest of the month for (my $dayNum = 1; $dayNum <= $daysInMonth{$month}; $dayNum++) { printf "%4d", $dayNum; $weekDayNum++; if ($weekDayNum == 7) { $weekDayNum = 0; print "\n"; } } print "\n"; print "\n" if ($weekDayNum != 0); $firstWeekDay = $weekDayNum; # Continue for next month }
REFERENCES & RESOURCES
- Popular Perl sites, e.g., www.perl.org, www.perl.com, www.pm.org, www.perlmongers.org.
- Perl's documentation @ http://perldoc.perl.org.
- "Perlintro - A brief introduction and overview of Perl", available @ http://perldoc.perl.org.
- CPAN (Comprehensive Perl Archive Network) @ www.cpan.org.
- (The Camel Book) Larry Wall, Tom Christiansen and Jon Orwant, "Programming Perl", 3rd eds, 2000 - covers Perl 5.6.
- (The Llama Book) Randal L. Schwartz, Tom Phoenix and Brian D Foy, "Learning Perl", 5th eds, 2008 - covers Perl 5.10.
- (The Ram Book) Tom Christiansen and Nathan Torkington, "The Perl Cookbook", 2nd eds, 2003 - recipes for common tasks.