In the syntax descriptions that follow, list operators that expect a list (and provide list context for the elements of the list) are shown with LIST as an argument. Such a list may consist of any combination of scalar arguments or list values; the list values will be included in the list as if each individual element were interpolated at that point in the list, forming a longer single-dimensional list value. Elements of the LIST should be separated by commas.
Any function in the list below may be used either with or without parentheses around its arguments. (The syntax descriptions omit the parens.) If you use the parens, the simple (but occasionally surprising) rule is this: It LOOKS like a function, therefore it IS a function, and precedence doesn't matter. Otherwise it's a list operator or unary operator, and precedence does matter. And whitespace between the function and left parenthesis doesn't count--so you need to be careful sometimes:
If you run Perl with the -w switch it can warn you about this. For example, the third line above produces:
For functions that can be used in either a scalar or list context, non-abortive failure is generally indicated in a scalar context by returning the undefined value, and in a list context by returning the null list.
Remember the following rule:
* - sub was a keyword in perl4, but in perl5 it is an operator which can be used in expressions.
-t
, which tests STDIN.
Unless otherwise documented, it returns 1
for TRUE and ''
for FALSE, or
the undefined value if the file doesn't exist. Despite the funny
names, precedence is the same as any other named unary operator, and
the argument may be parenthesized like any other unary operator. The
operator may be any of:
The interpretation of the file permission operators -r
, -R
, -w
,
-W
, -x
and
-X
is based solely on the mode of the file and the
uids and gids of the user. There may be other reasons you can't actually
read, write or execute the file. Also note that, for the superuser,
-r
, -R
, -w
and -W
always return 1, and -x
and
-X
return
1 if any execute bit is set in the mode. Scripts run by the superuser may
thus need to do a
stat()
in order to determine the actual mode of the
file, or temporarily set the uid to something else.
Example:
Note that -s/a/b/
does not do a negated substitution. Saying
-exp($foo)
still works as expected, however--only single letters
following a minus are interpreted as file tests.
The -T
and -B
switches work as follows. The first block or so of the
file is examined for odd characters such as strange control codes or
characters with the high bit set. If too many odd characters (>30%)
are found, it's a -B
file, otherwise it's a -T
file. Also, any file
containing null in the first block is considered a binary file. If -T
or -B
is used on a filehandle, the current stdio buffer is examined
rather than the first block. Both -T
and -B
return TRUE on a null
file, or a file at EOF when testing a filehandle. Because you have to
read a file to do the -T
test, on most occasions you want to use a -f
against the file first, as in
next unless -f $file && -T $file
.
If any of the file tests (or either the
stat()
or
lstat()
operators) are given the
special filehandle consisting of a solitary underline, then the stat
structure of the previous file test (or stat operator) is used, saving
a system call. (This doesn't work with -t
, and you need to remember
that
lstat()
and -l
will leave values in the stat structure for the
symbolic link, not the real file.) Example:
For delays of finer granularity than one second, you may use Perl's syscall() interface to access setitimer(2) if your system supports it, or else see select below. It is not advised to intermix alarm() and sleep() calls.
With EXPR, it returns some extra information that the debugger uses to print a stack trace. The value of EXPR indicates how many call frames to go back before the current one.
Furthermore, when called from within the DB package, caller returns more detailed information: it sets the list variable @DB::args to be the arguments with which that subroutine was invoked.
$/
(also known as
$INPUT_RECORD_SEPARATOR in the English
module). It returns the number
of characters removed. It's often used to remove the newline from the
end of an input record when you're worried that the final record may be
missing its newline. When in paragraph mode ($/ = ``''
), it removes all
trailing newlines from the string. If VARIABLE is omitted, it chomps
$_. Example:
You can actually chomp anything that's an lvalue, including an assignment:
If you chomp a list, each element is chomped, and the total number of characters removed is returned.
s/\n//
because it neither
scans nor copies the string. If VARIABLE is omitted, chops $_.
Example:
You can actually chop anything that's an lvalue, including an assignment:
If you chop a list, each element is chopped. Only the value of the last chop is returned.
Note that chop returns the last character. To return all but the last character, use substr($string, 0, -1) .
Here's an example that looks up non-numeric uids in the passwd file:
On most systems, you are not allowed to change the ownership of the file unless you're the superuser, although you should be able to change the group to any of your secondary groups. On insecure systems, these restrictions may be relaxed, but this is not a portable assumption.
$?
. Example:
FILEHANDLE may be an expression whose value gives the real filehandle name.
while
or
foreach
), it is always executed just before the conditional is about to
be evaluated again, just like the third part of a for
loop in C. Thus
it can be used to increment a loop variable, even when the loop has been
continued via the
next
statement (which is similar to the C
continue
statement).
Here's an example that makes sure that whoever runs this program knows their own password:
Of course, typing in your own password to whoever asks you for it is unwise.
Breaks the binding between a DBM file and an associative array.
This binds a dbm(3), ndbm(3), sdbm(3), gdbm(), or Berkeley DB file to an associative array. ASSOC is the name of the associative array. (Unlike normal open, the first argument is NOT a filehandle, even though it looks like one). DBNAME is the name of the database (without the .dir or .pag extension if any). If the database does not exist, it is created with protection specified by MODE (as modified by the umask() ). If your system only supports the older DBM functions, you may perform only one dbmopen() in your program. In older versions of Perl, if your system had neither DBM nor ndbm, calling dbmopen() produced a fatal error; it now falls back to sdbm(3).
If you don't have write access to the DBM file, you can only read associative array variables, not set them. If you want to test whether you can write, either use file tests or try setting a dummy array entry inside an eval() , which will trap the error.
Note that functions such as keys() and values() may return huge array values when used on large DBM files. You may prefer to use the each() function to iterate over large DBM files. Example:
See also AnyDBM_File for a more general description of the pros and cons of the various dbm approaches, as well as DB_File for a particularly rich implementation.
When used on a hash array element, it tells you whether the value is defined, not whether the key exists in the hash. Use exists() for that.
Examples:
See also undef() .
Note: many folks tend to overuse defined() , and then are surprised to discover that the number 0 and the null string are, in fact, defined concepts. For example, if you say
the pattern match succeeds, and $1 is defined, despite the fact that it matched ``nothing''. But it didn't really match nothing--rather, it matched something that happened to be 0 characters long. This is all very above-board and honest. When a function returns an undefined value, it's an admission that it couldn't give you an honest answer. So you should only use defined() when you're questioning the integrity of what you're trying to do. At other times, a simple comparison to 0 or ``'' is what you want.
$ENV{}
modifies the environment. Deleting from an array tied to a DBM
file deletes the entry from the DBM file. (But deleting from a
tie()
d
hash doesn't necessarily return anything.)
The following deletes all the values of an associative array:
(But it would be faster to use the undef() command.) Note that the EXPR can be arbitrarily complicated as long as the final operation is a hash key lookup:
STDERR
and exits with
the current value of $!
(errno). If $!
is 0, exits with the value of
($? >> 8)
(backtick `command` status). If ($? >> 8)
is 0,
exits with 255. Inside an
eval()
, the error message is stuffed into $@
,
and the
eval()
is terminated with the undefined value; this makes
die()
the way to raise an exception.
Equivalent examples:
If the value of EXPR does not end in a newline, the current script line number and input line number (if any) are also printed, and a newline is supplied. Hint: sometimes appending ``, stopped'' to your message will cause it to make better sense when the string ``at foo line 123'' is appended. Suppose you are running script ``canasta''.
produce, respectively
is just like
except that it's more efficient, more concise, keeps track of the current filename for error messages, and searches all the -I libraries if the file isn't in the current directory (see also the @INC array in Predefined Names). It's the same, however, in that it does reparse the file every time you call it, so you probably don't want to do this inside a loop.
Note that inclusion of library modules is better done with the use() and require() operators, which also do error checking and raise an exception if there's a problem.
Example:
See also keys() and values() .
An eof without an argument uses the last file read as argument. Empty parentheses () may be used to indicate the pseudofile formed of the files listed on the command line, i.e. eof() is reasonable to use inside a while (<>) loop to detect the end of only the last file. Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop. Examples:
Practical hint: you almost never need to use eof in Perl, because the input operators return undef when they run out of data.
If there is a syntax error or runtime error, or a
die()
statement is
executed, an undefined value is returned by
eval()
, and $@
is set to the
error message. If there was no error, $@
is guaranteed to be a null
string. If EXPR is omitted, evaluates $_. The final semicolon, if
any, may be omitted from the expression.
Note that, since eval() traps otherwise-fatal errors, it is useful for determining whether a particular feature (such as socket() or symlink() ) is implemented. It is also Perl's exception trapping mechanism, where the die operator is used to raise exceptions.
If the code to be executed doesn't vary, you may use the eval-BLOCK
form to trap run-time errors without incurring the penalty of
recompiling each time. The error, if any, is still returned in $@
.
Examples:
With an eval() , you should be especially careful to remember what's being looked at when:
Cases 1 and 2 above behave identically: they run the code contained in the variable $x. (Although case 2 has misleading double quotes making the reader wonder what else might be happening (nothing is).) Cases 3 and 4 likewise behave in the same way: they run the code <$x>, which does nothing at all. (Case 4 is preferred for purely visual reasons.) Case 5 is a place where normally you WOULD like to use double quotes, except that in that particular situation, you can just use symbolic references instead, as in case 6.
If there is more than one argument in LIST, or if LIST is an array with
more than one value, calls execvp(3) with the arguments in LIST. If
there is only one scalar argument, the argument is checked for shell
metacharacters. If there are any, the entire argument is passed to
/bin/sh -c
for parsing. If there are none, the argument is split
into words and passed directly to execvp(), which is more efficient.
Note:
exec()
and
system()
do not flush your output buffer, so you may
need to set $|
to avoid lost output. Examples:
If you don't really want to execute the first argument, but want to lie to the program you are executing about its own name, you can specify the program you actually want to run as an ``indirect object'' (without a comma) in front of the LIST. (This always forces interpretation of the LIST as a multi-valued list, even if there is only a single scalar in the list.) Example:
or, more directly,
A hash element can only be TRUE if it's defined, and defined if it exists, but the reverse doesn't necessarily hold true.
Note that the EXPR can be arbitrarily complicated as long as the final operation is a hash key lookup:
END
routines first, but the END
routines may not
abort the exit. Likewise any object destructors that need to be called
are called before exit.) Example:
See also die() . If EXPR is omitted, exits with 0 status.
first to get the correct function definitions. Argument processing and value return works just like ioctl() below. Note that fcntl() will produce a fatal error if used on a machine that doesn't implement fcntl(2). For example:
Here's a mailbox appender for BSD systems.
open(MBOX, ``>>/usr/spool/mail/$ENV{'USER'}'') or die ``Can't open mailbox: $!''; lock(); print MBOX $msg,"\n\n"; unlock();
See also DB_File for other flock() examples.
$|
($AUTOFLUSH in English) or call the
autoflush() FileHandle method to avoid duplicate output.
If you fork() without ever waiting on your children, you will accumulate zombies:
There's also the double-fork trick (error checking on fork() returns omitted);
See also the perlipc manpage for more examples of forking and reaping moribund children.
format Something = Test: @<<<<<<<< @||||| @>>>>> $str, $%, '$' . int($num) . $str = "widget"; $num = $cost/$quantity; $~ = 'Something'; write;
See the perlform manpage for many details and examples.
$^A
(or $ACCUMULATOR in English).
Eventually, when a
write()
is done, the contents of
$^A
are written to some filehandle, but you could also read $^A
yourself and then set $^A
back to ``''. Note that a format typically
does one
formline()
per line of form, but the
formline()
function itself
doesn't care how many newlines are embedded in the PICTURE. This means
that the ~
and ~~
tokens will treat the entire PICTURE as a single line.
You may therefore need to use multiple formlines to implement a single
record format, just like the format compiler.
Be careful if you put double quotes around the picture, since an ``@
''
character may be taken to mean the beginning of an array name.
formline()
always returns TRUE. See the perlform manpage
for other examples.
Determination of whether to whether $BSD_STYLE should be set is left as an exercise to the reader.
See also the Term::ReadKey
module from your nearest CPAN site;
details on CPAN can be found on CPAN
Do not consider getlogin() for authentication: it is not as secure as getpwuid() .
(If the entry doesn't exist you get a null list.)
Within a scalar context, you get the name, unless the function was a lookup by name, in which case you get the other thing, whatever it is. (If the entry doesn't exist you get the undefined value.) For example:
The $members value returned by getgr*() is a space separated list of the login names of the members of the group.
For the gethost*() functions, if the h_errno
variable is supported in
C, it will be returned to you via $?
if the function call fails. The
@addrs value returned by a successful call is a list of the raw
addresses returned by the corresponding system library call. In the
Internet domain, each address is four bytes long and you can unpack it
by saying something like:
All array elements are numeric, and come straight out of a struct tm. In particular this means that $mon has the range 0..11 and $wday has the range 0..6. If EXPR is omitted, does gmtime( time() ) .
The goto-EXPR form expects a label name, whose scope will be resolved dynamically. This allows for computed gotos per FORTRAN, but isn't necessarily recommended if you're optimizing for maintainability:
The goto-&NAME form is highly magical, and substitutes a call to the named subroutine for the currently running subroutine. This is used by AUTOLOAD subroutines that wish to load another subroutine and then pretend that the other subroutine had been called in the first place (except that any modifications to @_ in the current subroutine are propagated to the other subroutine.) After the goto, not even caller() will be able to tell that this routine was called first.
or equivalently,
Note that, since $_ is a reference into the list value, it can be used to modify the elements of the array. While this is useful and supported, it can cause bizarre results if the LIST is not a named array.
$[
variable to--but don't do that). If the substring is not found, returns
one less than the base, ordinarily -1.
first to get the correct function definitions. If ioctl.ph doesn't exist or doesn't have the correct definitions you'll have to roll your own, based on your C header files such as <sys/ioctl.h>. (There is a Perl script called h2ph that comes with the Perl kit which may help you in this, but it's non-trivial.) SCALAR will be read and/or written depending on the FUNCTION--a pointer to the string value of SCALAR will be passed as the third argument of the actual ioctl call. (If SCALAR has no string value but does have a numeric value, that value will be passed rather than a pointer to the string value. To guarantee this to be TRUE, add a 0 to the scalar before using it.) The pack() and unpack() functions are useful for manipulating the values of structures used by ioctl() . The following example sets the erase character to DEL.
The return value of ioctl (and fcntl) is as follows:
Thus Perl returns TRUE on success and FALSE on failure, yet you can still easily determine the actual value returned by the operating system:
See split .
or how about sorted by key:
To sort an array by value, you'll need to use a sort{} function. Here's a descending numeric sort of a hash by its values:
Unlike in the shell, in Perl if the SIGNAL is negative, it kills process groups instead of processes. (On System V, a negative PROCESS number will also kill process groups, but that's not portable.) That means you usually want to use positive not negative signals. You may also use a signal name in quotes. See ``Signals'' for details.
break
statement in C (as used in
loops); it immediately exits the loop in question. If the LABEL is
omitted, the command refers to the innermost enclosing loop. The
continue
block, if any, is not executed:
But you really probably want to be using
my()
instead, because
local()
isn't
what most people think of as ``local''). See L
All array elements are numeric, and come straight out of a struct tm.
In particular this means that $mon has the range 0..11 and $wday has
the range 0..6. If EXPR is omitted, does localtime(time).
In a scalar context, prints out the ctime(3) value:
Also see the timelocal.pl library, and the strftime(3) function available
via the POSIX module.
translates a list of numbers to the corresponding characters. And
is just a funny way to write
Note that if there were a
continue
block on the above, it would get
executed even on discarded lines. If the LABEL is omitted, the command
refers to the innermost enclosing loop.
If EXPR is omitted, uses $_.
If the filename begins with ``|'', the filename is interpreted
as a command to which output is to be piped, and if the filename ends with
a ``|'', the filename is interpreted See ``Using
open()
for IPC''
for more examples of this. as command which pipes input to us. (You may
not have a raw
open()
to a command that pipes both in and out, but see open2,
open3, and ``Bidirectional Communication'' for alternatives.)
Opening '-' opens STDIN and opening '>-' opens STDOUT. Open returns
non-zero upon success, the undefined value otherwise. If the open
involved a pipe, the return value happens to be the pid of the
subprocess.
If you're unfortunate enough to be running Perl on a system that
distinguishes between text files and binary files (modern operating
systems don't care), then you should check out
binmode
for tips for
dealing with this. The key distinction between systems that need binmode
and those that don't is their text file formats. Systems like Unix and
Plan9 that delimit lines with a single character, and that encode that
character in C as '\n', do not need
binmode
. The rest need it.
Examples:
You may also, in the Bourne shell tradition, specify an EXPR beginning
with ``>&'', in which case the rest of the string is interpreted as the
name of a filehandle (or file descriptor, if numeric) which is to be
duped and opened. You may use & after >, >>, <, +>,
+>> and +<. The
mode you specify should match the mode of the original filehandle.
(Duping a filehandle does not take into account any existing contents of
stdio buffers.)
Here is a script that saves, redirects, and restores STDOUT and
STDERR:
If you specify ``<&=N'', where N is a number, then Perl will do an
equivalent of C's fdopen() of that file descriptor; this is more
parsimonious of file descriptors. For example:
If you open a pipe on the command ``-'', i.e. either ``|-'' or ``-|'', then
there is an implicit fork done, and the return value of open is the pid
of the child within the parent process, and 0 within the child
process. (Use
defined($pid)
to determine whether the open was successful.)
The filehandle behaves normally for the parent, but i/o to that
filehandle is piped from/to the STDOUT/STDIN of the child process.
In the child process the filehandle isn't opened--i/o happens from/to
the new STDOUT or STDIN. Typically this is used like the normal
piped open when you want to exercise more control over just how the
pipe command gets executed, such as when you are running setuid, and
don't want to have to scan shell commands for metacharacters.
The following pairs are more or less equivalent:
See ``Safe Pipe Opens'' for more examples of this.
Explicitly closing any piped filehandle causes the parent process to
wait for the child to finish, and returns the status value in Using the FileHandle constructor from the FileHandle package,
you can generate anonymous filehandles which have the scope of whatever
variables hold references to them, and automatically close whenever
and however you leave that scope:
The filename that is passed to open will have leading and trailing
whitespace deleted. In order to open a file with arbitrary weird
characters in it, it's necessary to protect any leading and trailing
whitespace thusly:
If you want a ``real'' C
open()
(see open(2) on your system), then
you should use the
sysopen()
function. This is another way to
protect your filenames from interpretation. For example:
See
seek
for some details about mixing reading and writing.
Each letter may optionally be followed by a number which gives a repeat
count. With all types except ``a'', ``A'', ``b'', ``B'', ``h'' and ``H'', and ``P'' the
pack function will gobble up that many values from the LIST. A * for the
repeat count means to use however many items are left. The ``a'' and ``A''
types gobble just one value, but pack it as a string of length count,
padding with nulls or spaces as necessary. (When unpacking, ``A'' strips
trailing spaces and nulls, but ``a'' does not.) Likewise, the ``b'' and ``B''
fields pack a string that many bits long. The ``h'' and ``H'' fields pack a
string that many nybbles long. The ``P'' packs a pointer to a structure of
the size indicated by the length. Real numbers (floats and doubles) are
in the native machine format only; due to the multiplicity of floating
formats around, and the lack of a standard ``network'' representation, no
facility for interchange has been made. This means that packed floating
point data written on one machine may not be readable on another - even if
both use IEEE floating point arithmetic (as the endian-ness of the memory
representation is not part of the IEEE spec). Note that Perl uses doubles
internally for all numeric calculation, and converting from double into
float and thence back to double again will lose precision (i.e.
unpack(``f'', pack(``f'', $foo)
) will not in general equal $foo).
Examples:
The same template may generally also be used in the unpack function.
See ``Packages'' for more information about packages, modules,
and classes. See the perlsub manpage
for other scoping issues.
See open2, open3, and ``Bidirectional Communication''
for examples of such things.
If there are no elements in the array, returns the undefined value.
If ARRAY is omitted, pops the
@ARGV array in the main program, and the @_ array in subroutines, just
like
shift()
.
Note that if you're storing FILEHANDLES in an array or other expression,
you will have to use a block returning its value instead:
but is more efficient. Returns the new number of elements in the array.
(Note: if your rand function consistently returns numbers that are too
large or too small, then your version of Perl was probably compiled
with the wrong number of RANDBITS. As a workaround, you can usually
multiply EXPR by the correct power of 2 to get the range you want.
This will make your script unportable, however. It's better to recompile
if you can.)
If you're planning to filetest the return values out of a
readdir()
, you'd
better prepend the directory in question. Otherwise, since we didn't
chdir()
there, it would have been testing the wrong file.
If the referenced object has been blessed into a package, then that package
name is returned instead. You can think of
ref()
as a typeof() operator.
See also the perlref manpage
.
Otherwise, demands that a library file be included if it hasn't already
been included. The file is included via the do-FILE mechanism, which is
essentially just a variety of
eval()
. Has semantics similar to the following
subroutine:
Note that the file will not be included twice under the same specified
name. The file must return TRUE as the last statement to indicate
successful execution of any initialization code, so it's customary to
end such a file with ``1;'' unless you're sure it'll return TRUE
otherwise. But it's better just to put the `` If EXPR is a bare word, the require assumes a ``.pm'' extension and
replaces ``::'' with ``/'' in the filename for you,
to make it easy to load standard modules. This form of loading of
modules does not risk altering your namespace.
For a yet-more-powerful import facility, see
use
and
the perlmod manpage
.
Resetting ``A-Z'' is not recommended since you'll wipe out your
ARGV and ENV arrays. Only resets package variables--lexical variables
are unaffected, but they clean themselves up on scope exit anyway,
so you'll probably want to use them instead. See
my
.
There is no equivalent operator to force an expression to
be interpolated in a list context because it's in practice never
needed. If you really wanted to do so, however, you could use
the construction
On some systems you have to do a seek whenever you switch between reading
and writing. Amongst other things, this may have the effect of calling
stdio's clearerr(3). A ``whence'' of 1 (SEEK_CUR) is useful for not moving
the file pointer:
This is also useful for applications emulating If that doesn't work (some stdios are particularly cantankerous), then
you may need something more like this:
FILEHANDLE may be an expression whose value gives the name of the
actual filehandle. Thus:
Some programmers may prefer to think of filehandles as objects with
methods, preferring to write the last example as:
If you want to select on many filehandles you might wish to write a
subroutine:
The usual idiom is:
or to block until something becomes ready just do this
Most systems do not both to return anything useful in $timeleft, so
calling
select()
in a scalar context just returns $nfound.
Any of the bitmasks can also be undef. The timeout, if specified, is
in seconds, which may be fractional. Note: not all implementations are
capable of returning the $timeleft. If not, they always return
$timeleft equal to the supplied $timeout.
You can effect a 250-millisecond sleep this way:
WARNING: Do not attempt to mix buffered I/O (like
read()
or <FH>)
with
select()
. You have to use
sysread()
instead.
To signal the semaphore, replace ``-1'' with ``1''.
On some older systems, it may sleep up to a full second less than what
you requested, depending on how it counts seconds. Most modern systems
always sleep the full amount.
For delays of finer granularity than one second, you may use Perl's
syscall()
interface to access setitimer(2) if your system supports it,
or else see
select
below.
In the interests of efficiency the normal calling code for subroutines is
bypassed, with the following effects: the subroutine may not be a
recursive subroutine, and the two elements to be compared are passed into
the subroutine not via @_ but as the package global variables $a and
$b (see example below). They are passed by reference, so don't
modify $a and $b. And don't try to declare them as lexicals either.
Examples:
If you're using strict, you MUST NOT declare $a
and $b as lexicals. They are package globals. That means
if you're in the
or just
but if you're in the
Example, assuming array lengths are passed before arrays:
If not in a list context, returns the number of fields found and splits into
the @_ array. (In a list context, you can force the split into @_ by
using If EXPR is omitted, splits the $_ string. If PATTERN is also omitted,
splits on whitespace (after skipping any leading whitespace). Anything
matching PATTERN is taken to be a delimiter separating the fields. (Note
that the delimiter may be longer than one character.) If LIMIT is
specified and is not negative, splits into no more than that many fields
(though it may split into fewer). If LIMIT is unspecified, trailing null
fields are stripped (which potential users of
pop()
would do well to
remember). If LIMIT is negative, it is treated as if an arbitrarily large
LIMIT had been specified.
A pattern matching the null string (not to be confused with
a null pattern
produces the output 'h:i:t:h:e:r:e'.
The LIMIT parameter can be used to partially split a line
When assigning to a list, if LIMIT is omitted, Perl supplies a LIMIT
one larger than the number of variables in the list, to avoid
unnecessary work. For the list above LIMIT would have been 4 by
default. In time critical applications it behooves you not to split
into more fields than you really need.
If the PATTERN contains parentheses, additional array elements are
created from each matching substring in the delimiter.
produces the list value
If you had the entire header of a normal Unix email message in $header,
you could split it up into fields and their values this way:
The pattern As a special case, specifying a PATTERN of space ( Example:
(Note that $shell above will still have a newline on it. See
chop
,
chomp
, and
join
.)
Not all fields are supported on all filesystem types. Here are the
meaning of the fields:
(The epoch was at 00:00 January 1, 1970 GMT.)
If stat is passed the special filehandle consisting of an underline, no
stat is done, but the current contents of the stat structure from the
last stat or filetest are returned. Example:
(This only works on machines for which the device number is negative under NFS.)
For example, here is a loop which inserts index producing entries
before any line containing a certain pattern:
In searching for /\bfoo\b/, only those locations in $_ that contain ``f''
will be looked at, because ``f'' is rarer than ``o''. In general, this is
a big win except in pathological cases. The only question is whether
it saves you more time than it took to build the linked list in the
first place.
Note that if you have to look for strings that you don't know till
runtime, you can build an entire loop as a string and eval that to
avoid recompiling all your patterns all the time. Together with
undefining $/ to input entire files as one record, this can be very
fast, often faster than specialized programs like fgrep(1). The following
scans a list of files (
You can use the
substr()
function
as an lvalue, in which case EXPR must be an lvalue. If you assign
something shorter than LEN, the string will shrink, and if you assign
something longer than LEN, the string will grow to accommodate it. To
keep the string the same length you may need to pad or chop your value
using
sprintf()
.
Note that Perl only supports passing of up to 14 arguments to your system call,
which in practice should usually suffice.
The possible values and flag bits of the MODE parameter are
system-dependent; they are available via the standard module If the file named by FILENAME does not exist and the
open
call
creates it (typically because MODE includes the O_CREAT flag), then
the value of PERMS specifies the permissions of the newly created
file. If PERMS is omitted, the default value is 0666, which allows
read and write for all. This default is reasonable: see
umask
.
Note that functions such as
keys()
and
values()
may return huge array
values when used on large objects, like DBM files. You may prefer to
use the
each()
function to iterate over such. Example:
A class implementing an associative array should have the following
methods:
A class implementing an ordinary array should have the following methods:
A class implementing a scalar should have the following methods:
Unlike
dbmopen()
, the
tie()
function will not use or require a module
for you--you need to do that explicitly yourself. See DB_File
or the Config module for interesting
tie()
implementations.
Note: unlink will not delete directories unless you are superuser and
the -U flag is supplied to Perl. Even if these conditions are
met, be warned that unlinking a directory can inflict damage on your
filesystem. Use rmdir instead.
and then there's
In addition, you may prefix a field with a %<number> to indicate that
you want a <number>-bit checksum of the items instead of the items
themselves. Default is a 16-bit checksum. For example, the following
computes the same number as the System V sum program:
The following efficiently counts the number of set bits in a bit vector:
Note the LIST is prepended whole, not one element at a time, so the
prepended elements stay in the same order. Use reverse to do the
reverse.
except that Module must be a bare word.
If the first argument to
use
is a number, it is treated as a version
number instead of a module name. If the version of the Perl interpreter
is less than VERSION, then an error message is printed and Perl exits
immediately. This is often useful if you need to check the current
Perl version before
use
ing library modules which have changed in
incompatible ways from older versions of Perl. (We try not to do
this more than we have to.)
The BEGIN forces the require and import to happen at compile time. The
require makes sure the module is loaded into memory if it hasn't been
yet. The import is not a builtin--it's just an ordinary static method
call into the ``Module'' package to tell the module to import the list of
features back into the current package. The module can implement its
import method any way it likes, though most modules just choose to
derive their import method via inheritance from the Exporter class that
is defined in the Exporter module. See Exporter.
If you don't want your namespace altered, explicitly supply an empty list:
That is exactly equivalent to
If the VERSION argument is present between Module and LIST, then the
use
will fail if the Because this is a wide-open interface, pragmas (compiler directives)
are also implemented this way. Currently implemented pragmas are:
These pseudomodules import semantics into the current block scope, unlike
ordinary modules, which import symbols into the current package (which are
effective through the end of the file).
There's a corresponding ``no'' command that unimports meanings imported
by use, i.e. it calls
See the perlmod manpage
for a list of standard modules and pragmas.
Vectors created with
vec()
can also be manipulated with the logical
operators |, & and ^, which will assume a bit vector operation is
desired when both operands are strings.
To transform a bit vector into a string or array of 0's and 1's, use these:
If you know the exact length in bits, it can be used in place of the *.
then you can do a non-blocking wait for any process. Non-blocking wait
is only available on machines supporting either the waitpid(2) or
wait4(2) system calls. However, waiting for a particular pid with
FLAGS of 0 is implemented everywhere. (Perl emulates the system call
by remembering the status values of processes that have exited but have
not been harvested by the Perl script yet.)
Top of form processing is handled automatically: if there is
insufficient room on the current page for the formatted record, the
page is advanced by writing a form feed, a special top-of-page format
is used to format the new page header, and then the record is written.
By default the top-of-page format is the name of the filehandle with
``_TOP'' appended, but it may be dynamically set to the format of your
choice by assigning the name to the If FILEHANDLE is unspecified, output goes to the current default output
channel, which starts out as STDOUT but may be changed by the
select
operator. If the FILEHANDLE is an EXPR, then the expression
is evaluated and the resulting string is used to look up the name of
the FILEHANDLE at run time. For more on formats, see the perlform manpage
.
Note that write is NOT the opposite of read. Unfortunately.
$!
(errno).
open(LOG, '>>/usr/spool/news/twitlog'); # (log is reserved)
open(DBASE, '+
$?
.
Note: on any operation which may do a fork, unflushed buffers remain
unflushed in both processes, which means you may need to set $|
to
avoid duplicate output.
$Package::Variable
. If the package name is null, the main
package as assumed. That is, $::sail
is equivalent to $main::sail
.
$|
to flush your WRITEHANDLE
after each command, depending on the application.
m//g
search left off for the variable
in question. May be modified to change that offset.
$!
(errno). If EXPR is
omitted, uses $_.
$]
or $PERL_VERSION) be equal or greater than EXPR.
1;
'', in case you add more
statements.
$!
(errno). If
FILENAME is omitted, uses $_.
@{[ (some expression) ]}
, but usually a simple
(some expression)
suffices.
tail -f
. Once you hit
EOF on your read, and then sleep for a while, you might have to stick in a
seek()
to reset things. First the simple trick listed above to clear the
filepointer. The
seek()
doesn't change the current position, but it
does clear the end-of-file condition on the handle, so that the next
<FILE>
makes Perl try again to read something. Hopefully.
main
package, it's
FooPack
package, it's
$[ == 0
):
??
as the pattern delimiters, but it still returns the array
value.) The use of implicit split to @_ is deprecated, however.
//
, which is just one member of the set of patterns
matching a null string) will split the value of EXPR into separate
characters at each point it matches that way. For example:
/PATTERN/
may be replaced with an expression to specify
patterns that vary at runtime. (To do runtime compilation only once,
use /$variable/o
.)
' '
) will split on
white space just as split with no arguments does. Thus, split(' ') can
be used to emulate awk's default behavior, whereas
split(/ /)
will give you as many null initial fields as there are leading spaces.
A split on /\s+/ is like a split(' ') except that any leading
whitespace produces a null first field. A split with no arguments
really does a
split(' ', $_)
internally.
$_
if unspecified) in anticipation of
doing many pattern matches on the string before it is next modified.
This may or may not save time, depending on the nature and number of
patterns you are searching on, and on the distribution of character
frequencies in the string to be searched--you probably want to compare
runtimes with and without it to see which runs faster. Those loops
which scan for many short constant strings (including the constant
parts of more complex patterns) will benefit most. You may have only
one study active at a time--if you study a different scalar the first
is ``unstudied''. (The way study works is this: a linked list of every
character in the string to be searched is made, so we know, for
example, where all the 'k' characters are. From each search string,
the rarest character is selected, based on some static frequency tables
constructed from some C programs and English text. Only those places
that contain this ``rarest'' character are examined.)
@files
) for a list of words (@words
), and prints
out the names of those files that contain a match:
Fcntl
.
However, for historical reasons, some values are universal: zero means
read-only, one means write-only, and two means read/write.
$VERSION
variable in package Module is
less than VERSION.
unimport Module LIST
instead of
import
.
$?
.
$?
. If you say
$~
variable.
$^
variable while the filehandle is
selected. The number of lines remaining on the current page is in
variable $-
, which can be set to 0 to force a new page.