现在的位置: 首页 > 综合 > 正文

Perl 之 use(), require(), do(), %INC and @INC

2011年09月12日 ⁄ 综合 ⁄ 共 11105字 ⁄ 字号 评论关闭

转自:http://perl.apache.org/docs/general/perl_reference/perl_reference.html 

use(), require(), do(), %INC and @INC Explained

TOP 

The @INC array

@INC is a special Perl variable which is the equivalent of the shell's PATH variable. Whereas PATH contains a list of directories to search for executables, @INC contains a list of directories from which Perl modules and libraries can be loaded.

When you use(), require() or do() a filename or a module, Perl gets a list of directories from the @INC variable and searches them for the file it was requested to load. If the file that you want to load is not located in one of the listed directories, you have to tell Perl where to find the file. You can either provide a path relative to one of the directories in @INC, or you can provide the full path to the file.

TOP 

The %INC hash

%INC is another special Perl variable that is used to cache the names of the files and the modules that were successfully loaded and compiled by use(), require() or do() statements. Before attempting to load a file or a module with use() or require(), Perl checks whether it's already in the %INC hash. If it's there, the loading and therefore the compilation are not performed at all. Otherwise the file is loaded into memory and an attempt is made to compile it. do() does unconditional loading--no lookup in the %INC hash is made.

If the file is successfully loaded and compiled, a new key-value pair is added to %INC. The key is the name of the file or module as it was passed to the one of the three functions we have just mentioned, and if it was found in any of the @INC directories except "." the value is the full path to it in the file system.

The following examples will make it easier to understand the logic.

First, let's see what are the contents of @INC on my system:

  % perl -e 'print join "\n", @INC'
  /usr/lib/perl5/5.00503/i386-linux
  /usr/lib/perl5/5.00503
  /usr/lib/perl5/site_perl/5.005/i386-linux
  /usr/lib/perl5/site_perl/5.005
  .

Notice the . (current directory) is the last directory in the list.

Now let's load the module strict.pm and see the contents of %INC:

  % perl -e 'use strict; print map {"$_ => $INC{$_}\n"} keys %INC'
  
  strict.pm => /usr/lib/perl5/5.00503/strict.pm

Since strict.pm was found in /usr/lib/perl5/5.00503/ directory and /usr/lib/perl5/5.00503/ is a part of @INC%INC includes the full path as the value for the key strict.pm.

Now let's create the simplest module in /tmp/test.pm:

  test.pm
  -------
  1;

It does nothing, but returns a true value when loaded. Now let's load it in different ways:

  % cd /tmp
  % perl -e 'use test; print map {"$_ => $INC{$_}\n"} keys %INC'
  
  test.pm => test.pm

Since the file was found relative to . (the current directory), the relative path is inserted as the value. If we alter @INC, by adding /tmp to the end:

  % cd /tmp
  % perl -e 'BEGIN{push @INC, "/tmp"} use test; \
  print map {"$_ => $INC{$_}\n"} keys %INC'
  
  test.pm => test.pm

Here we still get the relative path, since the module was found first relative to ".". The directory /tmp was placed after . in the list. If we execute the same code from a different directory, the "." directory won't match,

  % cd /
  % perl -e 'BEGIN{push @INC, "/tmp"} use test; \
  print map {"$_ => $INC{$_}\n"} keys %INC'
  
  test.pm => /tmp/test.pm

so we get the full path. We can also prepend the path with unshift(), so it will be used for matching before "." and therefore we will get the full path as well:

  % cd /tmp
  % perl -e 'BEGIN{unshift @INC, "/tmp"} use test; \
  print map {"$_ => $INC{$_}\n"} keys %INC'
  
  test.pm => /tmp/test.pm

The code:

  BEGIN{unshift @INC, "/tmp"}

can be replaced with the more elegant:

  use lib "/tmp";

Which is almost equivalent to our BEGIN block and is the recommended approach.

These approaches to modifying @INC can be labor intensive, since if you want to move the script around in the file-system you have to modify the path. This can be painful, for example, when you move your scripts from development to a production server.

There is a module called FindBin which solves this problem in the plain Perl world, but unfortunately up untill perl 5.9.1 it won't work under mod_perl, since it's a module and as any module it's loaded only once. So the first script using it will have all the settings correct, but the rest of the scripts will not if located in a different directory from the first. Perl 5.9.1 provides a new function FindBin::again which will do the right thing. Also the CPAN module FindBin::Real provides a working alternative working under mod_perl.

For the sake of completeness, I'll present the FindBin module anyway.

If you use this module, you don't need to write a hard coded path. The following snippet does all the work for you (the file is /tmp/load.pl):

  load.pl
  -------
  #!/usr/bin/perl
  
  use FindBin ();
  use lib "$FindBin::Bin";
  use test;
  print "test.pm => $INC{'test.pm'}\n";

In the above example $FindBin::Bin is equal to /tmp. If we move the script somewhere else... e.g. /tmp/new_dir in the code above $FindBin::Bin equals /tmp/new_dir.

  % /tmp/load.pl
  
  test.pm => /tmp/test.pm

This is just like use lib except that no hard coded path is required.

You can use this workaround to make it work under mod_perl.

  do 'FindBin.pm';
  unshift @INC, "$FindBin::Bin";
  require test;
  #maybe test::import( ... ) here if need to import stuff

This has a slight overhead because it will load from disk and recompile the FindBin module on each request. So it may not be worth it.

TOP 

Modules, Libraries and Program Files

Before we proceed, let's define what we mean by modulelibrary and program file.

  • Libraries

    These are files which contain Perl subroutines and other code.

    When these are used to break up a large program into manageable chunks they don't generally include a package declaration; when they are used as subroutine libraries they often do have a package declaration.

    Their last statement returns true, a simple 1; statement ensures that.

    They can be named in any way desired, but generally their extension is .pl.

    Examples:

      config.pl
      ----------
      # No package so defaults to main::
      $dir = "/home/httpd/cgi-bin";
      $cgi = "/cgi-bin";
      1;
    
      mysubs.pl
      ----------
      # No package so defaults to main::
      sub print_header{
        print "Content-type: text/plain\r\n\r\n";
      }
      1;
    
      web.pl
      ------------
      package web ;
      # Call like this: web::print_with_class('loud',"Don't shout!");
      sub print_with_class{
        my ( $class, $text ) = @_ ;
        print qq{<span class="$class">$text</span>};
      }
      1;
  • Modules

    A file which contains perl subroutines and other code.

    It generally declares a package name at the beginning of it.

    Modules are generally used either as function libraries (which .pl files are still but less commonly used for), or as object libraries where a module is used to define a class and its methods.

    Its last statement returns true.

    The naming convention requires it to have a .pm extension.

    Example:

      MyModule.pm
      -----------
      package My::Module;
      $My::Module::VERSION = 0.01;
      
      sub new{ return bless {}, shift;}
      END { print "Quitting\n"}
      1;
  • Program Files

    Many Perl programs exist as a single file. Under Linux and other Unix-like operating systems the file often has no suffix since the operating system can determine that it is a perl script from the first line (shebang line) or if it's Apache that executes the code, there is a variety of ways to tell how and when the file should be executed. Under Windows a suffix is normally used, for example .pl or .plx.

    The program file will normally require() any libraries and use() any modules it requires for execution.

    It will contain Perl code but won't usually have any package names.

    Its last statement may return anything or nothing.

TOP 

require()

require() reads a file containing Perl code and compiles it. Before attempting to load the file it looks up the argument in %INC to see whether it has already been loaded. If it has, require() just returns without doing a thing. Otherwise an attempt will be made to load and compile the file.

require() has to find the file it has to load. If the argument is a full path to the file, it just tries to read it. For example:

  require "/home/httpd/perl/mylibs.pl";

If the path is relative, require() will attempt to search for the file in all the directories listed in @INC. For example:

  require "mylibs.pl";

If there is more than one occurrence of the file with the same name in the directories listed in @INC the first occurrence will be used.

The file must return TRUE as the last statement to indicate successful execution of any initialization code. Since you never know what changes the file will go through in the future, you cannot be sure that the last statement will always return TRUE. That's why the suggestion is to put "1;" at the end of file.

Although you should use the real filename for most files, if the file is a module, you may use the following convention instead:

  require My::Module;

This is equal to:

  require "My/Module.pm";

If require() fails to load the file, either because it couldn't find the file in question or the code failed to compile, or it didn't return TRUE, then the program would die(). To prevent this the require() statement can be enclosed into an eval() exception-handling block, as in this example:

  require.pl
  ----------
  #!/usr/bin/perl -w
  
  eval { require "/file/that/does/not/exists"};
  if ($@) {
    print "Failed to load, because : $@"
  }
  print "\nHello\n";

When we execute the program:

  % ./require.pl
  
  Failed to load, because : Can't locate /file/that/does/not/exists in
  @INC (@INC contains: /usr/lib/perl5/5.00503/i386-linux
  /usr/lib/perl5/5.00503 /usr/lib/perl5/site_perl/5.005/i386-linux
  /usr/lib/perl5/site_perl/5.005 .) at require.pl line 3.
  
  Hello

We see that the program didn't die(), because Hello was printed. This trick is useful when you want to check whether a user has some module installed, but if she hasn't it's not critical, perhaps the program can run without this module with reduced functionality.

If we remove the eval() part and try again:

  require.pl
  ----------
  #!/usr/bin/perl -w
  
  require "/file/that/does/not/exists";
  print "\nHello\n";

  % ./require1.pl
  
  Can't locate /file/that/does/not/exists in @INC (@INC contains:
  /usr/lib/perl5/5.00503/i386-linux /usr/lib/perl5/5.00503
  /usr/lib/perl5/site_perl/5.005/i386-linux
  /usr/lib/perl5/site_perl/5.005 .) at require1.pl line 3.

The program just die()s in the last example, which is what you want in most cases.

For more information refer to the perlfunc manpage.

TOP 

use()

use(), just like require(), loads and compiles files containing Perl code, but it works with modules only and is executed at compile time.

The only way to pass a module to load is by its module name and not its filename. If the module is located in MyCode.pm, the correct way to use() it is:

  use MyCode

and not:

  use "MyCode.pm"

use() translates the passed argument into a file name replacing :: with the operating system's path separator (normally /) and appending .pm at the end. So My::Module becomes My/Module.pm.

use() is exactly equivalent to:

 BEGIN { require Module; Module->import(LIST); }

Internally it calls require() to do the loading and compilation chores. When require() finishes its job, import() is called unless () is the second argument. The following pairs are equivalent:

  use MyModule;
  BEGIN {require MyModule; MyModule->import; }
  
  use MyModule qw(foo bar);
  BEGIN {require MyModule; MyModule->import("foo","bar"); }
  
  use MyModule ();
  BEGIN {require MyModule; }

The first pair exports the default tags. This happens if the module sets @EXPORT to a list of tags to be exported by default. The module's manpage normally describes what tags are exported by default.

The second pair exports only the tags passed as arguments.

The third pair describes the case where the caller does not want any symbols to be imported.

import() is not a builtin function, it's just an ordinary static method call into the "MyModule" package to tell the module to import the list of features back into the current package. See the Exporter manpage for more information.

When you write your own modules, always remember that it's better to use @EXPORT_OK instead of @EXPORT, since the former doesn't export symbols unless it was asked to. Exports pollute the namespace of the module user. Also avoid short or common symbol names to reduce the risk of name clashes.

When functions and variables aren't exported you can still access them using their full names, like $My::Module::bar or $My::Module::foo(). By convention you can use a leading underscore on names to informally indicate that they are internal and not for public use.

There's a corresponding "no" command that un-imports symbols imported by use, i.e., it calls Module->unimport(LIST) instead of import().

TOP 

do()

While do() behaves almost identically to require(), it reloads the file unconditionally. It doesn't check %INC to see whether the file was already loaded.

If do() cannot read the file, it returns undef and sets $! to report the error. If do() can read the file but cannot compile it, it returns undef and puts an error message in $@. If the file is successfully compiled, do() returns the value of the last expression evaluated.

 

完!

抱歉!评论已关闭.