doc/src/sgml/plperl.sgml

   1 <!-- $PostgreSQL: pgsql/doc/src/sgml/plperl.sgml,v 2.79 2010/02/05 18:11:46 momjian Exp $ -->
   2
   3  <chapter id="plperl">
   4   <title>PL/Perl - Perl Procedural Language</title>
   5
   6   <indexterm zone="plperl">
   7    <primary>PL/Perl</primary>
   8   </indexterm>
   9
  10   <indexterm zone="plperl">
  11    <primary>Perl</primary>
  12   </indexterm>
  13
  14   <para>
  15    PL/Perl is a loadable procedural language that enables you to write
  16    <productname>PostgreSQL</productname> functions in the
  17    <ulink url="http://www.perl.org">Perl programming language</ulink>.
  18   </para>
  19
  20   <para>
  21    The main advantage to using PL/Perl is that this allows use,
  22    within stored functions, of the manyfold <quote>string
  23    munging</quote> operators and functions available for Perl.  Parsing
  24    complex strings might be easier using Perl than it is with the
  25    string functions and control structures provided in PL/pgSQL.
  26   </para>
  27
  28   <para>
  29    To install PL/Perl in a particular database, use
  30    <literal>createlang plperl <replaceable>dbname</></literal>.
  31   </para>
  32
  33   <tip>
  34    <para>
  35     If a language is installed into <literal>template1</>, all subsequently
  36     created databases will have the language installed automatically.
  37    </para>
  38   </tip>
  39
  40   <note>
  41    <para>
  42     Users of source packages must specially enable the build of
  43     PL/Perl during the installation process.  (Refer to <xref
  44     linkend="install-short"> for more information.)  Users of
  45     binary packages might find PL/Perl in a separate subpackage.
  46    </para>
  47   </note>
  48
  49  <sect1 id="plperl-funcs">
  50   <title>PL/Perl Functions and Arguments</title>
  51
  52   <para>
  53    To create a function in the PL/Perl language, use the standard
  54    <xref linkend="sql-createfunction" endterm="sql-createfunction-title">
  55    syntax:
  56
  57 <programlisting>
  58 CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-types</replaceable>) RETURNS <replaceable>return-type</replaceable> AS $$
  59     # PL/Perl function body
  60 $$ LANGUAGE plperl;
  61 </programlisting>
  62
  63    The body of the function is ordinary Perl code. In fact, the PL/Perl
  64    glue code wraps it inside a Perl subroutine.  A PL/Perl function is
  65    called in a scalar context, so it can't return a list.  You can return
  66    non-scalar values (arrays, records, and sets) by returning a reference,
  67    as discussed below.
  68   </para>
  69
  70   <para>
  71    PL/Perl also supports anonymous code blocks called with the
  72    <xref linkend="sql-do" endterm="sql-do-title"> statement:
  73
  74 <programlisting>
  75 DO $$
  76     # PL/Perl code
  77 $$ LANGUAGE plperl;
  78 </programlisting>
  79
  80    An anonymous code block receives no arguments, and whatever value it
  81    might return is discarded.  Otherwise it behaves just like a function.
  82   </para>
  83
  84   <note>
  85    <para>
  86     The use of named nested subroutines is dangerous in Perl, especially if
  87     they refer to lexical variables in the enclosing scope. Because a PL/Perl
  88     function is wrapped in a subroutine, any named subroutine you create will
  89     be nested. In general, it is far safer to create anonymous subroutines
  90     which you call via a coderef. See <literal>Variable "%s" will not stay shared</literal>
  91     and <literal>Variable "%s" is not available</literal> in the
  92     <citerefentry><refentrytitle>perldiag</></citerefentry> man page for more
  93     details.
  94    </para>
  95   </note>
  96
  97   <para>
  98    The syntax of the <command>CREATE FUNCTION</command> command requires
  99    the function body to be written as a string constant.  It is usually
 100    most convenient to use dollar quoting (see <xref
 101    linkend="sql-syntax-dollar-quoting">) for the string constant.
 102    If you choose to use escape string syntax <literal>E''</>,
 103    you must double the single quote marks (<literal>'</>) and backslashes
 104    (<literal>\</>) used in the body of the function
 105    (see <xref linkend="sql-syntax-strings">).
 106   </para>
 107
 108   <para>
 109    Arguments and results are handled as in any other Perl subroutine:
 110    arguments are passed in <varname>@_</varname>, and a result value
 111    is returned with <literal>return</> or as the last expression
 112    evaluated in the function.
 113   </para>
 114
 115   <para>
 116    For example, a function returning the greater of two integer values
 117    could be defined as:
 118
 119 <programlisting>
 120 CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
 121     if ($_[0] &gt; $_[1]) { return $_[0]; }
 122     return $_[1];
 123 $$ LANGUAGE plperl;
 124 </programlisting>
 125   </para>
 126
 127   <para>
 128    If an SQL null value<indexterm><primary>null value</><secondary
 129    sortas="PL/Perl">in PL/Perl</></indexterm> is passed to a function,
 130    the argument value will appear as <quote>undefined</> in Perl.  The
 131    above function definition will not behave very nicely with null
 132    inputs (in fact, it will act as though they are zeroes).  We could
 133    add <literal>STRICT</> to the function definition to make
 134    <productname>PostgreSQL</productname> do something more reasonable:
 135    if a null value is passed, the function will not be called at all,
 136    but will just return a null result automatically.  Alternatively,
 137    we could check for undefined inputs in the function body.  For
 138    example, suppose that we wanted <function>perl_max</function> with
 139    one null and one nonnull argument to return the nonnull argument,
 140    rather than a null value:
 141
 142 <programlisting>
 143 CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
 144     my ($x, $y) = @_;
 145     if (not defined $x) {
 146         return undef if not defined $y;
 147         return $y;
 148     }
 149     return $x if not defined $y;
 150     return $x if $x &gt; $y;
 151     return $y;
 152 $$ LANGUAGE plperl;
 153 </programlisting>
 154    As shown above, to return an SQL null value from a PL/Perl
 155    function, return an undefined value.  This can be done whether the
 156    function is strict or not.
 157   </para>
 158
 159   <para>
 160    Anything in a function argument that is not a reference is
 161    a string, which is in the standard <productname>PostgreSQL</productname>
 162    external text representation for the relevant data type. In the case of
 163    ordinary numeric or text types, Perl will just do the right thing and
 164    the programmer will normally not have to worry about it. However, in
 165    other cases the argument will need to be converted into a form that is
 166    more usable in Perl. For example, the <function>decode_bytea</function>
 167    function can be used to convert an argument of
 168    type <type>bytea</> into unescaped binary.
 169   </para>
 170
 171   <para>
 172    Similarly, values passed back to <productname>PostgreSQL</productname>
 173    must be in the external text representation format. For example, the
 174    <function>encode_bytea</function> function can be used to
 175    to escape binary data for a return value of type <type>bytea</>.
 176   </para>
 177
 178   <para>
 179    Perl can return <productname>PostgreSQL</productname> arrays as
 180    references to Perl arrays.  Here is an example:
 181
 182 <programlisting>
 183 CREATE OR REPLACE function returns_array()
 184 RETURNS text[][] AS $$
 185     return [['a&quot;b','c,d'],['e\\f','g']];
 186 $$ LANGUAGE plperl;
 187
 188 select returns_array();
 189 </programlisting>
 190   </para>
 191
 192   <para>
 193    Composite-type arguments are passed to the function as references
 194    to hashes.  The keys of the hash are the attribute names of the
 195    composite type.  Here is an example:
 196
 197 <programlisting>
 198 CREATE TABLE employee (
 199     name text,
 200     basesalary integer,
 201     bonus integer
 202 );
 203
 204 CREATE FUNCTION empcomp(employee) RETURNS integer AS $$
 205     my ($emp) = @_;
 206     return $emp-&gt;{basesalary} + $emp-&gt;{bonus};
 207 $$ LANGUAGE plperl;
 208
 209 SELECT name, empcomp(employee.*) FROM employee;
 210 </programlisting>
 211   </para>
 212
 213   <para>
 214    A PL/Perl function can return a composite-type result using the same
 215    approach: return a reference to a hash that has the required attributes.
 216    For example:
 217
 218 <programlisting>
 219 CREATE TYPE testrowperl AS (f1 integer, f2 text, f3 text);
 220
 221 CREATE OR REPLACE FUNCTION perl_row() RETURNS testrowperl AS $$
 222     return {f2 =&gt; 'hello', f1 =&gt; 1, f3 =&gt; 'world'};
 223 $$ LANGUAGE plperl;
 224
 225 SELECT * FROM perl_row();
 226 </programlisting>
 227
 228    Any columns in the declared result data type that are not present in the
 229    hash will be returned as null values.
 230   </para>
 231
 232   <para>
 233     PL/Perl functions can also return sets of either scalar or
 234     composite types.  Usually you'll want to return rows one at a
 235     time, both to speed up startup time and to keep from queueing up
 236     the entire result set in memory.  You can do this with
 237     <function>return_next</function> as illustrated below.  Note that
 238     after the last <function>return_next</function>, you must put
 239     either <literal>return</literal> or (better) <literal>return
 240     undef</literal>.
 241
 242 <programlisting>
 243 CREATE OR REPLACE FUNCTION perl_set_int(int)
 244 RETURNS SETOF INTEGER AS $$
 245     foreach (0..$_[0]) {
 246         return_next($_);
 247     }
 248     return undef;
 249 $$ LANGUAGE plperl;
 250
 251 SELECT * FROM perl_set_int(5);
 252
 253 CREATE OR REPLACE FUNCTION perl_set()
 254 RETURNS SETOF testrowperl AS $$
 255     return_next({ f1 =&gt; 1, f2 =&gt; 'Hello', f3 =&gt; 'World' });
 256     return_next({ f1 =&gt; 2, f2 =&gt; 'Hello', f3 =&gt; 'PostgreSQL' });
 257     return_next({ f1 =&gt; 3, f2 =&gt; 'Hello', f3 =&gt; 'PL/Perl' });
 258     return undef;
 259 $$ LANGUAGE plperl;
 260 </programlisting>
 261
 262     For small result sets, you can return a reference to an array that
 263     contains either scalars, references to arrays, or references to
 264     hashes for simple types, array types, and composite types,
 265     respectively.  Here are some simple examples of returning the entire
 266     result set as an array reference:
 267
 268 <programlisting>
 269 CREATE OR REPLACE FUNCTION perl_set_int(int) RETURNS SETOF INTEGER AS $$
 270     return [0..$_[0]];
 271 $$ LANGUAGE plperl;
 272
 273 SELECT * FROM perl_set_int(5);
 274
 275 CREATE OR REPLACE FUNCTION perl_set() RETURNS SETOF testrowperl AS $$
 276     return [
 277         { f1 =&gt; 1, f2 =&gt; 'Hello', f3 =&gt; 'World' },
 278         { f1 =&gt; 2, f2 =&gt; 'Hello', f3 =&gt; 'PostgreSQL' },
 279         { f1 =&gt; 3, f2 =&gt; 'Hello', f3 =&gt; 'PL/Perl' }
 280     ];
 281 $$ LANGUAGE plperl;
 282
 283 SELECT * FROM perl_set();
 284 </programlisting>
 285   </para>
 286
 287   <para>
 288    If you wish to use the <literal>strict</> pragma with your code you
 289    have a few options. For temporary global use you can <command>SET</>
 290    <literal>plperl.use_strict</literal> to true.
 291    This will affect subsequent compilations of <application>PL/Perl</>
 292    functions, but not functions already compiled in the current session.
 293    For permanent global use you can set <literal>plperl.use_strict</literal>
 294    to true in the <filename>postgresql.conf</filename> file.
 295   </para>
 296
 297   <para>
 298    For permanent use in specific functions you can simply put:
 299 <programlisting>
 300 use strict;
 301 </programlisting>
 302    at the top of the function body.
 303   </para>
 304
 305   <para>
 306   The <literal>feature</> pragma is also available to <function>use</> if your Perl is version 5.10.0 or higher.
 307   </para>
 308
 309  </sect1>
 310
 311  <sect1 id="plperl-data">
 312   <title>Data Values in PL/Perl</title>
 313
 314   <para>
 315    The argument values supplied to a PL/Perl function's code are
 316    simply the input arguments converted to text form (just as if they
 317    had been displayed by a <command>SELECT</command> statement).
 318    Conversely, the <function>return</function> and <function>return_next</function>
 319    commands will accept any string that is acceptable input format
 320    for the function's declared return type.
 321   </para>
 322  </sect1>
 323
 324  <sect1 id="plperl-builtins">
 325   <title>Built-in Functions</title>
 326
 327  <sect2 id="plperl-database">
 328   <title>Database Access from PL/Perl</title>
 329
 330   <para>
 331    Access to the database itself from your Perl function can be done
 332    via the following functions:
 333   </para>
 334
 335    <variablelist>
 336     <varlistentry>
 337      <indexterm>
 338       <primary>spi_exec_query</primary>
 339       <secondary>in PL/Perl</secondary>
 340      </indexterm>
 341      <indexterm>
 342       <primary>spi_query</primary>
 343       <secondary>in PL/Perl</secondary>
 344      </indexterm>
 345      <indexterm>
 346       <primary>spi_fetchrow</primary>
 347       <secondary>in PL/Perl</secondary>
 348      </indexterm>
 349      <indexterm>
 350       <primary>spi_prepare</primary>
 351       <secondary>in PL/Perl</secondary>
 352      </indexterm>
 353      <indexterm>
 354       <primary>spi_exec_prepared</primary>
 355       <secondary>in PL/Perl</secondary>
 356      </indexterm>
 357      <indexterm>
 358       <primary>spi_query_prepared</primary>
 359       <secondary>in PL/Perl</secondary>
 360      </indexterm>
 361      <indexterm>
 362       <primary>spi_cursor_close</primary>
 363       <secondary>in PL/Perl</secondary>
 364      </indexterm>
 365      <indexterm>
 366       <primary>spi_freeplan</primary>
 367       <secondary>in PL/Perl</secondary>
 368      </indexterm>
 369
 370      <term><literal><function>spi_exec_query</>(<replaceable>query</replaceable> [, <replaceable>max-rows</replaceable>])</literal></term>
 371      <listitem>
 372       <para>
 373        <literal>spi_exec_query</literal> executes an SQL command and
 374 returns the entire row set as a reference to an array of hash
 375 references.  <emphasis>You should only use this command when you know
 376 that the result set will be relatively small.</emphasis>  Here is an
 377 example of a query (<command>SELECT</command> command) with the
 378 optional maximum number of rows:
 379
 380 <programlisting>
 381 $rv = spi_exec_query('SELECT * FROM my_table', 5);
 382 </programlisting>
 383         This returns up to 5 rows from the table
 384         <literal>my_table</literal>.  If <literal>my_table</literal>
 385         has a column <literal>my_column</literal>, you can get that
 386         value from row <literal>$i</literal> of the result like this:
 387 <programlisting>
 388 $foo = $rv-&gt;{rows}[$i]-&gt;{my_column};
 389 </programlisting>
 390        The total number of rows returned from a <command>SELECT</command>
 391        query can be accessed like this:
 392 <programlisting>
 393 $nrows = $rv-&gt;{processed}
 394 </programlisting>
 395       </para>
 396
 397       <para>
 398        Here is an example using a different command type:
 399 <programlisting>
 400 $query = "INSERT INTO my_table VALUES (1, 'test')";
 401 $rv = spi_exec_query($query);
 402 </programlisting>
 403        You can then access the command status (e.g.,
 404        <literal>SPI_OK_INSERT</literal>) like this:
 405 <programlisting>
 406 $res = $rv-&gt;{status};
 407 </programlisting>
 408        To get the number of rows affected, do:
 409 <programlisting>
 410 $nrows = $rv-&gt;{processed};
 411 </programlisting>
 412       </para>
 413
 414       <para>
 415        Here is a complete example:
 416 <programlisting>
 417 CREATE TABLE test (
 418     i int,
 419     v varchar
 420 );
 421
 422 INSERT INTO test (i, v) VALUES (1, 'first line');
 423 INSERT INTO test (i, v) VALUES (2, 'second line');
 424 INSERT INTO test (i, v) VALUES (3, 'third line');
 425 INSERT INTO test (i, v) VALUES (4, 'immortal');
 426
 427 CREATE OR REPLACE FUNCTION test_munge() RETURNS SETOF test AS $$
 428     my $rv = spi_exec_query('select i, v from test;');
 429     my $status = $rv-&gt;{status};
 430     my $nrows = $rv-&gt;{processed};
 431     foreach my $rn (0 .. $nrows - 1) {
 432         my $row = $rv-&gt;{rows}[$rn];
 433         $row-&gt;{i} += 200 if defined($row-&gt;{i});
 434         $row-&gt;{v} =~ tr/A-Za-z/a-zA-Z/ if (defined($row-&gt;{v}));
 435         return_next($row);
 436     }
 437     return undef;
 438 $$ LANGUAGE plperl;
 439
 440 SELECT * FROM test_munge();
 441 </programlisting>
 442     </para>
 443     </listitem>
 444     </varlistentry>
 445
 446     <varlistentry>
 447      <term><literal><function>spi_query</>(<replaceable>command</replaceable>)</literal></term>
 448      <term><literal><function>spi_fetchrow</>(<replaceable>cursor</replaceable>)</literal></term>
 449      <term><literal><function>spi_cursor_close</>(<replaceable>cursor</replaceable>)</literal></term>
 450
 451     <listitem>
 452     <para>
 453     <literal>spi_query</literal> and <literal>spi_fetchrow</literal>
 454     work together as a pair for row sets which might be large, or for cases
 455     where you wish to return rows as they arrive.
 456     <literal>spi_fetchrow</literal> works <emphasis>only</emphasis> with
 457     <literal>spi_query</literal>. The following example illustrates how
 458     you use them together:
 459
 460 <programlisting>
 461 CREATE TYPE foo_type AS (the_num INTEGER, the_text TEXT);
 462
 463 CREATE OR REPLACE FUNCTION lotsa_md5 (INTEGER) RETURNS SETOF foo_type AS $$
 464     use Digest::MD5 qw(md5_hex);
 465     my $file = '/usr/share/dict/words';
 466     my $t = localtime;
 467     elog(NOTICE, "opening file $file at $t" );
 468     open my $fh, '&lt;', $file # ooh, it's a file access!
 469         or elog(ERROR, "cannot open $file for reading: $!");
 470     my @words = &lt;$fh&gt;;
 471     close $fh;
 472     $t = localtime;
 473     elog(NOTICE, "closed file $file at $t");
 474     chomp(@words);
 475     my $row;
 476     my $sth = spi_query("SELECT * FROM generate_series(1,$_[0]) AS b(a)");
 477     while (defined ($row = spi_fetchrow($sth))) {
 478         return_next({
 479             the_num =&gt; $row-&gt;{a},
 480             the_text =&gt; md5_hex($words[rand @words])
 481         });
 482     }
 483     return;
 484 $$ LANGUAGE plperlu;
 485
 486 SELECT * from lotsa_md5(500);
 487 </programlisting>
 488     </para>
 489
 490     <para>
 491      Normally, <function>spi_fetchrow</> should be repeated until it
 492      returns <literal>undef</literal>, indicating that there are no more
 493      rows to read.  The cursor returned by <literal>spi_query</literal>
 494      is automatically freed when
 495      <function>spi_fetchrow</> returns <literal>undef</literal>.
 496      If you do not wish to read all the rows, instead call
 497      <function>spi_cursor_close</> to free the cursor.
 498      Failure to do so will result in memory leaks.
 499     </para>
 500
 501     </listitem>
 502     </varlistentry>
 503
 504     <varlistentry>
 505      <term><literal><function>spi_prepare</>(<replaceable>command</replaceable>, <replaceable>argument types</replaceable>)</literal></term>
 506      <term><literal><function>spi_query_prepared</>(<replaceable>plan</replaceable>, <replaceable>arguments</replaceable>)</literal></term>
 507      <term><literal><function>spi_exec_prepared</>(<replaceable>plan</replaceable> [, <replaceable>attributes</replaceable>], <replaceable>arguments</replaceable>)</literal></term>
 508      <term><literal><function>spi_freeplan</>(<replaceable>plan</replaceable>)</literal></term>
 509
 510     <listitem>
 511     <para>
 512     <literal>spi_prepare</literal>, <literal>spi_query_prepared</literal>, <literal>spi_exec_prepared</literal>,
 513     and <literal>spi_freeplan</literal> implement the same functionality but for prepared queries.
 514     <literal>spi_prepare</literal> accepts a query string with numbered argument placeholders ($1, $2, etc)
 515     and a string list of argument types:
 516 <programlisting>
 517 $plan = spi_prepare('SELECT * FROM test WHERE id &gt; $1 AND name = $2', 'INTEGER', 'TEXT');
 518 </programlisting>
 519     Once a query plan is prepared by a call to <literal>spi_prepare</literal>, the plan can be used instead
 520     of the string query, either in <literal>spi_exec_prepared</literal>, where the result is the same as returned
 521     by <literal>spi_exec_query</literal>, or in <literal>spi_query_prepared</literal> which returns a cursor
 522     exactly as <literal>spi_query</literal> does, which can be later passed to <literal>spi_fetchrow</literal>.
 523     The optional second parameter to <literal>spi_exec_prepared</literal> is a hash reference of attributes;
 524     the only attribute currently supported is <literal>limit</literal>, which sets the maximum number of rows returned by a query.
 525     </para>
 526
 527     <para>
 528     The advantage of prepared queries is that is it possible to use one prepared plan for more
 529     than one query execution. After the plan is not needed anymore, it can be freed with
 530     <literal>spi_freeplan</literal>:
 531     </para>
 532
 533     <para>
 534     <programlisting>
 535 CREATE OR REPLACE FUNCTION init() RETURNS VOID AS $$
 536         $_SHARED{my_plan} = spi_prepare( 'SELECT (now() + $1)::date AS now', 'INTERVAL');
 537 $$ LANGUAGE plperl;
 538
 539 CREATE OR REPLACE FUNCTION add_time( INTERVAL ) RETURNS TEXT AS $$
 540         return spi_exec_prepared(
 541                 $_SHARED{my_plan},
 542                 $_[0]
 543         )->{rows}->[0]->{now};
 544 $$ LANGUAGE plperl;
 545
 546 CREATE OR REPLACE FUNCTION done() RETURNS VOID AS $$
 547         spi_freeplan( $_SHARED{my_plan});
 548         undef $_SHARED{my_plan};
 549 $$ LANGUAGE plperl;
 550
 551 SELECT init();
 552 SELECT add_time('1 day'), add_time('2 days'), add_time('3 days');
 553 SELECT done();
 554
 555   add_time  |  add_time  |  add_time
 556 ------------+------------+------------
 557  2005-12-10 | 2005-12-11 | 2005-12-12
 558     </programlisting>
 559     </para>
 560
 561     <para>
 562     Note that the parameter subscript in <literal>spi_prepare</literal> is defined via
 563     $1, $2, $3, etc, so avoid declaring query strings in double quotes that might easily
 564     lead to hard-to-catch bugs.
 565     </para>
 566
 567     <para>
 568     Another example illustrates usage of an optional parameter in <literal>spi_exec_prepared</literal>:
 569     </para>
 570
 571     <para>
 572     <programlisting>
 573 CREATE TABLE hosts AS SELECT id, ('192.168.1.'||id)::inet AS address FROM generate_series(1,3) AS id;
 574
 575 CREATE OR REPLACE FUNCTION init_hosts_query() RETURNS VOID AS $$
 576         $_SHARED{plan} = spi_prepare('SELECT * FROM hosts WHERE address &lt;&lt; $1', 'inet');
 577 $$ LANGUAGE plperl;
 578
 579 CREATE OR REPLACE FUNCTION query_hosts(inet) RETURNS SETOF hosts AS $$
 580         return spi_exec_prepared(
 581                 $_SHARED{plan},
 582                 {limit =&gt; 2},
 583                 $_[0]
 584         )->{rows};
 585 $$ LANGUAGE plperl;
 586
 587 CREATE OR REPLACE FUNCTION release_hosts_query() RETURNS VOID AS $$
 588         spi_freeplan($_SHARED{plan});
 589         undef $_SHARED{plan};
 590 $$ LANGUAGE plperl;
 591
 592 SELECT init_hosts_query();
 593 SELECT query_hosts('192.168.1.0/30');
 594 SELECT release_hosts_query();
 595
 596     query_hosts
 597 -----------------
 598  (1,192.168.1.1)
 599  (2,192.168.1.2)
 600 (2 rows)
 601     </programlisting>
 602     </para>
 603     </listitem>
 604     </varlistentry>
 605    </variablelist>
 606  </sect2>
 607
 608  <sect2 id="plperl-utility-functions">
 609   <title>Utility functions in PL/Perl</title>
 610
 611    <variablelist>
 612     <varlistentry>
 613      <indexterm>
 614       <primary>elog</primary>
 615       <secondary>in PL/Perl</secondary>
 616      </indexterm>
 617
 618      <term><literal><function>elog</>(<replaceable>level</replaceable>, <replaceable>msg</replaceable>)</literal></term>
 619      <listitem>
 620       <para>
 621        Emit a log or error message. Possible levels are
 622        <literal>DEBUG</>, <literal>LOG</>, <literal>INFO</>,
 623        <literal>NOTICE</>, <literal>WARNING</>, and <literal>ERROR</>.
 624        <literal>ERROR</>
 625         raises an error condition; if this is not trapped by the surrounding
 626         Perl code, the error propagates out to the calling query, causing
 627         the current transaction or subtransaction to be aborted.  This
 628         is effectively the same as the Perl <literal>die</> command.
 629         The other levels only generate messages of different
 630         priority levels.
 631         Whether messages of a particular priority are reported to the client,
 632         written to the server log, or both is controlled by the
 633         <xref linkend="guc-log-min-messages"> and
 634         <xref linkend="guc-client-min-messages"> configuration
 635         variables. See <xref linkend="runtime-config"> for more
 636         information.
 637       </para>
 638      </listitem>
 639     </varlistentry>
 640
 641     <varlistentry>
 642      <indexterm>
 643       <primary>quote_literal</primary>
 644       <secondary>in PL/Perl</secondary>
 645      </indexterm>
 646
 647      <term><literal><function>quote_literal</>(<replaceable>string</replaceable>)</literal></term>
 648      <listitem>
 649       <para>
 650         Return the given string suitably quoted to be used as a string literal in an SQL
 651         statement string. Embedded single-quotes and backslashes are properly doubled.
 652         Note that <function>quote_literal</> returns undef on undef input; if the argument
 653         might be undef, <function>quote_nullable</> is often more suitable.
 654       </para>
 655      </listitem>
 656     </varlistentry>
 657
 658     <varlistentry>
 659      <indexterm>
 660       <primary>quote_nullable</primary>
 661       <secondary>in PL/Perl</secondary>
 662      </indexterm>
 663
 664      <term><literal><function>quote_nullable</>(<replaceable>string</replaceable>)</literal></term>
 665      <listitem>
 666       <para>
 667         Return the given string suitably quoted to be used as a string literal in an SQL
 668         statement string; or, if the argument is undef, return the unquoted string "NULL".
 669         Embedded single-quotes and backslashes are properly doubled.
 670       </para>
 671      </listitem>
 672     </varlistentry>
 673
 674     <varlistentry>
 675      <indexterm>
 676       <primary>quote_ident</primary>
 677       <secondary>in PL/Perl</secondary>
 678      </indexterm>
 679
 680      <term><literal><function>quote_ident</>(<replaceable>string</replaceable>)</literal></term>
 681      <listitem>
 682       <para>
 683         Return the given string suitably quoted to be used as an identifier in
 684         an SQL statement string. Quotes are added only if necessary (i.e., if
 685         the string contains non-identifier characters or would be case-folded).
 686         Embedded quotes are properly doubled.
 687       </para>
 688      </listitem>
 689     </varlistentry>
 690
 691     <varlistentry>
 692      <indexterm>
 693       <primary>decode_bytea</primary>
 694       <secondary>in PL/Perl</secondary>
 695      </indexterm>
 696
 697      <term><literal><function>decode_bytea</>(<replaceable>string</replaceable>)</literal></term>
 698      <listitem>
 699       <para>
 700         Return the unescaped binary data represented by the contents of the given string,
 701         which should be bytea encoded.
 702         </para>
 703      </listitem>
 704     </varlistentry>
 705
 706     <varlistentry>
 707      <indexterm>
 708       <primary>encode_bytea</primary>
 709       <secondary>in PL/Perl</secondary>
 710      </indexterm>
 711
 712      <term><literal><function>encode_bytea</>(<replaceable>string</replaceable>)</literal></term>
 713      <listitem>
 714       <para>
 715         Return the bytea encoded form of the binary data contents of the given string.
 716         </para>
 717      </listitem>
 718     </varlistentry>
 719
 720     <varlistentry>
 721      <indexterm>
 722       <primary>encode_array_literal</primary>
 723       <secondary>in PL/Perl</secondary>
 724      </indexterm>
 725
 726      <term><literal><function>encode_array_literal</>(<replaceable>array</replaceable>)</literal></term>
 727      <term><literal><function>encode_array_literal</>(<replaceable>array</replaceable>, <replaceable>delimiter</replaceable>)</literal></term>
 728      <listitem>
 729       <para>
 730         Returns the contents of the referenced array as a string in array literal format
 731         (see <xref linkend="arrays-input">).
 732         Returns the argument value unaltered if it's not a reference to an array.
 733         The delimiter used between elements of the array literal defaults to "<literal>, </literal>"
 734         if a delimiter is not specified or is undef.
 735         </para>
 736      </listitem>
 737     </varlistentry>
 738
 739     <varlistentry>
 740      <indexterm>
 741       <primary>encode_array_constructor</primary>
 742       <secondary>in PL/Perl</secondary>
 743      </indexterm>
 744
 745      <term><literal><function>encode_array_constructor</>(<replaceable>array</replaceable>)</literal></term>
 746      <listitem>
 747       <para>
 748         Returns the contents of the referenced array as a string in array constructor format
 749         (see <xref linkend="sql-syntax-array-constructors">).
 750         Individual values are quoted using <function>quote_nullable</function>.
 751         Returns the argument value, quoted using <function>quote_nullable</function>,
 752         if it's not a reference to an array.
 753         </para>
 754      </listitem>
 755     </varlistentry>
 756
 757     <varlistentry>
 758      <indexterm>
 759       <primary>looks_like_number</primary>
 760       <secondary>in PL/Perl</secondary>
 761      </indexterm>
 762
 763      <term><literal><function>looks_like_number</>(<replaceable>string</replaceable>)</literal></term>
 764      <listitem>
 765       <para>
 766         Returns a true value if the content of the given string looks like a
 767         number, according to Perl, returns false otherwise.
 768         Returns undef if the argument is undef.  Leading and trailing space is
 769         ignored. <literal>Inf</> and <literal>Infinity</> are regarded as numbers.
 770         </para>
 771      </listitem>
 772     </varlistentry>
 773
 774    </variablelist>
 775  </sect2>
 776  </sect1>
 777
 778  <sect1 id="plperl-global">
 779   <title>Global Values in PL/Perl</title>
 780
 781   <para>
 782     You can use the global hash <varname>%_SHARED</varname> to store
 783     data, including code references, between function calls for the
 784     lifetime of the current session.
 785   </para>
 786
 787   <para>
 788     Here is a simple example for shared data:
 789 <programlisting>
 790 CREATE OR REPLACE FUNCTION set_var(name text, val text) RETURNS text AS $$
 791     if ($_SHARED{$_[0]} = $_[1]) {
 792         return 'ok';
 793     } else {
 794         return "cannot set shared variable $_[0] to $_[1]";
 795     }
 796 $$ LANGUAGE plperl;
 797
 798 CREATE OR REPLACE FUNCTION get_var(name text) RETURNS text AS $$
 799     return $_SHARED{$_[0]};
 800 $$ LANGUAGE plperl;
 801
 802 SELECT set_var('sample', 'Hello, PL/Perl!  How''s tricks?');
 803 SELECT get_var('sample');
 804 </programlisting>
 805   </para>
 806
 807   <para>
 808    Here is a slightly more complicated example using a code reference:
 809
 810 <programlisting>
 811 CREATE OR REPLACE FUNCTION myfuncs() RETURNS void AS $$
 812     $_SHARED{myquote} = sub {
 813         my $arg = shift;
 814         $arg =~ s/(['\\])/\\$1/g;
 815         return "'$arg'";
 816     };
 817 $$ LANGUAGE plperl;
 818
 819 SELECT myfuncs(); /* initializes the function */
 820
 821 /* Set up a function that uses the quote function */
 822
 823 CREATE OR REPLACE FUNCTION use_quote(TEXT) RETURNS text AS $$
 824     my $text_to_quote = shift;
 825     my $qfunc = $_SHARED{myquote};
 826     return &amp;$qfunc($text_to_quote);
 827 $$ LANGUAGE plperl;
 828 </programlisting>
 829
 830    (You could have replaced the above with the one-liner
 831    <literal>return $_SHARED{myquote}-&gt;($_[0]);</literal>
 832    at the expense of readability.)
 833   </para>
 834  </sect1>
 835
 836  <sect1 id="plperl-trusted">
 837   <title>Trusted and Untrusted PL/Perl</title>
 838
 839   <indexterm zone="plperl-trusted">
 840    <primary>trusted</primary>
 841    <secondary>PL/Perl</secondary>
 842   </indexterm>
 843
 844   <para>
 845    Normally, PL/Perl is installed as a <quote>trusted</> programming
 846    language named <literal>plperl</>.  In this setup, certain Perl
 847    operations are disabled to preserve security.  In general, the
 848    operations that are restricted are those that interact with the
 849    environment. This includes file handle operations,
 850    <literal>require</literal>, and <literal>use</literal> (for
 851    external modules).  There is no way to access internals of the
 852    database server process or to gain OS-level access with the
 853    permissions of the server process,
 854    as a C function can do.  Thus, any unprivileged database user can
 855    be permitted to use this language.
 856   </para>
 857
 858   <para>
 859    Here is an example of a function that will not work because file
 860    system operations are not allowed for security reasons:
 861 <programlisting>
 862 CREATE FUNCTION badfunc() RETURNS integer AS $$
 863     my $tmpfile = "/tmp/badfile";
 864     open my $fh, '&gt;', $tmpfile
 865         or elog(ERROR, qq{could not open the file "$tmpfile": $!});
 866     print $fh "Testing writing to a file\n";
 867     close $fh or elog(ERROR, qq{could not close the file "$tmpfile": $!});
 868     return 1;
 869 $$ LANGUAGE plperl;
 870 </programlisting>
 871     The creation of this function will fail as its use of a forbidden
 872     operation will be caught by the validator.
 873   </para>
 874
 875   <para>
 876    Sometimes it is desirable to write Perl functions that are not
 877    restricted.  For example, one might want a Perl function that sends
 878    mail.  To handle these cases, PL/Perl can also be installed as an
 879    <quote>untrusted</> language (usually called
 880    <application>PL/PerlU</application><indexterm><primary>PL/PerlU</></indexterm>).
 881    In this case the full Perl language is available.  If the
 882    <command>createlang</command> program is used to install the
 883    language, the language name <literal>plperlu</literal> will select
 884    the untrusted PL/Perl variant.
 885   </para>
 886
 887   <para>
 888    The writer of a <application>PL/PerlU</> function must take care that the function
 889    cannot be used to do anything unwanted, since it will be able to do
 890    anything that could be done by a user logged in as the database
 891    administrator.  Note that the database system allows only database
 892    superusers to create functions in untrusted languages.
 893   </para>
 894
 895   <para>
 896    If the above function was created by a superuser using the language
 897    <literal>plperlu</>, execution would succeed.
 898   </para>
 899
 900   <para>
 901    In the same way, anonymous code blocks written in Perl can use
 902    restricted operations if the language is specified as
 903    <literal>plperlu</> rather than <literal>plperl</>, but the caller
 904    must be a superuser.
 905   </para>
 906
 907   <note>
 908     <para>
 909       For security reasons, to stop a leak of privileged operations from
 910       <application>PL/PerlU</> to <application>PL/Perl</>, these two languages
 911       have to run in separate instances of the Perl interpreter. If your
 912       Perl installation has been appropriately compiled, this is not a problem.
 913       However, not all installations are compiled with the requisite flags.
 914       If <productname>PostgreSQL</> detects that this is the case then it will
 915       not start a second interpreter, but instead create an error. In
 916       consequence, in such an installation, you cannot use both
 917       <application>PL/PerlU</> and <application>PL/Perl</> in the same backend
 918       process. The remedy for this is to obtain a Perl installation configured
 919       with the appropriate flags, namely either <literal>usemultiplicity</>
 920       or <literal>useithreads</>. <literal>usemultiplicity</> is preferred
 921       unless you actually need to use threads. For more details, see the
 922       <citerefentry><refentrytitle>perlembed</></citerefentry> man page.
 923     </para>
 924   </note>
 925
 926  </sect1>
 927
 928  <sect1 id="plperl-triggers">
 929   <title>PL/Perl Triggers</title>
 930
 931   <para>
 932    PL/Perl can be used to write trigger functions.  In a trigger function,
 933    the hash reference <varname>$_TD</varname> contains information about the
 934    current trigger event. <varname>$_TD</> is a global variable,
 935    which gets a separate local value for each invocation of the trigger.
 936    The fields of the <varname>$_TD</varname> hash reference are:
 937
 938    <variablelist>
 939     <varlistentry>
 940      <term><literal>$_TD-&gt;{new}{foo}</literal></term>
 941      <listitem>
 942       <para>
 943        <literal>NEW</literal> value of column <literal>foo</literal>
 944       </para>
 945      </listitem>
 946     </varlistentry>
 947
 948     <varlistentry>
 949      <term><literal>$_TD-&gt;{old}{foo}</literal></term>
 950      <listitem>
 951       <para>
 952        <literal>OLD</literal> value of column <literal>foo</literal>
 953       </para>
 954      </listitem>
 955     </varlistentry>
 956
 957     <varlistentry>
 958      <term><literal>$_TD-&gt;{name}</literal></term>
 959      <listitem>
 960       <para>
 961        Name of the trigger being called
 962       </para>
 963      </listitem>
 964     </varlistentry>
 965
 966     <varlistentry>
 967      <term><literal>$_TD-&gt;{event}</literal></term>
 968      <listitem>
 969       <para>
 970        Trigger event: <literal>INSERT</>, <literal>UPDATE</>,
 971        <literal>DELETE</>, <literal>TRUNCATE</>, or <literal>UNKNOWN</>
 972       </para>
 973      </listitem>
 974     </varlistentry>
 975
 976     <varlistentry>
 977      <term><literal>$_TD-&gt;{when}</literal></term>
 978      <listitem>
 979       <para>
 980        When the trigger was called: <literal>BEFORE</literal>, <literal>AFTER</literal>, or <literal>UNKNOWN</literal>
 981       </para>
 982      </listitem>
 983     </varlistentry>
 984
 985     <varlistentry>
 986      <term><literal>$_TD-&gt;{level}</literal></term>
 987      <listitem>
 988       <para>
 989        The trigger level: <literal>ROW</literal>, <literal>STATEMENT</literal>, or <literal>UNKNOWN</literal>
 990       </para>
 991      </listitem>
 992     </varlistentry>
 993
 994     <varlistentry>
 995      <term><literal>$_TD-&gt;{relid}</literal></term>
 996      <listitem>
 997       <para>
 998        OID of the table on which the trigger fired
 999       </para>
1000      </listitem>
1001     </varlistentry>
1002
1003     <varlistentry>
1004      <term><literal>$_TD-&gt;{table_name}</literal></term>
1005      <listitem>
1006       <para>
1007        Name of the table on which the trigger fired
1008       </para>
1009      </listitem>
1010     </varlistentry>
1011
1012     <varlistentry>
1013      <term><literal>$_TD-&gt;{relname}</literal></term>
1014      <listitem>
1015       <para>
1016        Name of the table on which the trigger fired. This has been deprecated,
1017        and could be removed in a future release.
1018        Please use $_TD-&gt;{table_name} instead.
1019       </para>
1020      </listitem>
1021     </varlistentry>
1022
1023     <varlistentry>
1024      <term><literal>$_TD-&gt;{table_schema}</literal></term>
1025      <listitem>
1026       <para>
1027        Name of the schema in which the table on which the trigger fired, is
1028       </para>
1029      </listitem>
1030     </varlistentry>
1031
1032     <varlistentry>
1033      <term><literal>$_TD-&gt;{argc}</literal></term>
1034      <listitem>
1035       <para>
1036        Number of arguments of the trigger function
1037       </para>
1038      </listitem>
1039     </varlistentry>
1040
1041     <varlistentry>
1042      <term><literal>@{$_TD-&gt;{args}}</literal></term>
1043      <listitem>
1044       <para>
1045        Arguments of the trigger function.  Does not exist if <literal>$_TD-&gt;{argc}</literal> is 0.
1046       </para>
1047      </listitem>
1048     </varlistentry>
1049
1050    </variablelist>
1051   </para>
1052
1053   <para>
1054    Row-level triggers can return one of the following:
1055
1056    <variablelist>
1057     <varlistentry>
1058      <term><literal>return;</literal></term>
1059      <listitem>
1060       <para>
1061        Execute the operation
1062       </para>
1063      </listitem>
1064     </varlistentry>
1065
1066     <varlistentry>
1067      <term><literal>"SKIP"</literal></term>
1068      <listitem>
1069       <para>
1070        Don't execute the operation
1071       </para>
1072      </listitem>
1073     </varlistentry>
1074
1075     <varlistentry>
1076      <term><literal>"MODIFY"</literal></term>
1077      <listitem>
1078       <para>
1079        Indicates that the <literal>NEW</literal> row was modified by
1080        the trigger function
1081       </para>
1082      </listitem>
1083     </varlistentry>
1084    </variablelist>
1085   </para>
1086
1087   <para>
1088    Here is an example of a trigger function, illustrating some of the
1089    above:
1090 <programlisting>
1091 CREATE TABLE test (
1092     i int,
1093     v varchar
1094 );
1095
1096 CREATE OR REPLACE FUNCTION valid_id() RETURNS trigger AS $$
1097     if (($_TD-&gt;{new}{i} &gt;= 100) || ($_TD-&gt;{new}{i} &lt;= 0)) {
1098         return "SKIP";    # skip INSERT/UPDATE command
1099     } elsif ($_TD-&gt;{new}{v} ne "immortal") {
1100         $_TD-&gt;{new}{v} .= "(modified by trigger)";
1101         return "MODIFY";  # modify row and execute INSERT/UPDATE command
1102     } else {
1103         return;           # execute INSERT/UPDATE command
1104     }
1105 $$ LANGUAGE plperl;
1106
1107 CREATE TRIGGER test_valid_id_trig
1108     BEFORE INSERT OR UPDATE ON test
1109     FOR EACH ROW EXECUTE PROCEDURE valid_id();
1110 </programlisting>
1111   </para>
1112  </sect1>
1113
1114  <sect1 id="plperl-under-the-hood">
1115   <title>PL/Perl Under the Hood</title>
1116
1117  <sect2 id="plperl-config">
1118   <title>Configuration</title>
1119
1120   <para>
1121   This section lists configuration parameters that affect <application>PL/Perl</>.
1122   To set any of these parameters before <application>PL/Perl</> has been loaded,
1123   it is necessary to have added <quote><literal>plperl</></> to the
1124   <xref linkend="guc-custom-variable-classes"> list in
1125   <filename>postgresql.conf</filename>.
1126   </para>
1127
1128   <variablelist>
1129
1130      <varlistentry id="guc-plperl-on-perl-init" xreflabel="plperl.on_perl_init">
1131       <term><varname>plperl.on_perl_init</varname> (<type>string</type>)</term>
1132       <indexterm>
1133        <primary><varname>plperl.on_perl_init</> configuration parameter</primary>
1134       </indexterm>
1135       <listitem>
1136        <para>
1137        Specifies perl code to be executed when a perl interpreter is first initialized.
1138        The SPI functions are not available when this code is executed.
1139        If the code fails with an error it will abort the initialization of the interpreter
1140        and propagate out to the calling query, causing the current transaction
1141        or subtransaction to be aborted.
1142        </para>
1143        <para>
1144        The perl code is limited to a single string. Longer code can be placed
1145        into a module and loaded by the <literal>on_perl_init</> string.
1146        Examples:
1147 <programlisting>
1148 plplerl.on_perl_init = '$ENV{NYTPROF}="start=no"; require Devel::NYTProf::PgPLPerl'
1149 plplerl.on_perl_init = 'use lib "/my/app"; use MyApp::PgInit;'
1150 </programlisting>
1151        </para>
1152        <para>
1153        Initialization will happen in the postmaster if the plperl library is included
1154        in <literal>shared_preload_libraries</> (see <xref linkend="guc-shared-preload-libraries">),
1155        in which case extra consideration should be given to the risk of destabilizing the postmaster.
1156        </para>
1157        <para>
1158        This parameter can only be set in the postgresql.conf file or on the server command line.
1159        </para>
1160       </listitem>
1161      </varlistentry>
1162
1163      <varlistentry id="guc-plperl-use-strict" xreflabel="plperl.use_strict">
1164       <term><varname>plperl.use_strict</varname> (<type>boolean</type>)</term>
1165       <indexterm>
1166        <primary><varname>plperl.use_strict</> configuration parameter</primary>
1167       </indexterm>
1168       <listitem>
1169        <para>
1170        When set true subsequent compilations of PL/Perl functions have the <literal>strict</> pragma enabled.
1171        This parameter does not affect functions already compiled in the current session.
1172        </para>
1173       </listitem>
1174      </varlistentry>
1175
1176   </variablelist>
1177 </sect2>
1178
1179  <sect2 id="plperl-missing">
1180   <title>Limitations and Missing Features</title>
1181
1182   <para>
1183    The following features are currently missing from PL/Perl, but they
1184    would make welcome contributions.
1185
1186    <itemizedlist>
1187     <listitem>
1188      <para>
1189       PL/Perl functions cannot call each other directly.
1190      </para>
1191     </listitem>
1192
1193     <listitem>
1194      <para>
1195       SPI is not yet fully implemented.
1196      </para>
1197     </listitem>
1198
1199     <listitem>
1200      <para>
1201       If you are fetching very large data sets using
1202       <literal>spi_exec_query</literal>, you should be aware that
1203       these will all go into memory.  You can avoid this by using
1204       <literal>spi_query</literal>/<literal>spi_fetchrow</literal> as
1205       illustrated earlier.
1206      </para>
1207      <para>
1208         A similar problem occurs if a set-returning function passes a
1209         large set of rows back to PostgreSQL via <literal>return</literal>. You
1210         can avoid this problem too by instead using
1211         <literal>return_next</literal> for each row returned, as shown
1212         previously.
1213      </para>
1214     </listitem>
1215
1216      <listitem>
1217       <para>
1218         When a session ends normally, not due to a fatal error, any
1219         <literal>END</> blocks that have been defined are executed.
1220         Currently no other actions are performed. Specifically,
1221         file handles are not automatically flushed and objects are
1222         not automatically destroyed.
1223       </para>
1224      </listitem>
1225    </itemizedlist>
1226   </para>
1227  </sect2>
1228
1229  </sect1>
1230
1231 </chapter>