2 $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.93 2005/01/07 22:40:46 tgl Exp $
6 <title>User-Defined Functions</title>
8 <indexterm zone="xfunc">
9 <primary>function</primary>
10 <secondary>user-defined</secondary>
14 <productname>PostgreSQL</productname> provides four kinds of
20 query language functions (functions written in
21 <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
26 procedural language functions (functions written in, for
27 example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
28 (<xref linkend="xfunc-pl">)
33 internal functions (<xref linkend="xfunc-internal">)
38 C-language functions (<xref linkend="xfunc-c">)
46 of function can take base types, composite types, or
47 combinations of these as arguments (parameters). In addition,
48 every kind of function can return a base type or
49 a composite type. Functions may also be defined to return
50 sets of base or composite values.
54 Many kinds of functions can take or return certain pseudo-types
55 (such as polymorphic types), but the available facilities vary.
56 Consult the description of each kind of function for more details.
60 It's easiest to define <acronym>SQL</acronym>
61 functions, so we'll start by discussing those.
62 Most of the concepts presented for <acronym>SQL</acronym> functions
63 will carry over to the other types of functions.
67 Throughout this chapter, it can be useful to look at the reference
68 page of the <xref linkend="sql-createfunction"
69 endterm="sql-createfunction-title"> command to
70 understand the examples better. Some examples from this chapter
71 can be found in <filename>funcs.sql</filename> and
72 <filename>funcs.c</filename> in the <filename>src/tutorial</>
73 directory in the <productname>PostgreSQL</productname> source
78 <sect1 id="xfunc-sql">
79 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
81 <indexterm zone="xfunc-sql">
82 <primary>function</primary>
83 <secondary>user-defined</secondary>
84 <tertiary>in SQL</tertiary>
88 SQL functions execute an arbitrary list of SQL statements, returning
89 the result of the last query in the list.
90 In the simple (non-set)
91 case, the first row of the last query's result will be returned.
92 (Bear in mind that <quote>the first row</quote> of a multirow
93 result is not well-defined unless you use <literal>ORDER BY</>.)
94 If the last query happens
95 to return no rows at all, the null value will be returned.
99 <indexterm><primary>SETOF</><seealso>function</></> Alternatively,
100 an SQL function may be declared to return a set, by specifying the
101 function's return type as <literal>SETOF
102 <replaceable>sometype</></literal>.<indexterm><primary>SETOF</></>
103 In this case all rows of the last query's result are returned.
104 Further details appear below.
108 The body of an SQL function must be a list of SQL
109 statements separated by semicolons. A semicolon after the last
110 statement is optional. Unless the function is declared to return
111 <type>void</>, the last statement must be a <command>SELECT</>.
115 Any collection of commands in the <acronym>SQL</acronym>
116 language can be packaged together and defined as a function.
117 Besides <command>SELECT</command> queries, the commands can include data
118 modification queries (<command>INSERT</command>,
119 <command>UPDATE</command>, and <command>DELETE</command>), as well as
120 other SQL commands. (The only exception is that you can't put
121 <command>BEGIN</>, <command>COMMIT</>, <command>ROLLBACK</>, or
122 <command>SAVEPOINT</> commands into a <acronym>SQL</acronym> function.)
123 However, the final command
124 must be a <command>SELECT</command> that returns whatever is
125 specified as the function's return type. Alternatively, if you
126 want to define a SQL function that performs actions but has no
127 useful value to return, you can define it as returning <type>void</>.
128 In that case, the function body must not end with a <command>SELECT</command>.
129 For example, this function removes rows with negative salaries from
130 the <literal>emp</> table:
133 CREATE FUNCTION clean_emp() RETURNS void AS '
148 The syntax of the <command>CREATE FUNCTION</command> command requires
149 the function body to be written as a string constant. It is usually
150 most convenient to use dollar quoting (see <xref
151 linkend="sql-syntax-dollar-quoting">) for the string constant.
152 If you choose to use regular single-quoted string constant syntax,
153 you must escape single quote marks (<literal>'</>) and backslashes
154 (<literal>\</>) used in the body of the function, typically by
155 doubling them (see <xref linkend="sql-syntax-strings">).
159 Arguments to the SQL function are referenced in the function
160 body using the syntax <literal>$<replaceable>n</></>: <literal>$1</>
161 refers to the first argument, <literal>$2</> to the second, and so on.
162 If an argument is of a composite type, then the dot notation,
163 e.g., <literal>$1.name</literal>, may be used to access attributes
168 <title><acronym>SQL</acronym> Functions on Base Types</title>
171 The simplest possible <acronym>SQL</acronym> function has no arguments and
172 simply returns a base type, such as <type>integer</type>:
175 CREATE FUNCTION one() RETURNS integer AS $$
179 -- Alternative syntax for string literal:
180 CREATE FUNCTION one() RETURNS integer AS '
193 Notice that we defined a column alias within the function body for the result of the function
194 (with the name <literal>result</>), but this column alias is not visible
195 outside the function. Hence, the result is labeled <literal>one</>
196 instead of <literal>result</>.
200 It is almost as easy to define <acronym>SQL</acronym> functions
201 that take base types as arguments. In the example below, notice
202 how we refer to the arguments within the function as <literal>$1</>
206 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
210 SELECT add_em(1, 2) AS answer;
219 Here is a more useful function, which might be used to debit a
223 CREATE FUNCTION tf1 (integer, numeric) RETURNS integer AS $$
225 SET balance = balance - $2
226 WHERE accountno = $1;
231 A user could execute this function to debit account 17 by $100.00 as
235 SELECT tf1(17, 100.0);
240 In practice one would probably like a more useful result from the
241 function than a constant 1, so a more likely definition
245 CREATE FUNCTION tf1 (integer, numeric) RETURNS numeric AS $$
247 SET balance = balance - $2
248 WHERE accountno = $1;
249 SELECT balance FROM bank WHERE accountno = $1;
253 which adjusts the balance and returns the new balance.
258 <title><acronym>SQL</acronym> Functions on Composite Types</title>
261 When writing functions with arguments of composite
262 types, we must not only specify which
263 argument we want (as we did above with <literal>$1</> and <literal>$2</literal>) but
264 also the desired attribute (field) of that argument. For example,
266 <type>emp</type> is a table containing employee data, and therefore
267 also the name of the composite type of each row of the table. Here
268 is a function <function>double_salary</function> that computes what someone's
269 salary would be if it were doubled:
279 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
280 SELECT $1.salary * 2 AS salary;
283 SELECT name, double_salary(emp.*) AS dream
285 WHERE emp.cubicle ~= point '(2,1)';
294 Notice the use of the syntax <literal>$1.salary</literal>
295 to select one field of the argument row value. Also notice
296 how the calling <command>SELECT</> command uses <literal>*</>
298 the entire current row of a table as a composite value. The table
299 row can alternatively be referenced using just the table name,
302 SELECT name, double_salary(emp) AS dream
304 WHERE emp.cubicle ~= point '(2,1)';
306 but this usage is deprecated since it's easy to get confused.
310 Sometimes it is handy to construct a composite argument value
311 on-the-fly. This can be done with the <literal>ROW</> construct.
312 For example, we could adjust the data being passed to the function:
314 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
320 It is also possible to build a function that returns a composite type.
321 This is an example of a function
322 that returns a single <type>emp</type> row:
325 CREATE FUNCTION new_emp() RETURNS emp AS $$
326 SELECT text 'None' AS name,
329 point '(2,2)' AS cubicle;
333 In this example we have specified each of the attributes
334 with a constant value, but any computation
335 could have been substituted for these constants.
339 Note two important things about defining the function:
344 The select list order in the query must be exactly the same as
345 that in which the columns appear in the table associated
346 with the composite type. (Naming the columns, as we did above,
347 is irrelevant to the system.)
352 You must typecast the expressions to match the
353 definition of the composite type, or you will get errors like this:
356 ERROR: function declared to return emp returns varchar instead of text at column 1
365 A different way to define the same function is:
368 CREATE FUNCTION new_emp() RETURNS emp AS $$
369 SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
373 Here we wrote a <command>SELECT</> that returns just a single
374 column of the correct composite type. This isn't really better
375 in this situation, but it is a handy alternative in some cases
376 — for example, if we need to compute the result by calling
377 another function that returns the desired composite value.
381 We could call this function directly in either of two ways:
387 --------------------------
388 (None,1000.0,25,"(2,2)")
390 SELECT * FROM new_emp();
392 name | salary | age | cubicle
393 ------+--------+-----+---------
394 None | 1000.0 | 25 | (2,2)
397 The second way is described more fully in <xref
398 linkend="xfunc-sql-table-functions">.
402 When you use a function that returns a composite type,
403 you might want only one field (attribute) from its result.
404 You can do that with syntax like this:
407 SELECT (new_emp()).name;
414 The extra parentheses are needed to keep the parser from getting
415 confused. If you try to do it without them, you get something like this:
418 SELECT new_emp().name;
419 ERROR: syntax error at or near "." at character 17
420 LINE 1: SELECT new_emp().name;
426 Another option is to use
427 functional notation for extracting an attribute. The simple way
428 to explain this is that we can use the
429 notations <literal>attribute(table)</> and <literal>table.attribute</>
433 SELECT name(new_emp());
441 -- This is the same as:
442 -- SELECT emp.name AS youngster FROM emp WHERE emp.age < 30;
444 SELECT name(emp) AS youngster FROM emp WHERE age(emp) < 30;
454 Another way to use a function returning a row result is to pass the
455 result to another function that accepts the correct row type as input:
458 CREATE FUNCTION getname(emp) RETURNS text AS $$
462 SELECT getname(new_emp());
471 Another way to use a function that returns a composite type is to
472 call it as a table function, as described below.
476 <sect2 id="xfunc-sql-table-functions">
477 <title><acronym>SQL</acronym> Functions as Table Sources</title>
480 All SQL functions may be used in the <literal>FROM</> clause of a query,
481 but it is particularly useful for functions returning composite types.
482 If the function is defined to return a base type, the table function
483 produces a one-column table. If the function is defined to return
484 a composite type, the table function produces a column for each attribute
485 of the composite type.
492 CREATE TABLE foo (fooid int, foosubid int, fooname text);
493 INSERT INTO foo VALUES (1, 1, 'Joe');
494 INSERT INTO foo VALUES (1, 2, 'Ed');
495 INSERT INTO foo VALUES (2, 1, 'Mary');
497 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
498 SELECT * FROM foo WHERE fooid = $1;
501 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
503 fooid | foosubid | fooname | upper
504 -------+----------+---------+-------
509 As the example shows, we can work with the columns of the function's
510 result just the same as if they were columns of a regular table.
514 Note that we only got one row out of the function. This is because
515 we did not use <literal>SETOF</>. That is described in the next section.
520 <title><acronym>SQL</acronym> Functions Returning Sets</title>
523 When an SQL function is declared as returning <literal>SETOF
524 <replaceable>sometype</></literal>, the function's final
525 <command>SELECT</> query is executed to completion, and each row it
526 outputs is returned as an element of the result set.
530 This feature is normally used when calling the function in the <literal>FROM</>
531 clause. In this case each row returned by the function becomes
532 a row of the table seen by the query. For example, assume that
533 table <literal>foo</> has the same contents as above, and we say:
536 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
537 SELECT * FROM foo WHERE fooid = $1;
540 SELECT * FROM getfoo(1) AS t1;
545 fooid | foosubid | fooname
546 -------+----------+---------
554 Currently, functions returning sets may also be called in the select list
555 of a query. For each row that the query
556 generates by itself, the function returning set is invoked, and an output
557 row is generated for each element of the function's result set. Note,
558 however, that this capability is deprecated and may be removed in future
559 releases. The following is an example function returning a set from the
563 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
564 SELECT name FROM nodes WHERE parent = $1
578 SELECT listchildren('Top');
586 SELECT name, listchildren(name) FROM nodes;
588 --------+--------------
597 In the last <command>SELECT</command>,
598 notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
599 This happens because <function>listchildren</function> returns an empty set
600 for those arguments, so no result rows are generated.
605 <title>Polymorphic <acronym>SQL</acronym> Functions</title>
608 <acronym>SQL</acronym> functions may be declared to accept and
609 return the polymorphic types <type>anyelement</type> and
610 <type>anyarray</type>. See <xref
611 linkend="extend-types-polymorphic"> for a more detailed
612 explanation of polymorphic functions. Here is a polymorphic
613 function <function>make_array</function> that builds up an array
614 from two arbitrary data type elements:
616 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
617 SELECT ARRAY[$1, $2];
620 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
622 ----------+-----------
629 Notice the use of the typecast <literal>'a'::text</literal>
630 to specify that the argument is of type <type>text</type>. This is
631 required if the argument is just a string literal, since otherwise
632 it would be treated as type
633 <type>unknown</type>, and array of <type>unknown</type> is not a valid
635 Without the typecast, you will get errors like this:
638 ERROR: could not determine "anyarray"/"anyelement" type because input has type "unknown"
644 It is permitted to have polymorphic arguments with a fixed
645 return type, but the converse is not. For example:
647 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
651 SELECT is_greater(1, 2);
657 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
660 ERROR: cannot determine result data type
661 DETAIL: A function returning "anyarray" or "anyelement" must have at least one argument of either type.
667 <sect1 id="xfunc-overload">
668 <title>Function Overloading</title>
670 <indexterm zone="xfunc-overload">
671 <primary>overloading</primary>
672 <secondary>functions</secondary>
676 More than one function may be defined with the same SQL name, so long
677 as the arguments they take are different. In other words,
678 function names can be <firstterm>overloaded</firstterm>. When a
679 query is executed, the server will determine which function to
680 call from the data types and the number of the provided arguments.
681 Overloading can also be used to simulate functions with a variable
682 number of arguments, up to a finite maximum number.
686 When creating a family of overloaded functions, one should be
687 careful not to create ambiguities. For instance, given the
690 CREATE FUNCTION test(int, real) RETURNS ...
691 CREATE FUNCTION test(smallint, double precision) RETURNS ...
693 it is not immediately clear which function would be called with
694 some trivial input like <literal>test(1, 1.5)</literal>. The
695 currently implemented resolution rules are described in
696 <xref linkend="typeconv">, but it is unwise to design a system that subtly
697 relies on this behavior.
701 A function that takes a single argument of a composite type should
702 generally not have the same name as any attribute (field) of that type.
703 Recall that <literal>attribute(table)</literal> is considered equivalent
704 to <literal>table.attribute</literal>. In the case that there is an
705 ambiguity between a function on a composite type and an attribute of
706 the composite type, the attribute will always be used. It is possible
707 to override that choice by schema-qualifying the function name
708 (that is, <literal>schema.func(table)</literal>) but it's better to
709 avoid the problem by not choosing conflicting names.
713 When overloading C-language functions, there is an additional
714 constraint: The C name of each function in the family of
715 overloaded functions must be different from the C names of all
716 other functions, either internal or dynamically loaded. If this
717 rule is violated, the behavior is not portable. You might get a
718 run-time linker error, or one of the functions will get called
719 (usually the internal one). The alternative form of the
720 <literal>AS</> clause for the SQL <command>CREATE
721 FUNCTION</command> command decouples the SQL function name from
722 the function name in the C source code. For instance,
724 CREATE FUNCTION test(int) RETURNS int
725 AS '<replaceable>filename</>', 'test_1arg'
727 CREATE FUNCTION test(int, int) RETURNS int
728 AS '<replaceable>filename</>', 'test_2arg'
731 The names of the C functions here reflect one of many possible conventions.
735 <sect1 id="xfunc-volatility">
736 <title>Function Volatility Categories</title>
738 <indexterm zone="xfunc-volatility">
739 <primary>volatility</primary>
740 <secondary>functions</secondary>
744 Every function has a <firstterm>volatility</> classification, with
745 the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
746 <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
747 <command>CREATE FUNCTION</command> command does not specify a category.
748 The volatility category is a promise to the optimizer about the behavior
754 A <literal>VOLATILE</> function can do anything, including modifying
755 the database. It can return different results on successive calls with
756 the same arguments. The optimizer makes no assumptions about the
757 behavior of such functions. A query using a volatile function will
758 re-evaluate the function at every row where its value is needed.
763 A <literal>STABLE</> function cannot modify the database and is
764 guaranteed to return the same results given the same arguments
765 for all calls within a single surrounding query. This category
766 allows the optimizer to optimize away multiple calls of the function
767 within a single query. In particular, it is safe to use an expression
768 containing such a function in an index scan condition. (Since an
769 index scan will evaluate the comparison value only once, not once at
770 each row, it is not valid to use a <literal>VOLATILE</> function in
771 an index scan condition.)
776 An <literal>IMMUTABLE</> function cannot modify the database and is
777 guaranteed to return the same results given the same arguments forever.
778 This category allows the optimizer to pre-evaluate the function when
779 a query calls it with constant arguments. For example, a query like
780 <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
781 <literal>SELECT ... WHERE x = 4</>, because the function underlying
782 the integer addition operator is marked <literal>IMMUTABLE</>.
789 For best optimization results, you should label your functions with the
790 strictest volatility category that is valid for them.
794 Any function with side-effects <emphasis>must</> be labeled
795 <literal>VOLATILE</>, so that calls to it cannot be optimized away.
796 Even a function with no side-effects needs to be labeled
797 <literal>VOLATILE</> if its value can change within a single query;
798 some examples are <literal>random()</>, <literal>currval()</>,
799 <literal>timeofday()</>.
803 There is relatively little difference between <literal>STABLE</> and
804 <literal>IMMUTABLE</> categories when considering simple interactive
805 queries that are planned and immediately executed: it doesn't matter
806 a lot whether a function is executed once during planning or once during
807 query execution startup. But there is a big difference if the plan is
808 saved and reused later. Labeling a function <literal>IMMUTABLE</> when
809 it really isn't may allow it to be prematurely folded to a constant during
810 planning, resulting in a stale value being re-used during subsequent uses
811 of the plan. This is a hazard when using prepared statements or when
812 using function languages that cache plans (such as
813 <application>PL/pgSQL</>).
817 Because of the snapshotting behavior of MVCC (see <xref linkend="mvcc">)
818 a function containing only <command>SELECT</> commands can safely be
819 marked <literal>STABLE</>, even if it selects from tables that might be
820 undergoing modifications by concurrent queries.
821 <productname>PostgreSQL</productname> will execute a <literal>STABLE</>
822 function using the snapshot established for the calling query, and so it
823 will see a fixed view of the database throughout that query.
825 that the <function>current_timestamp</> family of functions qualify
826 as stable, since their values do not change within a transaction.
830 The same snapshotting behavior is used for <command>SELECT</> commands
831 within <literal>IMMUTABLE</> functions. It is generally unwise to select
832 from database tables within an <literal>IMMUTABLE</> function at all,
833 since the immutability will be broken if the table contents ever change.
834 However, <productname>PostgreSQL</productname> does not enforce that you
839 A common error is to label a function <literal>IMMUTABLE</> when its
840 results depend on a configuration parameter. For example, a function
841 that manipulates timestamps might well have results that depend on the
842 <xref linkend="guc-timezone"> setting. For safety, such functions should
843 be labeled <literal>STABLE</> instead.
848 Before <productname>PostgreSQL</productname> release 8.0, the requirement
849 that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify
850 the database was not enforced by the system. Release 8.0 enforces it
851 by requiring SQL functions and procedural language functions of these
852 categories to contain no SQL commands other than <command>SELECT</>.
853 (This is not a completely bulletproof test, since such functions could
854 still call <literal>VOLATILE</> functions that modify the database.
855 If you do that, you will find that the <literal>STABLE</> or
856 <literal>IMMUTABLE</> function does not notice the database changes
857 applied by the called function.)
862 <sect1 id="xfunc-pl">
863 <title>Procedural Language Functions</title>
866 <productname>PostgreSQL</productname> allows user-defined functions
867 to be written in other languages besides SQL and C. These other
868 languages are generically called <firstterm>procedural
869 languages</firstterm> (<acronym>PL</>s).
870 Procedural languages aren't built into the
871 <productname>PostgreSQL</productname> server; they are offered
873 See <xref linkend="xplang"> and following chapters for more
878 <sect1 id="xfunc-internal">
879 <title>Internal Functions</title>
881 <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
884 Internal functions are functions written in C that have been statically
885 linked into the <productname>PostgreSQL</productname> server.
886 The <quote>body</quote> of the function definition
887 specifies the C-language name of the function, which need not be the
888 same as the name being declared for SQL use.
889 (For reasons of backwards compatibility, an empty body
890 is accepted as meaning that the C-language function name is the
891 same as the SQL name.)
895 Normally, all internal functions present in the
896 server are declared during the initialization of the database cluster (<command>initdb</command>),
897 but a user could use <command>CREATE FUNCTION</command>
898 to create additional alias names for an internal function.
899 Internal functions are declared in <command>CREATE FUNCTION</command>
900 with language name <literal>internal</literal>. For instance, to
901 create an alias for the <function>sqrt</function> function:
903 CREATE FUNCTION square_root(double precision) RETURNS double precision
908 (Most internal functions expect to be declared <quote>strict</quote>.)
913 Not all <quote>predefined</quote> functions are
914 <quote>internal</quote> in the above sense. Some predefined
915 functions are written in SQL.
921 <title>C-Language Functions</title>
923 <indexterm zone="xfunc-sql">
924 <primary>function</primary>
925 <secondary>user-defined</secondary>
926 <tertiary>in C</tertiary>
930 User-defined functions can be written in C (or a language that can
931 be made compatible with C, such as C++). Such functions are
932 compiled into dynamically loadable objects (also called shared
933 libraries) and are loaded by the server on demand. The dynamic
934 loading feature is what distinguishes <quote>C language</> functions
935 from <quote>internal</> functions — the actual coding conventions
936 are essentially the same for both. (Hence, the standard internal
937 function library is a rich source of coding examples for user-defined
942 Two different calling conventions are currently used for C functions.
943 The newer <quote>version 1</quote> calling convention is indicated by writing
944 a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
945 as illustrated below. Lack of such a macro indicates an old-style
946 (<quote>version 0</quote>) function. The language name specified in <command>CREATE FUNCTION</command>
947 is <literal>C</literal> in either case. Old-style functions are now deprecated
948 because of portability problems and lack of functionality, but they
949 are still supported for compatibility reasons.
952 <sect2 id="xfunc-c-dynload">
953 <title>Dynamic Loading</title>
955 <indexterm zone="xfunc-c-dynload">
956 <primary>dynamic loading</primary>
960 The first time a user-defined function in a particular
961 loadable object file is called in a session,
962 the dynamic loader loads that object file into memory so that the
963 function can be called. The <command>CREATE FUNCTION</command>
964 for a user-defined C function must therefore specify two pieces of
965 information for the function: the name of the loadable
966 object file, and the C name (link symbol) of the specific function to call
967 within that object file. If the C name is not explicitly specified then
968 it is assumed to be the same as the SQL function name.
972 The following algorithm is used to locate the shared object file
973 based on the name given in the <command>CREATE FUNCTION</command>
979 If the name is an absolute path, the given file is loaded.
985 If the name starts with the string <literal>$libdir</literal>,
986 that part is replaced by the <productname>PostgreSQL</> package
988 name, which is determined at build time.<indexterm><primary>$libdir</></>
994 If the name does not contain a directory part, the file is
995 searched for in the path specified by the configuration variable
996 <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1002 Otherwise (the file was not found in the path, or it contains a
1003 non-absolute directory part), the dynamic loader will try to
1004 take the name as given, which will most likely fail. (It is
1005 unreliable to depend on the current working directory.)
1010 If this sequence does not work, the platform-specific shared
1011 library file name extension (often <filename>.so</filename>) is
1012 appended to the given name and this sequence is tried again. If
1013 that fails as well, the load will fail.
1017 The user ID the <productname>PostgreSQL</productname> server runs
1018 as must be able to traverse the path to the file you intend to
1019 load. Making the file or a higher-level directory not readable
1020 and/or not executable by the <systemitem>postgres</systemitem>
1021 user is a common mistake.
1025 In any case, the file name that is given in the
1026 <command>CREATE FUNCTION</command> command is recorded literally
1027 in the system catalogs, so if the file needs to be loaded again
1028 the same procedure is applied.
1033 <productname>PostgreSQL</productname> will not compile a C function
1034 automatically. The object file must be compiled before it is referenced
1035 in a <command>CREATE
1036 FUNCTION</> command. See <xref linkend="dfunc"> for additional
1042 After it is used for the first time, a dynamically loaded object
1043 file is retained in memory. Future calls in the same session to
1044 the function(s) in that file will only incur the small overhead of
1045 a symbol table lookup. If you need to force a reload of an object
1046 file, for example after recompiling it, use the <command>LOAD</>
1047 command or begin a fresh session.
1051 It is recommended to locate shared libraries either relative to
1052 <literal>$libdir</literal> or through the dynamic library path.
1053 This simplifies version upgrades if the new installation is at a
1054 different location. The actual directory that
1055 <literal>$libdir</literal> stands for can be found out with the
1056 command <literal>pg_config --pkglibdir</literal>.
1060 Before <productname>PostgreSQL</productname> release 7.2, only
1061 exact absolute paths to object files could be specified in
1062 <command>CREATE FUNCTION</>. This approach is now deprecated
1063 since it makes the function definition unnecessarily unportable.
1064 It's best to specify just the shared library name with no path nor
1065 extension, and let the search mechanism provide that information
1070 <sect2 id="xfunc-c-basetype">
1071 <title>Base Types in C-Language Functions</title>
1073 <indexterm zone="xfunc-c-basetype">
1074 <primary>data type</primary>
1075 <secondary>internal organisation</secondary>
1079 To know how to write C-language functions, you need to know how
1080 <productname>PostgreSQL</productname> internally represents base
1081 data types and how they can be passed to and from functions.
1082 Internally, <productname>PostgreSQL</productname> regards a base
1083 type as a <quote>blob of memory</quote>. The user-defined
1084 functions that you define over a type in turn define the way that
1085 <productname>PostgreSQL</productname> can operate on it. That
1086 is, <productname>PostgreSQL</productname> will only store and
1087 retrieve the data from disk and use your user-defined functions
1088 to input, process, and output the data.
1092 Base types can have one of three internal formats:
1097 pass by value, fixed-length
1102 pass by reference, fixed-length
1107 pass by reference, variable-length
1114 By-value types can only be 1, 2, or 4 bytes in length
1115 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1116 You should be careful
1117 to define your types such that they will be the same
1118 size (in bytes) on all architectures. For example, the
1119 <literal>long</literal> type is dangerous because it
1120 is 4 bytes on some machines and 8 bytes on others, whereas
1121 <type>int</type> type is 4 bytes on most
1122 Unix machines. A reasonable implementation of
1123 the <type>int4</type> type on Unix
1127 /* 4-byte integer, passed by value */
1133 On the other hand, fixed-length types of any size may
1134 be passed by-reference. For example, here is a sample
1135 implementation of a <productname>PostgreSQL</productname> type:
1138 /* 16-byte structure, passed by reference */
1145 Only pointers to such types can be used when passing
1146 them in and out of <productname>PostgreSQL</productname> functions.
1147 To return a value of such a type, allocate the right amount of
1148 memory with <literal>palloc</literal>, fill in the allocated memory,
1149 and return a pointer to it. (You can also return an input value
1150 that has the same type as the return value directly by returning
1151 the pointer to the input value. <emphasis>Never</> modify the
1152 contents of a pass-by-reference input value, however.)
1156 Finally, all variable-length types must also be passed
1157 by reference. All variable-length types must begin
1158 with a length field of exactly 4 bytes, and all data to
1159 be stored within that type must be located in the memory
1160 immediately following that length field. The
1161 length field contains the total length of the structure,
1162 that is, it includes the size of the length field
1167 As an example, we can define the type <type>text</type> as
1177 Obviously, the data field declared here is not long enough to hold
1178 all possible strings. Since it's impossible to declare a variable-size
1179 structure in <acronym>C</acronym>, we rely on the knowledge that the
1180 <acronym>C</acronym> compiler won't range-check array subscripts. We
1181 just allocate the necessary amount of space and then access the array as
1182 if it were declared the right length. (This is a common trick, which
1183 you can read about in many textbooks about C.)
1188 variable-length types, we must be careful to allocate
1189 the correct amount of memory and set the length field correctly.
1190 For example, if we wanted to store 40 bytes in a <structname>text</>
1191 structure, we might use a code fragment like this:
1194 #include "postgres.h"
1196 char buffer[40]; /* our source data */
1198 text *destination = (text *) palloc(VARHDRSZ + 40);
1199 destination->length = VARHDRSZ + 40;
1200 memcpy(destination->data, buffer, 40);
1204 <literal>VARHDRSZ</> is the same as <literal>sizeof(int4)</>, but
1205 it's considered good style to use the macro <literal>VARHDRSZ</>
1206 to refer to the size of the overhead for a variable-length type.
1210 <xref linkend="xfunc-c-type-table"> specifies which C type
1211 corresponds to which SQL type when writing a C-language function
1212 that uses a built-in type of <productname>PostgreSQL</>.
1213 The <quote>Defined In</quote> column gives the header file that
1214 needs to be included to get the type definition. (The actual
1215 definition may be in a different file that is included by the
1216 listed file. It is recommended that users stick to the defined
1217 interface.) Note that you should always include
1218 <filename>postgres.h</filename> first in any source file, because
1219 it declares a number of things that you will need anyway.
1222 <table tocentry="1" id="xfunc-c-type-table">
1223 <title>Equivalent C Types for Built-In SQL Types</title>
1240 <entry><type>abstime</type></entry>
1241 <entry><type>AbsoluteTime</type></entry>
1242 <entry><filename>utils/nabstime.h</filename></entry>
1245 <entry><type>boolean</type></entry>
1246 <entry><type>bool</type></entry>
1247 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1250 <entry><type>box</type></entry>
1251 <entry><type>BOX*</type></entry>
1252 <entry><filename>utils/geo_decls.h</filename></entry>
1255 <entry><type>bytea</type></entry>
1256 <entry><type>bytea*</type></entry>
1257 <entry><filename>postgres.h</filename></entry>
1260 <entry><type>"char"</type></entry>
1261 <entry><type>char</type></entry>
1262 <entry>(compiler built-in)</entry>
1265 <entry><type>character</type></entry>
1266 <entry><type>BpChar*</type></entry>
1267 <entry><filename>postgres.h</filename></entry>
1270 <entry><type>cid</type></entry>
1271 <entry><type>CommandId</type></entry>
1272 <entry><filename>postgres.h</filename></entry>
1275 <entry><type>date</type></entry>
1276 <entry><type>DateADT</type></entry>
1277 <entry><filename>utils/date.h</filename></entry>
1280 <entry><type>smallint</type> (<type>int2</type>)</entry>
1281 <entry><type>int2</type> or <type>int16</type></entry>
1282 <entry><filename>postgres.h</filename></entry>
1285 <entry><type>int2vector</type></entry>
1286 <entry><type>int2vector*</type></entry>
1287 <entry><filename>postgres.h</filename></entry>
1290 <entry><type>integer</type> (<type>int4</type>)</entry>
1291 <entry><type>int4</type> or <type>int32</type></entry>
1292 <entry><filename>postgres.h</filename></entry>
1295 <entry><type>real</type> (<type>float4</type>)</entry>
1296 <entry><type>float4*</type></entry>
1297 <entry><filename>postgres.h</filename></entry>
1300 <entry><type>double precision</type> (<type>float8</type>)</entry>
1301 <entry><type>float8*</type></entry>
1302 <entry><filename>postgres.h</filename></entry>
1305 <entry><type>interval</type></entry>
1306 <entry><type>Interval*</type></entry>
1307 <entry><filename>utils/timestamp.h</filename></entry>
1310 <entry><type>lseg</type></entry>
1311 <entry><type>LSEG*</type></entry>
1312 <entry><filename>utils/geo_decls.h</filename></entry>
1315 <entry><type>name</type></entry>
1316 <entry><type>Name</type></entry>
1317 <entry><filename>postgres.h</filename></entry>
1320 <entry><type>oid</type></entry>
1321 <entry><type>Oid</type></entry>
1322 <entry><filename>postgres.h</filename></entry>
1325 <entry><type>oidvector</type></entry>
1326 <entry><type>oidvector*</type></entry>
1327 <entry><filename>postgres.h</filename></entry>
1330 <entry><type>path</type></entry>
1331 <entry><type>PATH*</type></entry>
1332 <entry><filename>utils/geo_decls.h</filename></entry>
1335 <entry><type>point</type></entry>
1336 <entry><type>POINT*</type></entry>
1337 <entry><filename>utils/geo_decls.h</filename></entry>
1340 <entry><type>regproc</type></entry>
1341 <entry><type>regproc</type></entry>
1342 <entry><filename>postgres.h</filename></entry>
1345 <entry><type>reltime</type></entry>
1346 <entry><type>RelativeTime</type></entry>
1347 <entry><filename>utils/nabstime.h</filename></entry>
1350 <entry><type>text</type></entry>
1351 <entry><type>text*</type></entry>
1352 <entry><filename>postgres.h</filename></entry>
1355 <entry><type>tid</type></entry>
1356 <entry><type>ItemPointer</type></entry>
1357 <entry><filename>storage/itemptr.h</filename></entry>
1360 <entry><type>time</type></entry>
1361 <entry><type>TimeADT</type></entry>
1362 <entry><filename>utils/date.h</filename></entry>
1365 <entry><type>time with time zone</type></entry>
1366 <entry><type>TimeTzADT</type></entry>
1367 <entry><filename>utils/date.h</filename></entry>
1370 <entry><type>timestamp</type></entry>
1371 <entry><type>Timestamp*</type></entry>
1372 <entry><filename>utils/timestamp.h</filename></entry>
1375 <entry><type>tinterval</type></entry>
1376 <entry><type>TimeInterval</type></entry>
1377 <entry><filename>utils/nabstime.h</filename></entry>
1380 <entry><type>varchar</type></entry>
1381 <entry><type>VarChar*</type></entry>
1382 <entry><filename>postgres.h</filename></entry>
1385 <entry><type>xid</type></entry>
1386 <entry><type>TransactionId</type></entry>
1387 <entry><filename>postgres.h</filename></entry>
1394 Now that we've gone over all of the possible structures
1395 for base types, we can show some examples of real functions.
1400 <title>Calling Conventions Version 0 for C-Language Functions</title>
1403 We present the <quote>old style</quote> calling convention first — although
1404 this approach is now deprecated, it's easier to get a handle on
1405 initially. In the version-0 method, the arguments and result
1406 of the C function are just declared in normal C style, but being
1407 careful to use the C representation of each SQL data type as shown
1412 Here are some examples:
1415 #include "postgres.h"
1416 #include <string.h>
1426 /* by reference, fixed length */
1429 add_one_float8(float8 *arg)
1431 float8 *result = (float8 *) palloc(sizeof(float8));
1433 *result = *arg + 1.0;
1439 makepoint(Point *pointx, Point *pointy)
1441 Point *new_point = (Point *) palloc(sizeof(Point));
1443 new_point->x = pointx->x;
1444 new_point->y = pointy->y;
1449 /* by reference, variable length */
1455 * VARSIZE is the total size of the struct in bytes.
1457 text *new_t = (text *) palloc(VARSIZE(t));
1458 VARATT_SIZEP(new_t) = VARSIZE(t);
1460 * VARDATA is a pointer to the data region of the struct.
1462 memcpy((void *) VARDATA(new_t), /* destination */
1463 (void *) VARDATA(t), /* source */
1464 VARSIZE(t)-VARHDRSZ); /* how many bytes */
1469 concat_text(text *arg1, text *arg2)
1471 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1472 text *new_text = (text *) palloc(new_text_size);
1474 VARATT_SIZEP(new_text) = new_text_size;
1475 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1476 memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1477 VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1484 Supposing that the above code has been prepared in file
1485 <filename>funcs.c</filename> and compiled into a shared object,
1486 we could define the functions to <productname>PostgreSQL</productname>
1487 with commands like this:
1490 CREATE FUNCTION add_one(integer) RETURNS integer
1491 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
1494 -- note overloading of SQL function name "add_one"
1495 CREATE FUNCTION add_one(double precision) RETURNS double precision
1496 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
1499 CREATE FUNCTION makepoint(point, point) RETURNS point
1500 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
1503 CREATE FUNCTION copytext(text) RETURNS text
1504 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
1507 CREATE FUNCTION concat_text(text, text) RETURNS text
1508 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text',
1514 Here, <replaceable>DIRECTORY</replaceable> stands for the
1515 directory of the shared library file (for instance the
1516 <productname>PostgreSQL</productname> tutorial directory, which
1517 contains the code for the examples used in this section).
1518 (Better style would be to use just <literal>'funcs'</> in the
1519 <literal>AS</> clause, after having added
1520 <replaceable>DIRECTORY</replaceable> to the search path. In any
1521 case, we may omit the system-specific extension for a shared
1522 library, commonly <literal>.so</literal> or
1523 <literal>.sl</literal>.)
1527 Notice that we have specified the functions as <quote>strict</quote>,
1529 the system should automatically assume a null result if any input
1530 value is null. By doing this, we avoid having to check for null inputs
1531 in the function code. Without this, we'd have to check for null values
1532 explicitly, by checking for a null pointer for each
1533 pass-by-reference argument. (For pass-by-value arguments, we don't
1534 even have a way to check!)
1538 Although this calling convention is simple to use,
1539 it is not very portable; on some architectures there are problems
1540 with passing data types that are smaller than <type>int</type> this way. Also, there is
1541 no simple way to return a null result, nor to cope with null arguments
1542 in any way other than making the function strict. The version-1
1543 convention, presented next, overcomes these objections.
1548 <title>Calling Conventions Version 1 for C-Language Functions</title>
1551 The version-1 calling convention relies on macros to suppress most
1552 of the complexity of passing arguments and results. The C declaration
1553 of a version-1 function is always
1555 Datum funcname(PG_FUNCTION_ARGS)
1557 In addition, the macro call
1559 PG_FUNCTION_INFO_V1(funcname);
1561 must appear in the same source file. (Conventionally. it's
1562 written just before the function itself.) This macro call is not
1563 needed for <literal>internal</>-language functions, since
1564 <productname>PostgreSQL</> assumes that all internal functions
1565 use the version-1 convention. It is, however, required for
1566 dynamically-loaded functions.
1570 In a version-1 function, each actual argument is fetched using a
1571 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1572 macro that corresponds to the argument's data type, and the
1573 result is returned using a
1574 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
1575 macro for the return type.
1576 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1577 takes as its argument the number of the function argument to
1578 fetch, where the count starts at 0.
1579 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
1580 takes as its argument the actual value to return.
1584 Here we show the same functions as above, coded in version-1 style:
1587 #include "postgres.h"
1588 #include <string.h>
1593 PG_FUNCTION_INFO_V1(add_one);
1596 add_one(PG_FUNCTION_ARGS)
1598 int32 arg = PG_GETARG_INT32(0);
1600 PG_RETURN_INT32(arg + 1);
1603 /* b reference, fixed length */
1605 PG_FUNCTION_INFO_V1(add_one_float8);
1608 add_one_float8(PG_FUNCTION_ARGS)
1610 /* The macros for FLOAT8 hide its pass-by-reference nature. */
1611 float8 arg = PG_GETARG_FLOAT8(0);
1613 PG_RETURN_FLOAT8(arg + 1.0);
1616 PG_FUNCTION_INFO_V1(makepoint);
1619 makepoint(PG_FUNCTION_ARGS)
1621 /* Here, the pass-by-reference nature of Point is not hidden. */
1622 Point *pointx = PG_GETARG_POINT_P(0);
1623 Point *pointy = PG_GETARG_POINT_P(1);
1624 Point *new_point = (Point *) palloc(sizeof(Point));
1626 new_point->x = pointx->x;
1627 new_point->y = pointy->y;
1629 PG_RETURN_POINT_P(new_point);
1632 /* by reference, variable length */
1634 PG_FUNCTION_INFO_V1(copytext);
1637 copytext(PG_FUNCTION_ARGS)
1639 text *t = PG_GETARG_TEXT_P(0);
1641 * VARSIZE is the total size of the struct in bytes.
1643 text *new_t = (text *) palloc(VARSIZE(t));
1644 VARATT_SIZEP(new_t) = VARSIZE(t);
1646 * VARDATA is a pointer to the data region of the struct.
1648 memcpy((void *) VARDATA(new_t), /* destination */
1649 (void *) VARDATA(t), /* source */
1650 VARSIZE(t)-VARHDRSZ); /* how many bytes */
1651 PG_RETURN_TEXT_P(new_t);
1654 PG_FUNCTION_INFO_V1(concat_text);
1657 concat_text(PG_FUNCTION_ARGS)
1659 text *arg1 = PG_GETARG_TEXT_P(0);
1660 text *arg2 = PG_GETARG_TEXT_P(1);
1661 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1662 text *new_text = (text *) palloc(new_text_size);
1664 VARATT_SIZEP(new_text) = new_text_size;
1665 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1666 memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1667 VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1668 PG_RETURN_TEXT_P(new_text);
1674 The <command>CREATE FUNCTION</command> commands are the same as
1675 for the version-0 equivalents.
1679 At first glance, the version-1 coding conventions may appear to
1680 be just pointless obscurantism. They do, however, offer a number
1681 of improvements, because the macros can hide unnecessary detail.
1682 An example is that in coding <function>add_one_float8</>, we no longer need to
1683 be aware that <type>float8</type> is a pass-by-reference type. Another
1684 example is that the <literal>GETARG</> macros for variable-length types allow
1685 for more efficient fetching of <quote>toasted</quote> (compressed or
1686 out-of-line) values.
1690 One big improvement in version-1 functions is better handling of null
1691 inputs and results. The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
1692 allows a function to test whether each input is null. (Of course, doing
1693 this is only necessary in functions not declared <quote>strict</>.)
1695 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
1696 the input arguments are counted beginning at zero. Note that one
1697 should refrain from executing
1698 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
1699 one has verified that the argument isn't null.
1700 To return a null result, execute <function>PG_RETURN_NULL()</function>;
1701 this works in both strict and nonstrict functions.
1705 Other options provided in the new-style interface are two
1707 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1708 macros. The first of these,
1709 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
1710 guarantees to return a copy of the specified argument that is
1711 safe for writing into. (The normal macros will sometimes return a
1712 pointer to a value that is physically stored in a table, which
1713 must not be written to. Using the
1714 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
1715 macros guarantees a writable result.)
1716 The second variant consists of the
1717 <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
1718 macros which take three arguments. The first is the number of the
1719 function argument (as above). The second and third are the offset and
1720 length of the segment to be returned. Offsets are counted from
1721 zero, and a negative length requests that the remainder of the
1722 value be returned. These macros provide more efficient access to
1723 parts of large values in the case where they have storage type
1724 <quote>external</quote>. (The storage type of a column can be specified using
1725 <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
1726 COLUMN <replaceable>colname</replaceable> SET STORAGE
1727 <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
1728 <literal>plain</>, <literal>external</>, <literal>extended</literal>,
1729 or <literal>main</>.)
1733 Finally, the version-1 function call conventions make it possible
1734 to return set results (<xref linkend="xfunc-c-return-set">) and
1735 implement trigger functions (<xref linkend="triggers">) and
1736 procedural-language call handlers (<xref
1737 linkend="plhandler">). Version-1 code is also more
1738 portable than version-0, because it does not break restrictions
1739 on function call protocol in the C standard. For more details
1740 see <filename>src/backend/utils/fmgr/README</filename> in the
1741 source distribution.
1746 <title>Writing Code</title>
1749 Before we turn to the more advanced topics, we should discuss
1750 some coding rules for <productname>PostgreSQL</productname>
1751 C-language functions. While it may be possible to load functions
1752 written in languages other than C into
1753 <productname>PostgreSQL</productname>, this is usually difficult
1754 (when it is possible at all) because other languages, such as
1755 C++, FORTRAN, or Pascal often do not follow the same calling
1756 convention as C. That is, other languages do not pass argument
1757 and return values between functions in the same way. For this
1758 reason, we will assume that your C-language functions are
1759 actually written in C.
1763 The basic rules for writing and building C functions are as follows:
1768 Use <literal>pg_config
1769 --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
1770 to find out where the <productname>PostgreSQL</> server header
1771 files are installed on your system (or the system that your
1772 users will be running on). This option is new with
1773 <productname>PostgreSQL</> 7.2. For
1774 <productname>PostgreSQL</> 7.1 you should use the option
1775 <option>--includedir</option>. (<command>pg_config</command>
1776 will exit with a non-zero status if it encounters an unknown
1777 option.) For releases prior to 7.1 you will have to guess,
1778 but since that was before the current calling conventions were
1779 introduced, it is unlikely that you want to support those
1786 When allocating memory, use the
1787 <productname>PostgreSQL</productname> functions
1788 <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
1789 instead of the corresponding C library functions
1790 <function>malloc</function> and <function>free</function>.
1791 The memory allocated by <function>palloc</function> will be
1792 freed automatically at the end of each transaction, preventing
1799 Always zero the bytes of your structures using
1800 <function>memset</function>. Without this, it's difficult to
1801 support hash indexes or hash joins, as you must pick out only
1802 the significant bits of your data structure to compute a hash.
1803 Even if you initialize all fields of your structure, there may be
1804 alignment padding (holes in the structure) that may contain
1811 Most of the internal <productname>PostgreSQL</productname>
1812 types are declared in <filename>postgres.h</filename>, while
1813 the function manager interfaces
1814 (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
1815 <filename>fmgr.h</filename>, so you will need to include at
1816 least these two files. For portability reasons it's best to
1817 include <filename>postgres.h</filename> <emphasis>first</>,
1818 before any other system or user header files. Including
1819 <filename>postgres.h</filename> will also include
1820 <filename>elog.h</filename> and <filename>palloc.h</filename>
1827 Symbol names defined within object files must not conflict
1828 with each other or with symbols defined in the
1829 <productname>PostgreSQL</productname> server executable. You
1830 will have to rename your functions or variables if you get
1831 error messages to this effect.
1837 Compiling and linking your code so that it can be dynamically
1838 loaded into <productname>PostgreSQL</productname> always
1839 requires special flags. See <xref linkend="dfunc"> for a
1840 detailed explanation of how to do it for your particular
1850 <sect2 id="xfunc-c-pgxs">
1851 <title>Extension Building Infrastructure</title>
1853 <indexterm zone="xfunc-c-pgxs">
1854 <primary>pgxs</primary>
1858 If you are thinking about distributing your
1859 <productname>PostgreSQL</> extension modules, setting up a
1860 portable build system for them can be fairly difficult. Therefore
1861 the <productname>PostgreSQL</> installation provides a build
1862 infrastructure for extensions, called <acronym>PGXS</acronym>, so
1863 that simple extension modules can be built simply against an
1864 already installed server. Note that this infrastructure is not
1865 intended to be a universal build system framework that can be used
1866 to build all software interfacing to <productname>PostgreSQL</>;
1867 it simply automates common build rules for simple server extension
1868 modules. For more complicated packages, you need to write your
1873 To use the infrastructure for your extension, you must write a
1874 simple makefile. In that makefile, you need to set some variables
1875 and finally include the global <acronym>PGXS</acronym> makefile.
1876 Here is an example that builds an extension module named
1877 <literal>isbn_issn</literal> consisting of a shared library, an
1878 SQL script, and a documentation text file:
1881 DATA_built = isbn_issn.sql
1882 DOCS = README.isbn_issn
1884 PGXS := $(shell pg_config --pgxs)
1887 The last two lines should always be the same. Earlier in the
1888 file, you assign variables or add custom
1889 <application>make</application> rules.
1893 The following variables can be set:
1897 <term><varname>MODULES</varname></term>
1900 list of shared objects to be build from source file with same
1901 stem (do not include suffix in this list)
1907 <term><varname>DATA</varname></term>
1910 random files to install into <literal><replaceable>prefix</replaceable>/share/contrib</literal>
1916 <term><varname>DATA_built</varname></term>
1919 random files to install into
1920 <literal><replaceable>prefix</replaceable>/share/contrib</literal>,
1921 which need to be built first
1927 <term><varname>DOCS</varname></term>
1930 random files to install under
1931 <literal><replaceable>prefix</replaceable>/doc/contrib</literal>
1937 <term><varname>SCRIPTS</varname></term>
1940 script files (not binaries) to install into
1941 <literal><replaceable>prefix</replaceable>/bin</literal>
1947 <term><varname>SCRIPTS_built</varname></term>
1950 script files (not binaries) to install into
1951 <literal><replaceable>prefix</replaceable>/bin</literal>,
1952 which need to be built first
1958 <term><varname>REGRESS</varname></term>
1961 list of regression test cases (without suffix)
1967 or at most one of these two:
1971 <term><varname>PROGRAM</varname></term>
1974 a binary program to build (list objects files in <varname>OBJS</varname>)
1980 <term><varname>MODULE_big</varname></term>
1983 a shared object to build (list object files in <varname>OBJS</varname>)
1989 The following can also be set:
1994 <term><varname>EXTRA_CLEAN</varname></term>
1997 extra files to remove in <literal>make clean</literal>
2003 <term><varname>PG_CPPFLAGS</varname></term>
2006 will be added to <varname>CPPFLAGS</varname>
2012 <term><varname>PG_LIBS</varname></term>
2015 will be added to <varname>PROGRAM</varname> link line
2021 <term><varname>SHLIB_LINK</varname></term>
2024 will be added to <varname>MODULE_big</varname> link line
2032 Put this makefile as <literal>Makefile</literal> in the directory
2033 which holds your extension. Then you can do
2034 <literal>make</literal> to compile, and later <literal>make
2035 install</literal> to install your module. The extension is
2036 compiled and installed for the
2037 <productname>PostgreSQL</productname> installation that
2038 corresponds to the first <command>pg_config</command> command
2045 <title>Composite-Type Arguments in C-Language Functions</title>
2048 Composite types do not have a fixed layout like C structures.
2049 Instances of a composite type may contain null fields. In
2050 addition, composite types that are part of an inheritance
2051 hierarchy may have different fields than other members of the
2052 same inheritance hierarchy. Therefore,
2053 <productname>PostgreSQL</productname> provides a function
2054 interface for accessing fields of composite types from C.
2058 Suppose we want to write a function to answer the query
2061 SELECT name, c_overpaid(emp, 1500) AS overpaid
2063 WHERE name = 'Bill' OR name = 'Sam';
2066 Using call conventions version 0, we can define
2067 <function>c_overpaid</> as:
2070 #include "postgres.h"
2071 #include "executor/executor.h" /* for GetAttributeByName() */
2074 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2080 salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2083 return salary > limit;
2087 In version-1 coding, the above would look like this:
2090 #include "postgres.h"
2091 #include "executor/executor.h" /* for GetAttributeByName() */
2093 PG_FUNCTION_INFO_V1(c_overpaid);
2096 c_overpaid(PG_FUNCTION_ARGS)
2098 HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2099 int32 limit = PG_GETARG_INT32(1);
2103 salary = GetAttributeByName(t, "salary", &isnull);
2105 PG_RETURN_BOOL(false);
2106 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2108 PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2114 <function>GetAttributeByName</function> is the
2115 <productname>PostgreSQL</productname> system function that
2116 returns attributes out of the specified row. It has
2117 three arguments: the argument of type <type>HeapTupleHeader</type> passed
2119 the function, the name of the desired attribute, and a
2120 return parameter that tells whether the attribute
2121 is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2122 value that you can convert to the proper data type by using the
2123 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2124 macro. Note that the return value is meaningless if the null flag is
2125 set; always check the null flag before trying to do anything with the
2130 There is also <function>GetAttributeByNum</function>, which selects
2131 the target attribute by column number instead of name.
2135 The following command declares the function
2136 <function>c_overpaid</function> in SQL:
2139 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2140 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2144 Notice we have used <literal>STRICT</> so that we did not have to
2145 check whether the input arguments were NULL.
2150 <title>Returning Rows (Composite Types) from C-Language Functions</title>
2153 To return a row or composite-type value from a C-language
2154 function, you can use a special API that provides macros and
2155 functions to hide most of the complexity of building composite
2156 data types. To use this API, the source file must include:
2158 #include "funcapi.h"
2163 There are two ways you can build a composite data value (henceforth
2164 a <quote>tuple</>): you can build it from an array of Datum values,
2165 or from an array of C strings that can be passed to the input
2166 conversion functions of the tuple's column data types. In either
2167 case, you first need to obtain or construct a <structname>TupleDesc</>
2168 descriptor for the tuple structure. When working with Datums, you
2169 pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2170 and then call <function>heap_formtuple</> for each row. When working
2171 with C strings, you pass the <structname>TupleDesc</> to
2172 <function>TupleDescGetAttInMetadata</>, and then call
2173 <function>BuildTupleFromCStrings</> for each row. In the case of a
2174 function returning a set of tuples, the setup steps can all be done
2175 once during the first call of the function.
2179 Several helper functions are available for setting up the initial
2180 <structname>TupleDesc</>. If you want to use a named composite type,
2181 you can fetch the information from the system catalogs. Use
2183 TupleDesc RelationNameGetTupleDesc(const char *relname)
2185 to get a <structname>TupleDesc</> for a named relation, or
2187 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2189 to get a <structname>TupleDesc</> based on a type OID. This can
2190 be used to get a <structname>TupleDesc</> for a base or
2191 composite type. When writing a function that returns
2192 <structname>record</>, the expected <structname>TupleDesc</>
2193 must be passed in by the caller.
2197 Once you have a <structname>TupleDesc</>, call
2199 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2201 if you plan to work with Datums, or
2203 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2205 if you plan to work with C strings. If you are writing a function
2206 returning set, you can save the results of these functions in the
2207 <structname>FuncCallContext</> structure — use the
2208 <structfield>tuple_desc</> or <structfield>attinmeta</> field
2213 When working with Datums, use
2215 HeapTuple heap_formtuple(TupleDesc tupdesc, Datum *values, char *nulls)
2217 to build a <structname>HeapTuple</> given user data in Datum form.
2221 When working with C strings, use
2223 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2225 to build a <structname>HeapTuple</> given user data
2226 in C string form. <literal>values</literal> is an array of C strings,
2227 one for each attribute of the return row. Each C string should be in
2228 the form expected by the input function of the attribute data
2229 type. In order to return a null value for one of the attributes,
2230 the corresponding pointer in the <parameter>values</> array
2231 should be set to <symbol>NULL</>. This function will need to
2232 be called again for each row you return.
2236 Once you have built a tuple to return from your function, it
2237 must be converted into a <type>Datum</>. Use
2239 HeapTupleGetDatum(HeapTuple tuple)
2241 to convert a <structname>HeapTuple</> into a valid Datum. This
2242 <type>Datum</> can be returned directly if you intend to return
2243 just a single row, or it can be used as the current return value
2244 in a set-returning function.
2248 An example appears in the next section.
2253 <sect2 id="xfunc-c-return-set">
2254 <title>Returning Sets from C-Language Functions</title>
2257 There is also a special API that provides support for returning
2258 sets (multiple rows) from a C-language function. A set-returning
2259 function must follow the version-1 calling conventions. Also,
2260 source files must include <filename>funcapi.h</filename>, as
2265 A set-returning function (<acronym>SRF</>) is called
2266 once for each item it returns. The <acronym>SRF</> must
2267 therefore save enough state to remember what it was doing and
2268 return the next item on each call.
2269 The structure <structname>FuncCallContext</> is provided to help
2270 control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</>
2271 is used to hold a pointer to <structname>FuncCallContext</>
2277 * Number of times we've been called before
2279 * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2280 * incremented for you every time SRF_RETURN_NEXT() is called.
2285 * OPTIONAL maximum number of calls
2287 * max_calls is here for convenience only and setting it is optional.
2288 * If not set, you must provide alternative means to know when the
2294 * OPTIONAL pointer to result slot
2296 * This is obsolete and only present for backwards compatibility, viz,
2297 * user-defined SRFs that use the deprecated TupleDescGetSlot().
2299 TupleTableSlot *slot;
2302 * OPTIONAL pointer to miscellaneous user-provided context information
2304 * user_fctx is for use as a pointer to your own data to retain
2305 * arbitrary context information between calls of your function.
2310 * OPTIONAL pointer to struct containing attribute type input metadata
2312 * attinmeta is for use when returning tuples (i.e., composite data types)
2313 * and is not used when returning base data types. It is only needed
2314 * if you intend to use BuildTupleFromCStrings() to create the return
2317 AttInMetadata *attinmeta;
2320 * memory context used for structures that must live for multiple calls
2322 * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2323 * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2324 * context for any memory that is to be reused across multiple calls
2327 MemoryContext multi_call_memory_ctx;
2330 * OPTIONAL pointer to struct containing tuple description
2332 * tuple_desc is for use when returning tuples (i.e. composite data types)
2333 * and is only needed if you are going to build the tuples with
2334 * heap_formtuple() rather than with BuildTupleFromCStrings(). Note that
2335 * the TupleDesc pointer stored here should usually have been run through
2336 * BlessTupleDesc() first.
2338 TupleDesc tuple_desc;
2345 An <acronym>SRF</> uses several functions and macros that
2346 automatically manipulate the <structname>FuncCallContext</>
2347 structure (and expect to find it via <literal>fn_extra</>). Use
2351 to determine if your function is being called for the first or a
2352 subsequent time. On the first call (only) use
2354 SRF_FIRSTCALL_INIT()
2356 to initialize the <structname>FuncCallContext</>. On every function call,
2357 including the first, use
2361 to properly set up for using the <structname>FuncCallContext</>
2362 and clearing any previously returned data left over from the
2367 If your function has data to return, use
2369 SRF_RETURN_NEXT(funcctx, result)
2371 to return it to the caller. (<literal>result</> must be of type
2372 <type>Datum</>, either a single value or a tuple prepared as
2373 described above.) Finally, when your function is finished
2376 SRF_RETURN_DONE(funcctx)
2378 to clean up and end the <acronym>SRF</>.
2382 The memory context that is current when the <acronym>SRF</> is called is
2383 a transient context that will be cleared between calls. This means
2384 that you do not need to call <function>pfree</> on everything
2385 you allocated using <function>palloc</>; it will go away anyway. However, if you want to allocate
2386 any data structures to live across calls, you need to put them somewhere
2387 else. The memory context referenced by
2388 <structfield>multi_call_memory_ctx</> is a suitable location for any
2389 data that needs to survive until the <acronym>SRF</> is finished running. In most
2390 cases, this means that you should switch into
2391 <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2395 A complete pseudo-code example looks like the following:
2398 my_set_returning_function(PG_FUNCTION_ARGS)
2400 FuncCallContext *funcctx;
2402 MemoryContext oldcontext;
2403 <replaceable>further declarations as needed</replaceable>
2405 if (SRF_IS_FIRSTCALL())
2407 funcctx = SRF_FIRSTCALL_INIT();
2408 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2409 /* One-time setup code appears here: */
2410 <replaceable>user code</replaceable>
2411 <replaceable>if returning composite</replaceable>
2412 <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2413 <replaceable>endif returning composite</replaceable>
2414 <replaceable>user code</replaceable>
2415 MemoryContextSwitchTo(oldcontext);
2418 /* Each-time setup code appears here: */
2419 <replaceable>user code</replaceable>
2420 funcctx = SRF_PERCALL_SETUP();
2421 <replaceable>user code</replaceable>
2423 /* this is just one way we might test whether we are done: */
2424 if (funcctx->call_cntr < funcctx->max_calls)
2426 /* Here we want to return another item: */
2427 <replaceable>user code</replaceable>
2428 <replaceable>obtain result Datum</replaceable>
2429 SRF_RETURN_NEXT(funcctx, result);
2433 /* Here we are done returning items and just need to clean up: */
2434 <replaceable>user code</replaceable>
2435 SRF_RETURN_DONE(funcctx);
2442 A complete example of a simple <acronym>SRF</> returning a composite type looks like:
2444 PG_FUNCTION_INFO_V1(testpassbyval);
2447 testpassbyval(PG_FUNCTION_ARGS)
2449 FuncCallContext *funcctx;
2453 AttInMetadata *attinmeta;
2455 /* stuff done only on the first call of the function */
2456 if (SRF_IS_FIRSTCALL())
2458 MemoryContext oldcontext;
2460 /* create a function context for cross-call persistence */
2461 funcctx = SRF_FIRSTCALL_INIT();
2463 /* switch to memory context appropriate for multiple function calls */
2464 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2466 /* total number of tuples to be returned */
2467 funcctx->max_calls = PG_GETARG_UINT32(0);
2469 /* Build a tuple description for a __testpassbyval tuple */
2470 tupdesc = RelationNameGetTupleDesc("__testpassbyval");
2473 * generate attribute metadata needed later to produce tuples from raw
2476 attinmeta = TupleDescGetAttInMetadata(tupdesc);
2477 funcctx->attinmeta = attinmeta;
2479 MemoryContextSwitchTo(oldcontext);
2482 /* stuff done on every call of the function */
2483 funcctx = SRF_PERCALL_SETUP();
2485 call_cntr = funcctx->call_cntr;
2486 max_calls = funcctx->max_calls;
2487 attinmeta = funcctx->attinmeta;
2489 if (call_cntr < max_calls) /* do when there is more left to send */
2496 * Prepare a values array for building the returned tuple.
2497 * This should be an array of C strings which will
2498 * be processed later by the type input functions.
2500 values = (char **) palloc(3 * sizeof(char *));
2501 values[0] = (char *) palloc(16 * sizeof(char));
2502 values[1] = (char *) palloc(16 * sizeof(char));
2503 values[2] = (char *) palloc(16 * sizeof(char));
2505 snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
2506 snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
2507 snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
2510 tuple = BuildTupleFromCStrings(attinmeta, values);
2512 /* make the tuple into a datum */
2513 result = HeapTupleGetDatum(tuple);
2515 /* clean up (this is not really necessary) */
2521 SRF_RETURN_NEXT(funcctx, result);
2523 else /* do when there is no more left */
2525 SRF_RETURN_DONE(funcctx);
2530 The SQL code to declare this function is:
2532 CREATE TYPE __testpassbyval AS (f1 integer, f2 integer, f3 integer);
2534 CREATE OR REPLACE FUNCTION testpassbyval(integer, integer) RETURNS SETOF __testpassbyval
2535 AS '<replaceable>filename</>', 'testpassbyval'
2536 LANGUAGE C IMMUTABLE STRICT;
2541 The directory <filename>contrib/tablefunc</> in the source
2542 distribution contains more examples of set-returning functions.
2547 <title>Polymorphic Arguments and Return Types</title>
2550 C-language functions may be declared to accept and
2551 return the polymorphic types
2552 <type>anyelement</type> and <type>anyarray</type>.
2553 See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
2554 of polymorphic functions. When function arguments or return types
2555 are defined as polymorphic types, the function author cannot know
2556 in advance what data type it will be called with, or
2557 need to return. There are two routines provided in <filename>fmgr.h</>
2558 to allow a version-1 C function to discover the actual data types
2559 of its arguments and the type it is expected to return. The routines are
2560 called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
2561 <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
2562 They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
2563 information is not available.
2564 The structure <literal>flinfo</> is normally accessed as
2565 <literal>fcinfo->flinfo</>. The parameter <literal>argnum</>
2570 For example, suppose we want to write a function to accept a single
2571 element of any type, and return a one-dimensional array of that type:
2574 PG_FUNCTION_INFO_V1(make_array);
2576 make_array(PG_FUNCTION_ARGS)
2579 Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
2588 if (!OidIsValid(element_type))
2589 elog(ERROR, "could not determine data type of input");
2591 /* get the provided element */
2592 element = PG_GETARG_DATUM(0);
2594 /* we have one dimension */
2596 /* and one element */
2598 /* and lower bound is 1 */
2601 /* get required info about the element type */
2602 get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
2604 /* now build the array */
2605 result = construct_md_array(&element, ndims, dims, lbs,
2606 element_type, typlen, typbyval, typalign);
2608 PG_RETURN_ARRAYTYPE_P(result);
2614 The following command declares the function
2615 <function>make_array</function> in SQL:
2618 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
2619 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
2623 Note the use of <literal>STRICT</literal>; this is essential
2624 since the code is not bothering to test for a null input.
2629 <!-- Keep this comment at the end of the file
2634 sgml-minimize-attributes:nil
2635 sgml-always-quote-attributes:t
2638 sgml-parent-document:nil
2639 sgml-default-dtd-file:"./reference.ced"
2640 sgml-exposed-tags:nil
2641 sgml-local-catalogs:("/usr/lib/sgml/catalog")
2642 sgml-local-ecat-files:nil