doc/src/sgml/syntax.sgml

   1 <!--
   2 $Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.39 2001/02/10 07:08:44 tgl Exp $
   3 -->
   4
   5 <chapter id="sql-syntax">
   6  <title>SQL Syntax</title>
   7
   8   <abstract>
   9    <para>
  10     A description of the general syntax of SQL.
  11    </para>
  12   </abstract>
  13
  14  <sect1 id="sql-syntax-lexical">
  15   <title>Lexical Structure</title>
  16
  17   <para>
  18    SQL input consists of a sequence of
  19    <firstterm>commands</firstterm>.  A command is composed of a
  20    sequence of <firstterm>tokens</firstterm>, terminated by a
  21    semicolon (<quote>;</quote>).  The end of the input stream also
  22    terminates a command.  Which tokens are valid depends on the syntax
  23    of the particular command.
  24   </para>
  25
  26   <para>
  27    A token can be a <firstterm>key word</firstterm>, an
  28    <firstterm>identifier</firstterm>, a <firstterm>quoted
  29    identifier</firstterm>, a <firstterm>literal</firstterm> (or
  30    constant), or a special character symbol.  Tokens are normally
  31    separated by whitespace (space, tab, newline), but need not be if
  32    there is no ambiguity (which is generally only the case if a
  33    special character is adjacent to some other token type).
  34   </para>
  35
  36   <para>
  37    Additionally, <firstterm>comments</firstterm> can occur in SQL
  38    input.  They are not tokens, they are effectively equivalent to
  39    whitespace.
  40   </para>
  41
  42   <informalexample id="sql-syntax-ex-commands">
  43    <para>
  44     For example, the following is (syntactically) valid SQL input:
  45 <programlisting>
  46 SELECT * FROM MY_TABLE;
  47 UPDATE MY_TABLE SET A = 5;
  48 INSERT INTO MY_TABLE VALUES (3, 'hi there');
  49 </programlisting>
  50     This is a sequence of three commands, one per line (although this
  51     is not required; more than one command can be on a line, and
  52     commands can usefully be split across lines).
  53    </para>
  54   </informalexample>
  55
  56   <para>
  57    The SQL syntax is not very consistent regarding what tokens
  58    identify commands and which are operands or parameters.  The first
  59    few tokens are generally the command name, so in the above example
  60    we would usually speak of a <quote>SELECT</quote>, an
  61    <quote>UPDATE</quote>, and an <quote>INSERT</quote> command.  But
  62    for instance the <command>UPDATE</command> command always requires
  63    a <token>SET</token> token to appear in a certain position, and
  64    this particular variation of <command>INSERT</command> also
  65    requires a <token>VALUES</token> in order to be complete.  The
  66    precise syntax rules for each command are described in the
  67    <citetitle>Reference Manual</citetitle>.
  68   </para>
  69
  70   <sect2 id="sql-syntax-identifiers">
  71    <title>Identifiers and Key Words</title>
  72
  73    <para>
  74     Tokens such as <token>SELECT</token>, <token>UPDATE</token>, or
  75     <token>VALUES</token> in the example above are examples of
  76     <firstterm>key words</firstterm>, that is, words that have a fixed
  77     meaning in the SQL language.  The tokens <token>MY_TABLE</token>
  78     and <token>A</token> are examples of
  79     <firstterm>identifiers</firstterm>.  They identify names of
  80     tables, columns, or other database objects, depending on the
  81     command they are used in.  Therefore they are sometimes simply
  82     called <quote>names</quote>.  Key words and identifiers have the
  83     same lexical structure, meaning that one cannot know whether a
  84     token is an identifier or a key word without knowing the language.
  85     A complete list of key words can be found in <xref
  86     linkend="sql-keywords-appendix">.
  87    </para>
  88
  89    <para>
  90     SQL identifiers and key words must begin with a letter
  91     (<literal>a</literal>-<literal>z</literal>) or underscore
  92     (<literal>_</literal>).  Subsequent characters in an identifier or
  93     key word can be letters, digits
  94     (<literal>0</literal>-<literal>9</literal>), or underscores,
  95     although the SQL standard will not define a key word that contains
  96     digits or starts or ends with an underscore.
  97    </para>
  98
  99    <para>
 100     The system uses no more than <symbol>NAMEDATALEN</symbol>-1
 101     characters of an identifier; longer names can be written in
 102     commands, but they will be truncated.  By default,
 103     <symbol>NAMEDATALEN</symbol> is 32 so the maximum identifier length
 104     is 31 (but at the time the system is built,
 105     <symbol>NAMEDATALEN</symbol> can be changed in
 106     <filename>src/include/postgres_ext.h</filename>).
 107    </para>
 108
 109    <para>
 110     Identifier and key word names are case insensitive.  Therefore
 111 <programlisting>
 112 UPDATE MY_TABLE SET A = 5;
 113 </programlisting>
 114     can equivalently be written as
 115 <programlisting>
 116 uPDaTE my_TabLE SeT a = 5;
 117 </programlisting>
 118     A convention often used is to write key words in upper
 119     case and names in lower case, e.g.,
 120 <programlisting>
 121 UPDATE my_table SET a = 5;
 122 </programlisting>
 123    </para>
 124
 125    <para>
 126     There is a second kind of identifier:  the <firstterm>delimited
 127     identifier</firstterm> or <firstterm>quoted
 128     identifier</firstterm>.  It is formed by enclosing an arbitrary
 129     sequence of characters in double-quotes
 130     (<literal>"</literal>). <!-- " font-lock mania --> A delimited
 131     identifier is always an identifier, never a key word.  So
 132     <literal>"select"</literal> could be used to refer to a column or
 133     table named <quote>select</quote>, whereas an unquoted
 134     <literal>select</literal> would be taken as a key word and
 135     would therefore provoke a parse error when used where a table or
 136     column name is expected.  The example can be written with quoted
 137     identifiers like this:
 138 <programlisting>
 139 UPDATE "my_table" SET "a" = 5;
 140 </programlisting>
 141    </para>
 142
 143    <para>
 144     Quoted identifiers can contain any character other than a double
 145     quote itself.  This allows constructing table or column names that
 146     would otherwise not be possible, such as ones containing spaces or
 147     ampersands.  The length limitation still applies.
 148    </para>
 149
 150    <para>
 151     Quoting an identifier also makes it case-sensitive, whereas
 152     unquoted names are always folded to lower case.  For example, the
 153     identifiers <literal>FOO</literal>, <literal>foo</literal> and
 154     <literal>"foo"</literal> are considered the same by
 155     <productname>Postgres</productname>, but <literal>"Foo"</literal>
 156     and <literal>"FOO"</literal> are different from these three and
 157     each other.
 158     <footnote>
 159      <para>
 160       <productname>Postgres</productname>' folding of unquoted names to lower
 161       case is incompatible with the SQL standard, which says that unquoted
 162       names should be folded to upper case.  Thus, <literal>foo</literal>
 163       should be equivalent to <literal>"FOO"</literal> not
 164       <literal>"foo"</literal> according to the standard.  If you want to
 165       write portable applications you are advised to always quote a particular
 166       name or never quote it.
 167      </para>
 168     </footnote>
 169    </para>
 170   </sect2>
 171
 172
 173   <sect2 id="sql-syntax-constants">
 174    <title>Constants</title>
 175
 176    <para>
 177     There are four kinds of <firstterm>implicitly typed
 178     constants</firstterm> in <productname>Postgres</productname>:
 179     strings, bit strings, integers, and floating point numbers.
 180     Constants can also be specified with explicit types, which can
 181     enable more accurate representation and more efficient handling by
 182     the system. The implicit constants are described below; explicit
 183     constants are discussed afterwards.
 184    </para>
 185
 186    <sect3 id="sql-syntax-strings">
 187     <title>String Constants</title>
 188
 189     <para>
 190      A string constant in SQL is an arbitrary sequence of characters
 191      bounded by single quotes (<quote>'</quote>), e.g., <literal>'This
 192      is a string'</literal>.  SQL allows single quotes to be embedded
 193      in strings by typing two adjacent single quotes (e.g.,
 194      <literal>'Dianne''s horse'</literal>).  In
 195      <productname>Postgres</productname> single quotes may
 196      alternatively be escaped with a backslash (<quote>\</quote>,
 197      e.g., <literal>'Dianne\'s horse'</literal>).
 198     </para>
 199
 200     <para>
 201      C-style backslash escapes are also available:
 202      <literal>\b</literal> is a backspace, <literal>\f</literal> is a
 203      form feed, <literal>\n</literal> is a newline,
 204      <literal>\r</literal> is a carriage return, <literal>\t</literal>
 205      is a tab, and <literal>\<replaceable>xxx</replaceable></literal>,
 206      where <replaceable>xxx</replaceable> is an octal number, is the
 207      character with the corresponding ASCII code.  Any other character
 208      following a backslash is taken literally.  Thus, to include a
 209      backslash in a string constant, type two backslashes.
 210     </para>
 211
 212     <para>
 213      The character with the code zero cannot be in a string constant.
 214     </para>
 215
 216     <para>
 217      Two string constants that are only separated by whitespace
 218      <emphasis>with at least one newline</emphasis> are concatenated
 219      and effectively treated as if the string had been written in one
 220      constant.  For example:
 221 <programlisting>
 222 SELECT 'foo'
 223 'bar';
 224 </programlisting>
 225      is equivalent to
 226 <programlisting>
 227 SELECT 'foobar';
 228 </programlisting>
 229      but
 230 <programlisting>
 231 SELECT 'foo'      'bar';
 232 </programlisting>
 233      is not valid syntax.
 234     </para>
 235    </sect3>
 236
 237    <sect3 id="sql-syntax-bit-strings">
 238     <title>Bit String Constants</title>
 239
 240     <para>
 241      Bit string constants look like string constants with a
 242      <literal>B</literal> (upper or lower case) immediately before the
 243      opening quote (no intervening whitespace), e.g.,
 244      <literal>B'1001'</literal>.  The only characters allowed within
 245      bit string constants are <literal>0</literal> and
 246      <literal>1</literal>.  Bit string constants can be continued
 247      across lines in the same way as regular string constants.
 248     </para>
 249    </sect3>
 250
 251    <sect3>
 252     <title>Integer Constants</title>
 253
 254     <para>
 255      Integer constants in SQL are sequences of decimal digits (0
 256      though 9) with no decimal point.  The range of legal values
 257      depends on which integer data type is used, but the plain
 258      <type>integer</type> type accepts values ranging from -2147483648
 259      to +2147483647.  (The optional plus or minus sign is actually a
 260      separate unary operator and not part of the integer constant.)
 261     </para>
 262    </sect3>
 263
 264    <sect3>
 265     <title>Floating Point Constants</title>
 266
 267     <para>
 268      Floating point constants are accepted in these general forms:
 269 <synopsis>
 270 <replaceable>digits</replaceable>.<optional><replaceable>digits</replaceable></optional><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 271 <optional><replaceable>digits</replaceable></optional>.<replaceable>digits</replaceable><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 272 <replaceable>digits</replaceable>e<optional>+-</optional><replaceable>digits</replaceable>
 273 </synopsis>
 274      where <replaceable>digits</replaceable> is one or more decimal
 275      digits.  At least one digit must be before or after the decimal
 276      point, and after the <literal>e</literal> if you use that option.
 277      Thus, a floating point constant is distinguished from an integer
 278      constant by the presence of either the decimal point or the
 279      exponent clause (or both).  There must not be a space or other
 280      characters embedded in the constant.
 281     </para>
 282
 283     <informalexample>
 284      <para>
 285       These are some examples of valid floating point constants:
 286 <literallayout>
 287 3.5
 288 4.
 289 .001
 290 5e2
 291 1.925e-3
 292 </literallayout>
 293      </para>
 294     </informalexample>
 295
 296     <para>
 297      Floating point constants are of type <type>DOUBLE
 298      PRECISION</type>. <type>REAL</type> can be specified explicitly
 299      by using <acronym>SQL</acronym> string notation or
 300      <productname>Postgres</productname> type notation:
 301
 302 <programlisting>
 303 REAL '1.23'  -- string style
 304 '1.23'::REAL -- Postgres (historical) style
 305      </programlisting>
 306     </para>
 307    </sect3>
 308
 309    <sect3 id="sql-syntax-constants-generic">
 310     <title>Constants of Other Types</title>
 311
 312     <para>
 313      A constant of an <emphasis>arbitrary</emphasis> type can be
 314      entered using any one of the following notations:
 315 <synopsis>
 316 <replaceable>type</replaceable> '<replaceable>string</replaceable>'
 317 '<replaceable>string</replaceable>'::<replaceable>type</replaceable>
 318 CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 319 </synopsis>
 320      The value inside the string is passed to the input conversion
 321      routine for the type called <replaceable>type</replaceable>. The
 322      result is a constant of the indicated type.  The explicit type
 323      cast may be omitted if there is no ambiguity as to the type the
 324      constant must be (for example, when it is passed as an argument
 325      to a non-overloaded function), in which case it is automatically
 326      coerced.
 327     </para>
 328
 329     <para>
 330      It is also possible to specify a type coercion using a function-like
 331      syntax:
 332 <synopsis>
 333 <replaceable>typename</replaceable> ( <replaceable>value</replaceable> )
 334 </synopsis>
 335      although this only works for types whose names are also valid as
 336      function names.  (For example, <literal>double precision</literal>
 337      can't be used this way --- but the equivalent <literal>float8</literal>
 338      can.)
 339     </para>
 340
 341     <para>
 342      The <literal>::</literal>, <literal>CAST()</literal>, and
 343      function-call syntaxes can also be used to specify the type of
 344      arbitrary expressions, but the form
 345      <replaceable>type</replaceable>
 346      '<replaceable>string</replaceable>' can only be used to specify
 347      the type of a literal constant.
 348     </para>
 349    </sect3>
 350
 351    <sect3>
 352     <title>Array constants</title>
 353
 354     <para>
 355      The general format of an array constant is the following:
 356 <synopsis>
 357 '{ <replaceable>val1</replaceable> <replaceable>delim</replaceable> <replaceable>val2</replaceable> <replaceable>delim</replaceable> ... }'
 358 </synopsis>
 359      where <replaceable>delim</replaceable> is the delimiter character
 360      for the type, as recorded in its <literal>pg_type</literal>
 361      entry.  (For all built-in types, this is the comma character
 362      ",".)  Each <replaceable>val</replaceable> is either a constant
 363      of the array element type, or a sub-array.  An example of an
 364      array constant is
 365 <programlisting>
 366 '{{1,2,3},{4,5,6},{7,8,9}}'
 367 </programlisting>
 368      This constant is a two-dimensional, 3 by 3 array consisting of three
 369      sub-arrays of integers.
 370     </para>
 371
 372     <para>
 373      Individual array elements can be placed between double-quote
 374      marks (<literal>"</literal>) <!-- " --> to avoid ambiguity
 375      problems with respect to white space.  Without quote marks, the
 376      array-value parser will skip leading white space.
 377     </para>
 378
 379     <para>
 380      (Array constants are actually only a special case of the generic
 381      type constants discussed in the previous section.  The constant
 382      is initially treated as a string and passed to the array input
 383      conversion routine.  An explicit type specification might be
 384      necessary.)
 385     </para>
 386    </sect3>
 387   </sect2>
 388
 389
 390   <sect2 id="sql-syntax-operators">
 391    <title>Operators</title>
 392
 393    <para>
 394     An operator is a sequence of up to <symbol>NAMEDATALEN</symbol>-1
 395     (31 by default) characters from the following list:
 396 <literallayout>
 397 + - * / &lt; &gt; = ~ ! @ # % ^ &amp; | ` ? $
 398 </literallayout>
 399
 400     There are a few restrictions on operator names, however:
 401     <itemizedlist>
 402      <listitem>
 403       <para>
 404        "$" (dollar) cannot be a single-character operator, although it
 405        can be part of a multi-character operator name.
 406       </para>
 407      </listitem>
 408
 409      <listitem>
 410       <para>
 411        <literal>--</literal> and <literal>/*</literal> cannot appear
 412        anywhere in an operator name, since they will be taken as the
 413        start of a comment.
 414       </para>
 415      </listitem>
 416
 417      <listitem>
 418       <para>
 419        A multi-character operator name cannot end in "+" or "-",
 420        unless the name also contains at least one of these characters:
 421 <literallayout>
 422 ~ ! @ # % ^ &amp; | ` ? $
 423 </literallayout>
 424        For example, <literal>@-</literal> is an allowed operator name,
 425        but <literal>*-</literal> is not.  This restriction allows
 426        <productname>Postgres</productname> to parse SQL-compliant
 427        queries without requiring spaces between tokens.
 428       </para>
 429      </listitem>
 430     </itemizedlist>
 431    </para>
 432
 433    <para>
 434     When working with non-SQL-standard operator names, you will usually
 435     need to separate adjacent operators with spaces to avoid ambiguity.
 436     For example, if you have defined a left-unary operator named "@",
 437     you cannot write <literal>X*@Y</literal>; you must write
 438     <literal>X* @Y</literal> to ensure that
 439     <productname>Postgres</productname> reads it as two operator names
 440     not one.
 441    </para>
 442   </sect2>
 443
 444   <sect2>
 445    <title>Special Characters</title>
 446
 447   <para>
 448    Some characters that are not alphanumeric have a special meaning
 449    that is different from being an operator.  Details on the usage can
 450    be found at the location where the respective syntax element is
 451    described.  This section only exists to advise the existence and
 452    summarize the purposes of these characters.
 453
 454    <itemizedlist>
 455     <listitem>
 456      <para>
 457       A dollar sign (<literal>$</literal>) followed by digits is used
 458       to represent the positional parameters in the body of a function
 459       definition.  In other contexts the dollar sign may be part of an
 460       operator name.
 461      </para>
 462     </listitem>
 463
 464     <listitem>
 465      <para>
 466       Parentheses (<literal>()</literal>) have their usual meaning to
 467       group expressions and enforce precedence.  In some cases
 468       parentheses are required as part of the fixed syntax of a
 469       particular SQL command.
 470      </para>
 471     </listitem>
 472
 473     <listitem>
 474      <para>
 475       Brackets (<literal>[]</literal>) are used to select the elements
 476       of an array.  See <xref linkend="arrays"> for more information
 477       on arrays.
 478      </para>
 479     </listitem>
 480
 481     <listitem>
 482      <para>
 483       Commas (<literal>,</literal>) are used in some syntactical
 484       constructs to separate the elements of a list.
 485      </para>
 486     </listitem>
 487
 488     <listitem>
 489      <para>
 490       The semicolon (<literal>;</literal>) terminates an SQL command.
 491       It cannot appear anywhere within a command, except within a
 492       string constant or quoted identifier.
 493      </para>
 494     </listitem>
 495
 496     <listitem>
 497      <para>
 498       The colon (<literal>:</literal>) is used to select
 499       <quote>slices</quote> from arrays. (See <xref
 500       linkend="arrays">.)  In certain SQL dialects (such as Embedded
 501       SQL), the colon is used to prefix variable names.
 502      </para>
 503     </listitem>
 504
 505     <listitem>
 506      <para>
 507       The asterisk (<literal>*</literal>) has a special meaning when
 508       used in the <command>SELECT</command> command or with the
 509       <function>COUNT</function> aggregate function.
 510      </para>
 511     </listitem>
 512
 513     <listitem>
 514      <para>
 515       The period (<literal>.</literal>) is used in floating point
 516       constants, and to separate table and column names.
 517      </para>
 518     </listitem>
 519    </itemizedlist>
 520
 521    </para>
 522   </sect2>
 523
 524   <sect2 id="sql-syntax-comments">
 525    <title>Comments</title>
 526
 527    <para>
 528     A comment is an arbitrary sequence of characters beginning with
 529     double dashes and extending to the end of the line, e.g.:
 530 <programlisting>
 531 -- This is a standard SQL92 comment
 532 </programlisting>
 533    </para>
 534
 535    <para>
 536     Alternatively, C-style block comments can be used:
 537 <programlisting>
 538 /* multi-line comment
 539  * with nesting: /* nested block comment */
 540  */
 541 </programlisting>
 542     where the comment begins with <literal>/*</literal> and extends to
 543     the matching occurrence of <literal>*/</literal>. These block
 544     comments nest, as specified in SQL99 but unlike C, so that one can
 545     comment out larger blocks of code that may contain existing block
 546     comments.
 547    </para>
 548
 549    <para>
 550     A comment is removed from the input stream before further syntax
 551     analysis and is effectively replaced by whitespace.
 552    </para>
 553   </sect2>
 554  </sect1>
 555
 556
 557   <sect1 id="sql-syntax-columns">
 558    <title>Columns</title>
 559
 560     <para>
 561      A <firstterm>column</firstterm>
 562      is either a user-defined column of a given table or one of the
 563      following system-defined columns:
 564
 565      <variablelist>
 566       <varlistentry>
 567        <term>oid</term>
 568        <listitem>
 569         <para>
 570          The unique identifier (object ID) of a row.  This is a serial number
 571          that is added by Postgres to all rows automatically. OIDs are not
 572          reused and are 32-bit quantities.
 573         </para>
 574        </listitem>
 575       </varlistentry>
 576
 577       <varlistentry>
 578       <term>tableoid</term>
 579        <listitem>
 580         <para>
 581          The OID of the table containing this row.  This attribute is
 582          particularly handy for queries that select from inheritance
 583          hierarchies, since without it, it's difficult to tell which
 584          individual table a row came from.  The tableoid can be joined
 585          against the OID attribute of pg_class to obtain the table name.
 586         </para>
 587        </listitem>
 588       </varlistentry>
 589
 590       <varlistentry>
 591        <term>xmin</term>
 592        <listitem>
 593         <para>
 594          The identity (transaction ID) of the inserting transaction for
 595          this tuple.  (Note: a tuple is an individual state of a row;
 596          each UPDATE of a row creates a new tuple for the same logical row.)
 597         </para>
 598        </listitem>
 599       </varlistentry>
 600
 601       <varlistentry>
 602       <term>cmin</term>
 603        <listitem>
 604         <para>
 605          The command identifier (starting at zero) within the inserting
 606          transaction.
 607         </para>
 608        </listitem>
 609       </varlistentry>
 610
 611       <varlistentry>
 612       <term>xmax</term>
 613        <listitem>
 614         <para>
 615          The identity (transaction ID) of the deleting transaction,
 616          or zero for an undeleted tuple.  In practice, this is never nonzero
 617          for a visible tuple.
 618         </para>
 619        </listitem>
 620       </varlistentry>
 621
 622       <varlistentry>
 623       <term>cmax</term>
 624        <listitem>
 625         <para>
 626          The command identifier within the deleting transaction, or zero.
 627          Again, this is never nonzero for a visible tuple.
 628         </para>
 629        </listitem>
 630       </varlistentry>
 631
 632       <varlistentry>
 633       <term>ctid</term>
 634        <listitem>
 635         <para>
 636          The tuple ID of the tuple within its table.  This is a pair
 637          (block number, tuple index within block) that identifies the
 638          physical location of the tuple.  Note that although the ctid
 639          can be used to locate the tuple very quickly, a row's ctid
 640          will change each time it is updated or moved by VACUUM.
 641          Therefore ctid is useless as a long-term row identifier.
 642          The OID, or even better a user-defined serial number, should
 643          be used to identify logical rows.
 644         </para>
 645        </listitem>
 646       </varlistentry>
 647      </variablelist>
 648     </para>
 649
 650     <para>
 651      For further information on the system attributes consult
 652      <xref linkend="STON87a" endterm="STON87a">.
 653      Transaction and command identifiers are 32-bit quantities.
 654     </para>
 655
 656   </sect1>
 657
 658
 659  <sect1 id="sql-expressions">
 660   <title>Value Expressions</title>
 661
 662   <para>
 663    Value expressions are used in a variety of contexts, such
 664    as in the target list of the <command>SELECT</command> command, as
 665    new column values in <command>INSERT</command> or
 666    <command>UPDATE</command>, or in search conditions in a number of
 667    commands.  The result of a value expression is sometimes called a
 668    <firstterm>scalar</firstterm>, to distinguish it from the result of
 669    a table expression (which is a table).  Value expressions are
 670    therefore also called <firstterm>scalar expressions</firstterm> (or
 671    even simply <firstterm>expressions</firstterm>).  The expression
 672    syntax allows the calculation of values from primitive parts using
 673    arithmetic, logical, set, and other operations.
 674   </para>
 675
 676   <para>
 677    A value expression is one of the following:
 678
 679    <itemizedlist>
 680     <listitem>
 681      <para>
 682       A constant or literal value; see <xref linkend="sql-syntax-constants">.
 683      </para>
 684     </listitem>
 685
 686     <listitem>
 687      <para>
 688       A column reference
 689      </para>
 690     </listitem>
 691
 692     <listitem>
 693      <para>
 694       An operator invocation:
 695       <simplelist>
 696        <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> <replaceable>expression</replaceable> (binary infix operator)</member>
 697        <member><replaceable>operator</replaceable> <replaceable>expression</replaceable> (unary prefix operator)</member>
 698        <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> (unary postfix operator)</member>
 699       </simplelist>
 700       where <replaceable>operator</replaceable> follows the syntax
 701       rules of <xref linkend="sql-syntax-operators"> or is one of the
 702       tokens <token>AND</token>, <token>OR</token>, and
 703       <token>NOT</token>.  Which particular operators exist and whether
 704       they are unary or binary depends on what operators have been
 705       defined by the system or the user.  <xref linkend="functions">
 706       describes the built-in operators.
 707      </para>
 708     </listitem>
 709
 710     <listitem>
 711      <para>
 712 <synopsis>( <replaceable>expression</replaceable> )</synopsis>
 713       Parentheses are used to group subexpressions and override precedence.
 714      </para>
 715     </listitem>
 716
 717     <listitem>
 718      <para>
 719       A positional parameter reference, in the body of a function declaration.
 720      </para>
 721     </listitem>
 722
 723     <listitem>
 724      <para>
 725       A function call
 726      </para>
 727     </listitem>
 728
 729     <listitem>
 730      <para>
 731       An aggregate expression
 732      </para>
 733     </listitem>
 734
 735     <listitem>
 736      <para>
 737       A scalar subquery.  This is an ordinary
 738       <command>SELECT</command> in parentheses that returns exactly one
 739       row with one column.  It is an error to use a subquery that
 740       returns more than one row or more than one column in the context
 741       of a value expression.
 742      </para>
 743     </listitem>
 744    </itemizedlist>
 745   </para>
 746
 747   <para>
 748    In addition to this list, there are a number of constructs that can
 749    be classified as an expression but do not follow any general syntax
 750    rules.  These generally have the semantics of a function or
 751    operator and are explained in the appropriate location in <xref
 752    linkend="functions">.  An example is the <literal>IS NULL</literal>
 753    clause.
 754   </para>
 755
 756   <para>
 757    We have already discussed constants in <xref
 758    linkend="sql-syntax-constants">.  The following sections discuss
 759    the remaining options.
 760   </para>
 761
 762   <sect2>
 763    <title>Column References</title>
 764
 765    <para>
 766     A column can be referenced in the form:
 767 <synopsis>
 768 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]'
 769 </synopsis>
 770
 771     <replaceable>correlation</replaceable> is either the name of a
 772     table, an alias for a table defined by means of a FROM clause, or
 773     the keyword <literal>NEW</literal> or <literal>OLD</literal>.
 774     (NEW and OLD can only appear in the action portion of a rule,
 775     while other correlation names can be used in any SQL statement.)
 776     The correlation name can be omitted if the column name is unique
 777     across all the tables being used in the current query.  If
 778     <replaceable>column</replaceable> is of an array type, then the
 779     optional <replaceable>subscript</replaceable> selects a specific
 780     element in the array.  If no subscript is provided, then the whole
 781     array is selected.  Refer to the description of the particular
 782     commands in the <citetitle>PostgreSQL Reference Manual</citetitle>
 783     for the allowed syntax in each case.
 784    </para>
 785   </sect2>
 786
 787   <sect2>
 788    <title>Positional Parameters</title>
 789
 790    <para>
 791     A positional parameter reference is used to indicate a parameter
 792     in an SQL function.  Typically this is used in SQL function
 793     definition statements.  The form of a parameter is:
 794 <synopsis>
 795 $<replaceable>number</replaceable>
 796 </synopsis>
 797    </para>
 798
 799    <para>
 800     For example, consider the definition of a function,
 801     <function>dept</function>, as
 802
 803 <programlisting>
 804 CREATE FUNCTION dept (text) RETURNS dept
 805   AS 'select * from dept where name = $1'
 806   LANGUAGE 'sql';
 807 </programlisting>
 808
 809     Here the <literal>$1</literal> will be replaced by the first
 810     function argument when the function is invoked.
 811    </para>
 812   </sect2>
 813
 814   <sect2>
 815    <title>Function Calls</title>
 816
 817    <para>
 818     The syntax for a function call is the name of a function
 819     (which is subject to the syntax rules for identifiers of <xref
 820     linkend="sql-syntax-identifiers">), followed by its argument list
 821     enclosed in parentheses:
 822
 823 <synopsis>
 824 <replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
 825 </synopsis>
 826    </para>
 827
 828    <para>
 829     For example, the following computes the square root of 2:
 830 <programlisting>
 831 sqrt(2)
 832 </programlisting>
 833    </para>
 834
 835    <para>
 836     The list of built-in functions is in <xref linkend="functions">.
 837     Other functions may be added by the user.
 838    </para>
 839   </sect2>
 840
 841   <sect2 id="syntax-aggregates">
 842    <title>Aggregate Expressions</title>
 843
 844    <para>
 845     An <firstterm>aggregate expression</firstterm> represents the
 846     application of an aggregate function across the rows selected by a
 847     query.  An aggregate function reduces multiple inputs to a single
 848     output value, such as the sum or average of the inputs.  The
 849     syntax of an aggregate expression is one of the following:
 850
 851     <simplelist>
 852      <member><replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable>)</member>
 853      <member><replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable>)</member>
 854      <member><replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)</member>
 855      <member><replaceable>aggregate_name</replaceable> ( * )</member>
 856     </simplelist>
 857
 858     where <replaceable>aggregate_name</replaceable> is a previously
 859     defined aggregate, and <replaceable>expression</replaceable> is
 860     any expression that does not itself contain an aggregate
 861     expression.
 862    </para>
 863
 864    <para>
 865     The first form of aggregate expression invokes the aggregate
 866     across all input rows for which the given expression yields a
 867     non-NULL value.  (Actually, it is up to the aggregate function
 868     whether to ignore NULLs or not --- but all the standard ones do.)
 869     The second form is the same as the first, since
 870     <literal>ALL</literal> is the default.  The third form invokes the
 871     aggregate for all distinct non-NULL values of the expression found
 872     in the input rows.  The last form invokes the aggregate once for
 873     each input row regardless of NULL or non-NULL values; since no
 874     particular input value is specified, it is generally only useful
 875     for the <function>count()</function> aggregate function.
 876    </para>
 877
 878    <para>
 879     For example, <literal>count(*)</literal> yields the total number
 880     of input rows; <literal>count(f1)</literal> yields the number of
 881     input rows in which <literal>f1</literal> is non-NULL;
 882     <literal>count(distinct f1)</literal> yields the number of
 883     distinct non-NULL values of <literal>f1</literal>.
 884    </para>
 885
 886    <para>
 887     The predefined aggregate functions are described in <xref
 888     linkend="functions-aggregate">.  Other aggregate functions may be added
 889     by the user.
 890    </para>
 891   </sect2>
 892
 893  </sect1>
 894
 895
 896   <sect1 id="sql-precedence">
 897    <title>Lexical Precedence</title>
 898
 899    <para>
 900     The precedence and associativity of the operators is hard-wired
 901     into the parser.  Most operators have the same precedence and are
 902     left-associative.  This may lead to non-intuitive behavior; for
 903     example the Boolean operators "&lt;" and "&gt;" have a different
 904     precedence than the Boolean operators "&lt;=" and "&gt;=".  Also,
 905     you will sometimes need to add parentheses when using combinations
 906     of binary and unary operators.  For instance
 907 <programlisting>
 908 SELECT 5 ! ~ 6;
 909 </programlisting>
 910    will be parsed as
 911 <programlisting>
 912 SELECT 5 ! (~ 6);
 913 </programlisting>
 914     because the parser has no idea --- until it's too late --- that
 915     <token>!</token> is defined as a postfix operator not an infix one.
 916     To get the desired behavior in this case, you must write
 917 <programlisting>
 918 SELECT (5 !) ~ 6;
 919 </programlisting>
 920     This is the price one pays for extensibility.
 921    </para>
 922
 923    <table tocentry="1">
 924     <title>Operator Precedence (decreasing)</title>
 925
 926     <tgroup cols="2">
 927      <thead>
 928       <row>
 929        <entry>Operator/Element</entry>
 930        <entry>Associativity</entry>
 931        <entry>Description</entry>
 932       </row>
 933      </thead>
 934
 935      <tbody>
 936       <row>
 937        <entry><token>::</token></entry>
 938        <entry>left</entry>
 939        <entry><productname>Postgres</productname>-style typecast</entry>
 940       </row>
 941
 942       <row>
 943        <entry><token>[</token> <token>]</token></entry>
 944        <entry>left</entry>
 945        <entry>array element selection</entry>
 946       </row>
 947
 948       <row>
 949        <entry><token>.</token></entry>
 950        <entry>left</entry>
 951        <entry>table/column name separator</entry>
 952       </row>
 953
 954       <row>
 955        <entry><token>-</token></entry>
 956        <entry>right</entry>
 957        <entry>unary minus</entry>
 958       </row>
 959
 960       <row>
 961        <entry><token>^</token></entry>
 962        <entry>left</entry>
 963        <entry>exponentiation</entry>
 964       </row>
 965
 966       <row>
 967        <entry><token>*</token> <token>/</token> <token>%</token></entry>
 968        <entry>left</entry>
 969        <entry>multiplication, division, modulo</entry>
 970       </row>
 971
 972       <row>
 973        <entry><token>+</token> <token>-</token></entry>
 974        <entry>left</entry>
 975        <entry>addition, subtraction</entry>
 976       </row>
 977
 978       <row>
 979        <entry><token>IS</token></entry>
 980        <entry></entry>
 981        <entry>test for TRUE, FALSE, NULL</entry>
 982       </row>
 983
 984       <row>
 985        <entry><token>ISNULL</token></entry>
 986        <entry></entry>
 987        <entry>test for NULL</entry>
 988       </row>
 989
 990       <row>
 991        <entry><token>NOTNULL</token></entry>
 992        <entry></entry>
 993        <entry>test for NOT NULL</entry>
 994       </row>
 995
 996       <row>
 997        <entry>(any other)</entry>
 998        <entry>left</entry>
 999        <entry>all other native and user-defined operators</entry>
1000       </row>
1001
1002       <row>
1003        <entry><token>IN</token></entry>
1004        <entry></entry>
1005        <entry>set membership</entry>
1006       </row>
1007
1008       <row>
1009        <entry><token>BETWEEN</token></entry>
1010        <entry></entry>
1011        <entry>containment</entry>
1012       </row>
1013
1014       <row>
1015        <entry><token>OVERLAPS</token></entry>
1016        <entry></entry>
1017        <entry>time interval overlap</entry>
1018       </row>
1019
1020       <row>
1021        <entry><token>LIKE</token> <token>ILIKE</token></entry>
1022        <entry></entry>
1023        <entry>string pattern matching</entry>
1024       </row>
1025
1026       <row>
1027        <entry><token>&lt;</token> <token>&gt;</token></entry>
1028        <entry></entry>
1029        <entry>less than, greater than</entry>
1030       </row>
1031
1032       <row>
1033        <entry><token>=</token></entry>
1034        <entry>right</entry>
1035        <entry>equality, assignment</entry>
1036       </row>
1037
1038       <row>
1039        <entry><token>NOT</token></entry>
1040        <entry>right</entry>
1041        <entry>logical negation</entry>
1042       </row>
1043
1044       <row>
1045        <entry><token>AND</token></entry>
1046        <entry>left</entry>
1047        <entry>logical conjunction</entry>
1048       </row>
1049
1050       <row>
1051        <entry><token>OR</token></entry>
1052        <entry>left</entry>
1053        <entry>logical disjunction</entry>
1054       </row>
1055      </tbody>
1056     </tgroup>
1057    </table>
1058
1059    <para>
1060     Note that the operator precedence rules also apply to user-defined
1061     operators that have the same names as the built-in operators
1062     mentioned above.  For example, if you define a
1063     <quote>+</quote> operator for some custom data type it will have
1064     the same precedence as the built-in <quote>+</quote> operator, no
1065     matter what yours does.
1066    </para>
1067   </sect1>
1068
1069 </chapter>
1070
1071 <!-- Keep this comment at the end of the file
1072 Local variables:
1073 mode:sgml
1074 sgml-omittag:nil
1075 sgml-shorttag:t
1076 sgml-minimize-attributes:nil
1077 sgml-always-quote-attributes:t
1078 sgml-indent-step:1
1079 sgml-indent-data:t
1080 sgml-parent-document:nil
1081 sgml-default-dtd-file:"./reference.ced"
1082 sgml-exposed-tags:nil
1083 sgml-local-catalogs:("/usr/lib/sgml/catalog")
1084 sgml-local-ecat-files:nil
1085 End:
1086 -->