doc/src/sgml/syntax.sgml

   1 <!--
   2 $Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.38 2001/01/26 22:04:22 petere Exp $
   3 -->
   4
   5 <chapter id="sql-syntax">
   6  <title>SQL Syntax</title>
   7
   8   <abstract>
   9    <para>
  10     A description of the general syntax of SQL.
  11    </para>
  12   </abstract>
  13
  14  <sect1 id="sql-syntax-lexical">
  15   <title>Lexical Structure</title>
  16
  17   <para>
  18    SQL input consists of a sequence of
  19    <firstterm>commands</firstterm>.  A command is composed of a
  20    sequence of <firstterm>tokens</firstterm>, terminated by a
  21    semicolon (<quote>;</quote>).  The end of the input stream also
  22    terminates a command.  Which tokens are valid depends on the syntax
  23    of the particular command.
  24   </para>
  25
  26   <para>
  27    A token can be a <firstterm>key word</firstterm>, an
  28    <firstterm>identifier</firstterm>, a <firstterm>quoted
  29    identifier</firstterm>, a <firstterm>literal</firstterm> (or
  30    constant), or a special character symbol.  Tokens are normally
  31    separated by whitespace (space, tab, newline), but need not be if
  32    there is no ambiguity (which is generally only the case if a
  33    special character is adjacent to some other token type).
  34   </para>
  35
  36   <para>
  37    Additionally, <firstterm>comments</firstterm> can occur in SQL
  38    input.  They are not tokens, they are effectively equivalent to
  39    whitespace.
  40   </para>
  41
  42   <informalexample id="sql-syntax-ex-commands">
  43    <para>
  44     For example, the following is (syntactically) valid SQL input:
  45 <programlisting>
  46 SELECT * FROM MY_TABLE;
  47 UPDATE MY_TABLE SET A = 5;
  48 INSERT INTO MY_TABLE VALUES (3, 'hi there');
  49 </programlisting>
  50     This is a sequence of three commands, one per line (although this
  51     is not required; more than one command can be on a line, and
  52     commands can be usefully split across lines).
  53    </para>
  54   </informalexample>
  55
  56   <para>
  57    The SQL syntax is not very consistent regarding what tokens
  58    identify commands and which are operands or parameters.  The first
  59    few tokens are generally the command name, so in the above example
  60    we would usually speak of a <quote>SELECT</quote>, an
  61    <quote>UPDATE</quote>, and an <quote>INSERT</quote> command.  But
  62    for instance the <command>UPDATE</command> command always requires
  63    a <token>SET</token> token to appear in a certain position, and
  64    this particular variation of <command>INSERT</command> also
  65    requires a <token>VALUES</token> in order to be complete.  The
  66    precise syntax rules for each command are described in the
  67    <citetitle>Reference Manual</citetitle>.
  68   </para>
  69
  70   <sect2 id="sql-syntax-identifiers">
  71    <title>Identifiers and Key Words</title>
  72
  73    <para>
  74     Tokens such as <token>SELECT</token>, <token>UPDATE</token>, or
  75     <token>VALUES</token> in the example above are examples of
  76     <firstterm>key words</firstterm>, that is, words that have a fixed
  77     meaning in the SQL language.  The tokens <token>MY_TABLE</token>
  78     and <token>A</token> are examples of
  79     <firstterm>identifiers</firstterm>.  They identify names of
  80     tables, columns, or other database objects, depending on the
  81     command they are used in.  Therefore they are sometimes simply
  82     called <quote>names</quote>.  Key words and identifiers have the
  83     same lexical structure, meaning that one cannot know whether a
  84     token is an identifier or a key word without knowing the language.
  85     A complete list of key words can be found in <xref
  86     linkend="sql-keywords-appendix">.
  87    </para>
  88
  89    <para>
  90     SQL identifiers and key words must begin with a letter
  91     (<literal>a</literal>-<literal>z</literal>) or underscore
  92     (<literal>_</literal>).  Subsequent characters in an identifier or
  93     key word can be letters, digits
  94     (<literal>0</literal>-<literal>9</literal>), or underscores,
  95     although the SQL standard will not define a key word that contains
  96     digits or starts or ends with an underscore.
  97    </para>
  98
  99    <para>
 100     The system uses no more than <symbol>NAMEDATALEN</symbol>-1
 101     characters of an identifier; longer names can be written in
 102     commands, but they will be truncated.  By default,
 103     <symbol>NAMEDATALEN</symbol> is 32 so the maximum identifier length
 104     is 31 (but at the time the system is built,
 105     <symbol>NAMEDATALEN</symbol> can be changed in
 106     <filename>src/include/postgres_ext.h</filename>).
 107    </para>
 108
 109    <para>
 110     Identifier and key word names are case insensitive.  Therefore
 111 <programlisting>
 112 UPDATE MY_TABLE SET A = 5;
 113 </programlisting>
 114     can equivalently be written as
 115 <programlisting>
 116 uPDaTE my_TabLE SeT a = 5;
 117 </programlisting>
 118     A good convention to adopt is perhaps to write key words in upper
 119     case and names in lower case, e.g.,
 120 <programlisting>
 121 UPDATE my_table SET a = 5;
 122 </programlisting>
 123    </para>
 124
 125    <para>
 126     There is a second kind of identifier:  the <firstterm>delimited
 127     identifier</firstterm> or <firstterm>quoted
 128     identifier</firstterm>.  It is formed by enclosing an arbitrary
 129     sequence of characters in double-quotes
 130     (<literal>"</literal>). <!-- " font-lock mania --> A delimited
 131     identifier is always an identifier, never a key word.  So
 132     <literal>"select"</literal> could be used to refer to a column or
 133     table named <quote>select</quote>, whereas an unquoted
 134     <literal>select</literal> would be taken as part of a command and
 135     would therefore provoke a parse error when used where a table or
 136     column name is expected.  The example can be written with quoted
 137     identifiers like so:
 138 <programlisting>
 139 UPDATE "my_table" SET "a" = 5;
 140 </programlisting>
 141    </para>
 142
 143    <para>
 144     Quoted identifiers can contain any character other than a double
 145     quote itself.  This allows constructing table or column names that
 146     would otherwise not be possible, such as ones containing spaces or
 147     ampersands.  The length limitation still applies.
 148    </para>
 149
 150    <para>
 151     Quoting an identifier also makes it case-sensitive, whereas
 152     unquoted names are always folded to lower case.  For example, the
 153     identifiers <literal>FOO</literal>, <literal>foo</literal> and
 154     <literal>"foo"</literal> are considered the same by
 155     <productname>Postgres</productname>, but <literal>"Foo"</literal>
 156     and <literal>"FOO"</literal> are different from these three and
 157     each other.
 158     <footnote>
 159      <para>
 160       This is incompatible with SQL, where unquoted names are folded to
 161       upper case.  Thus, <literal>foo</literal> is equivalent to
 162       <literal>"FOO"</literal>.  If you want to write portable
 163       applications you are advised to always quote a particular name or
 164       never quote it.
 165      </para>
 166     </footnote>
 167    </para>
 168   </sect2>
 169
 170
 171   <sect2 id="sql-syntax-constants">
 172    <title>Constants</title>
 173
 174    <para>
 175     There are four kinds of <firstterm>implicitly typed
 176     constants</firstterm> in <productname>Postgres</productname>:
 177     strings, bit strings, integers, and floating point numbers.
 178     Constants can also be specified with explicit types, which can
 179     enable more accurate representation and more efficient handling by
 180     the system. The implicit constants are described below; explicit
 181     constants are discussed afterwards.
 182    </para>
 183
 184    <sect3 id="sql-syntax-strings">
 185     <title>String Constants</title>
 186
 187     <para>
 188      A string constant in SQL is an arbitrary sequence of characters
 189      bounded by single quotes (<quote>'</quote>), e.g., <literal>'This
 190      is a string'</literal>.  SQL allows single quotes to be embedded
 191      in strings by typing two adjacent single quotes (e.g.,
 192      <literal>'Dianne''s horse'</literal>).  In
 193      <productname>Postgres</productname> single quotes may
 194      alternatively be escaped with a backslash (<quote>\</quote>,
 195      e.g., <literal>'Dianne\'s horse'</literal>).
 196     </para>
 197
 198     <para>
 199      C-style backslash escapes are also available:
 200      <literal>\b</literal> is a backspace, <literal>\f</literal> is a
 201      form feed, <literal>\n</literal> is a newline,
 202      <literal>\r</literal> is a carriage return, <literal>\t</literal>
 203      is a tab, and <literal>\<replaceable>xxx</replaceable></literal>,
 204      where <replaceable>xxx</replaceable> is an octal number, is the
 205      character with the corresponding ASCII code.  Any other character
 206      following a backslash is taken literally.  Thus, to include a
 207      backslash in a string constant, type two backslashes.
 208     </para>
 209
 210     <para>
 211      The character with the code zero cannot be in a string constant.
 212     </para>
 213
 214     <para>
 215      Two string constants that are only separated by whitespace
 216      <emphasis>with at least one newline</emphasis> are concatenated
 217      and effectively treated as if the string had been written in one
 218      constant.  For example:
 219 <programlisting>
 220 SELECT 'foo'
 221 'bar';
 222 </programlisting>
 223      is equivalent to
 224 <programlisting>
 225 SELECT 'foobar';
 226 </programlisting>
 227      but
 228 <programlisting>
 229 SELECT 'foo'      'bar';
 230 </programlisting>
 231      is not valid syntax.
 232     </para>
 233    </sect3>
 234
 235    <sect3 id="sql-syntax-bit-strings">
 236     <title>Bit String Constants</title>
 237
 238     <para>
 239      Bit string constants look like string constants with a
 240      <literal>B</literal> (upper or lower case) immediately before the
 241      opening quote (no intervening whitespace), e.g.,
 242      <literal>B'1001'</literal>.  The only characters allowed within
 243      bit string constants are <literal>0</literal> and
 244      <literal>1</literal>.  Bit strings constants can be continued
 245      across lines in the same way as regular string constants.
 246     </para>
 247    </sect3>
 248
 249    <sect3>
 250     <title>Integer Constants</title>
 251
 252     <para>
 253      Integer constants in SQL are sequences of decimal digits (0
 254      though 9) with no decimal point.  The range of legal values
 255      depends on which integer data type is used, but the plain
 256      <type>integer</type> type accepts values ranging from -2147483648
 257      to +2147483647.  (The optional plus or minus sign is actually a
 258      separate unary operator and not part of the integer constant.)
 259     </para>
 260    </sect3>
 261
 262    <sect3>
 263     <title>Floating Point Constants</title>
 264
 265     <para>
 266      Floating point constants are accepted in these general forms:
 267 <synopsis>
 268 <replaceable>digits</replaceable>.<optional><replaceable>digits</replaceable></optional><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 269 <optional><replaceable>digits</replaceable></optional>.<replaceable>digits</replaceable><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 270 <replaceable>digits</replaceable>e<optional>+-</optional><replaceable>digits</replaceable>
 271 </synopsis>
 272      where <replaceable>digits</replaceable> is one or more decimal
 273      digits.  At least one digit must be before or after the decimal
 274      point and after the <literal>e</literal> if you use that option.
 275      Thus, a floating point constant is distinguished from an integer
 276      constant by the presence of either the decimal point or the
 277      exponent clause (or both).  There must not be a space or other
 278      characters embedded in the constant.
 279     </para>
 280
 281     <informalexample>
 282      <para>
 283       These are some examples of valid floating point constants:
 284 <literallayout>
 285 3.5
 286 4.
 287 .001
 288 5e2
 289 1.925e-3
 290 </literallayout>
 291      </para>
 292     </informalexample>
 293
 294     <para>
 295      Floating point constants are of type <type>DOUBLE
 296      PRECISION</type>. <type>REAL</type> can be specified explicitly
 297      by using <acronym>SQL</acronym> string notation or
 298      <productname>Postgres</productname> type notation:
 299
 300 <programlisting>
 301 REAL '1.23'  -- string style
 302 '1.23'::REAL -- Postgres (historical) style
 303      </programlisting>
 304     </para>
 305    </sect3>
 306
 307    <sect3 id="sql-syntax-constants-generic">
 308     <title>Constants of Other Types</title>
 309
 310     <para>
 311      A constant of an <emphasis>arbitrary</emphasis> type can be
 312      entered using any one of the following notations:
 313 <synopsis>
 314 <replaceable>type</replaceable> '<replaceable>string</replaceable>'
 315 '<replaceable>string</replaceable>'::<replaceable>type</replaceable>
 316 CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 317 </synopsis>
 318      The value inside the string is passed to the input conversion
 319      routine for the type called <replaceable>type</replaceable>. The
 320      result is a constant of the indicated type.  The explicit type
 321      cast may be omitted if there is no ambiguity as to the type the
 322      constant must be (for example, when it is passed as an argument
 323      to a non-overloaded function), in which case it is automatically
 324      coerced.
 325     </para>
 326
 327     <para>
 328      It is also possible to specify a type coercion using a function-like
 329      syntax:
 330 <synopsis>
 331 <replaceable>typename</replaceable> ( <replaceable>value</replaceable> )
 332 </synopsis>
 333      although this only works for types whose names are also valid as
 334      function names.  (For example, <literal>double precision</literal>
 335      can't be used this way --- but the equivalent <literal>float8</literal>
 336      can.)
 337     </para>
 338
 339     <para>
 340      The <literal>::</literal>, <literal>CAST()</literal>, and
 341      function-call syntaxes can also be used to specify the type of
 342      arbitrary expressions, but the form
 343      <replaceable>type</replaceable>
 344      '<replaceable>string</replaceable>' can only be used to specify
 345      the type of a literal constant.
 346     </para>
 347    </sect3>
 348
 349    <sect3>
 350     <title>Array constants</title>
 351
 352     <para>
 353      The general format of an array constant is the following:
 354 <synopsis>
 355 '{ <replaceable>val1</replaceable> <replaceable>delim</replaceable> <replaceable>val2</replaceable> <replaceable>delim</replaceable> ... }'
 356 </synopsis>
 357      where <replaceable>delim</replaceable> is the delimiter character
 358      for the type, as recorded in its <literal>pg_type</literal>
 359      entry.  (For all built-in types, this is the comma character
 360      ",".)  Each <replaceable>val</replaceable> is either a constant
 361      of the array element type, or a sub-array.  An example of an
 362      array constant is
 363 <programlisting>
 364 '{{1,2,3},{4,5,6},{7,8,9}}'
 365 </programlisting>
 366      This constant is a two-dimensional, 3 by 3 array consisting of three
 367      sub-arrays of integers.
 368     </para>
 369
 370     <para>
 371      Individual array elements can be placed between double-quote
 372      marks (<literal>"</literal>) <!-- " --> to avoid ambiguity
 373      problems with respect to white space.  Without quote marks, the
 374      array-value parser will skip leading white space.
 375     </para>
 376
 377     <para>
 378      (Array constants are actually only a special case of the generic
 379      type constants discussed in the previous section.  The constant
 380      is initially treated as a string and passed to the array input
 381      conversion routine.  An explicit type specification might be
 382      necessary.)
 383     </para>
 384    </sect3>
 385   </sect2>
 386
 387
 388   <sect2 id="sql-syntax-operators">
 389    <title>Operators</title>
 390
 391    <para>
 392     An operator is a sequence of up to <symbol>NAMEDATALEN</symbol>-1
 393     (31 by default) characters from the following list:
 394 <literallayout>
 395 + - * / &lt; &gt; = ~ ! @ # % ^ &amp; | ` ? $
 396 </literallayout>
 397
 398     There are a few restrictions on operator names, however:
 399     <itemizedlist>
 400      <listitem>
 401       <para>
 402        "$" (dollar) cannot be a single-character operator, although it
 403        can be part of a multi-character operator name.
 404       </para>
 405      </listitem>
 406
 407      <listitem>
 408       <para>
 409        <literal>--</literal> and <literal>/*</literal> cannot appear
 410        anywhere in an operator name, since they will be taken as the
 411        start of a comment.
 412       </para>
 413      </listitem>
 414
 415      <listitem>
 416       <para>
 417        A multi-character operator name cannot end in "+" or "-",
 418        unless the name also contains at least one of these characters:
 419 <literallayout>
 420 ~ ! @ # % ^ &amp; | ` ? $
 421 </literallayout>
 422        For example, <literal>@-</literal> is an allowed operator name,
 423        but <literal>*-</literal> is not.  This restriction allows
 424        <productname>Postgres</productname> to parse SQL-compliant
 425        queries without requiring spaces between tokens.
 426       </para>
 427      </listitem>
 428     </itemizedlist>
 429    </para>
 430
 431    <para>
 432     When working with non-SQL-standard operator names, you will usually
 433     need to separate adjacent operators with spaces to avoid ambiguity.
 434     For example, if you have defined a left-unary operator named "@",
 435     you cannot write <literal>X*@Y</literal>; you must write
 436     <literal>X* @Y</literal> to ensure that
 437     <productname>Postgres</productname> reads it as two operator names
 438     not one.
 439    </para>
 440   </sect2>
 441
 442   <sect2>
 443    <title>Special Characters</title>
 444
 445   <para>
 446    Some characters that are not alphanumeric have a special meaning
 447    that is different from being an operator.  Details on the usage can
 448    be found at the location where the respective syntax element is
 449    described.  This section only exists to advise the existence and
 450    summarize the purposes of these characters.
 451
 452    <itemizedlist>
 453     <listitem>
 454      <para>
 455       A dollar sign (<literal>$</literal>) followed by digits is used
 456       to represent the positional parameters in the body of a function
 457       definition.  In other contexts the dollar sign may be part of an
 458       operator name.
 459      </para>
 460     </listitem>
 461
 462     <listitem>
 463      <para>
 464       Parentheses (<literal>()</literal>) have their usual meaning to
 465       group expressions and enforce precedence.  In some cases
 466       parentheses are required as part of the fixed syntax of a
 467       particular SQL command.
 468      </para>
 469     </listitem>
 470
 471     <listitem>
 472      <para>
 473       Brackets (<literal>[]</literal>) are used to select the elements
 474       of an array.  See <xref linkend="arrays"> for more information
 475       on arrays.
 476      </para>
 477     </listitem>
 478
 479     <listitem>
 480      <para>
 481       Commas (<literal>,</literal>) are used in some syntactical
 482       constructs to separate the elements of a list.
 483      </para>
 484     </listitem>
 485
 486     <listitem>
 487      <para>
 488       The semicolon (<literal>;</literal>) terminates an SQL command.
 489       It cannot appear anywhere within a command, except when quoted
 490       as a string constant or identifier.
 491      </para>
 492     </listitem>
 493
 494     <listitem>
 495      <para>
 496       The colon (<literal>:</literal>) is used to select
 497       <quote>slices</quote> from arrays. (See <xref
 498       linkend="arrays">.)  In certain SQL dialects (such as Embedded
 499       SQL), the colon is used to prefix variable names.
 500      </para>
 501     </listitem>
 502
 503     <listitem>
 504      <para>
 505       The asterisk (<literal>*</literal>) has a special meaning when
 506       used in the <command>SELECT</command> command or with the
 507       <function>COUNT</function> aggregate function.
 508      </para>
 509     </listitem>
 510
 511     <listitem>
 512      <para>
 513       The period (<literal>.</literal>) is used in floating point
 514       constants, and to separate table and column names.
 515      </para>
 516     </listitem>
 517    </itemizedlist>
 518
 519    </para>
 520   </sect2>
 521
 522   <sect2 id="sql-syntax-comments">
 523    <title>Comments</title>
 524
 525    <para>
 526     A comment is an arbitrary sequence of characters beginning with
 527     double dashes and extending to the end of the line, e.g.:
 528 <programlisting>
 529 -- This is a standard SQL92 comment
 530 </programlisting>
 531    </para>
 532
 533    <para>
 534     Alternatively, C-style block comments can be used:
 535 <programlisting>
 536 /* multi-line comment
 537  * with nesting: /* nested block comment */
 538  */
 539 </programlisting>
 540     where the comment begins with <literal>/*</literal> and extends to
 541     the matching occurrence of <literal>*/</literal>. These block
 542     comments nest, as specified in SQL99 but unlike C, so that one can
 543     comment out larger blocks of code that may contain existing block
 544     comments.
 545    </para>
 546
 547    <para>
 548     A comment is removed from the input stream before further syntax
 549     analysis and is effectively replaced by whitespace.
 550    </para>
 551   </sect2>
 552  </sect1>
 553
 554
 555   <sect1 id="sql-syntax-columns">
 556    <title>Columns</title>
 557
 558     <para>
 559      A <firstterm>column</firstterm>
 560      is either a user-defined column of a given table or one of the
 561      following system-defined columns:
 562
 563      <variablelist>
 564       <varlistentry>
 565        <term>oid</term>
 566        <listitem>
 567         <para>
 568          The unique identifier (object ID) of a row.  This is a serial number
 569          that is added by Postgres to all rows automatically. OIDs are not
 570          reused and are 32-bit quantities.
 571         </para>
 572        </listitem>
 573       </varlistentry>
 574
 575       <varlistentry>
 576       <term>tableoid</term>
 577        <listitem>
 578         <para>
 579          The OID of the table containing this row.  This attribute is
 580          particularly handy for queries that select from inheritance
 581          hierarchies, since without it, it's difficult to tell which
 582          individual table a row came from.  The tableoid can be joined
 583          against the OID attribute of pg_class to obtain the table name.
 584         </para>
 585        </listitem>
 586       </varlistentry>
 587
 588       <varlistentry>
 589        <term>xmin</term>
 590        <listitem>
 591         <para>
 592          The identity (transaction ID) of the inserting transaction for
 593          this tuple.  (Note: a tuple is an individual state of a row;
 594          each UPDATE of a row creates a new tuple for the same logical row.)
 595         </para>
 596        </listitem>
 597       </varlistentry>
 598
 599       <varlistentry>
 600       <term>cmin</term>
 601        <listitem>
 602         <para>
 603          The command identifier (starting at zero) within the inserting
 604          transaction.
 605         </para>
 606        </listitem>
 607       </varlistentry>
 608
 609       <varlistentry>
 610       <term>xmax</term>
 611        <listitem>
 612         <para>
 613          The identity (transaction ID) of the deleting transaction,
 614          or zero for an undeleted tuple.  In practice, this is never nonzero
 615          for a visible tuple.
 616         </para>
 617        </listitem>
 618       </varlistentry>
 619
 620       <varlistentry>
 621       <term>cmax</term>
 622        <listitem>
 623         <para>
 624          The command identifier within the deleting transaction, or zero.
 625          Again, this is never nonzero for a visible tuple.
 626         </para>
 627        </listitem>
 628       </varlistentry>
 629
 630       <varlistentry>
 631       <term>ctid</term>
 632        <listitem>
 633         <para>
 634          The tuple ID of the tuple within its table.  This is a pair
 635          (block number, tuple index within block) that identifies the
 636          physical location of the tuple.  Note that although the ctid
 637          can be used to locate the tuple very quickly, a row's ctid
 638          will change each time it is updated or moved by VACUUM.
 639          Therefore ctid is useless as a long-term row identifier.
 640          The OID, or even better a user-defined serial number, should
 641          be used to identify logical rows.
 642         </para>
 643        </listitem>
 644       </varlistentry>
 645      </variablelist>
 646     </para>
 647
 648     <para>
 649      For further information on the system attributes consult
 650      <xref linkend="STON87a" endterm="STON87a">.
 651      Transaction and command identifiers are 32 bit quantities.
 652     </para>
 653
 654   </sect1>
 655
 656
 657  <sect1 id="sql-expressions">
 658   <title>Value Expressions</title>
 659
 660   <para>
 661    Value expressions are used in a variety of syntactic contexts, such
 662    as in the target list of the <command>SELECT</command> command, as
 663    new column values in <command>INSERT</command> or
 664    <command>UPDATE</command>, or in search conditions in a number of
 665    commands.  The result of a value expression is sometimes called a
 666    <firstterm>scalar</firstterm>, to distinguish it from the result of
 667    a table expression (which is a table).  Value expressions are
 668    therefore also called <firstterm>scalar expressions</firstterm> (or
 669    even simply <firstterm>expressions</firstterm>).  The expression
 670    syntax allows the calculation of values from primitive parts using
 671    arithmetic, logical, set, and other operations.
 672   </para>
 673
 674   <para>
 675    A value expression is one of the following:
 676
 677    <itemizedlist>
 678     <listitem>
 679      <para>
 680       A constant or literal value; see <xref linkend="sql-syntax-constants">.
 681      </para>
 682     </listitem>
 683
 684     <listitem>
 685      <para>
 686       A column reference
 687      </para>
 688     </listitem>
 689
 690     <listitem>
 691      <para>
 692       An operator invocation:
 693       <simplelist>
 694        <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> <replaceable>expression</replaceable> (binary infix operator)</member>
 695        <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> (unary postfix operator)</member>
 696        <member><replaceable>operator</replaceable> <replaceable>expression</replaceable> (unary prefix operator)</member>
 697       </simplelist>
 698       where <replaceable>operator</replaceable> follows the syntax
 699       rules of <xref linkend="sql-syntax-operators"> or is one of the
 700       tokens <token>AND</token>, <token>OR</token>, and
 701       <token>NOT</token>.  What particular operators exist and whether
 702       they are unary or binary depends on what operators have been
 703       defined by the system or the user.  <xref linkend="functions">
 704       describes the built-in operators.
 705      </para>
 706     </listitem>
 707
 708     <listitem>
 709      <para>
 710 <synopsis>( <replaceable>expression</replaceable> )</synopsis>
 711       Parentheses are used to group subexpressions and override precedence.
 712      </para>
 713     </listitem>
 714
 715     <listitem>
 716      <para>
 717       A positional parameter reference, in the body of a function declaration.
 718      </para>
 719     </listitem>
 720
 721     <listitem>
 722      <para>
 723       A function call
 724      </para>
 725     </listitem>
 726
 727     <listitem>
 728      <para>
 729       An aggregate expression
 730      </para>
 731     </listitem>
 732
 733     <listitem>
 734      <para>
 735       A scalar subquery.  This is an ordinary
 736       <command>SELECT</command> in parenthesis that returns exactly one
 737       row with one column.  It is an error to use a subquery that
 738       returns more than one row or more than one column in the context
 739       of a value expression.
 740      </para>
 741     </listitem>
 742    </itemizedlist>
 743   </para>
 744
 745   <para>
 746    In addition to this list, there are a number of constructs that can
 747    be classified as an expression but do not follow any general syntax
 748    rules.  These generally have the semantics of a function or
 749    operator and are explained in the appropriate location in <xref
 750    linkend="functions">.  An example is the <literal>IS NULL</literal>
 751    clause.
 752   </para>
 753
 754   <para>
 755    We have already discussed constants in <xref
 756    linkend="sql-syntax-constants">.  The following sections discuss
 757    the remaining options.
 758   </para>
 759
 760   <sect2>
 761    <title>Column References</title>
 762
 763    <para>
 764     A column can be referenced in the form:
 765 <synopsis>
 766 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]'
 767 </synopsis>
 768
 769     <replaceable>correlation</replaceable> is either the name of a
 770     table, an alias for a table defined by means of a FROM clause, or
 771     the keyword <literal>NEW</literal> or <literal>OLD</literal>.
 772     (NEW and OLD can only appear in the action portion of a rule,
 773     while other correlation names can be used in any SQL statement.)
 774     The correlation name can be omitted if the column name is unique
 775     across all the tables being used in the current query.  If
 776     <replaceable>column</replaceable> is of an array type, then the
 777     optional <replaceable>subscript</replaceable> selects a specific
 778     element in the array.  If no subscript is provided, then the whole
 779     array is selected.  Refer to the description of the particular
 780     commands in the <citetitle>PostgreSQL Reference Manual</citetitle>
 781     for the allowed syntax in each case.
 782    </para>
 783   </sect2>
 784
 785   <sect2>
 786    <title>Positional Parameters</title>
 787
 788    <para>
 789     A positional parameter reference is used to indicate a parameter
 790     in an SQL function.  Typically this is used in SQL function
 791     definition statements.  The form of a parameter is:
 792 <synopsis>
 793 $<replaceable>number</replaceable>
 794 </synopsis>
 795    </para>
 796
 797    <para>
 798     For example, consider the definition of a function,
 799     <function>dept</function>, as
 800
 801 <programlisting>
 802 CREATE FUNCTION dept (text) RETURNS dept
 803   AS 'select * from dept where name = $1'
 804   LANGUAGE 'sql';
 805 </programlisting>
 806
 807     Here the <literal>$1</literal> will be replaced by the first
 808     function argument when the function is invoked.
 809    </para>
 810   </sect2>
 811
 812   <sect2>
 813    <title>Function Calls</title>
 814
 815    <para>
 816     The syntax for a function call is the name of a legal function
 817     (subject to the syntax rules for identifiers of <xref
 818     linkend="sql-syntax-identifiers"> , followed by its argument list
 819     enclosed in parentheses:
 820
 821 <synopsis>
 822 <replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
 823 </synopsis>
 824    </para>
 825
 826    <para>
 827     For example, the following computes the square root of 2:
 828 <programlisting>
 829 sqrt(2)
 830 </programlisting>
 831    </para>
 832
 833    <para>
 834     The list of built-in functions is in <xref linkend="functions">.
 835     Other functions may be added by the user.
 836    </para>
 837   </sect2>
 838
 839   <sect2 id="syntax-aggregates">
 840    <title>Aggregate Expressions</title>
 841
 842    <para>
 843     An <firstterm>aggregate expression</firstterm> represents the
 844     application of an aggregate function across the rows selected by a
 845     query.  An aggregate function reduces multiple inputs to a single
 846     output value, such as the sum or average of the inputs.  The
 847     syntax of an aggregate expression is one of the following:
 848
 849     <simplelist>
 850      <member><replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable>)</member>
 851      <member><replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable>)</member>
 852      <member><replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)</member>
 853      <member><replaceable>aggregate_name</replaceable> ( * )</member>
 854     </simplelist>
 855
 856     where <replaceable>aggregate_name</replaceable> is a previously
 857     defined aggregate, and <replaceable>expression</replaceable> is
 858     any expression that does not itself contain an aggregate
 859     expression.
 860    </para>
 861
 862    <para>
 863     The first form of aggregate expression invokes the aggregate
 864     across all input rows for which the given expression yields a
 865     non-NULL value.  The second form is the same as the first, since
 866     <literal>ALL</literal> is the default.  The third form invokes the
 867     aggregate for all distinct non-NULL values of the expression found
 868     in the input rows.  The last form invokes the aggregate once for
 869     each input row regardless of NULL or non-NULL values; since no
 870     particular input value is specified, it is generally only useful
 871     for the <function>count()</function> aggregate function.
 872    </para>
 873
 874    <para>
 875     For example, <literal>count(*)</literal> yields the total number
 876     of input rows; <literal>count(f1)</literal> yields the number of
 877     input rows in which <literal>f1</literal> is non-NULL;
 878     <literal>count(distinct f1)</literal> yields the number of
 879     distinct non-NULL values of <literal>f1</literal>.
 880    </para>
 881
 882    <para>
 883     The predefined aggregate functions are described in <xref
 884     linkend="functions-aggregate">.
 885    </para>
 886   </sect2>
 887
 888  </sect1>
 889
 890
 891   <sect1 id="sql-precedence">
 892    <title>Lexical Precedence</title>
 893
 894    <para>
 895     The precedence and associativity of the operators is hard-wired
 896     into the parser.  Most operators have the same precedence and are
 897     left-associative.  This may lead to non-intuitive behavior; for
 898     example the Boolean operators "&lt;" and "&gt;" have a different
 899     precedence than the Boolean operators "&lt;=" and "&gt;=".  Also,
 900     you will sometimes need to add parentheses when using combinations
 901     of binary and unary operators.  For instance
 902 <programlisting>
 903 SELECT 5 &amp; ~ 6;
 904 </programlisting>
 905    will be parsed as
 906 <programlisting>
 907 SELECT (5 &amp;) ~ 6;
 908 </programlisting>
 909     because the parser has no idea that <token>&amp;</token> is
 910     defined as a binary operator.  This is the price one pays for
 911     extensibility.
 912    </para>
 913
 914    <table tocentry="1">
 915     <title>Operator Precedence (decreasing)</title>
 916
 917     <tgroup cols="2">
 918      <thead>
 919       <row>
 920        <entry>Operator/Element</entry>
 921        <entry>Associativity</entry>
 922        <entry>Description</entry>
 923       </row>
 924      </thead>
 925
 926      <tbody>
 927       <row>
 928        <entry><token>::</token></entry>
 929        <entry>left</entry>
 930        <entry><productname>Postgres</productname>-style typecast</entry>
 931       </row>
 932
 933       <row>
 934        <entry><token>[</token> <token>]</token></entry>
 935        <entry>left</entry>
 936        <entry>array element selection</entry>
 937       </row>
 938
 939       <row>
 940        <entry><token>.</token></entry>
 941        <entry>left</entry>
 942        <entry>table/column name separator</entry>
 943       </row>
 944
 945       <row>
 946        <entry><token>-</token></entry>
 947        <entry>right</entry>
 948        <entry>unary minus</entry>
 949       </row>
 950
 951       <row>
 952        <entry><token>^</token></entry>
 953        <entry>left</entry>
 954        <entry>exponentiation</entry>
 955       </row>
 956
 957       <row>
 958        <entry><token>*</token> <token>/</token> <token>%</token></entry>
 959        <entry>left</entry>
 960        <entry>multiplication, division, modulo</entry>
 961       </row>
 962
 963       <row>
 964        <entry><token>+</token> <token>-</token></entry>
 965        <entry>left</entry>
 966        <entry>addition, subtraction</entry>
 967       </row>
 968
 969       <row>
 970        <entry><token>IS</token></entry>
 971        <entry></entry>
 972        <entry>test for TRUE, FALSE, NULL</entry>
 973       </row>
 974
 975       <row>
 976        <entry><token>ISNULL</token></entry>
 977        <entry></entry>
 978        <entry>test for NULL</entry>
 979       </row>
 980
 981       <row>
 982        <entry><token>NOTNULL</token></entry>
 983        <entry></entry>
 984        <entry>test for NOT NULL</entry>
 985       </row>
 986
 987       <row>
 988        <entry>(any other)</entry>
 989        <entry>left</entry>
 990        <entry>all other native and user-defined operators</entry>
 991       </row>
 992
 993       <row>
 994        <entry><token>IN</token></entry>
 995        <entry></entry>
 996        <entry>set membership</entry>
 997       </row>
 998
 999       <row>
1000        <entry><token>BETWEEN</token></entry>
1001        <entry></entry>
1002        <entry>containment</entry>
1003       </row>
1004
1005       <row>
1006        <entry><token>OVERLAPS</token></entry>
1007        <entry></entry>
1008        <entry>time interval overlap</entry>
1009       </row>
1010
1011       <row>
1012        <entry><token>LIKE</token> <token>ILIKE</token></entry>
1013        <entry></entry>
1014        <entry>string pattern matching</entry>
1015       </row>
1016
1017       <row>
1018        <entry><token>&lt;</token> <token>&gt;</token></entry>
1019        <entry></entry>
1020        <entry>less than, greater than</entry>
1021       </row>
1022
1023       <row>
1024        <entry><token>=</token></entry>
1025        <entry>right</entry>
1026        <entry>equality, assignment</entry>
1027       </row>
1028
1029       <row>
1030        <entry><token>NOT</token></entry>
1031        <entry>right</entry>
1032        <entry>logical negation</entry>
1033       </row>
1034
1035       <row>
1036        <entry><token>AND</token></entry>
1037        <entry>left</entry>
1038        <entry>logical conjunction</entry>
1039       </row>
1040
1041       <row>
1042        <entry><token>OR</token></entry>
1043        <entry>left</entry>
1044        <entry>logical disjunction</entry>
1045       </row>
1046      </tbody>
1047     </tgroup>
1048    </table>
1049
1050    <para>
1051     Note that the operator precedence rules also apply to user-defined
1052     operators that have the same names as the built-in operators
1053     mentioned above.  For example, if you define a
1054     <quote>+</quote> operator for some custom data type it will have
1055     the same precedence as the built-in <quote>+</quote> operator, no
1056     matter what yours does.
1057    </para>
1058   </sect1>
1059
1060 </chapter>
1061
1062 <!-- Keep this comment at the end of the file
1063 Local variables:
1064 mode:sgml
1065 sgml-omittag:nil
1066 sgml-shorttag:t
1067 sgml-minimize-attributes:nil
1068 sgml-always-quote-attributes:t
1069 sgml-indent-step:1
1070 sgml-indent-data:t
1071 sgml-parent-document:nil
1072 sgml-default-dtd-file:"./reference.ced"
1073 sgml-exposed-tags:nil
1074 sgml-local-catalogs:("/usr/lib/sgml/catalog")
1075 sgml-local-ecat-files:nil
1076 End:
1077 -->