doc/src/sgml/syntax.sgml

   1 <!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.106 2006/03/10 19:10:49 momjian Exp $ -->
   2
   3 <chapter id="sql-syntax">
   4  <title>SQL Syntax</title>
   5
   6  <indexterm zone="sql-syntax">
   7   <primary>syntax</primary>
   8   <secondary>SQL</secondary>
   9  </indexterm>
  10
  11  <para>
  12   This chapter describes the syntax of SQL.  It forms the foundation
  13   for understanding the following chapters which will go into detail
  14   about how the SQL commands are applied to define and modify data.
  15  </para>
  16
  17  <para>
  18   We also advise users who are already familiar with SQL to read this
  19   chapter carefully because there are several rules and concepts that
  20   are implemented inconsistently among SQL databases or that are
  21   specific to <productname>PostgreSQL</productname>.
  22  </para>
  23
  24  <sect1 id="sql-syntax-lexical">
  25   <title>Lexical Structure</title>
  26
  27   <indexterm>
  28    <primary>token</primary>
  29   </indexterm>
  30
  31   <para>
  32    SQL input consists of a sequence of
  33    <firstterm>commands</firstterm>.  A command is composed of a
  34    sequence of <firstterm>tokens</firstterm>, terminated by a
  35    semicolon (<quote>;</quote>).  The end of the input stream also
  36    terminates a command.  Which tokens are valid depends on the syntax
  37    of the particular command.
  38   </para>
  39
  40   <para>
  41    A token can be a <firstterm>key word</firstterm>, an
  42    <firstterm>identifier</firstterm>, a <firstterm>quoted
  43    identifier</firstterm>, a <firstterm>literal</firstterm> (or
  44    constant), or a special character symbol.  Tokens are normally
  45    separated by whitespace (space, tab, newline), but need not be if
  46    there is no ambiguity (which is generally only the case if a
  47    special character is adjacent to some other token type).
  48   </para>
  49
  50   <para>
  51    Additionally, <firstterm>comments</firstterm> can occur in SQL
  52    input.  They are not tokens, they are effectively equivalent to
  53    whitespace.
  54   </para>
  55
  56    <para>
  57     For example, the following is (syntactically) valid SQL input:
  58 <programlisting>
  59 SELECT * FROM MY_TABLE;
  60 UPDATE MY_TABLE SET A = 5;
  61 INSERT INTO MY_TABLE VALUES (3, 'hi there');
  62 </programlisting>
  63     This is a sequence of three commands, one per line (although this
  64     is not required; more than one command can be on a line, and
  65     commands can usefully be split across lines).
  66    </para>
  67
  68   <para>
  69    The SQL syntax is not very consistent regarding what tokens
  70    identify commands and which are operands or parameters.  The first
  71    few tokens are generally the command name, so in the above example
  72    we would usually speak of a <quote>SELECT</quote>, an
  73    <quote>UPDATE</quote>, and an <quote>INSERT</quote> command.  But
  74    for instance the <command>UPDATE</command> command always requires
  75    a <token>SET</token> token to appear in a certain position, and
  76    this particular variation of <command>INSERT</command> also
  77    requires a <token>VALUES</token> in order to be complete.  The
  78    precise syntax rules for each command are described in <xref linkend="reference">.
  79   </para>
  80
  81   <sect2 id="sql-syntax-identifiers">
  82    <title>Identifiers and Key Words</title>
  83
  84    <indexterm zone="sql-syntax-identifiers">
  85     <primary>identifier</primary>
  86     <secondary>syntax of</secondary>
  87    </indexterm>
  88
  89    <indexterm zone="sql-syntax-identifiers">
  90     <primary>name</primary>
  91     <secondary>syntax of</secondary>
  92    </indexterm>
  93
  94    <indexterm zone="sql-syntax-identifiers">
  95     <primary>key word</primary>
  96     <secondary>syntax of</secondary>
  97    </indexterm>
  98
  99    <para>
 100     Tokens such as <token>SELECT</token>, <token>UPDATE</token>, or
 101     <token>VALUES</token> in the example above are examples of
 102     <firstterm>key words</firstterm>, that is, words that have a fixed
 103     meaning in the SQL language.  The tokens <token>MY_TABLE</token>
 104     and <token>A</token> are examples of
 105     <firstterm>identifiers</firstterm>.  They identify names of
 106     tables, columns, or other database objects, depending on the
 107     command they are used in.  Therefore they are sometimes simply
 108     called <quote>names</quote>.  Key words and identifiers have the
 109     same lexical structure, meaning that one cannot know whether a
 110     token is an identifier or a key word without knowing the language.
 111     A complete list of key words can be found in <xref
 112     linkend="sql-keywords-appendix">.
 113    </para>
 114
 115    <para>
 116     SQL identifiers and key words must begin with a letter
 117     (<literal>a</literal>-<literal>z</literal>, but also letters with
 118     diacritical marks and non-Latin letters) or an underscore
 119     (<literal>_</literal>).  Subsequent characters in an identifier or
 120     key word can be letters, underscores, digits
 121     (<literal>0</literal>-<literal>9</literal>), or dollar signs
 122     (<literal>$</>).  Note that dollar signs are not allowed in identifiers
 123     according to the letter of the SQL standard, so their use may render
 124     applications less portable.
 125     The SQL standard will not define a key word that contains
 126     digits or starts or ends with an underscore, so identifiers of this
 127     form are safe against possible conflict with future extensions of the
 128     standard.
 129    </para>
 130
 131    <para>
 132     <indexterm><primary>identifier</primary><secondary>length</secondary></indexterm>
 133     The system uses no more than <symbol>NAMEDATALEN</symbol>-1
 134     characters of an identifier; longer names can be written in
 135     commands, but they will be truncated.  By default,
 136     <symbol>NAMEDATALEN</symbol> is 64 so the maximum identifier
 137     length is 63. If this limit is problematic, it can be raised by
 138     changing the <symbol>NAMEDATALEN</symbol> constant in
 139     <filename>src/include/postgres_ext.h</filename>.
 140    </para>
 141
 142    <para>
 143     <indexterm>
 144      <primary>case sensitivity</primary>
 145      <secondary>of SQL commands</secondary>
 146     </indexterm>
 147     Identifier and key word names are case insensitive.  Therefore
 148 <programlisting>
 149 UPDATE MY_TABLE SET A = 5;
 150 </programlisting>
 151     can equivalently be written as
 152 <programlisting>
 153 uPDaTE my_TabLE SeT a = 5;
 154 </programlisting>
 155     A convention often used is to write key words in upper
 156     case and names in lower case, e.g.,
 157 <programlisting>
 158 UPDATE my_table SET a = 5;
 159 </programlisting>
 160    </para>
 161
 162    <para>
 163     <indexterm>
 164      <primary>quotation marks</primary>
 165      <secondary>and identifiers</secondary>
 166     </indexterm>
 167     There is a second kind of identifier:  the <firstterm>delimited
 168     identifier</firstterm> or <firstterm>quoted
 169     identifier</firstterm>.  It is formed by enclosing an arbitrary
 170     sequence of characters in double-quotes
 171     (<literal>"</literal>). <!-- " font-lock mania --> A delimited
 172     identifier is always an identifier, never a key word.  So
 173     <literal>"select"</literal> could be used to refer to a column or
 174     table named <quote>select</quote>, whereas an unquoted
 175     <literal>select</literal> would be taken as a key word and
 176     would therefore provoke a parse error when used where a table or
 177     column name is expected.  The example can be written with quoted
 178     identifiers like this:
 179 <programlisting>
 180 UPDATE "my_table" SET "a" = 5;
 181 </programlisting>
 182    </para>
 183
 184    <para>
 185     Quoted identifiers can contain any character other than a double
 186     quote itself.  (To include a double quote, write two double quotes.)
 187     This allows constructing table or column names that would
 188     otherwise not be possible, such as ones containing spaces or
 189     ampersands.  The length limitation still applies.
 190    </para>
 191
 192    <para>
 193     Quoting an identifier also makes it case-sensitive, whereas
 194     unquoted names are always folded to lower case.  For example, the
 195     identifiers <literal>FOO</literal>, <literal>foo</literal>, and
 196     <literal>"foo"</literal> are considered the same by
 197     <productname>PostgreSQL</productname>, but
 198     <literal>"Foo"</literal> and <literal>"FOO"</literal> are
 199     different from these three and each other.  (The folding of
 200     unquoted names to lower case in <productname>PostgreSQL</> is
 201     incompatible with the SQL standard, which says that unquoted names
 202     should be folded to upper case.  Thus, <literal>foo</literal>
 203     should be equivalent to <literal>"FOO"</literal> not
 204     <literal>"foo"</literal> according to the standard.  If you want
 205     to write portable applications you are advised to always quote a
 206     particular name or never quote it.)
 207    </para>
 208   </sect2>
 209
 210
 211   <sect2 id="sql-syntax-constants">
 212    <title>Constants</title>
 213
 214    <indexterm zone="sql-syntax-constants">
 215     <primary>constant</primary>
 216    </indexterm>
 217
 218    <para>
 219     There are three kinds of <firstterm>implicitly-typed
 220     constants</firstterm> in <productname>PostgreSQL</productname>:
 221     strings, bit strings, and numbers.
 222     Constants can also be specified with explicit types, which can
 223     enable more accurate representation and more efficient handling by
 224     the system. These alternatives are discussed in the following
 225     subsections.
 226    </para>
 227
 228    <sect3 id="sql-syntax-strings">
 229     <title>String Constants</title>
 230
 231     <indexterm zone="sql-syntax-strings">
 232      <primary>character string</primary>
 233      <secondary>constant</secondary>
 234     </indexterm>
 235
 236     <para>
 237      <indexterm>
 238       <primary>quotation marks</primary>
 239       <secondary>escaping</secondary>
 240      </indexterm>
 241      A string constant in SQL is an arbitrary sequence of characters
 242      bounded by single quotes (<literal>'</literal>), for example
 243      <literal>'This is a string'</literal>.  The standard-compliant way of
 244      writing a single-quote character within a string constant is to
 245      write two adjacent single quotes, e.g.
 246      <literal>'Dianne''s horse'</literal>.
 247      <productname>PostgreSQL</productname> also allows single quotes
 248      to be escaped with a backslash (<literal>\'</literal>).  However,
 249      future versions of <productname>PostgreSQL</productname> will not
 250      allow this, so applications using backslashes should convert to the
 251      standard-compliant method outlined above.
 252     </para>
 253
 254     <para>
 255      Another <productname>PostgreSQL</productname> extension is that
 256      C-style backslash escapes are available: <literal>\b</literal> is a
 257      backspace, <literal>\f</literal> is a form feed,
 258      <literal>\n</literal> is a newline, <literal>\r</literal> is a
 259      carriage return, <literal>\t</literal> is a tab. Also supported is
 260      <literal>\<replaceable>digits</replaceable></literal>, where
 261      <replaceable>digits</replaceable> represents an octal byte value, and
 262      <literal>\x<replaceable>hexdigits</replaceable></literal>, where
 263      <replaceable>hexdigits</replaceable> represents a hexadecimal byte value.
 264      (It is your responsibility that the byte sequences you create are
 265      valid characters in the server character set encoding.) Any other
 266      character following a backslash is taken literally. Thus, to
 267      include a backslash in a string constant, write two backslashes.
 268     </para>
 269
 270     <note>
 271     <para>
 272      While ordinary strings now support C-style backslash escapes,
 273      future versions will generate warnings for such usage and
 274      eventually treat backslashes as literal characters to be
 275      standard-conforming. The proper way to specify escape processing is
 276      to use the escape string syntax to indicate that escape
 277      processing is desired. Escape string syntax is specified by writing
 278      the letter <literal>E</literal> (upper or lower case) just before
 279      the string, e.g. <literal>E'\041'</>. This method will work in all
 280      future versions of <productname>PostgreSQL</productname>.
 281     </para>
 282     </note>
 283
 284     <para>
 285      The character with the code zero cannot be in a string constant.
 286     </para>
 287
 288     <para>
 289      Two string constants that are only separated by whitespace
 290      <emphasis>with at least one newline</emphasis> are concatenated
 291      and effectively treated as if the string had been written in one
 292      constant.  For example:
 293 <programlisting>
 294 SELECT 'foo'
 295 'bar';
 296 </programlisting>
 297      is equivalent to
 298 <programlisting>
 299 SELECT 'foobar';
 300 </programlisting>
 301      but
 302 <programlisting>
 303 SELECT 'foo'      'bar';
 304 </programlisting>
 305      is not valid syntax.  (This slightly bizarre behavior is specified
 306      by <acronym>SQL</acronym>; <productname>PostgreSQL</productname> is
 307      following the standard.)
 308     </para>
 309    </sect3>
 310
 311    <sect3 id="sql-syntax-dollar-quoting">
 312     <title>Dollar-Quoted String Constants</title>
 313
 314      <indexterm>
 315       <primary>dollar quoting</primary>
 316      </indexterm>
 317
 318     <para>
 319      While the standard syntax for specifying string constants is usually
 320      convenient, it can be difficult to understand when the desired string
 321      contains many single quotes or backslashes, since each of those must
 322      be doubled. To allow more readable queries in such situations,
 323      <productname>PostgreSQL</productname> provides another way, called
 324      <quote>dollar quoting</quote>, to write string constants.
 325      A dollar-quoted string constant
 326      consists of a dollar sign (<literal>$</literal>), an optional
 327      <quote>tag</quote> of zero or more characters, another dollar
 328      sign, an arbitrary sequence of characters that makes up the
 329      string content, a dollar sign, the same tag that began this
 330      dollar quote, and a dollar sign. For example, here are two
 331      different ways to specify the string <quote>Dianne's horse</>
 332      using dollar quoting:
 333 <programlisting>
 334 $$Dianne's horse$$
 335 $SomeTag$Dianne's horse$SomeTag$
 336 </programlisting>
 337      Notice that inside the dollar-quoted string, single quotes can be
 338      used without needing to be escaped.  Indeed, no characters inside
 339      a dollar-quoted string are ever escaped: the string content is always
 340      written literally.  Backslashes are not special, and neither are
 341      dollar signs, unless they are part of a sequence matching the opening
 342      tag.
 343     </para>
 344
 345     <para>
 346      It is possible to nest dollar-quoted string constants by choosing
 347      different tags at each nesting level.  This is most commonly used in
 348      writing function definitions.  For example:
 349 <programlisting>
 350 $function$
 351 BEGIN
 352     RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
 353 END;
 354 $function$
 355 </programlisting>
 356      Here, the sequence <literal>$q$[\t\r\n\v\\]$q$</> represents a
 357      dollar-quoted literal string <literal>[\t\r\n\v\\]</>, which will
 358      be recognized when the function body is executed by
 359      <productname>PostgreSQL</>.  But since the sequence does not match
 360      the outer dollar quoting delimiter <literal>$function$</>, it is
 361      just some more characters within the constant so far as the outer
 362      string is concerned.
 363     </para>
 364
 365     <para>
 366      The tag, if any, of a dollar-quoted string follows the same rules
 367      as an unquoted identifier, except that it cannot contain a dollar sign.
 368      Tags are case sensitive, so <literal>$tag$String content$tag$</literal>
 369      is correct, but <literal>$TAG$String content$tag$</literal> is not.
 370     </para>
 371
 372     <para>
 373      A dollar-quoted string that follows a keyword or identifier must
 374      be separated from it by whitespace; otherwise the dollar quoting
 375      delimiter would be taken as part of the preceding identifier.
 376     </para>
 377
 378     <para>
 379      Dollar quoting is not part of the SQL standard, but it is often a more
 380      convenient way to write complicated string literals than the
 381      standard-compliant single quote syntax.  It is particularly useful when
 382      representing string constants inside other constants, as is often needed
 383      in procedural function definitions.  With single-quote syntax, each
 384      backslash in the above example would have to be written as four
 385      backslashes, which would be reduced to two backslashes in parsing the
 386      original string constant, and then to one when the inner string constant
 387      is re-parsed during function execution.
 388     </para>
 389    </sect3>
 390
 391    <sect3 id="sql-syntax-bit-strings">
 392     <title>Bit-String Constants</title>
 393
 394     <indexterm zone="sql-syntax-bit-strings">
 395      <primary>bit string</primary>
 396      <secondary>constant</secondary>
 397     </indexterm>
 398
 399     <para>
 400      Bit-string constants look like regular string constants with a
 401      <literal>B</literal> (upper or lower case) immediately before the
 402      opening quote (no intervening whitespace), e.g.,
 403      <literal>B'1001'</literal>.  The only characters allowed within
 404      bit-string constants are <literal>0</literal> and
 405      <literal>1</literal>.
 406     </para>
 407
 408     <para>
 409      Alternatively, bit-string constants can be specified in hexadecimal
 410      notation, using a leading <literal>X</literal> (upper or lower case),
 411      e.g., <literal>X'1FF'</literal>.  This notation is equivalent to
 412      a bit-string constant with four binary digits for each hexadecimal digit.
 413     </para>
 414
 415     <para>
 416      Both forms of bit-string constant can be continued
 417      across lines in the same way as regular string constants.
 418      Dollar quoting cannot be used in a bit-string constant.
 419     </para>
 420    </sect3>
 421
 422    <sect3>
 423     <title>Numeric Constants</title>
 424
 425     <indexterm>
 426      <primary>number</primary>
 427      <secondary>constant</secondary>
 428     </indexterm>
 429
 430     <para>
 431      Numeric constants are accepted in these general forms:
 432 <synopsis>
 433 <replaceable>digits</replaceable>
 434 <replaceable>digits</replaceable>.<optional><replaceable>digits</replaceable></optional><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 435 <optional><replaceable>digits</replaceable></optional>.<replaceable>digits</replaceable><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 436 <replaceable>digits</replaceable>e<optional>+-</optional><replaceable>digits</replaceable>
 437 </synopsis>
 438      where <replaceable>digits</replaceable> is one or more decimal
 439      digits (0 through 9).  At least one digit must be before or after the
 440      decimal point, if one is used.  At least one digit must follow the
 441      exponent marker (<literal>e</literal>), if one is present.
 442      There may not be any spaces or other characters embedded in the
 443      constant.  Note that any leading plus or minus sign is not actually
 444      considered part of the constant; it is an operator applied to the
 445      constant.
 446     </para>
 447
 448     <para>
 449      These are some examples of valid numeric constants:
 450 <literallayout>
 451 42
 452 3.5
 453 4.
 454 .001
 455 5e2
 456 1.925e-3
 457 </literallayout>
 458     </para>
 459
 460     <para>
 461      <indexterm><primary>integer</primary></indexterm>
 462      <indexterm><primary>bigint</primary></indexterm>
 463      <indexterm><primary>numeric</primary></indexterm>
 464      A numeric constant that contains neither a decimal point nor an
 465      exponent is initially presumed to be type <type>integer</> if its
 466      value fits in type <type>integer</> (32 bits); otherwise it is
 467      presumed to be type <type>bigint</> if its
 468      value fits in type <type>bigint</> (64 bits); otherwise it is
 469      taken to be type <type>numeric</>.  Constants that contain decimal
 470      points and/or exponents are always initially presumed to be type
 471      <type>numeric</>.
 472     </para>
 473
 474     <para>
 475      The initially assigned data type of a numeric constant is just a
 476      starting point for the type resolution algorithms.  In most cases
 477      the constant will be automatically coerced to the most
 478      appropriate type depending on context.  When necessary, you can
 479      force a numeric value to be interpreted as a specific data type
 480      by casting it.<indexterm><primary>type cast</primary></indexterm>
 481      For example, you can force a numeric value to be treated as type
 482      <type>real</> (<type>float4</>) by writing
 483
 484 <programlisting>
 485 REAL '1.23'  -- string style
 486 1.23::REAL   -- PostgreSQL (historical) style
 487 </programlisting>
 488
 489      These are actually just special cases of the general casting
 490      notations discussed next.
 491     </para>
 492    </sect3>
 493
 494    <sect3 id="sql-syntax-constants-generic">
 495     <title>Constants of Other Types</title>
 496
 497     <indexterm>
 498      <primary>data type</primary>
 499      <secondary>constant</secondary>
 500     </indexterm>
 501
 502     <para>
 503      A constant of an <emphasis>arbitrary</emphasis> type can be
 504      entered using any one of the following notations:
 505 <synopsis>
 506 <replaceable>type</replaceable> '<replaceable>string</replaceable>'
 507 '<replaceable>string</replaceable>'::<replaceable>type</replaceable>
 508 CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 509 </synopsis>
 510      The string constant's text is passed to the input conversion
 511      routine for the type called <replaceable>type</replaceable>. The
 512      result is a constant of the indicated type.  The explicit type
 513      cast may be omitted if there is no ambiguity as to the type the
 514      constant must be (for example, when it is assigned directly to a
 515      table column), in which case it is automatically coerced.
 516     </para>
 517
 518     <para>
 519      The string constant can be written using either regular SQL
 520      notation or dollar-quoting.
 521     </para>
 522
 523     <para>
 524      It is also possible to specify a type coercion using a function-like
 525      syntax:
 526 <synopsis>
 527 <replaceable>typename</replaceable> ( '<replaceable>string</replaceable>' )
 528 </synopsis>
 529      but not all type names may be used in this way; see <xref
 530      linkend="sql-syntax-type-casts"> for details.
 531     </para>
 532
 533     <para>
 534      The <literal>::</literal>, <literal>CAST()</literal>, and
 535      function-call syntaxes can also be used to specify run-time type
 536      conversions of arbitrary expressions, as discussed in <xref
 537      linkend="sql-syntax-type-casts">.  But the form
 538      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 539      can only be used to specify the type of a literal constant.
 540      Another restriction on
 541      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 542      is that it does not work for array types; use <literal>::</literal>
 543      or <literal>CAST()</literal> to specify the type of an array constant.
 544     </para>
 545
 546     <para>
 547      The <literal>CAST()</> syntax conforms to SQL.  The
 548      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 549      syntax is a generalization of the standard: SQL specifies this syntax only
 550      for a few data types, but <productname>PostgreSQL</productname> allows it
 551      for all types.  The syntax with
 552      <literal>::</literal> is historical <productname>PostgreSQL</productname>
 553      usage, as is the function-call syntax.
 554     </para>
 555    </sect3>
 556   </sect2>
 557
 558   <sect2 id="sql-syntax-operators">
 559    <title>Operators</title>
 560
 561    <indexterm zone="sql-syntax-operators">
 562     <primary>operator</primary>
 563     <secondary>syntax</secondary>
 564    </indexterm>
 565
 566    <para>
 567     An operator name is a sequence of up to <symbol>NAMEDATALEN</symbol>-1
 568     (63 by default) characters from the following list:
 569 <literallayout>
 570 + - * / &lt; &gt; = ~ ! @ # % ^ &amp; | ` ?
 571 </literallayout>
 572
 573     There are a few restrictions on operator names, however:
 574     <itemizedlist>
 575      <listitem>
 576       <para>
 577        <literal>--</literal> and <literal>/*</literal> cannot appear
 578        anywhere in an operator name, since they will be taken as the
 579        start of a comment.
 580       </para>
 581      </listitem>
 582
 583      <listitem>
 584       <para>
 585        A multiple-character operator name cannot end in <literal>+</> or <literal>-</>,
 586        unless the name also contains at least one of these characters:
 587 <literallayout>
 588 ~ ! @ # % ^ &amp; | ` ?
 589 </literallayout>
 590        For example, <literal>@-</literal> is an allowed operator name,
 591        but <literal>*-</literal> is not.  This restriction allows
 592        <productname>PostgreSQL</productname> to parse SQL-compliant
 593        queries without requiring spaces between tokens.
 594       </para>
 595      </listitem>
 596     </itemizedlist>
 597    </para>
 598
 599    <para>
 600     When working with non-SQL-standard operator names, you will usually
 601     need to separate adjacent operators with spaces to avoid ambiguity.
 602     For example, if you have defined a left unary operator named <literal>@</literal>,
 603     you cannot write <literal>X*@Y</literal>; you must write
 604     <literal>X* @Y</literal> to ensure that
 605     <productname>PostgreSQL</productname> reads it as two operator names
 606     not one.
 607    </para>
 608   </sect2>
 609
 610   <sect2>
 611    <title>Special Characters</title>
 612
 613   <para>
 614    Some characters that are not alphanumeric have a special meaning
 615    that is different from being an operator.  Details on the usage can
 616    be found at the location where the respective syntax element is
 617    described.  This section only exists to advise the existence and
 618    summarize the purposes of these characters.
 619
 620    <itemizedlist>
 621     <listitem>
 622      <para>
 623       A dollar sign (<literal>$</literal>) followed by digits is used
 624       to represent a positional parameter in the body of a function
 625       definition or a prepared statement.  In other contexts the
 626       dollar sign may be part of an identifier or a dollar-quoted string
 627       constant.
 628      </para>
 629     </listitem>
 630
 631     <listitem>
 632      <para>
 633       Parentheses (<literal>()</literal>) have their usual meaning to
 634       group expressions and enforce precedence.  In some cases
 635       parentheses are required as part of the fixed syntax of a
 636       particular SQL command.
 637      </para>
 638     </listitem>
 639
 640     <listitem>
 641      <para>
 642       Brackets (<literal>[]</literal>) are used to select the elements
 643       of an array.  See <xref linkend="arrays"> for more information
 644       on arrays.
 645      </para>
 646     </listitem>
 647
 648     <listitem>
 649      <para>
 650       Commas (<literal>,</literal>) are used in some syntactical
 651       constructs to separate the elements of a list.
 652      </para>
 653     </listitem>
 654
 655     <listitem>
 656      <para>
 657       The semicolon (<literal>;</literal>) terminates an SQL command.
 658       It cannot appear anywhere within a command, except within a
 659       string constant or quoted identifier.
 660      </para>
 661     </listitem>
 662
 663     <listitem>
 664      <para>
 665       The colon (<literal>:</literal>) is used to select
 666       <quote>slices</quote> from arrays. (See <xref
 667       linkend="arrays">.)  In certain SQL dialects (such as Embedded
 668       SQL), the colon is used to prefix variable names.
 669      </para>
 670     </listitem>
 671
 672     <listitem>
 673      <para>
 674       The asterisk (<literal>*</literal>) is used in some contexts to denote
 675       all the fields of a table row or composite value.  It also
 676       has a special meaning when used as the argument of the
 677       <function>COUNT</function> aggregate function.
 678      </para>
 679     </listitem>
 680
 681     <listitem>
 682      <para>
 683       The period (<literal>.</literal>) is used in numeric
 684       constants, and to separate schema, table, and column names.
 685      </para>
 686     </listitem>
 687    </itemizedlist>
 688
 689    </para>
 690   </sect2>
 691
 692   <sect2 id="sql-syntax-comments">
 693    <title>Comments</title>
 694
 695    <indexterm zone="sql-syntax-comments">
 696     <primary>comment</primary>
 697     <secondary sortas="SQL">in SQL</secondary>
 698    </indexterm>
 699
 700    <para>
 701     A comment is an arbitrary sequence of characters beginning with
 702     double dashes and extending to the end of the line, e.g.:
 703 <programlisting>
 704 -- This is a standard SQL comment
 705 </programlisting>
 706    </para>
 707
 708    <para>
 709     Alternatively, C-style block comments can be used:
 710 <programlisting>
 711 /* multiline comment
 712  * with nesting: /* nested block comment */
 713  */
 714 </programlisting>
 715     where the comment begins with <literal>/*</literal> and extends to
 716     the matching occurrence of <literal>*/</literal>. These block
 717     comments nest, as specified in the SQL standard but unlike C, so that one can
 718     comment out larger blocks of code that may contain existing block
 719     comments.
 720    </para>
 721
 722    <para>
 723     A comment is removed from the input stream before further syntax
 724     analysis and is effectively replaced by whitespace.
 725    </para>
 726   </sect2>
 727
 728   <sect2 id="sql-precedence">
 729    <title>Lexical Precedence</title>
 730
 731    <indexterm zone="sql-precedence">
 732     <primary>operator</primary>
 733     <secondary>precedence</secondary>
 734    </indexterm>
 735
 736    <para>
 737     <xref linkend="sql-precedence-table"> shows the precedence and
 738     associativity of the operators in <productname>PostgreSQL</>.
 739     Most operators have the same precedence and are left-associative.
 740     The precedence and associativity of the operators is hard-wired
 741     into the parser.  This may lead to non-intuitive behavior; for
 742     example the Boolean operators <literal>&lt;</> and
 743     <literal>&gt;</> have a different precedence than the Boolean
 744     operators <literal>&lt;=</> and <literal>&gt;=</>.  Also, you will
 745     sometimes need to add parentheses when using combinations of
 746     binary and unary operators.  For instance
 747 <programlisting>
 748 SELECT 5 ! - 6;
 749 </programlisting>
 750    will be parsed as
 751 <programlisting>
 752 SELECT 5 ! (- 6);
 753 </programlisting>
 754     because the parser has no idea &mdash; until it is too late
 755     &mdash; that <token>!</token> is defined as a postfix operator,
 756     not an infix one.  To get the desired behavior in this case, you
 757     must write
 758 <programlisting>
 759 SELECT (5 !) - 6;
 760 </programlisting>
 761     This is the price one pays for extensibility.
 762    </para>
 763
 764    <table id="sql-precedence-table">
 765     <title>Operator Precedence (decreasing)</title>
 766
 767     <tgroup cols="3">
 768      <thead>
 769       <row>
 770        <entry>Operator/Element</entry>
 771        <entry>Associativity</entry>
 772        <entry>Description</entry>
 773       </row>
 774      </thead>
 775
 776      <tbody>
 777       <row>
 778        <entry><token>.</token></entry>
 779        <entry>left</entry>
 780        <entry>table/column name separator</entry>
 781       </row>
 782
 783       <row>
 784        <entry><token>::</token></entry>
 785        <entry>left</entry>
 786        <entry><productname>PostgreSQL</productname>-style typecast</entry>
 787       </row>
 788
 789       <row>
 790        <entry><token>[</token> <token>]</token></entry>
 791        <entry>left</entry>
 792        <entry>array element selection</entry>
 793       </row>
 794
 795       <row>
 796        <entry><token>-</token></entry>
 797        <entry>right</entry>
 798        <entry>unary minus</entry>
 799       </row>
 800
 801       <row>
 802        <entry><token>^</token></entry>
 803        <entry>left</entry>
 804        <entry>exponentiation</entry>
 805       </row>
 806
 807       <row>
 808        <entry><token>*</token> <token>/</token> <token>%</token></entry>
 809        <entry>left</entry>
 810        <entry>multiplication, division, modulo</entry>
 811       </row>
 812
 813       <row>
 814        <entry><token>+</token> <token>-</token></entry>
 815        <entry>left</entry>
 816        <entry>addition, subtraction</entry>
 817       </row>
 818
 819       <row>
 820        <entry><token>IS</token></entry>
 821        <entry></entry>
 822        <entry><literal>IS TRUE</>, <literal>IS FALSE</>, <literal>IS UNKNOWN</>, <literal>IS NULL</></entry>
 823       </row>
 824
 825       <row>
 826        <entry><token>ISNULL</token></entry>
 827        <entry></entry>
 828        <entry>test for null</entry>
 829       </row>
 830
 831       <row>
 832        <entry><token>NOTNULL</token></entry>
 833        <entry></entry>
 834        <entry>test for not null</entry>
 835       </row>
 836
 837       <row>
 838        <entry>(any other)</entry>
 839        <entry>left</entry>
 840        <entry>all other native and user-defined operators</entry>
 841       </row>
 842
 843       <row>
 844        <entry><token>IN</token></entry>
 845        <entry></entry>
 846        <entry>set membership</entry>
 847       </row>
 848
 849       <row>
 850        <entry><token>BETWEEN</token></entry>
 851        <entry></entry>
 852        <entry>range containment</entry>
 853       </row>
 854
 855       <row>
 856        <entry><token>OVERLAPS</token></entry>
 857        <entry></entry>
 858        <entry>time interval overlap</entry>
 859       </row>
 860
 861       <row>
 862        <entry><token>LIKE</token> <token>ILIKE</token> <token>SIMILAR</token></entry>
 863        <entry></entry>
 864        <entry>string pattern matching</entry>
 865       </row>
 866
 867       <row>
 868        <entry><token>&lt;</token> <token>&gt;</token></entry>
 869        <entry></entry>
 870        <entry>less than, greater than</entry>
 871       </row>
 872
 873       <row>
 874        <entry><token>=</token></entry>
 875        <entry>right</entry>
 876        <entry>equality, assignment</entry>
 877       </row>
 878
 879       <row>
 880        <entry><token>NOT</token></entry>
 881        <entry>right</entry>
 882        <entry>logical negation</entry>
 883       </row>
 884
 885       <row>
 886        <entry><token>AND</token></entry>
 887        <entry>left</entry>
 888        <entry>logical conjunction</entry>
 889       </row>
 890
 891       <row>
 892        <entry><token>OR</token></entry>
 893        <entry>left</entry>
 894        <entry>logical disjunction</entry>
 895       </row>
 896      </tbody>
 897     </tgroup>
 898    </table>
 899
 900    <para>
 901     Note that the operator precedence rules also apply to user-defined
 902     operators that have the same names as the built-in operators
 903     mentioned above.  For example, if you define a
 904     <quote>+</quote> operator for some custom data type it will have
 905     the same precedence as the built-in <quote>+</quote> operator, no
 906     matter what yours does.
 907    </para>
 908
 909    <para>
 910     When a schema-qualified operator name is used in the
 911     <literal>OPERATOR</> syntax, as for example in
 912 <programlisting>
 913 SELECT 3 OPERATOR(pg_catalog.+) 4;
 914 </programlisting>
 915     the <literal>OPERATOR</> construct is taken to have the default precedence
 916     shown in <xref linkend="sql-precedence-table"> for <quote>any other</> operator.  This is true no matter
 917     which specific operator name appears inside <literal>OPERATOR()</>.
 918    </para>
 919   </sect2>
 920  </sect1>
 921
 922  <sect1 id="sql-expressions">
 923   <title>Value Expressions</title>
 924
 925   <indexterm zone="sql-expressions">
 926    <primary>expression</primary>
 927    <secondary>syntax</secondary>
 928   </indexterm>
 929
 930   <indexterm zone="sql-expressions">
 931    <primary>value expression</primary>
 932   </indexterm>
 933
 934   <indexterm>
 935    <primary>scalar</primary>
 936    <see>expression</see>
 937   </indexterm>
 938
 939   <para>
 940    Value expressions are used in a variety of contexts, such
 941    as in the target list of the <command>SELECT</command> command, as
 942    new column values in <command>INSERT</command> or
 943    <command>UPDATE</command>, or in search conditions in a number of
 944    commands.  The result of a value expression is sometimes called a
 945    <firstterm>scalar</firstterm>, to distinguish it from the result of
 946    a table expression (which is a table).  Value expressions are
 947    therefore also called <firstterm>scalar expressions</firstterm> (or
 948    even simply <firstterm>expressions</firstterm>).  The expression
 949    syntax allows the calculation of values from primitive parts using
 950    arithmetic, logical, set, and other operations.
 951   </para>
 952
 953   <para>
 954    A value expression is one of the following:
 955
 956    <itemizedlist>
 957     <listitem>
 958      <para>
 959       A constant or literal value.
 960      </para>
 961     </listitem>
 962
 963     <listitem>
 964      <para>
 965       A column reference.
 966      </para>
 967     </listitem>
 968
 969     <listitem>
 970      <para>
 971       A positional parameter reference, in the body of a function definition
 972       or prepared statement.
 973      </para>
 974     </listitem>
 975
 976     <listitem>
 977      <para>
 978       A subscripted expression.
 979      </para>
 980     </listitem>
 981
 982     <listitem>
 983      <para>
 984       A field selection expression.
 985      </para>
 986     </listitem>
 987
 988     <listitem>
 989      <para>
 990       An operator invocation.
 991      </para>
 992     </listitem>
 993
 994     <listitem>
 995      <para>
 996       A function call.
 997      </para>
 998     </listitem>
 999
1000     <listitem>
1001      <para>
1002       An aggregate expression.
1003      </para>
1004     </listitem>
1005
1006     <listitem>
1007      <para>
1008       A type cast.
1009      </para>
1010     </listitem>
1011
1012     <listitem>
1013      <para>
1014       A scalar subquery.
1015      </para>
1016     </listitem>
1017
1018     <listitem>
1019      <para>
1020       An array constructor.
1021      </para>
1022     </listitem>
1023
1024     <listitem>
1025      <para>
1026       A row constructor.
1027      </para>
1028     </listitem>
1029
1030     <listitem>
1031      <para>
1032       Another value expression in parentheses, useful to group
1033       subexpressions and override
1034       precedence.<indexterm><primary>parenthesis</></>
1035      </para>
1036     </listitem>
1037    </itemizedlist>
1038   </para>
1039
1040   <para>
1041    In addition to this list, there are a number of constructs that can
1042    be classified as an expression but do not follow any general syntax
1043    rules.  These generally have the semantics of a function or
1044    operator and are explained in the appropriate location in <xref
1045    linkend="functions">.  An example is the <literal>IS NULL</literal>
1046    clause.
1047   </para>
1048
1049   <para>
1050    We have already discussed constants in <xref
1051    linkend="sql-syntax-constants">.  The following sections discuss
1052    the remaining options.
1053   </para>
1054
1055   <sect2>
1056    <title>Column References</title>
1057
1058    <indexterm>
1059     <primary>column reference</primary>
1060    </indexterm>
1061
1062    <para>
1063     A column can be referenced in the form
1064 <synopsis>
1065 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable>
1066 </synopsis>
1067    </para>
1068
1069    <para>
1070     <replaceable>correlation</replaceable> is the name of a
1071     table (possibly qualified with a schema name), or an alias for a table
1072     defined by means of a <literal>FROM</literal> clause, or one of
1073     the key words <literal>NEW</literal> or <literal>OLD</literal>.
1074     (<literal>NEW</literal> and <literal>OLD</literal> can only appear in rewrite rules,
1075     while other correlation names can be used in any SQL statement.)
1076     The correlation name and separating dot may be omitted if the column name
1077     is unique across all the tables being used in the current query.  (See also <xref linkend="queries">.)
1078    </para>
1079   </sect2>
1080
1081   <sect2>
1082    <title>Positional Parameters</title>
1083
1084    <indexterm>
1085     <primary>parameter</primary>
1086     <secondary>syntax</secondary>
1087    </indexterm>
1088
1089    <indexterm>
1090     <primary>$</primary>
1091    </indexterm>
1092
1093    <para>
1094     A positional parameter reference is used to indicate a value
1095     that is supplied externally to an SQL statement.  Parameters are
1096     used in SQL function definitions and in prepared queries.  Some
1097     client libraries also support specifying data values separately
1098     from the SQL command string, in which case parameters are used to
1099     refer to the out-of-line data values.
1100     The form of a parameter reference is:
1101 <synopsis>
1102 $<replaceable>number</replaceable>
1103 </synopsis>
1104    </para>
1105
1106    <para>
1107     For example, consider the definition of a function,
1108     <function>dept</function>, as
1109
1110 <programlisting>
1111 CREATE FUNCTION dept(text) RETURNS dept
1112     AS $$ SELECT * FROM dept WHERE name = $1 $$
1113     LANGUAGE SQL;
1114 </programlisting>
1115
1116     Here the <literal>$1</literal> references the value of the first
1117     function argument whenever the function is invoked.
1118    </para>
1119   </sect2>
1120
1121   <sect2>
1122    <title>Subscripts</title>
1123
1124    <indexterm>
1125     <primary>subscript</primary>
1126    </indexterm>
1127
1128    <para>
1129     If an expression yields a value of an array type, then a specific
1130     element of the array value can be extracted by writing
1131 <synopsis>
1132 <replaceable>expression</replaceable>[<replaceable>subscript</replaceable>]
1133 </synopsis>
1134     or multiple adjacent elements (an <quote>array slice</>) can be extracted
1135     by writing
1136 <synopsis>
1137 <replaceable>expression</replaceable>[<replaceable>lower_subscript</replaceable>:<replaceable>upper_subscript</replaceable>]
1138 </synopsis>
1139     (Here, the brackets <literal>[ ]</literal> are meant to appear literally.)
1140     Each <replaceable>subscript</replaceable> is itself an expression,
1141     which must yield an integer value.
1142    </para>
1143
1144    <para>
1145     In general the array <replaceable>expression</replaceable> must be
1146     parenthesized, but the parentheses may be omitted when the expression
1147     to be subscripted is just a column reference or positional parameter.
1148     Also, multiple subscripts can be concatenated when the original array
1149     is multidimensional.
1150     For example,
1151
1152 <programlisting>
1153 mytable.arraycolumn[4]
1154 mytable.two_d_column[17][34]
1155 $1[10:42]
1156 (arrayfunction(a,b))[42]
1157 </programlisting>
1158
1159     The parentheses in the last example are required.
1160     See <xref linkend="arrays"> for more about arrays.
1161    </para>
1162   </sect2>
1163
1164   <sect2>
1165    <title>Field Selection</title>
1166
1167    <indexterm>
1168     <primary>field selection</primary>
1169    </indexterm>
1170
1171    <para>
1172     If an expression yields a value of a composite type (row type), then a
1173     specific field of the row can be extracted by writing
1174 <synopsis>
1175 <replaceable>expression</replaceable>.<replaceable>fieldname</replaceable>
1176 </synopsis>
1177    </para>
1178
1179    <para>
1180     In general the row <replaceable>expression</replaceable> must be
1181     parenthesized, but the parentheses may be omitted when the expression
1182     to be selected from is just a table reference or positional parameter.
1183     For example,
1184
1185 <programlisting>
1186 mytable.mycolumn
1187 $1.somecolumn
1188 (rowfunction(a,b)).col3
1189 </programlisting>
1190
1191     (Thus, a qualified column reference is actually just a special case
1192     of the field selection syntax.)
1193    </para>
1194   </sect2>
1195
1196   <sect2>
1197    <title>Operator Invocations</title>
1198
1199    <indexterm>
1200     <primary>operator</primary>
1201     <secondary>invocation</secondary>
1202    </indexterm>
1203
1204    <para>
1205     There are three possible syntaxes for an operator invocation:
1206     <simplelist>
1207      <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> <replaceable>expression</replaceable> (binary infix operator)</member>
1208      <member><replaceable>operator</replaceable> <replaceable>expression</replaceable> (unary prefix operator)</member>
1209      <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> (unary postfix operator)</member>
1210     </simplelist>
1211     where the <replaceable>operator</replaceable> token follows the syntax
1212     rules of <xref linkend="sql-syntax-operators">, or is one of the
1213     key words <token>AND</token>, <token>OR</token>, and
1214     <token>NOT</token>, or is a qualified operator name in the form
1215 <synopsis>
1216 <literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operatorname</><literal>)</>
1217 </synopsis>
1218     Which particular operators exist and whether
1219     they are unary or binary depends on what operators have been
1220     defined by the system or the user.  <xref linkend="functions">
1221     describes the built-in operators.
1222    </para>
1223   </sect2>
1224
1225   <sect2>
1226    <title>Function Calls</title>
1227
1228    <indexterm>
1229     <primary>function</primary>
1230     <secondary>invocation</secondary>
1231    </indexterm>
1232
1233    <para>
1234     The syntax for a function call is the name of a function
1235     (possibly qualified with a schema name), followed by its argument list
1236     enclosed in parentheses:
1237
1238 <synopsis>
1239 <replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
1240 </synopsis>
1241    </para>
1242
1243    <para>
1244     For example, the following computes the square root of 2:
1245 <programlisting>
1246 sqrt(2)
1247 </programlisting>
1248    </para>
1249
1250    <para>
1251     The list of built-in functions is in <xref linkend="functions">.
1252     Other functions may be added by the user.
1253    </para>
1254   </sect2>
1255
1256   <sect2 id="syntax-aggregates">
1257    <title>Aggregate Expressions</title>
1258
1259    <indexterm zone="syntax-aggregates">
1260     <primary>aggregate function</primary>
1261     <secondary>invocation</secondary>
1262    </indexterm>
1263
1264    <para>
1265     An <firstterm>aggregate expression</firstterm> represents the
1266     application of an aggregate function across the rows selected by a
1267     query.  An aggregate function reduces multiple inputs to a single
1268     output value, such as the sum or average of the inputs.  The
1269     syntax of an aggregate expression is one of the following:
1270
1271 <synopsis>
1272 <replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable>)
1273 <replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable>)
1274 <replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)
1275 <replaceable>aggregate_name</replaceable> ( * )
1276 </synopsis>
1277
1278     where <replaceable>aggregate_name</replaceable> is a previously
1279     defined aggregate (possibly qualified with a schema name), and
1280     <replaceable>expression</replaceable> is
1281     any value expression that does not itself contain an aggregate
1282     expression.
1283    </para>
1284
1285    <para>
1286     The first form of aggregate expression invokes the aggregate
1287     across all input rows for which the given expression yields a
1288     non-null value.  (Actually, it is up to the aggregate function
1289     whether to ignore null values or not &mdash; but all the standard ones do.)
1290     The second form is the same as the first, since
1291     <literal>ALL</literal> is the default.  The third form invokes the
1292     aggregate for all distinct non-null values of the expression found
1293     in the input rows.  The last form invokes the aggregate once for
1294     each input row regardless of null or non-null values; since no
1295     particular input value is specified, it is generally only useful
1296     for the <function>count()</function> aggregate function.
1297    </para>
1298
1299    <para>
1300     For example, <literal>count(*)</literal> yields the total number
1301     of input rows; <literal>count(f1)</literal> yields the number of
1302     input rows in which <literal>f1</literal> is non-null;
1303     <literal>count(distinct f1)</literal> yields the number of
1304     distinct non-null values of <literal>f1</literal>.
1305    </para>
1306
1307    <para>
1308     The predefined aggregate functions are described in <xref
1309     linkend="functions-aggregate">.  Other aggregate functions may be added
1310     by the user.
1311    </para>
1312
1313    <para>
1314     An aggregate expression may only appear in the result list or
1315     <literal>HAVING</> clause of a <command>SELECT</> command.
1316     It is forbidden in other clauses, such as <literal>WHERE</>,
1317     because those clauses are logically evaluated before the results
1318     of aggregates are formed.
1319    </para>
1320
1321    <para>
1322     When an aggregate expression appears in a subquery (see
1323     <xref linkend="sql-syntax-scalar-subqueries"> and
1324     <xref linkend="functions-subquery">), the aggregate is normally
1325     evaluated over the rows of the subquery.  But an exception occurs
1326     if the aggregate's argument contains only outer-level variables:
1327     the aggregate then belongs to the nearest such outer level, and is
1328     evaluated over the rows of that query.  The aggregate expression
1329     as a whole is then an outer reference for the subquery it appears in,
1330     and acts as a constant over any one evaluation of that subquery.
1331     The restriction about
1332     appearing only in the result list or <literal>HAVING</> clause
1333     applies with respect to the query level that the aggregate belongs to.
1334    </para>
1335   </sect2>
1336
1337   <sect2 id="sql-syntax-type-casts">
1338    <title>Type Casts</title>
1339
1340    <indexterm>
1341     <primary>data type</primary>
1342     <secondary>type cast</secondary>
1343    </indexterm>
1344
1345    <indexterm>
1346     <primary>type cast</primary>
1347    </indexterm>
1348
1349    <para>
1350     A type cast specifies a conversion from one data type to another.
1351     <productname>PostgreSQL</productname> accepts two equivalent syntaxes
1352     for type casts:
1353 <synopsis>
1354 CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable> )
1355 <replaceable>expression</replaceable>::<replaceable>type</replaceable>
1356 </synopsis>
1357     The <literal>CAST</> syntax conforms to SQL; the syntax with
1358     <literal>::</literal> is historical <productname>PostgreSQL</productname>
1359     usage.
1360    </para>
1361
1362    <para>
1363     When a cast is applied to a value expression of a known type, it
1364     represents a run-time type conversion.  The cast will succeed only
1365     if a suitable type conversion operation has been defined.  Notice that this
1366     is subtly different from the use of casts with constants, as shown in
1367     <xref linkend="sql-syntax-constants-generic">.  A cast applied to an
1368     unadorned string literal represents the initial assignment of a type
1369     to a literal constant value, and so it will succeed for any type
1370     (if the contents of the string literal are acceptable input syntax for the
1371     data type).
1372    </para>
1373
1374    <para>
1375     An explicit type cast may usually be omitted if there is no ambiguity as
1376     to the type that a value expression must produce (for example, when it is
1377     assigned to a table column); the system will automatically apply a
1378     type cast in such cases.  However, automatic casting is only done for
1379     casts that are marked <quote>OK to apply implicitly</>
1380     in the system catalogs.  Other casts must be invoked with
1381     explicit casting syntax.  This restriction is intended to prevent
1382     surprising conversions from being applied silently.
1383    </para>
1384
1385    <para>
1386     It is also possible to specify a type cast using a function-like
1387     syntax:
1388 <synopsis>
1389 <replaceable>typename</replaceable> ( <replaceable>expression</replaceable> )
1390 </synopsis>
1391     However, this only works for types whose names are also valid as
1392     function names.  For example, <literal>double precision</literal>
1393     can't be used this way, but the equivalent <literal>float8</literal>
1394     can.  Also, the names <literal>interval</>, <literal>time</>, and
1395     <literal>timestamp</> can only be used in this fashion if they are
1396     double-quoted, because of syntactic conflicts.  Therefore, the use of
1397     the function-like cast syntax leads to inconsistencies and should
1398     probably be avoided in new applications.
1399
1400     (The function-like syntax is in fact just a function call.  When
1401     one of the two standard cast syntaxes is used to do a run-time
1402     conversion, it will internally invoke a registered function to
1403     perform the conversion.  By convention, these conversion functions
1404     have the same name as their output type, and thus the <quote>function-like
1405     syntax</> is nothing more than a direct invocation of the underlying
1406     conversion function.  Obviously, this is not something that a portable
1407     application should rely on.)
1408    </para>
1409   </sect2>
1410
1411   <sect2 id="sql-syntax-scalar-subqueries">
1412    <title>Scalar Subqueries</title>
1413
1414    <indexterm>
1415     <primary>subquery</primary>
1416    </indexterm>
1417
1418    <para>
1419     A scalar subquery is an ordinary
1420     <command>SELECT</command> query in parentheses that returns exactly one
1421     row with one column.  (See <xref linkend="queries"> for information about writing queries.)
1422     The <command>SELECT</command> query is executed
1423     and the single returned value is used in the surrounding value expression.
1424     It is an error to use a query that
1425     returns more than one row or more than one column as a scalar subquery.
1426     (But if, during a particular execution, the subquery returns no rows,
1427     there is no error; the scalar result is taken to be null.)
1428     The subquery can refer to variables from the surrounding query,
1429     which will act as constants during any one evaluation of the subquery.
1430     See also <xref linkend="functions-subquery"> for other expressions involving subqueries.
1431    </para>
1432
1433    <para>
1434     For example, the following finds the largest city population in each
1435     state:
1436 <programlisting>
1437 SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
1438     FROM states;
1439 </programlisting>
1440    </para>
1441   </sect2>
1442
1443   <sect2 id="sql-syntax-array-constructors">
1444    <title>Array Constructors</title>
1445
1446    <indexterm>
1447     <primary>array</primary>
1448     <secondary>constructor</secondary>
1449    </indexterm>
1450
1451    <indexterm>
1452     <primary>ARRAY</primary>
1453    </indexterm>
1454
1455    <para>
1456     An array constructor is an expression that builds an
1457     array value from values for its member elements.  A simple array
1458     constructor
1459     consists of the key word <literal>ARRAY</literal>, a left square bracket
1460     <literal>[</>, one or more expressions (separated by commas) for the
1461     array element values, and finally a right square bracket <literal>]</>.
1462     For example,
1463 <programlisting>
1464 SELECT ARRAY[1,2,3+4];
1465   array
1466 ---------
1467  {1,2,7}
1468 (1 row)
1469 </programlisting>
1470     The array element type is the common type of the member expressions,
1471     determined using the same rules as for <literal>UNION</> or
1472     <literal>CASE</> constructs (see <xref linkend="typeconv-union-case">).
1473    </para>
1474
1475    <para>
1476     Multidimensional array values can be built by nesting array
1477     constructors.
1478     In the inner constructors, the key word <literal>ARRAY</literal> may
1479     be omitted.  For example, these produce the same result:
1480
1481 <programlisting>
1482 SELECT ARRAY[ARRAY[1,2], ARRAY[3,4]];
1483      array
1484 ---------------
1485  {{1,2},{3,4}}
1486 (1 row)
1487
1488 SELECT ARRAY[[1,2],[3,4]];
1489      array
1490 ---------------
1491  {{1,2},{3,4}}
1492 (1 row)
1493 </programlisting>
1494
1495     Since multidimensional arrays must be rectangular, inner constructors
1496     at the same level must produce sub-arrays of identical dimensions.
1497   </para>
1498
1499   <para>
1500     Multidimensional array constructor elements can be anything yielding
1501     an array of the proper kind, not only a sub-<literal>ARRAY</> construct.
1502     For example:
1503 <programlisting>
1504 CREATE TABLE arr(f1 int[], f2 int[]);
1505
1506 INSERT INTO arr VALUES (ARRAY[[1,2],[3,4]], ARRAY[[5,6],[7,8]]);
1507
1508 SELECT ARRAY[f1, f2, '{{9,10},{11,12}}'::int[]] FROM arr;
1509                      array
1510 ------------------------------------------------
1511  {{{1,2},{3,4}},{{5,6},{7,8}},{{9,10},{11,12}}}
1512 (1 row)
1513 </programlisting>
1514   </para>
1515
1516   <para>
1517    It is also possible to construct an array from the results of a
1518    subquery.  In this form, the array constructor is written with the
1519    key word <literal>ARRAY</literal> followed by a parenthesized (not
1520    bracketed) subquery. For example:
1521 <programlisting>
1522 SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
1523                           ?column?
1524 -------------------------------------------------------------
1525  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31}
1526 (1 row)
1527 </programlisting>
1528    The subquery must return a single column. The resulting
1529    one-dimensional array will have an element for each row in the
1530    subquery result, with an element type matching that of the
1531    subquery's output column.
1532   </para>
1533
1534   <para>
1535    The subscripts of an array value built with <literal>ARRAY</literal>
1536    always begin with one.  For more information about arrays, see
1537    <xref linkend="arrays">.
1538   </para>
1539
1540   </sect2>
1541
1542   <sect2 id="sql-syntax-row-constructors">
1543    <title>Row Constructors</title>
1544
1545    <indexterm>
1546     <primary>composite type</primary>
1547     <secondary>constructor</secondary>
1548    </indexterm>
1549
1550    <indexterm>
1551     <primary>row type</primary>
1552     <secondary>constructor</secondary>
1553    </indexterm>
1554
1555    <indexterm>
1556     <primary>ROW</primary>
1557    </indexterm>
1558
1559    <para>
1560     A row constructor is an expression that builds a row value (also
1561     called a composite value) from values
1562     for its member fields.  A row constructor consists of the key word
1563     <literal>ROW</literal>, a left parenthesis, zero or more
1564     expressions (separated by commas) for the row field values, and finally
1565     a right parenthesis.  For example,
1566 <programlisting>
1567 SELECT ROW(1,2.5,'this is a test');
1568 </programlisting>
1569     The key word <literal>ROW</> is optional when there is more than one
1570     expression in the list.
1571    </para>
1572
1573    <para>
1574     By default, the value created by a <literal>ROW</> expression is of
1575     an anonymous record type.  If necessary, it can be cast to a named
1576     composite type &mdash; either the row type of a table, or a composite type
1577     created with <command>CREATE TYPE AS</>.  An explicit cast may be needed
1578     to avoid ambiguity.  For example:
1579 <programlisting>
1580 CREATE TABLE mytable(f1 int, f2 float, f3 text);
1581
1582 CREATE FUNCTION getf1(mytable) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
1583
1584 -- No cast needed since only one getf1() exists
1585 SELECT getf1(ROW(1,2.5,'this is a test'));
1586  getf1
1587 -------
1588      1
1589 (1 row)
1590
1591 CREATE TYPE myrowtype AS (f1 int, f2 text, f3 numeric);
1592
1593 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
1594
1595 -- Now we need a cast to indicate which function to call:
1596 SELECT getf1(ROW(1,2.5,'this is a test'));
1597 ERROR:  function getf1(record) is not unique
1598
1599 SELECT getf1(ROW(1,2.5,'this is a test')::mytable);
1600  getf1
1601 -------
1602      1
1603 (1 row)
1604
1605 SELECT getf1(CAST(ROW(11,'this is a test',2.5) AS myrowtype));
1606  getf1
1607 -------
1608     11
1609 (1 row)
1610 </programlisting>
1611   </para>
1612
1613   <para>
1614    Row constructors can be used to build composite values to be stored
1615    in a composite-type table column, or to be passed to a function that
1616    accepts a composite parameter.  Also,
1617    it is possible to compare two row values or test a row with
1618    <literal>IS NULL</> or <literal>IS NOT NULL</>, for example
1619 <programlisting>
1620 SELECT ROW(1,2.5,'this is a test') = ROW(1, 3, 'not the same');
1621
1622 SELECT ROW(a, b, c) IS NOT NULL FROM table;
1623 </programlisting>
1624    For more detail see <xref linkend="functions-comparisons">.
1625    Row constructors can also be used in connection with subqueries,
1626    as discussed in <xref linkend="functions-subquery">.
1627   </para>
1628
1629   </sect2>
1630
1631   <sect2 id="syntax-express-eval">
1632    <title>Expression Evaluation Rules</title>
1633
1634    <indexterm>
1635     <primary>expression</primary>
1636     <secondary>order of evaluation</secondary>
1637    </indexterm>
1638
1639    <para>
1640     The order of evaluation of subexpressions is not defined.  In
1641     particular, the inputs of an operator or function are not necessarily
1642     evaluated left-to-right or in any other fixed order.
1643    </para>
1644
1645    <para>
1646     Furthermore, if the result of an expression can be determined by
1647     evaluating only some parts of it, then other subexpressions
1648     might not be evaluated at all.  For instance, if one wrote
1649 <programlisting>
1650 SELECT true OR somefunc();
1651 </programlisting>
1652     then <literal>somefunc()</literal> would (probably) not be called
1653     at all. The same would be the case if one wrote
1654 <programlisting>
1655 SELECT somefunc() OR true;
1656 </programlisting>
1657     Note that this is not the same as the left-to-right
1658     <quote>short-circuiting</quote> of Boolean operators that is found
1659     in some programming languages.
1660    </para>
1661
1662    <para>
1663     As a consequence, it is unwise to use functions with side effects
1664     as part of complex expressions.  It is particularly dangerous to
1665     rely on side effects or evaluation order in <literal>WHERE</> and <literal>HAVING</> clauses,
1666     since those clauses are extensively reprocessed as part of
1667     developing an execution plan.  Boolean
1668     expressions (<literal>AND</>/<literal>OR</>/<literal>NOT</> combinations) in those clauses may be reorganized
1669     in any manner allowed by the laws of Boolean algebra.
1670    </para>
1671
1672    <para>
1673     When it is essential to force evaluation order, a <literal>CASE</>
1674     construct (see <xref linkend="functions-conditional">) may be
1675     used.  For example, this is an untrustworthy way of trying to
1676     avoid division by zero in a <literal>WHERE</> clause:
1677 <programlisting>
1678 SELECT ... WHERE x &lt;&gt; 0 AND y/x &gt; 1.5;
1679 </programlisting>
1680     But this is safe:
1681 <programlisting>
1682 SELECT ... WHERE CASE WHEN x &lt;&gt; 0 THEN y/x &gt; 1.5 ELSE false END;
1683 </programlisting>
1684     A <literal>CASE</> construct used in this fashion will defeat optimization
1685     attempts, so it should only be done when necessary.  (In this particular
1686     example, it would doubtless be best to sidestep the problem by writing
1687     <literal>y &gt; 1.5*x</> instead.)
1688    </para>
1689   </sect2>
1690  </sect1>
1691
1692 </chapter>
1693
1694 <!-- Keep this comment at the end of the file
1695 Local variables:
1696 mode:sgml
1697 sgml-omittag:nil
1698 sgml-shorttag:t
1699 sgml-minimize-attributes:nil
1700 sgml-always-quote-attributes:t
1701 sgml-indent-step:1
1702 sgml-indent-data:t
1703 sgml-parent-document:nil
1704 sgml-default-dtd-file:"./reference.ced"
1705 sgml-exposed-tags:nil
1706 sgml-local-catalogs:("/usr/lib/sgml/catalog")
1707 sgml-local-ecat-files:nil
1708 End:
1709 -->