doc/src/sgml/syntax.sgml

   1 <!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.135 2009/09/21 22:22:07 petere Exp $ -->
   2
   3 <chapter id="sql-syntax">
   4  <title>SQL Syntax</title>
   5
   6  <indexterm zone="sql-syntax">
   7   <primary>syntax</primary>
   8   <secondary>SQL</secondary>
   9  </indexterm>
  10
  11  <para>
  12   This chapter describes the syntax of SQL.  It forms the foundation
  13   for understanding the following chapters which will go into detail
  14   about how SQL commands are applied to define and modify data.
  15  </para>
  16
  17  <para>
  18   We also advise users who are already familiar with SQL to read this
  19   chapter carefully because it contains several rules and concepts that
  20   are implemented inconsistently among SQL databases or that are
  21   specific to <productname>PostgreSQL</productname>.
  22  </para>
  23
  24  <sect1 id="sql-syntax-lexical">
  25   <title>Lexical Structure</title>
  26
  27   <indexterm>
  28    <primary>token</primary>
  29   </indexterm>
  30
  31   <para>
  32    SQL input consists of a sequence of
  33    <firstterm>commands</firstterm>.  A command is composed of a
  34    sequence of <firstterm>tokens</firstterm>, terminated by a
  35    semicolon (<quote>;</quote>).  The end of the input stream also
  36    terminates a command.  Which tokens are valid depends on the syntax
  37    of the particular command.
  38   </para>
  39
  40   <para>
  41    A token can be a <firstterm>key word</firstterm>, an
  42    <firstterm>identifier</firstterm>, a <firstterm>quoted
  43    identifier</firstterm>, a <firstterm>literal</firstterm> (or
  44    constant), or a special character symbol.  Tokens are normally
  45    separated by whitespace (space, tab, newline), but need not be if
  46    there is no ambiguity (which is generally only the case if a
  47    special character is adjacent to some other token type).
  48   </para>
  49
  50   <para>
  51    Additionally, <firstterm>comments</firstterm> can occur in SQL
  52    input.  They are not tokens, they are effectively equivalent to
  53    whitespace.
  54   </para>
  55
  56    <para>
  57     For example, the following is (syntactically) valid SQL input:
  58 <programlisting>
  59 SELECT * FROM MY_TABLE;
  60 UPDATE MY_TABLE SET A = 5;
  61 INSERT INTO MY_TABLE VALUES (3, 'hi there');
  62 </programlisting>
  63     This is a sequence of three commands, one per line (although this
  64     is not required; more than one command can be on a line, and
  65     commands can usefully be split across lines).
  66    </para>
  67
  68   <para>
  69    The SQL syntax is not very consistent regarding what tokens
  70    identify commands and which are operands or parameters.  The first
  71    few tokens are generally the command name, so in the above example
  72    we would usually speak of a <quote>SELECT</quote>, an
  73    <quote>UPDATE</quote>, and an <quote>INSERT</quote> command.  But
  74    for instance the <command>UPDATE</command> command always requires
  75    a <token>SET</token> token to appear in a certain position, and
  76    this particular variation of <command>INSERT</command> also
  77    requires a <token>VALUES</token> in order to be complete.  The
  78    precise syntax rules for each command are described in <xref linkend="reference">.
  79   </para>
  80
  81   <sect2 id="sql-syntax-identifiers">
  82    <title>Identifiers and Key Words</title>
  83
  84    <indexterm zone="sql-syntax-identifiers">
  85     <primary>identifier</primary>
  86     <secondary>syntax of</secondary>
  87    </indexterm>
  88
  89    <indexterm zone="sql-syntax-identifiers">
  90     <primary>name</primary>
  91     <secondary>syntax of</secondary>
  92    </indexterm>
  93
  94    <indexterm zone="sql-syntax-identifiers">
  95     <primary>key word</primary>
  96     <secondary>syntax of</secondary>
  97    </indexterm>
  98
  99    <para>
 100     Tokens such as <token>SELECT</token>, <token>UPDATE</token>, or
 101     <token>VALUES</token> in the example above are examples of
 102     <firstterm>key words</firstterm>, that is, words that have a fixed
 103     meaning in the SQL language.  The tokens <token>MY_TABLE</token>
 104     and <token>A</token> are examples of
 105     <firstterm>identifiers</firstterm>.  They identify names of
 106     tables, columns, or other database objects, depending on the
 107     command they are used in.  Therefore they are sometimes simply
 108     called <quote>names</quote>.  Key words and identifiers have the
 109     same lexical structure, meaning that one cannot know whether a
 110     token is an identifier or a key word without knowing the language.
 111     A complete list of key words can be found in <xref
 112     linkend="sql-keywords-appendix">.
 113    </para>
 114
 115    <para>
 116     SQL identifiers and key words must begin with a letter
 117     (<literal>a</literal>-<literal>z</literal>, but also letters with
 118     diacritical marks and non-Latin letters) or an underscore
 119     (<literal>_</literal>).  Subsequent characters in an identifier or
 120     key word can be letters, underscores, digits
 121     (<literal>0</literal>-<literal>9</literal>), or dollar signs
 122     (<literal>$</>).  Note that dollar signs are not allowed in identifiers
 123     according to the letter of the SQL standard, so their use might render
 124     applications less portable.
 125     The SQL standard will not define a key word that contains
 126     digits or starts or ends with an underscore, so identifiers of this
 127     form are safe against possible conflict with future extensions of the
 128     standard.
 129    </para>
 130
 131    <para>
 132     <indexterm><primary>identifier</primary><secondary>length</secondary></indexterm>
 133     The system uses no more than <symbol>NAMEDATALEN</symbol>-1
 134     bytes of an identifier; longer names can be written in
 135     commands, but they will be truncated.  By default,
 136     <symbol>NAMEDATALEN</symbol> is 64 so the maximum identifier
 137     length is 63 bytes. If this limit is problematic, it can be raised by
 138     changing the <symbol>NAMEDATALEN</symbol> constant in
 139     <filename>src/include/pg_config_manual.h</filename>.
 140    </para>
 141
 142    <para>
 143     <indexterm>
 144      <primary>case sensitivity</primary>
 145      <secondary>of SQL commands</secondary>
 146     </indexterm>
 147     Identifier and key word names are case insensitive.  Therefore:
 148 <programlisting>
 149 UPDATE MY_TABLE SET A = 5;
 150 </programlisting>
 151     can equivalently be written as:
 152 <programlisting>
 153 uPDaTE my_TabLE SeT a = 5;
 154 </programlisting>
 155     A convention often used is to write key words in upper
 156     case and names in lower case, e.g.:
 157 <programlisting>
 158 UPDATE my_table SET a = 5;
 159 </programlisting>
 160    </para>
 161
 162    <para>
 163     <indexterm>
 164      <primary>quotation marks</primary>
 165      <secondary>and identifiers</secondary>
 166     </indexterm>
 167     There is a second kind of identifier:  the <firstterm>delimited
 168     identifier</firstterm> or <firstterm>quoted
 169     identifier</firstterm>.  It is formed by enclosing an arbitrary
 170     sequence of characters in double-quotes
 171     (<literal>"</literal>). <!-- " font-lock mania --> A delimited
 172     identifier is always an identifier, never a key word.  So
 173     <literal>"select"</literal> could be used to refer to a column or
 174     table named <quote>select</quote>, whereas an unquoted
 175     <literal>select</literal> would be taken as a key word and
 176     would therefore provoke a parse error when used where a table or
 177     column name is expected.  The example can be written with quoted
 178     identifiers like this:
 179 <programlisting>
 180 UPDATE "my_table" SET "a" = 5;
 181 </programlisting>
 182    </para>
 183
 184    <para>
 185     Quoted identifiers can contain any character, except the character
 186     with code zero.  (To include a double quote, write two double quotes.)
 187     This allows constructing table or column names that would
 188     otherwise not be possible, such as ones containing spaces or
 189     ampersands.  The length limitation still applies.
 190    </para>
 191
 192    <para>
 193     <indexterm><primary>Unicode escape</primary><secondary>in
 194     identifiers</secondary></indexterm> A variant of quoted
 195     identifiers allows including escaped Unicode characters identified
 196     by their code points.  This variant starts
 197     with <literal>U&amp;</literal> (upper or lower case U followed by
 198     ampersand) immediately before the opening double quote, without
 199     any spaces in between, for example <literal>U&amp;"foo"</literal>.
 200     (Note that this creates an ambiguity with the
 201     operator <literal>&amp;</literal>.  Use spaces around the operator to
 202     avoid this problem.)  Inside the quotes, Unicode characters can be
 203     specified in escaped form by writing a backslash followed by the
 204     four-digit hexadecimal code point number or alternatively a
 205     backslash followed by a plus sign followed by a six-digit
 206     hexadecimal code point number.  For example, the
 207     identifier <literal>"data"</literal> could be written as
 208 <programlisting>
 209 U&amp;"d\0061t\+000061"
 210 </programlisting>
 211     The following less trivial example writes the Russian
 212     word <quote>slon</quote> (elephant) in Cyrillic letters:
 213 <programlisting>
 214 U&amp;"\0441\043B\043E\043D"
 215 </programlisting>
 216    </para>
 217
 218    <para>
 219     If a different escape character than backslash is desired, it can
 220     be specified using
 221     the <literal>UESCAPE</literal><indexterm><primary>UESCAPE</primary></indexterm>
 222     clause after the string, for example:
 223 <programlisting>
 224 U&amp;"d!0061t!+000061" UESCAPE '!'
 225 </programlisting>
 226     The escape character can be any single character other than a
 227     hexadecimal digit, the plus sign, a single quote, a double quote,
 228     or a whitespace character.  Note that the escape character is
 229     written in single quotes, not double quotes.
 230    </para>
 231
 232    <para>
 233     To include the escape character in the identifier literally, write
 234     it twice.
 235    </para>
 236
 237    <para>
 238     The Unicode escape syntax works only when the server encoding is
 239     UTF8.  When other server encodings are used, only code points in
 240     the ASCII range (up to <literal>\007F</literal>) can be specified.
 241     Both the 4-digit and the 6-digit form can be used to specify
 242     UTF-16 surrogate pairs to compose characters with code points
 243     larger than <literal>\FFFF</literal> (although the availability of
 244     the 6-digit form technically makes this unnecessary).
 245    </para>
 246
 247    <para>
 248     Quoting an identifier also makes it case-sensitive, whereas
 249     unquoted names are always folded to lower case.  For example, the
 250     identifiers <literal>FOO</literal>, <literal>foo</literal>, and
 251     <literal>"foo"</literal> are considered the same by
 252     <productname>PostgreSQL</productname>, but
 253     <literal>"Foo"</literal> and <literal>"FOO"</literal> are
 254     different from these three and each other.  (The folding of
 255     unquoted names to lower case in <productname>PostgreSQL</> is
 256     incompatible with the SQL standard, which says that unquoted names
 257     should be folded to upper case.  Thus, <literal>foo</literal>
 258     should be equivalent to <literal>"FOO"</literal> not
 259     <literal>"foo"</literal> according to the standard.  If you want
 260     to write portable applications you are advised to always quote a
 261     particular name or never quote it.)
 262    </para>
 263   </sect2>
 264
 265
 266   <sect2 id="sql-syntax-constants">
 267    <title>Constants</title>
 268
 269    <indexterm zone="sql-syntax-constants">
 270     <primary>constant</primary>
 271    </indexterm>
 272
 273    <para>
 274     There are three kinds of <firstterm>implicitly-typed
 275     constants</firstterm> in <productname>PostgreSQL</productname>:
 276     strings, bit strings, and numbers.
 277     Constants can also be specified with explicit types, which can
 278     enable more accurate representation and more efficient handling by
 279     the system. These alternatives are discussed in the following
 280     subsections.
 281    </para>
 282
 283    <sect3 id="sql-syntax-strings">
 284     <title>String Constants</title>
 285
 286     <indexterm zone="sql-syntax-strings">
 287      <primary>character string</primary>
 288      <secondary>constant</secondary>
 289     </indexterm>
 290
 291     <para>
 292      <indexterm>
 293       <primary>quotation marks</primary>
 294       <secondary>escaping</secondary>
 295      </indexterm>
 296      A string constant in SQL is an arbitrary sequence of characters
 297      bounded by single quotes (<literal>'</literal>), for example
 298      <literal>'This is a string'</literal>.  To include
 299      a single-quote character within a string constant,
 300      write two adjacent single quotes, e.g.,
 301      <literal>'Dianne''s horse'</literal>.
 302      Note that this is <emphasis>not</> the same as a double-quote
 303      character (<literal>"</>). <!-- font-lock sanity: " -->
 304     </para>
 305
 306     <para>
 307      Two string constants that are only separated by whitespace
 308      <emphasis>with at least one newline</emphasis> are concatenated
 309      and effectively treated as if the string had been written as one
 310      constant.  For example:
 311 <programlisting>
 312 SELECT 'foo'
 313 'bar';
 314 </programlisting>
 315      is equivalent to:
 316 <programlisting>
 317 SELECT 'foobar';
 318 </programlisting>
 319      but:
 320 <programlisting>
 321 SELECT 'foo'      'bar';
 322 </programlisting>
 323      is not valid syntax.  (This slightly bizarre behavior is specified
 324      by <acronym>SQL</acronym>; <productname>PostgreSQL</productname> is
 325      following the standard.)
 326     </para>
 327    </sect3>
 328
 329    <sect3 id="sql-syntax-strings-escape">
 330     <title>String Constants with C-Style Escapes</title>
 331
 332      <indexterm zone="sql-syntax-strings-escape">
 333       <primary>escape string syntax</primary>
 334      </indexterm>
 335      <indexterm zone="sql-syntax-strings-escape">
 336       <primary>backslash escapes</primary>
 337      </indexterm>
 338
 339     <para>
 340      <productname>PostgreSQL</productname> also accepts <quote>escape</>
 341      string constants, which are an extension to the SQL standard.
 342      An escape string constant is specified by writing the letter
 343      <literal>E</literal> (upper or lower case) just before the opening single
 344      quote, e.g., <literal>E'foo'</>.  (When continuing an escape string
 345      constant across lines, write <literal>E</> only before the first opening
 346      quote.)
 347      Within an escape string, a backslash character (<literal>\</>) begins a
 348      C-like <firstterm>backslash escape</> sequence, in which the combination
 349      of backslash and following character(s) represent a special byte
 350      value, as shown in <xref linkend="sql-backslash-table">.
 351     </para>
 352
 353      <table id="sql-backslash-table">
 354       <title>Backslash Escape Sequences</title>
 355       <tgroup cols="2">
 356       <thead>
 357        <row>
 358         <entry>Backslash Escape Sequence</>
 359         <entry>Interpretation</entry>
 360        </row>
 361       </thead>
 362
 363       <tbody>
 364        <row>
 365         <entry><literal>\b</literal></entry>
 366         <entry>backspace</entry>
 367        </row>
 368        <row>
 369         <entry><literal>\f</literal></entry>
 370         <entry>form feed</entry>
 371        </row>
 372        <row>
 373         <entry><literal>\n</literal></entry>
 374         <entry>newline</entry>
 375        </row>
 376        <row>
 377         <entry><literal>\r</literal></entry>
 378         <entry>carriage return</entry>
 379        </row>
 380        <row>
 381         <entry><literal>\t</literal></entry>
 382         <entry>tab</entry>
 383        </row>
 384        <row>
 385         <entry>
 386          <literal>\<replaceable>o</replaceable></literal>,
 387          <literal>\<replaceable>oo</replaceable></literal>,
 388          <literal>\<replaceable>ooo</replaceable></literal>
 389          (<replaceable>o</replaceable> = 0 - 7)
 390         </entry>
 391         <entry>octal byte value</entry>
 392        </row>
 393        <row>
 394         <entry>
 395          <literal>\x<replaceable>h</replaceable></literal>,
 396          <literal>\x<replaceable>hh</replaceable></literal>
 397          (<replaceable>h</replaceable> = 0 - 9, A - F)
 398         </entry>
 399         <entry>hexadecimal byte value</entry>
 400        </row>
 401       </tbody>
 402       </tgroup>
 403      </table>
 404
 405     <para>
 406      Any other
 407      character following a backslash is taken literally. Thus, to
 408      include a backslash character, write two backslashes (<literal>\\</>).
 409      Also, a single quote can be included in an escape string by writing
 410      <literal>\'</literal>, in addition to the normal way of <literal>''</>.
 411     </para>
 412
 413     <para>
 414      It is your responsibility that the byte sequences you create are
 415      valid characters in the server character set encoding.  When the
 416      server encoding is UTF-8, then the alternative Unicode escape
 417      syntax, explained in <xref linkend="sql-syntax-strings-uescape">,
 418      should be used instead.  (The alternative would be doing the
 419      UTF-8 encoding by hand and writing out the bytes, which would be
 420      very cumbersome.)
 421     </para>
 422
 423     <caution>
 424     <para>
 425      If the configuration parameter
 426      <xref linkend="guc-standard-conforming-strings"> is <literal>off</>,
 427      then <productname>PostgreSQL</productname> recognizes backslash escapes
 428      in both regular and escape string constants.  This is for backward
 429      compatibility with the historical behavior, where backslash escapes
 430      were always recognized.
 431      Although <varname>standard_conforming_strings</> currently defaults to
 432      <literal>off</>, the default will change to <literal>on</> in a future
 433      release for improved standards compliance.  Applications are therefore
 434      encouraged to migrate away from using backslash escapes.  If you need
 435      to use a backslash escape to represent a special character, write the
 436      string constant with an <literal>E</> to be sure it will be handled the same
 437      way in future releases.
 438     </para>
 439
 440     <para>
 441      In addition to <varname>standard_conforming_strings</>, the configuration
 442      parameters <xref linkend="guc-escape-string-warning"> and
 443      <xref linkend="guc-backslash-quote"> govern treatment of backslashes
 444      in string constants.
 445     </para>
 446     </caution>
 447
 448     <para>
 449      The character with the code zero cannot be in a string constant.
 450     </para>
 451    </sect3>
 452
 453    <sect3 id="sql-syntax-strings-uescape">
 454     <title>String Constants with Unicode Escapes</title>
 455
 456     <indexterm  zone="sql-syntax-strings-uescape">
 457      <primary>Unicode escape</primary>
 458      <secondary>in string constants</secondary>
 459     </indexterm>
 460
 461     <para>
 462      <productname>PostgreSQL</productname> also supports another type
 463      of escape syntax for strings that allows specifying arbitrary
 464      Unicode characters by code point.  A Unicode escape string
 465      constant starts with <literal>U&amp;</literal> (upper or lower case
 466      letter U followed by ampersand) immediately before the opening
 467      quote, without any spaces in between, for
 468      example <literal>U&amp;'foo'</literal>.  (Note that this creates an
 469      ambiguity with the operator <literal>&amp;</literal>.  Use spaces
 470      around the operator to avoid this problem.)  Inside the quotes,
 471      Unicode characters can be specified in escaped form by writing a
 472      backslash followed by the four-digit hexadecimal code point
 473      number or alternatively a backslash followed by a plus sign
 474      followed by a six-digit hexadecimal code point number.  For
 475      example, the string <literal>'data'</literal> could be written as
 476 <programlisting>
 477 U&amp;'d\0061t\+000061'
 478 </programlisting>
 479      The following less trivial example writes the Russian
 480      word <quote>slon</quote> (elephant) in Cyrillic letters:
 481 <programlisting>
 482 U&amp;'\0441\043B\043E\043D'
 483 </programlisting>
 484     </para>
 485
 486     <para>
 487      If a different escape character than backslash is desired, it can
 488      be specified using
 489      the <literal>UESCAPE</literal><indexterm><primary>UESCAPE</primary></indexterm>
 490      clause after the string, for example:
 491 <programlisting>
 492 U&amp;'d!0061t!+000061' UESCAPE '!'
 493 </programlisting>
 494      The escape character can be any single character other than a
 495      hexadecimal digit, the plus sign, a single quote, a double quote,
 496      or a whitespace character.
 497     </para>
 498
 499     <para>
 500      The Unicode escape syntax works only when the server encoding is
 501      UTF8.  When other server encodings are used, only code points in
 502      the ASCII range (up to <literal>\007F</literal>) can be
 503      specified.
 504      Both the 4-digit and the 6-digit form can be used to specify
 505      UTF-16 surrogate pairs to compose characters with code points
 506      larger than <literal>\FFFF</literal> (although the availability
 507      of the 6-digit form technically makes this unnecessary).
 508     </para>
 509
 510     <para>
 511      Also, the Unicode escape syntax for string constants only works
 512      when the configuration
 513      parameter <xref linkend="guc-standard-conforming-strings"> is
 514      turned on.  This is because otherwise this syntax could confuse
 515      clients that parse the SQL statements to the point that it could
 516      lead to SQL injections and similar security issues.  If the
 517      parameter is set to off, this syntax will be rejected with an
 518      error message.
 519     </para>
 520
 521     <para>
 522      To include the escape character in the string literally, write it
 523      twice.
 524     </para>
 525    </sect3>
 526
 527    <sect3 id="sql-syntax-dollar-quoting">
 528     <title>Dollar-Quoted String Constants</title>
 529
 530      <indexterm>
 531       <primary>dollar quoting</primary>
 532      </indexterm>
 533
 534     <para>
 535      While the standard syntax for specifying string constants is usually
 536      convenient, it can be difficult to understand when the desired string
 537      contains many single quotes or backslashes, since each of those must
 538      be doubled. To allow more readable queries in such situations,
 539      <productname>PostgreSQL</productname> provides another way, called
 540      <quote>dollar quoting</quote>, to write string constants.
 541      A dollar-quoted string constant
 542      consists of a dollar sign (<literal>$</literal>), an optional
 543      <quote>tag</quote> of zero or more characters, another dollar
 544      sign, an arbitrary sequence of characters that makes up the
 545      string content, a dollar sign, the same tag that began this
 546      dollar quote, and a dollar sign. For example, here are two
 547      different ways to specify the string <quote>Dianne's horse</>
 548      using dollar quoting:
 549 <programlisting>
 550 $$Dianne's horse$$
 551 $SomeTag$Dianne's horse$SomeTag$
 552 </programlisting>
 553      Notice that inside the dollar-quoted string, single quotes can be
 554      used without needing to be escaped.  Indeed, no characters inside
 555      a dollar-quoted string are ever escaped: the string content is always
 556      written literally.  Backslashes are not special, and neither are
 557      dollar signs, unless they are part of a sequence matching the opening
 558      tag.
 559     </para>
 560
 561     <para>
 562      It is possible to nest dollar-quoted string constants by choosing
 563      different tags at each nesting level.  This is most commonly used in
 564      writing function definitions.  For example:
 565 <programlisting>
 566 $function$
 567 BEGIN
 568     RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
 569 END;
 570 $function$
 571 </programlisting>
 572      Here, the sequence <literal>$q$[\t\r\n\v\\]$q$</> represents a
 573      dollar-quoted literal string <literal>[\t\r\n\v\\]</>, which will
 574      be recognized when the function body is executed by
 575      <productname>PostgreSQL</>.  But since the sequence does not match
 576      the outer dollar quoting delimiter <literal>$function$</>, it is
 577      just some more characters within the constant so far as the outer
 578      string is concerned.
 579     </para>
 580
 581     <para>
 582      The tag, if any, of a dollar-quoted string follows the same rules
 583      as an unquoted identifier, except that it cannot contain a dollar sign.
 584      Tags are case sensitive, so <literal>$tag$String content$tag$</literal>
 585      is correct, but <literal>$TAG$String content$tag$</literal> is not.
 586     </para>
 587
 588     <para>
 589      A dollar-quoted string that follows a keyword or identifier must
 590      be separated from it by whitespace; otherwise the dollar quoting
 591      delimiter would be taken as part of the preceding identifier.
 592     </para>
 593
 594     <para>
 595      Dollar quoting is not part of the SQL standard, but it is often a more
 596      convenient way to write complicated string literals than the
 597      standard-compliant single quote syntax.  It is particularly useful when
 598      representing string constants inside other constants, as is often needed
 599      in procedural function definitions.  With single-quote syntax, each
 600      backslash in the above example would have to be written as four
 601      backslashes, which would be reduced to two backslashes in parsing the
 602      original string constant, and then to one when the inner string constant
 603      is re-parsed during function execution.
 604     </para>
 605    </sect3>
 606
 607    <sect3 id="sql-syntax-bit-strings">
 608     <title>Bit-String Constants</title>
 609
 610     <indexterm zone="sql-syntax-bit-strings">
 611      <primary>bit string</primary>
 612      <secondary>constant</secondary>
 613     </indexterm>
 614
 615     <para>
 616      Bit-string constants look like regular string constants with a
 617      <literal>B</literal> (upper or lower case) immediately before the
 618      opening quote (no intervening whitespace), e.g.,
 619      <literal>B'1001'</literal>.  The only characters allowed within
 620      bit-string constants are <literal>0</literal> and
 621      <literal>1</literal>.
 622     </para>
 623
 624     <para>
 625      Alternatively, bit-string constants can be specified in hexadecimal
 626      notation, using a leading <literal>X</literal> (upper or lower case),
 627      e.g., <literal>X'1FF'</literal>.  This notation is equivalent to
 628      a bit-string constant with four binary digits for each hexadecimal digit.
 629     </para>
 630
 631     <para>
 632      Both forms of bit-string constant can be continued
 633      across lines in the same way as regular string constants.
 634      Dollar quoting cannot be used in a bit-string constant.
 635     </para>
 636    </sect3>
 637
 638    <sect3>
 639     <title>Numeric Constants</title>
 640
 641     <indexterm>
 642      <primary>number</primary>
 643      <secondary>constant</secondary>
 644     </indexterm>
 645
 646     <para>
 647      Numeric constants are accepted in these general forms:
 648 <synopsis>
 649 <replaceable>digits</replaceable>
 650 <replaceable>digits</replaceable>.<optional><replaceable>digits</replaceable></optional><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 651 <optional><replaceable>digits</replaceable></optional>.<replaceable>digits</replaceable><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 652 <replaceable>digits</replaceable>e<optional>+-</optional><replaceable>digits</replaceable>
 653 </synopsis>
 654      where <replaceable>digits</replaceable> is one or more decimal
 655      digits (0 through 9).  At least one digit must be before or after the
 656      decimal point, if one is used.  At least one digit must follow the
 657      exponent marker (<literal>e</literal>), if one is present.
 658      There cannot be any spaces or other characters embedded in the
 659      constant.  Note that any leading plus or minus sign is not actually
 660      considered part of the constant; it is an operator applied to the
 661      constant.
 662     </para>
 663
 664     <para>
 665      These are some examples of valid numeric constants:
 666 <literallayout>
 667 42
 668 3.5
 669 4.
 670 .001
 671 5e2
 672 1.925e-3
 673 </literallayout>
 674     </para>
 675
 676     <para>
 677      <indexterm><primary>integer</primary></indexterm>
 678      <indexterm><primary>bigint</primary></indexterm>
 679      <indexterm><primary>numeric</primary></indexterm>
 680      A numeric constant that contains neither a decimal point nor an
 681      exponent is initially presumed to be type <type>integer</> if its
 682      value fits in type <type>integer</> (32 bits); otherwise it is
 683      presumed to be type <type>bigint</> if its
 684      value fits in type <type>bigint</> (64 bits); otherwise it is
 685      taken to be type <type>numeric</>.  Constants that contain decimal
 686      points and/or exponents are always initially presumed to be type
 687      <type>numeric</>.
 688     </para>
 689
 690     <para>
 691      The initially assigned data type of a numeric constant is just a
 692      starting point for the type resolution algorithms.  In most cases
 693      the constant will be automatically coerced to the most
 694      appropriate type depending on context.  When necessary, you can
 695      force a numeric value to be interpreted as a specific data type
 696      by casting it.<indexterm><primary>type cast</primary></indexterm>
 697      For example, you can force a numeric value to be treated as type
 698      <type>real</> (<type>float4</>) by writing:
 699
 700 <programlisting>
 701 REAL '1.23'  -- string style
 702 1.23::REAL   -- PostgreSQL (historical) style
 703 </programlisting>
 704
 705      These are actually just special cases of the general casting
 706      notations discussed next.
 707     </para>
 708    </sect3>
 709
 710    <sect3 id="sql-syntax-constants-generic">
 711     <title>Constants of Other Types</title>
 712
 713     <indexterm>
 714      <primary>data type</primary>
 715      <secondary>constant</secondary>
 716     </indexterm>
 717
 718     <para>
 719      A constant of an <emphasis>arbitrary</emphasis> type can be
 720      entered using any one of the following notations:
 721 <synopsis>
 722 <replaceable>type</replaceable> '<replaceable>string</replaceable>'
 723 '<replaceable>string</replaceable>'::<replaceable>type</replaceable>
 724 CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 725 </synopsis>
 726      The string constant's text is passed to the input conversion
 727      routine for the type called <replaceable>type</replaceable>. The
 728      result is a constant of the indicated type.  The explicit type
 729      cast can be omitted if there is no ambiguity as to the type the
 730      constant must be (for example, when it is assigned directly to a
 731      table column), in which case it is automatically coerced.
 732     </para>
 733
 734     <para>
 735      The string constant can be written using either regular SQL
 736      notation or dollar-quoting.
 737     </para>
 738
 739     <para>
 740      It is also possible to specify a type coercion using a function-like
 741      syntax:
 742 <synopsis>
 743 <replaceable>typename</replaceable> ( '<replaceable>string</replaceable>' )
 744 </synopsis>
 745      but not all type names can be used in this way; see <xref
 746      linkend="sql-syntax-type-casts"> for details.
 747     </para>
 748
 749     <para>
 750      The <literal>::</literal>, <literal>CAST()</literal>, and
 751      function-call syntaxes can also be used to specify run-time type
 752      conversions of arbitrary expressions, as discussed in <xref
 753      linkend="sql-syntax-type-casts">.  To avoid syntactic ambiguity, the
 754      <literal><replaceable>type</> '<replaceable>string</>'</literal>
 755      syntax can only be used to specify the type of a simple literal constant.
 756      Another restriction on the
 757      <literal><replaceable>type</> '<replaceable>string</>'</literal>
 758      syntax is that it does not work for array types; use <literal>::</literal>
 759      or <literal>CAST()</literal> to specify the type of an array constant.
 760     </para>
 761
 762     <para>
 763      The <literal>CAST()</> syntax conforms to SQL.  The
 764      <literal><replaceable>type</> '<replaceable>string</>'</literal>
 765      syntax is a generalization of the standard: SQL specifies this syntax only
 766      for a few data types, but <productname>PostgreSQL</productname> allows it
 767      for all types.  The syntax with
 768      <literal>::</literal> is historical <productname>PostgreSQL</productname>
 769      usage, as is the function-call syntax.
 770     </para>
 771    </sect3>
 772   </sect2>
 773
 774   <sect2 id="sql-syntax-operators">
 775    <title>Operators</title>
 776
 777    <indexterm zone="sql-syntax-operators">
 778     <primary>operator</primary>
 779     <secondary>syntax</secondary>
 780    </indexterm>
 781
 782    <para>
 783     An operator name is a sequence of up to <symbol>NAMEDATALEN</symbol>-1
 784     (63 by default) characters from the following list:
 785 <literallayout>
 786 + - * / &lt; &gt; = ~ ! @ # % ^ &amp; | ` ?
 787 </literallayout>
 788
 789     There are a few restrictions on operator names, however:
 790     <itemizedlist>
 791      <listitem>
 792       <para>
 793        <literal>--</literal> and <literal>/*</literal> cannot appear
 794        anywhere in an operator name, since they will be taken as the
 795        start of a comment.
 796       </para>
 797      </listitem>
 798
 799      <listitem>
 800       <para>
 801        A multiple-character operator name cannot end in <literal>+</> or <literal>-</>,
 802        unless the name also contains at least one of these characters:
 803 <literallayout>
 804 ~ ! @ # % ^ &amp; | ` ?
 805 </literallayout>
 806        For example, <literal>@-</literal> is an allowed operator name,
 807        but <literal>*-</literal> is not.  This restriction allows
 808        <productname>PostgreSQL</productname> to parse SQL-compliant
 809        queries without requiring spaces between tokens.
 810       </para>
 811      </listitem>
 812     </itemizedlist>
 813    </para>
 814
 815    <para>
 816     When working with non-SQL-standard operator names, you will usually
 817     need to separate adjacent operators with spaces to avoid ambiguity.
 818     For example, if you have defined a left unary operator named <literal>@</literal>,
 819     you cannot write <literal>X*@Y</literal>; you must write
 820     <literal>X* @Y</literal> to ensure that
 821     <productname>PostgreSQL</productname> reads it as two operator names
 822     not one.
 823    </para>
 824   </sect2>
 825
 826   <sect2>
 827    <title>Special Characters</title>
 828
 829   <para>
 830    Some characters that are not alphanumeric have a special meaning
 831    that is different from being an operator.  Details on the usage can
 832    be found at the location where the respective syntax element is
 833    described.  This section only exists to advise the existence and
 834    summarize the purposes of these characters.
 835
 836    <itemizedlist>
 837     <listitem>
 838      <para>
 839       A dollar sign (<literal>$</literal>) followed by digits is used
 840       to represent a positional parameter in the body of a function
 841       definition or a prepared statement.  In other contexts the
 842       dollar sign can be part of an identifier or a dollar-quoted string
 843       constant.
 844      </para>
 845     </listitem>
 846
 847     <listitem>
 848      <para>
 849       Parentheses (<literal>()</literal>) have their usual meaning to
 850       group expressions and enforce precedence.  In some cases
 851       parentheses are required as part of the fixed syntax of a
 852       particular SQL command.
 853      </para>
 854     </listitem>
 855
 856     <listitem>
 857      <para>
 858       Brackets (<literal>[]</literal>) are used to select the elements
 859       of an array.  See <xref linkend="arrays"> for more information
 860       on arrays.
 861      </para>
 862     </listitem>
 863
 864     <listitem>
 865      <para>
 866       Commas (<literal>,</literal>) are used in some syntactical
 867       constructs to separate the elements of a list.
 868      </para>
 869     </listitem>
 870
 871     <listitem>
 872      <para>
 873       The semicolon (<literal>;</literal>) terminates an SQL command.
 874       It cannot appear anywhere within a command, except within a
 875       string constant or quoted identifier.
 876      </para>
 877     </listitem>
 878
 879     <listitem>
 880      <para>
 881       The colon (<literal>:</literal>) is used to select
 882       <quote>slices</quote> from arrays. (See <xref
 883       linkend="arrays">.)  In certain SQL dialects (such as Embedded
 884       SQL), the colon is used to prefix variable names.
 885      </para>
 886     </listitem>
 887
 888     <listitem>
 889      <para>
 890       The asterisk (<literal>*</literal>) is used in some contexts to denote
 891       all the fields of a table row or composite value.  It also
 892       has a special meaning when used as the argument of an
 893       aggregate function, namely that the aggregate does not require
 894       any explicit parameter.
 895      </para>
 896     </listitem>
 897
 898     <listitem>
 899      <para>
 900       The period (<literal>.</literal>) is used in numeric
 901       constants, and to separate schema, table, and column names.
 902      </para>
 903     </listitem>
 904    </itemizedlist>
 905
 906    </para>
 907   </sect2>
 908
 909   <sect2 id="sql-syntax-comments">
 910    <title>Comments</title>
 911
 912    <indexterm zone="sql-syntax-comments">
 913     <primary>comment</primary>
 914     <secondary sortas="SQL">in SQL</secondary>
 915    </indexterm>
 916
 917    <para>
 918     A comment is a sequence of characters beginning with
 919     double dashes and extending to the end of the line, e.g.:
 920 <programlisting>
 921 -- This is a standard SQL comment
 922 </programlisting>
 923    </para>
 924
 925    <para>
 926     Alternatively, C-style block comments can be used:
 927 <programlisting>
 928 /* multiline comment
 929  * with nesting: /* nested block comment */
 930  */
 931 </programlisting>
 932     where the comment begins with <literal>/*</literal> and extends to
 933     the matching occurrence of <literal>*/</literal>. These block
 934     comments nest, as specified in the SQL standard but unlike C, so that one can
 935     comment out larger blocks of code that might contain existing block
 936     comments.
 937    </para>
 938
 939    <para>
 940     A comment is removed from the input stream before further syntax
 941     analysis and is effectively replaced by whitespace.
 942    </para>
 943   </sect2>
 944
 945   <sect2 id="sql-precedence">
 946    <title>Lexical Precedence</title>
 947
 948    <indexterm zone="sql-precedence">
 949     <primary>operator</primary>
 950     <secondary>precedence</secondary>
 951    </indexterm>
 952
 953    <para>
 954     <xref linkend="sql-precedence-table"> shows the precedence and
 955     associativity of the operators in <productname>PostgreSQL</>.
 956     Most operators have the same precedence and are left-associative.
 957     The precedence and associativity of the operators is hard-wired
 958     into the parser.  This can lead to non-intuitive behavior; for
 959     example the Boolean operators <literal>&lt;</> and
 960     <literal>&gt;</> have a different precedence than the Boolean
 961     operators <literal>&lt;=</> and <literal>&gt;=</>.  Also, you will
 962     sometimes need to add parentheses when using combinations of
 963     binary and unary operators.  For instance:
 964 <programlisting>
 965 SELECT 5 ! - 6;
 966 </programlisting>
 967    will be parsed as:
 968 <programlisting>
 969 SELECT 5 ! (- 6);
 970 </programlisting>
 971     because the parser has no idea &mdash; until it is too late
 972     &mdash; that <token>!</token> is defined as a postfix operator,
 973     not an infix one.  To get the desired behavior in this case, you
 974     must write:
 975 <programlisting>
 976 SELECT (5 !) - 6;
 977 </programlisting>
 978     This is the price one pays for extensibility.
 979    </para>
 980
 981    <table id="sql-precedence-table">
 982     <title>Operator Precedence (decreasing)</title>
 983
 984     <tgroup cols="3">
 985      <thead>
 986       <row>
 987        <entry>Operator/Element</entry>
 988        <entry>Associativity</entry>
 989        <entry>Description</entry>
 990       </row>
 991      </thead>
 992
 993      <tbody>
 994       <row>
 995        <entry><token>.</token></entry>
 996        <entry>left</entry>
 997        <entry>table/column name separator</entry>
 998       </row>
 999
1000       <row>
1001        <entry><token>::</token></entry>
1002        <entry>left</entry>
1003        <entry><productname>PostgreSQL</productname>-style typecast</entry>
1004       </row>
1005
1006       <row>
1007        <entry><token>[</token> <token>]</token></entry>
1008        <entry>left</entry>
1009        <entry>array element selection</entry>
1010       </row>
1011
1012       <row>
1013        <entry><token>-</token></entry>
1014        <entry>right</entry>
1015        <entry>unary minus</entry>
1016       </row>
1017
1018       <row>
1019        <entry><token>^</token></entry>
1020        <entry>left</entry>
1021        <entry>exponentiation</entry>
1022       </row>
1023
1024       <row>
1025        <entry><token>*</token> <token>/</token> <token>%</token></entry>
1026        <entry>left</entry>
1027        <entry>multiplication, division, modulo</entry>
1028       </row>
1029
1030       <row>
1031        <entry><token>+</token> <token>-</token></entry>
1032        <entry>left</entry>
1033        <entry>addition, subtraction</entry>
1034       </row>
1035
1036       <row>
1037        <entry><token>IS</token></entry>
1038        <entry></entry>
1039        <entry><literal>IS TRUE</>, <literal>IS FALSE</>, <literal>IS UNKNOWN</>, <literal>IS NULL</></entry>
1040       </row>
1041
1042       <row>
1043        <entry><token>ISNULL</token></entry>
1044        <entry></entry>
1045        <entry>test for null</entry>
1046       </row>
1047
1048       <row>
1049        <entry><token>NOTNULL</token></entry>
1050        <entry></entry>
1051        <entry>test for not null</entry>
1052       </row>
1053
1054       <row>
1055        <entry>(any other)</entry>
1056        <entry>left</entry>
1057        <entry>all other native and user-defined operators</entry>
1058       </row>
1059
1060       <row>
1061        <entry><token>IN</token></entry>
1062        <entry></entry>
1063        <entry>set membership</entry>
1064       </row>
1065
1066       <row>
1067        <entry><token>BETWEEN</token></entry>
1068        <entry></entry>
1069        <entry>range containment</entry>
1070       </row>
1071
1072       <row>
1073        <entry><token>OVERLAPS</token></entry>
1074        <entry></entry>
1075        <entry>time interval overlap</entry>
1076       </row>
1077
1078       <row>
1079        <entry><token>LIKE</token> <token>ILIKE</token> <token>SIMILAR</token></entry>
1080        <entry></entry>
1081        <entry>string pattern matching</entry>
1082       </row>
1083
1084       <row>
1085        <entry><token>&lt;</token> <token>&gt;</token></entry>
1086        <entry></entry>
1087        <entry>less than, greater than</entry>
1088       </row>
1089
1090       <row>
1091        <entry><token>=</token></entry>
1092        <entry>right</entry>
1093        <entry>equality, assignment</entry>
1094       </row>
1095
1096       <row>
1097        <entry><token>NOT</token></entry>
1098        <entry>right</entry>
1099        <entry>logical negation</entry>
1100       </row>
1101
1102       <row>
1103        <entry><token>AND</token></entry>
1104        <entry>left</entry>
1105        <entry>logical conjunction</entry>
1106       </row>
1107
1108       <row>
1109        <entry><token>OR</token></entry>
1110        <entry>left</entry>
1111        <entry>logical disjunction</entry>
1112       </row>
1113      </tbody>
1114     </tgroup>
1115    </table>
1116
1117    <para>
1118     Note that the operator precedence rules also apply to user-defined
1119     operators that have the same names as the built-in operators
1120     mentioned above.  For example, if you define a
1121     <quote>+</quote> operator for some custom data type it will have
1122     the same precedence as the built-in <quote>+</quote> operator, no
1123     matter what yours does.
1124    </para>
1125
1126    <para>
1127     When a schema-qualified operator name is used in the
1128     <literal>OPERATOR</> syntax, as for example in:
1129 <programlisting>
1130 SELECT 3 OPERATOR(pg_catalog.+) 4;
1131 </programlisting>
1132     the <literal>OPERATOR</> construct is taken to have the default precedence
1133     shown in <xref linkend="sql-precedence-table"> for <quote>any other</> operator.  This is true no matter
1134     which specific operator appears inside <literal>OPERATOR()</>.
1135    </para>
1136   </sect2>
1137  </sect1>
1138
1139  <sect1 id="sql-expressions">
1140   <title>Value Expressions</title>
1141
1142   <indexterm zone="sql-expressions">
1143    <primary>expression</primary>
1144    <secondary>syntax</secondary>
1145   </indexterm>
1146
1147   <indexterm zone="sql-expressions">
1148    <primary>value expression</primary>
1149   </indexterm>
1150
1151   <indexterm>
1152    <primary>scalar</primary>
1153    <see>expression</see>
1154   </indexterm>
1155
1156   <para>
1157    Value expressions are used in a variety of contexts, such
1158    as in the target list of the <command>SELECT</command> command, as
1159    new column values in <command>INSERT</command> or
1160    <command>UPDATE</command>, or in search conditions in a number of
1161    commands.  The result of a value expression is sometimes called a
1162    <firstterm>scalar</firstterm>, to distinguish it from the result of
1163    a table expression (which is a table).  Value expressions are
1164    therefore also called <firstterm>scalar expressions</firstterm> (or
1165    even simply <firstterm>expressions</firstterm>).  The expression
1166    syntax allows the calculation of values from primitive parts using
1167    arithmetic, logical, set, and other operations.
1168   </para>
1169
1170   <para>
1171    A value expression is one of the following:
1172
1173    <itemizedlist>
1174     <listitem>
1175      <para>
1176       A constant or literal value
1177      </para>
1178     </listitem>
1179
1180     <listitem>
1181      <para>
1182       A column reference
1183      </para>
1184     </listitem>
1185
1186     <listitem>
1187      <para>
1188       A positional parameter reference, in the body of a function definition
1189       or prepared statement
1190      </para>
1191     </listitem>
1192
1193     <listitem>
1194      <para>
1195       A subscripted expression
1196      </para>
1197     </listitem>
1198
1199     <listitem>
1200      <para>
1201       A field selection expression
1202      </para>
1203     </listitem>
1204
1205     <listitem>
1206      <para>
1207       An operator invocation
1208      </para>
1209     </listitem>
1210
1211     <listitem>
1212      <para>
1213       A function call
1214      </para>
1215     </listitem>
1216
1217     <listitem>
1218      <para>
1219       An aggregate expression
1220      </para>
1221     </listitem>
1222
1223     <listitem>
1224      <para>
1225       A window function call
1226      </para>
1227     </listitem>
1228
1229     <listitem>
1230      <para>
1231       A type cast
1232      </para>
1233     </listitem>
1234
1235     <listitem>
1236      <para>
1237       A scalar subquery
1238      </para>
1239     </listitem>
1240
1241     <listitem>
1242      <para>
1243       An array constructor
1244      </para>
1245     </listitem>
1246
1247     <listitem>
1248      <para>
1249       A row constructor
1250      </para>
1251     </listitem>
1252
1253     <listitem>
1254      <para>
1255       Another value expression in parentheses (used to group
1256       subexpressions and override
1257       precedence<indexterm><primary>parenthesis</></>)
1258      </para>
1259     </listitem>
1260    </itemizedlist>
1261   </para>
1262
1263   <para>
1264    In addition to this list, there are a number of constructs that can
1265    be classified as an expression but do not follow any general syntax
1266    rules.  These generally have the semantics of a function or
1267    operator and are explained in the appropriate location in <xref
1268    linkend="functions">.  An example is the <literal>IS NULL</literal>
1269    clause.
1270   </para>
1271
1272   <para>
1273    We have already discussed constants in <xref
1274    linkend="sql-syntax-constants">.  The following sections discuss
1275    the remaining options.
1276   </para>
1277
1278   <sect2>
1279    <title>Column References</title>
1280
1281    <indexterm>
1282     <primary>column reference</primary>
1283    </indexterm>
1284
1285    <para>
1286     A column can be referenced in the form:
1287 <synopsis>
1288 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable>
1289 </synopsis>
1290    </para>
1291
1292    <para>
1293     <replaceable>correlation</replaceable> is the name of a
1294     table (possibly qualified with a schema name), or an alias for a table
1295     defined by means of a <literal>FROM</literal> clause, or one of
1296     the key words <literal>NEW</literal> or <literal>OLD</literal>.
1297     (<literal>NEW</literal> and <literal>OLD</literal> can only appear in rewrite rules,
1298     while other correlation names can be used in any SQL statement.)
1299     The correlation name and separating dot can be omitted if the column name
1300     is unique across all the tables being used in the current query.  (See also <xref linkend="queries">.)
1301    </para>
1302   </sect2>
1303
1304   <sect2>
1305    <title>Positional Parameters</title>
1306
1307    <indexterm>
1308     <primary>parameter</primary>
1309     <secondary>syntax</secondary>
1310    </indexterm>
1311
1312    <indexterm>
1313     <primary>$</primary>
1314    </indexterm>
1315
1316    <para>
1317     A positional parameter reference is used to indicate a value
1318     that is supplied externally to an SQL statement.  Parameters are
1319     used in SQL function definitions and in prepared queries.  Some
1320     client libraries also support specifying data values separately
1321     from the SQL command string, in which case parameters are used to
1322     refer to the out-of-line data values.
1323     The form of a parameter reference is:
1324 <synopsis>
1325 $<replaceable>number</replaceable>
1326 </synopsis>
1327    </para>
1328
1329    <para>
1330     For example, consider the definition of a function,
1331     <function>dept</function>, as:
1332
1333 <programlisting>
1334 CREATE FUNCTION dept(text) RETURNS dept
1335     AS $$ SELECT * FROM dept WHERE name = $1 $$
1336     LANGUAGE SQL;
1337 </programlisting>
1338
1339     Here the <literal>$1</literal> references the value of the first
1340     function argument whenever the function is invoked.
1341    </para>
1342   </sect2>
1343
1344   <sect2>
1345    <title>Subscripts</title>
1346
1347    <indexterm>
1348     <primary>subscript</primary>
1349    </indexterm>
1350
1351    <para>
1352     If an expression yields a value of an array type, then a specific
1353     element of the array value can be extracted by writing
1354 <synopsis>
1355 <replaceable>expression</replaceable>[<replaceable>subscript</replaceable>]
1356 </synopsis>
1357     or multiple adjacent elements (an <quote>array slice</>) can be extracted
1358     by writing
1359 <synopsis>
1360 <replaceable>expression</replaceable>[<replaceable>lower_subscript</replaceable>:<replaceable>upper_subscript</replaceable>]
1361 </synopsis>
1362     (Here, the brackets <literal>[ ]</literal> are meant to appear literally.)
1363     Each <replaceable>subscript</replaceable> is itself an expression,
1364     which must yield an integer value.
1365    </para>
1366
1367    <para>
1368     In general the array <replaceable>expression</replaceable> must be
1369     parenthesized, but the parentheses can be omitted when the expression
1370     to be subscripted is just a column reference or positional parameter.
1371     Also, multiple subscripts can be concatenated when the original array
1372     is multidimensional.
1373     For example:
1374
1375 <programlisting>
1376 mytable.arraycolumn[4]
1377 mytable.two_d_column[17][34]
1378 $1[10:42]
1379 (arrayfunction(a,b))[42]
1380 </programlisting>
1381
1382     The parentheses in the last example are required.
1383     See <xref linkend="arrays"> for more about arrays.
1384    </para>
1385   </sect2>
1386
1387   <sect2>
1388    <title>Field Selection</title>
1389
1390    <indexterm>
1391     <primary>field selection</primary>
1392    </indexterm>
1393
1394    <para>
1395     If an expression yields a value of a composite type (row type), then a
1396     specific field of the row can be extracted by writing
1397 <synopsis>
1398 <replaceable>expression</replaceable>.<replaceable>fieldname</replaceable>
1399 </synopsis>
1400    </para>
1401
1402    <para>
1403     In general the row <replaceable>expression</replaceable> must be
1404     parenthesized, but the parentheses can be omitted when the expression
1405     to be selected from is just a table reference or positional parameter.
1406     For example:
1407
1408 <programlisting>
1409 mytable.mycolumn
1410 $1.somecolumn
1411 (rowfunction(a,b)).col3
1412 </programlisting>
1413
1414     (Thus, a qualified column reference is actually just a special case
1415     of the field selection syntax.)  An important special case is
1416     extracting a field from a table column that is of a composite type:
1417
1418 <programlisting>
1419 (compositecol).somefield
1420 (mytable.compositecol).somefield
1421 </programlisting>
1422
1423     The parentheses are required here to show that
1424     <structfield>compositecol</> is a column name not a table name,
1425     or that <structname>mytable</> is a table name not a schema name
1426     in the second case.
1427    </para>
1428   </sect2>
1429
1430   <sect2>
1431    <title>Operator Invocations</title>
1432
1433    <indexterm>
1434     <primary>operator</primary>
1435     <secondary>invocation</secondary>
1436    </indexterm>
1437
1438    <para>
1439     There are three possible syntaxes for an operator invocation:
1440     <simplelist>
1441      <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> <replaceable>expression</replaceable> (binary infix operator)</member>
1442      <member><replaceable>operator</replaceable> <replaceable>expression</replaceable> (unary prefix operator)</member>
1443      <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> (unary postfix operator)</member>
1444     </simplelist>
1445     where the <replaceable>operator</replaceable> token follows the syntax
1446     rules of <xref linkend="sql-syntax-operators">, or is one of the
1447     key words <token>AND</token>, <token>OR</token>, and
1448     <token>NOT</token>, or is a qualified operator name in the form:
1449 <synopsis>
1450 <literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operatorname</><literal>)</>
1451 </synopsis>
1452     Which particular operators exist and whether
1453     they are unary or binary depends on what operators have been
1454     defined by the system or the user.  <xref linkend="functions">
1455     describes the built-in operators.
1456    </para>
1457   </sect2>
1458
1459   <sect2>
1460    <title>Function Calls</title>
1461
1462    <indexterm>
1463     <primary>function</primary>
1464     <secondary>invocation</secondary>
1465    </indexterm>
1466
1467    <para>
1468     The syntax for a function call is the name of a function
1469     (possibly qualified with a schema name), followed by its argument list
1470     enclosed in parentheses:
1471
1472 <synopsis>
1473 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
1474 </synopsis>
1475    </para>
1476
1477    <para>
1478     For example, the following computes the square root of 2:
1479 <programlisting>
1480 sqrt(2)
1481 </programlisting>
1482    </para>
1483
1484    <para>
1485     The list of built-in functions is in <xref linkend="functions">.
1486     Other functions can be added by the user.
1487    </para>
1488   </sect2>
1489
1490   <sect2 id="syntax-aggregates">
1491    <title>Aggregate Expressions</title>
1492
1493    <indexterm zone="syntax-aggregates">
1494     <primary>aggregate function</primary>
1495     <secondary>invocation</secondary>
1496    </indexterm>
1497
1498    <para>
1499     An <firstterm>aggregate expression</firstterm> represents the
1500     application of an aggregate function across the rows selected by a
1501     query.  An aggregate function reduces multiple inputs to a single
1502     output value, such as the sum or average of the inputs.  The
1503     syntax of an aggregate expression is one of the following:
1504
1505 <synopsis>
1506 <replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] )
1507 <replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] )
1508 <replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)
1509 <replaceable>aggregate_name</replaceable> ( * )
1510 </synopsis>
1511
1512     where <replaceable>aggregate_name</replaceable> is a previously
1513     defined aggregate (possibly qualified with a schema name), and
1514     <replaceable>expression</replaceable> is
1515     any value expression that does not itself contain an aggregate
1516     expression or a window function call.
1517    </para>
1518
1519    <para>
1520     The first form of aggregate expression invokes the aggregate
1521     across all input rows for which the given expression(s) yield
1522     non-null values.  (Actually, it is up to the aggregate function
1523     whether to ignore null values or not &mdash; but all the standard ones do.)
1524     The second form is the same as the first, since
1525     <literal>ALL</literal> is the default.  The third form invokes the
1526     aggregate for all distinct non-null values of the expressions found
1527     in the input rows.  The last form invokes the aggregate once for
1528     each input row regardless of null or non-null values; since no
1529     particular input value is specified, it is generally only useful
1530     for the <function>count(*)</function> aggregate function.
1531    </para>
1532
1533    <para>
1534     For example, <literal>count(*)</literal> yields the total number
1535     of input rows; <literal>count(f1)</literal> yields the number of
1536     input rows in which <literal>f1</literal> is non-null;
1537     <literal>count(distinct f1)</literal> yields the number of
1538     distinct non-null values of <literal>f1</literal>.
1539    </para>
1540
1541    <para>
1542     The predefined aggregate functions are described in <xref
1543     linkend="functions-aggregate">.  Other aggregate functions can be added
1544     by the user.
1545    </para>
1546
1547    <para>
1548     An aggregate expression can only appear in the result list or
1549     <literal>HAVING</> clause of a <command>SELECT</> command.
1550     It is forbidden in other clauses, such as <literal>WHERE</>,
1551     because those clauses are logically evaluated before the results
1552     of aggregates are formed.
1553    </para>
1554
1555    <para>
1556     When an aggregate expression appears in a subquery (see
1557     <xref linkend="sql-syntax-scalar-subqueries"> and
1558     <xref linkend="functions-subquery">), the aggregate is normally
1559     evaluated over the rows of the subquery.  But an exception occurs
1560     if the aggregate's arguments contain only outer-level variables:
1561     the aggregate then belongs to the nearest such outer level, and is
1562     evaluated over the rows of that query.  The aggregate expression
1563     as a whole is then an outer reference for the subquery it appears in,
1564     and acts as a constant over any one evaluation of that subquery.
1565     The restriction about
1566     appearing only in the result list or <literal>HAVING</> clause
1567     applies with respect to the query level that the aggregate belongs to.
1568    </para>
1569
1570    <note>
1571     <para>
1572      <productname>PostgreSQL</productname> currently does not support
1573      <literal>DISTINCT</> with more than one input expression.
1574     </para>
1575    </note>
1576   </sect2>
1577
1578   <sect2 id="syntax-window-functions">
1579    <title>Window Function Calls</title>
1580
1581    <indexterm zone="syntax-window-functions">
1582     <primary>window function</primary>
1583     <secondary>invocation</secondary>
1584    </indexterm>
1585
1586    <indexterm zone="syntax-window-functions">
1587     <primary>OVER clause</primary>
1588    </indexterm>
1589
1590    <para>
1591     A <firstterm>window function call</firstterm> represents the application
1592     of an aggregate-like function over some portion of the rows selected
1593     by a query.  Unlike regular aggregate function calls, this is not tied
1594     to grouping of the selected rows into a single output row &mdash; each
1595     row remains separate in the query output.  However the window function
1596     is able to scan all the rows that would be part of the current row's
1597     group according to the grouping specification (<literal>PARTITION BY</>
1598     list) of the window function call.
1599     The syntax of a window function call is one of the following:
1600
1601 <synopsis>
1602 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER ( <replaceable class="parameter">window_definition</replaceable> )
1603 <replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER <replaceable>window_name</replaceable>
1604 <replaceable>function_name</replaceable> ( * ) OVER ( <replaceable class="parameter">window_definition</replaceable> )
1605 <replaceable>function_name</replaceable> ( * ) OVER <replaceable>window_name</replaceable>
1606 </synopsis>
1607     where <replaceable class="parameter">window_definition</replaceable>
1608     has the syntax
1609 <synopsis>
1610 [ <replaceable class="parameter">existing_window_name</replaceable> ]
1611 [ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
1612 [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
1613 [ <replaceable class="parameter">frame_clause</replaceable> ]
1614 </synopsis>
1615     and the optional <replaceable class="parameter">frame_clause</replaceable>
1616     can be one of
1617 <synopsis>
1618 RANGE UNBOUNDED PRECEDING
1619 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
1620 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
1621 ROWS UNBOUNDED PRECEDING
1622 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
1623 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
1624 </synopsis>
1625
1626     Here, <replaceable>expression</replaceable> represents any value
1627     expression that does not itself contain window function calls.
1628     The <literal>PARTITION BY</> and <literal>ORDER BY</> lists have
1629     essentially the same syntax and semantics as <literal>GROUP BY</>
1630     and <literal>ORDER BY</> clauses of the whole query, except that their
1631     expressions are always just expressions and cannot be output-column
1632     names or numbers.
1633     <replaceable>window_name</replaceable> is a reference to a named window
1634     specification defined in the query's <literal>WINDOW</literal> clause.
1635     Named window specifications are usually referenced with just
1636     <literal>OVER</> <replaceable>window_name</replaceable>, but it is
1637     also possible to write a window name inside the parentheses and then
1638     optionally supply an ordering clause and/or frame clause (the referenced
1639     window must lack these clauses, if they are supplied here).
1640     This latter syntax follows the same rules as modifying an existing
1641     window name within the <literal>WINDOW</literal> clause; see the
1642     <xref linkend="sql-select" endterm="sql-select-title"> reference
1643     page for details.
1644    </para>
1645
1646    <para>
1647     The <replaceable class="parameter">frame_clause</replaceable> specifies
1648     the set of rows constituting the <firstterm>window frame</>, for those
1649     window functions that act on the frame instead of the whole partition.
1650     The default framing option is <literal>RANGE UNBOUNDED PRECEDING</>,
1651     which is the same as <literal>RANGE BETWEEN UNBOUNDED PRECEDING AND
1652     CURRENT ROW</>; it selects rows up through the current row's last
1653     peer in the <literal>ORDER BY</> ordering (which means all rows if
1654     there is no <literal>ORDER BY</>).  The options
1655     <literal>RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</> and
1656     <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</>
1657     are also equivalent: they always select all rows in the partition.
1658     Lastly, <literal>ROWS UNBOUNDED PRECEDING</> or its verbose equivalent
1659     <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</> select
1660     all rows up through the current row (regardless of duplicates).
1661     Beware that this option can produce implementation-dependent results
1662     if the <literal>ORDER BY</> ordering does not order the rows uniquely.
1663    </para>
1664
1665    <para>
1666     The built-in window functions are described in <xref
1667     linkend="functions-window-table">.  Other window functions can be added by
1668     the user.  Also, any built-in or user-defined aggregate function can be
1669     used as a window function.
1670    </para>
1671
1672    <para>
1673     The syntaxes using <literal>*</> are used for calling parameter-less
1674     aggregate functions as window functions, for example
1675     <literal>count(*) OVER (PARTITION BY x ORDER BY y)</>.
1676     <literal>*</> is customarily not used for non-aggregate window functions.
1677     Aggregate window functions, unlike normal aggregate functions, do not
1678     allow <literal>DISTINCT</> to be used within the function argument list.
1679    </para>
1680
1681    <para>
1682     Window function calls are permitted only in the <literal>SELECT</literal>
1683     list and the <literal>ORDER BY</> clause of the query.
1684    </para>
1685
1686    <para>
1687     More information about window functions can be found in
1688     <xref linkend="tutorial-window"> and
1689     <xref linkend="queries-window">.
1690    </para>
1691   </sect2>
1692
1693   <sect2 id="sql-syntax-type-casts">
1694    <title>Type Casts</title>
1695
1696    <indexterm>
1697     <primary>data type</primary>
1698     <secondary>type cast</secondary>
1699    </indexterm>
1700
1701    <indexterm>
1702     <primary>type cast</primary>
1703    </indexterm>
1704
1705    <para>
1706     A type cast specifies a conversion from one data type to another.
1707     <productname>PostgreSQL</productname> accepts two equivalent syntaxes
1708     for type casts:
1709 <synopsis>
1710 CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable> )
1711 <replaceable>expression</replaceable>::<replaceable>type</replaceable>
1712 </synopsis>
1713     The <literal>CAST</> syntax conforms to SQL; the syntax with
1714     <literal>::</literal> is historical <productname>PostgreSQL</productname>
1715     usage.
1716    </para>
1717
1718    <para>
1719     When a cast is applied to a value expression of a known type, it
1720     represents a run-time type conversion.  The cast will succeed only
1721     if a suitable type conversion operation has been defined.  Notice that this
1722     is subtly different from the use of casts with constants, as shown in
1723     <xref linkend="sql-syntax-constants-generic">.  A cast applied to an
1724     unadorned string literal represents the initial assignment of a type
1725     to a literal constant value, and so it will succeed for any type
1726     (if the contents of the string literal are acceptable input syntax for the
1727     data type).
1728    </para>
1729
1730    <para>
1731     An explicit type cast can usually be omitted if there is no ambiguity as
1732     to the type that a value expression must produce (for example, when it is
1733     assigned to a table column); the system will automatically apply a
1734     type cast in such cases.  However, automatic casting is only done for
1735     casts that are marked <quote>OK to apply implicitly</>
1736     in the system catalogs.  Other casts must be invoked with
1737     explicit casting syntax.  This restriction is intended to prevent
1738     surprising conversions from being applied silently.
1739    </para>
1740
1741    <para>
1742     It is also possible to specify a type cast using a function-like
1743     syntax:
1744 <synopsis>
1745 <replaceable>typename</replaceable> ( <replaceable>expression</replaceable> )
1746 </synopsis>
1747     However, this only works for types whose names are also valid as
1748     function names.  For example, <literal>double precision</literal>
1749     cannot be used this way, but the equivalent <literal>float8</literal>
1750     can.  Also, the names <literal>interval</>, <literal>time</>, and
1751     <literal>timestamp</> can only be used in this fashion if they are
1752     double-quoted, because of syntactic conflicts.  Therefore, the use of
1753     the function-like cast syntax leads to inconsistencies and should
1754     probably be avoided.
1755    </para>
1756
1757    <note>
1758     <para>
1759      The function-like syntax is in fact just a function call.  When
1760      one of the two standard cast syntaxes is used to do a run-time
1761      conversion, it will internally invoke a registered function to
1762      perform the conversion.  By convention, these conversion functions
1763      have the same name as their output type, and thus the <quote>function-like
1764      syntax</> is nothing more than a direct invocation of the underlying
1765      conversion function.  Obviously, this is not something that a portable
1766      application should rely on.  For further details see
1767      <xref linkend="sql-createcast" endterm="sql-createcast-title">.
1768     </para>
1769    </note>
1770   </sect2>
1771
1772   <sect2 id="sql-syntax-scalar-subqueries">
1773    <title>Scalar Subqueries</title>
1774
1775    <indexterm>
1776     <primary>subquery</primary>
1777    </indexterm>
1778
1779    <para>
1780     A scalar subquery is an ordinary
1781     <command>SELECT</command> query in parentheses that returns exactly one
1782     row with one column.  (See <xref linkend="queries"> for information about writing queries.)
1783     The <command>SELECT</command> query is executed
1784     and the single returned value is used in the surrounding value expression.
1785     It is an error to use a query that
1786     returns more than one row or more than one column as a scalar subquery.
1787     (But if, during a particular execution, the subquery returns no rows,
1788     there is no error; the scalar result is taken to be null.)
1789     The subquery can refer to variables from the surrounding query,
1790     which will act as constants during any one evaluation of the subquery.
1791     See also <xref linkend="functions-subquery"> for other expressions involving subqueries.
1792    </para>
1793
1794    <para>
1795     For example, the following finds the largest city population in each
1796     state:
1797 <programlisting>
1798 SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
1799     FROM states;
1800 </programlisting>
1801    </para>
1802   </sect2>
1803
1804   <sect2 id="sql-syntax-array-constructors">
1805    <title>Array Constructors</title>
1806
1807    <indexterm>
1808     <primary>array</primary>
1809     <secondary>constructor</secondary>
1810    </indexterm>
1811
1812    <indexterm>
1813     <primary>ARRAY</primary>
1814    </indexterm>
1815
1816    <para>
1817     An array constructor is an expression that builds an
1818     array value using values for its member elements.  A simple array
1819     constructor
1820     consists of the key word <literal>ARRAY</literal>, a left square bracket
1821     <literal>[</>, a list of expressions (separated by commas) for the
1822     array element values, and finally a right square bracket <literal>]</>.
1823     For example:
1824 <programlisting>
1825 SELECT ARRAY[1,2,3+4];
1826   array
1827 ---------
1828  {1,2,7}
1829 (1 row)
1830 </programlisting>
1831     By default,
1832     the array element type is the common type of the member expressions,
1833     determined using the same rules as for <literal>UNION</> or
1834     <literal>CASE</> constructs (see <xref linkend="typeconv-union-case">).
1835     You can override this by explicitly casting the array constructor to the
1836     desired type, for example:
1837 <programlisting>
1838 SELECT ARRAY[1,2,22.7]::integer[];
1839   array
1840 ----------
1841  {1,2,23}
1842 (1 row)
1843 </programlisting>
1844     This has the same effect as casting each expression to the array
1845     element type individually.
1846     For more on casting, see <xref linkend="sql-syntax-type-casts">.
1847    </para>
1848
1849    <para>
1850     Multidimensional array values can be built by nesting array
1851     constructors.
1852     In the inner constructors, the key word <literal>ARRAY</literal> can
1853     be omitted.  For example, these produce the same result:
1854
1855 <programlisting>
1856 SELECT ARRAY[ARRAY[1,2], ARRAY[3,4]];
1857      array
1858 ---------------
1859  {{1,2},{3,4}}
1860 (1 row)
1861
1862 SELECT ARRAY[[1,2],[3,4]];
1863      array
1864 ---------------
1865  {{1,2},{3,4}}
1866 (1 row)
1867 </programlisting>
1868
1869     Since multidimensional arrays must be rectangular, inner constructors
1870     at the same level must produce sub-arrays of identical dimensions.
1871     Any cast applied to the outer <literal>ARRAY</> constructor propagates
1872     automatically to all the inner constructors.
1873   </para>
1874
1875   <para>
1876     Multidimensional array constructor elements can be anything yielding
1877     an array of the proper kind, not only a sub-<literal>ARRAY</> construct.
1878     For example:
1879 <programlisting>
1880 CREATE TABLE arr(f1 int[], f2 int[]);
1881
1882 INSERT INTO arr VALUES (ARRAY[[1,2],[3,4]], ARRAY[[5,6],[7,8]]);
1883
1884 SELECT ARRAY[f1, f2, '{{9,10},{11,12}}'::int[]] FROM arr;
1885                      array
1886 ------------------------------------------------
1887  {{{1,2},{3,4}},{{5,6},{7,8}},{{9,10},{11,12}}}
1888 (1 row)
1889 </programlisting>
1890   </para>
1891
1892   <para>
1893    You can construct an empty array, but since it's impossible to have an
1894    array with no type, you must explicitly cast your empty array to the
1895    desired type.  For example:
1896 <programlisting>
1897 SELECT ARRAY[]::integer[];
1898  array
1899 -------
1900  {}
1901 (1 row)
1902 </programlisting>
1903   </para>
1904
1905   <para>
1906    It is also possible to construct an array from the results of a
1907    subquery.  In this form, the array constructor is written with the
1908    key word <literal>ARRAY</literal> followed by a parenthesized (not
1909    bracketed) subquery. For example:
1910 <programlisting>
1911 SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
1912                           ?column?
1913 -------------------------------------------------------------
1914  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31}
1915 (1 row)
1916 </programlisting>
1917    The subquery must return a single column. The resulting
1918    one-dimensional array will have an element for each row in the
1919    subquery result, with an element type matching that of the
1920    subquery's output column.
1921   </para>
1922
1923   <para>
1924    The subscripts of an array value built with <literal>ARRAY</literal>
1925    always begin with one.  For more information about arrays, see
1926    <xref linkend="arrays">.
1927   </para>
1928
1929   </sect2>
1930
1931   <sect2 id="sql-syntax-row-constructors">
1932    <title>Row Constructors</title>
1933
1934    <indexterm>
1935     <primary>composite type</primary>
1936     <secondary>constructor</secondary>
1937    </indexterm>
1938
1939    <indexterm>
1940     <primary>row type</primary>
1941     <secondary>constructor</secondary>
1942    </indexterm>
1943
1944    <indexterm>
1945     <primary>ROW</primary>
1946    </indexterm>
1947
1948    <para>
1949     A row constructor is an expression that builds a row value (also
1950     called a composite value) using values
1951     for its member fields.  A row constructor consists of the key word
1952     <literal>ROW</literal>, a left parenthesis, zero or more
1953     expressions (separated by commas) for the row field values, and finally
1954     a right parenthesis.  For example:
1955 <programlisting>
1956 SELECT ROW(1,2.5,'this is a test');
1957 </programlisting>
1958     The key word <literal>ROW</> is optional when there is more than one
1959     expression in the list.
1960    </para>
1961
1962    <para>
1963     A row constructor can include the syntax
1964     <replaceable>rowvalue</replaceable><literal>.*</literal>,
1965     which will be expanded to a list of the elements of the row value,
1966     just as occurs when the <literal>.*</> syntax is used at the top level
1967     of a <command>SELECT</> list.  For example, if table <literal>t</> has
1968     columns <literal>f1</> and <literal>f2</>, these are the same:
1969 <programlisting>
1970 SELECT ROW(t.*, 42) FROM t;
1971 SELECT ROW(t.f1, t.f2, 42) FROM t;
1972 </programlisting>
1973    </para>
1974
1975    <note>
1976     <para>
1977      Before <productname>PostgreSQL</productname> 8.2, the
1978      <literal>.*</literal> syntax was not expanded, so that writing
1979      <literal>ROW(t.*, 42)</> created a two-field row whose first field
1980      was another row value.  The new behavior is usually more useful.
1981      If you need the old behavior of nested row values, write the inner
1982      row value without <literal>.*</literal>, for instance
1983      <literal>ROW(t, 42)</>.
1984     </para>
1985    </note>
1986
1987    <para>
1988     By default, the value created by a <literal>ROW</> expression is of
1989     an anonymous record type.  If necessary, it can be cast to a named
1990     composite type &mdash; either the row type of a table, or a composite type
1991     created with <command>CREATE TYPE AS</>.  An explicit cast might be needed
1992     to avoid ambiguity.  For example:
1993 <programlisting>
1994 CREATE TABLE mytable(f1 int, f2 float, f3 text);
1995
1996 CREATE FUNCTION getf1(mytable) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
1997
1998 -- No cast needed since only one getf1() exists
1999 SELECT getf1(ROW(1,2.5,'this is a test'));
2000  getf1
2001 -------
2002      1
2003 (1 row)
2004
2005 CREATE TYPE myrowtype AS (f1 int, f2 text, f3 numeric);
2006
2007 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
2008
2009 -- Now we need a cast to indicate which function to call:
2010 SELECT getf1(ROW(1,2.5,'this is a test'));
2011 ERROR:  function getf1(record) is not unique
2012
2013 SELECT getf1(ROW(1,2.5,'this is a test')::mytable);
2014  getf1
2015 -------
2016      1
2017 (1 row)
2018
2019 SELECT getf1(CAST(ROW(11,'this is a test',2.5) AS myrowtype));
2020  getf1
2021 -------
2022     11
2023 (1 row)
2024 </programlisting>
2025   </para>
2026
2027   <para>
2028    Row constructors can be used to build composite values to be stored
2029    in a composite-type table column, or to be passed to a function that
2030    accepts a composite parameter.  Also,
2031    it is possible to compare two row values or test a row with
2032    <literal>IS NULL</> or <literal>IS NOT NULL</>, for example:
2033 <programlisting>
2034 SELECT ROW(1,2.5,'this is a test') = ROW(1, 3, 'not the same');
2035
2036 SELECT ROW(table.*) IS NULL FROM table;  -- detect all-null rows
2037 </programlisting>
2038    For more detail see <xref linkend="functions-comparisons">.
2039    Row constructors can also be used in connection with subqueries,
2040    as discussed in <xref linkend="functions-subquery">.
2041   </para>
2042
2043   </sect2>
2044
2045   <sect2 id="syntax-express-eval">
2046    <title>Expression Evaluation Rules</title>
2047
2048    <indexterm>
2049     <primary>expression</primary>
2050     <secondary>order of evaluation</secondary>
2051    </indexterm>
2052
2053    <para>
2054     The order of evaluation of subexpressions is not defined.  In
2055     particular, the inputs of an operator or function are not necessarily
2056     evaluated left-to-right or in any other fixed order.
2057    </para>
2058
2059    <para>
2060     Furthermore, if the result of an expression can be determined by
2061     evaluating only some parts of it, then other subexpressions
2062     might not be evaluated at all.  For instance, if one wrote:
2063 <programlisting>
2064 SELECT true OR somefunc();
2065 </programlisting>
2066     then <literal>somefunc()</literal> would (probably) not be called
2067     at all. The same would be the case if one wrote:
2068 <programlisting>
2069 SELECT somefunc() OR true;
2070 </programlisting>
2071     Note that this is not the same as the left-to-right
2072     <quote>short-circuiting</quote> of Boolean operators that is found
2073     in some programming languages.
2074    </para>
2075
2076    <para>
2077     As a consequence, it is unwise to use functions with side effects
2078     as part of complex expressions.  It is particularly dangerous to
2079     rely on side effects or evaluation order in <literal>WHERE</> and <literal>HAVING</> clauses,
2080     since those clauses are extensively reprocessed as part of
2081     developing an execution plan.  Boolean
2082     expressions (<literal>AND</>/<literal>OR</>/<literal>NOT</> combinations) in those clauses can be reorganized
2083     in any manner allowed by the laws of Boolean algebra.
2084    </para>
2085
2086    <para>
2087     When it is essential to force evaluation order, a <literal>CASE</>
2088     construct (see <xref linkend="functions-conditional">) can be
2089     used.  For example, this is an untrustworthy way of trying to
2090     avoid division by zero in a <literal>WHERE</> clause:
2091 <programlisting>
2092 SELECT ... WHERE x &gt; 0 AND y/x &gt; 1.5;
2093 </programlisting>
2094     But this is safe:
2095 <programlisting>
2096 SELECT ... WHERE CASE WHEN x &gt; 0 THEN y/x &gt; 1.5 ELSE false END;
2097 </programlisting>
2098     A <literal>CASE</> construct used in this fashion will defeat optimization
2099     attempts, so it should only be done when necessary.  (In this particular
2100     example, it would be better to sidestep the problem by writing
2101     <literal>y &gt; 1.5*x</> instead.)
2102    </para>
2103   </sect2>
2104  </sect1>
2105
2106 </chapter>