doc/src/sgml/syntax.sgml

   1 <!--
   2 $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.105 2005/11/04 23:14:02 petere Exp $
   3 -->
   4
   5 <chapter id="sql-syntax">
   6  <title>SQL Syntax</title>
   7
   8  <indexterm zone="sql-syntax">
   9   <primary>syntax</primary>
  10   <secondary>SQL</secondary>
  11  </indexterm>
  12
  13  <para>
  14   This chapter describes the syntax of SQL.  It forms the foundation
  15   for understanding the following chapters which will go into detail
  16   about how the SQL commands are applied to define and modify data.
  17  </para>
  18
  19  <para>
  20   We also advise users who are already familiar with SQL to read this
  21   chapter carefully because there are several rules and concepts that
  22   are implemented inconsistently among SQL databases or that are
  23   specific to <productname>PostgreSQL</productname>.
  24  </para>
  25
  26  <sect1 id="sql-syntax-lexical">
  27   <title>Lexical Structure</title>
  28
  29   <indexterm>
  30    <primary>token</primary>
  31   </indexterm>
  32
  33   <para>
  34    SQL input consists of a sequence of
  35    <firstterm>commands</firstterm>.  A command is composed of a
  36    sequence of <firstterm>tokens</firstterm>, terminated by a
  37    semicolon (<quote>;</quote>).  The end of the input stream also
  38    terminates a command.  Which tokens are valid depends on the syntax
  39    of the particular command.
  40   </para>
  41
  42   <para>
  43    A token can be a <firstterm>key word</firstterm>, an
  44    <firstterm>identifier</firstterm>, a <firstterm>quoted
  45    identifier</firstterm>, a <firstterm>literal</firstterm> (or
  46    constant), or a special character symbol.  Tokens are normally
  47    separated by whitespace (space, tab, newline), but need not be if
  48    there is no ambiguity (which is generally only the case if a
  49    special character is adjacent to some other token type).
  50   </para>
  51
  52   <para>
  53    Additionally, <firstterm>comments</firstterm> can occur in SQL
  54    input.  They are not tokens, they are effectively equivalent to
  55    whitespace.
  56   </para>
  57
  58    <para>
  59     For example, the following is (syntactically) valid SQL input:
  60 <programlisting>
  61 SELECT * FROM MY_TABLE;
  62 UPDATE MY_TABLE SET A = 5;
  63 INSERT INTO MY_TABLE VALUES (3, 'hi there');
  64 </programlisting>
  65     This is a sequence of three commands, one per line (although this
  66     is not required; more than one command can be on a line, and
  67     commands can usefully be split across lines).
  68    </para>
  69
  70   <para>
  71    The SQL syntax is not very consistent regarding what tokens
  72    identify commands and which are operands or parameters.  The first
  73    few tokens are generally the command name, so in the above example
  74    we would usually speak of a <quote>SELECT</quote>, an
  75    <quote>UPDATE</quote>, and an <quote>INSERT</quote> command.  But
  76    for instance the <command>UPDATE</command> command always requires
  77    a <token>SET</token> token to appear in a certain position, and
  78    this particular variation of <command>INSERT</command> also
  79    requires a <token>VALUES</token> in order to be complete.  The
  80    precise syntax rules for each command are described in <xref linkend="reference">.
  81   </para>
  82
  83   <sect2 id="sql-syntax-identifiers">
  84    <title>Identifiers and Key Words</title>
  85
  86    <indexterm zone="sql-syntax-identifiers">
  87     <primary>identifier</primary>
  88     <secondary>syntax of</secondary>
  89    </indexterm>
  90
  91    <indexterm zone="sql-syntax-identifiers">
  92     <primary>name</primary>
  93     <secondary>syntax of</secondary>
  94    </indexterm>
  95
  96    <indexterm zone="sql-syntax-identifiers">
  97     <primary>key word</primary>
  98     <secondary>syntax of</secondary>
  99    </indexterm>
 100
 101    <para>
 102     Tokens such as <token>SELECT</token>, <token>UPDATE</token>, or
 103     <token>VALUES</token> in the example above are examples of
 104     <firstterm>key words</firstterm>, that is, words that have a fixed
 105     meaning in the SQL language.  The tokens <token>MY_TABLE</token>
 106     and <token>A</token> are examples of
 107     <firstterm>identifiers</firstterm>.  They identify names of
 108     tables, columns, or other database objects, depending on the
 109     command they are used in.  Therefore they are sometimes simply
 110     called <quote>names</quote>.  Key words and identifiers have the
 111     same lexical structure, meaning that one cannot know whether a
 112     token is an identifier or a key word without knowing the language.
 113     A complete list of key words can be found in <xref
 114     linkend="sql-keywords-appendix">.
 115    </para>
 116
 117    <para>
 118     SQL identifiers and key words must begin with a letter
 119     (<literal>a</literal>-<literal>z</literal>, but also letters with
 120     diacritical marks and non-Latin letters) or an underscore
 121     (<literal>_</literal>).  Subsequent characters in an identifier or
 122     key word can be letters, underscores, digits
 123     (<literal>0</literal>-<literal>9</literal>), or dollar signs
 124     (<literal>$</>).  Note that dollar signs are not allowed in identifiers
 125     according to the letter of the SQL standard, so their use may render
 126     applications less portable.
 127     The SQL standard will not define a key word that contains
 128     digits or starts or ends with an underscore, so identifiers of this
 129     form are safe against possible conflict with future extensions of the
 130     standard.
 131    </para>
 132
 133    <para>
 134     <indexterm><primary>identifier</primary><secondary>length</secondary></indexterm>
 135     The system uses no more than <symbol>NAMEDATALEN</symbol>-1
 136     characters of an identifier; longer names can be written in
 137     commands, but they will be truncated.  By default,
 138     <symbol>NAMEDATALEN</symbol> is 64 so the maximum identifier
 139     length is 63. If this limit is problematic, it can be raised by
 140     changing the <symbol>NAMEDATALEN</symbol> constant in
 141     <filename>src/include/postgres_ext.h</filename>.
 142    </para>
 143
 144    <para>
 145     <indexterm>
 146      <primary>case sensitivity</primary>
 147      <secondary>of SQL commands</secondary>
 148     </indexterm>
 149     Identifier and key word names are case insensitive.  Therefore
 150 <programlisting>
 151 UPDATE MY_TABLE SET A = 5;
 152 </programlisting>
 153     can equivalently be written as
 154 <programlisting>
 155 uPDaTE my_TabLE SeT a = 5;
 156 </programlisting>
 157     A convention often used is to write key words in upper
 158     case and names in lower case, e.g.,
 159 <programlisting>
 160 UPDATE my_table SET a = 5;
 161 </programlisting>
 162    </para>
 163
 164    <para>
 165     <indexterm>
 166      <primary>quotation marks</primary>
 167      <secondary>and identifiers</secondary>
 168     </indexterm>
 169     There is a second kind of identifier:  the <firstterm>delimited
 170     identifier</firstterm> or <firstterm>quoted
 171     identifier</firstterm>.  It is formed by enclosing an arbitrary
 172     sequence of characters in double-quotes
 173     (<literal>"</literal>). <!-- " font-lock mania --> A delimited
 174     identifier is always an identifier, never a key word.  So
 175     <literal>"select"</literal> could be used to refer to a column or
 176     table named <quote>select</quote>, whereas an unquoted
 177     <literal>select</literal> would be taken as a key word and
 178     would therefore provoke a parse error when used where a table or
 179     column name is expected.  The example can be written with quoted
 180     identifiers like this:
 181 <programlisting>
 182 UPDATE "my_table" SET "a" = 5;
 183 </programlisting>
 184    </para>
 185
 186    <para>
 187     Quoted identifiers can contain any character other than a double
 188     quote itself.  (To include a double quote, write two double quotes.)
 189     This allows constructing table or column names that would
 190     otherwise not be possible, such as ones containing spaces or
 191     ampersands.  The length limitation still applies.
 192    </para>
 193
 194    <para>
 195     Quoting an identifier also makes it case-sensitive, whereas
 196     unquoted names are always folded to lower case.  For example, the
 197     identifiers <literal>FOO</literal>, <literal>foo</literal>, and
 198     <literal>"foo"</literal> are considered the same by
 199     <productname>PostgreSQL</productname>, but
 200     <literal>"Foo"</literal> and <literal>"FOO"</literal> are
 201     different from these three and each other.  (The folding of
 202     unquoted names to lower case in <productname>PostgreSQL</> is
 203     incompatible with the SQL standard, which says that unquoted names
 204     should be folded to upper case.  Thus, <literal>foo</literal>
 205     should be equivalent to <literal>"FOO"</literal> not
 206     <literal>"foo"</literal> according to the standard.  If you want
 207     to write portable applications you are advised to always quote a
 208     particular name or never quote it.)
 209    </para>
 210   </sect2>
 211
 212
 213   <sect2 id="sql-syntax-constants">
 214    <title>Constants</title>
 215
 216    <indexterm zone="sql-syntax-constants">
 217     <primary>constant</primary>
 218    </indexterm>
 219
 220    <para>
 221     There are three kinds of <firstterm>implicitly-typed
 222     constants</firstterm> in <productname>PostgreSQL</productname>:
 223     strings, bit strings, and numbers.
 224     Constants can also be specified with explicit types, which can
 225     enable more accurate representation and more efficient handling by
 226     the system. These alternatives are discussed in the following
 227     subsections.
 228    </para>
 229
 230    <sect3 id="sql-syntax-strings">
 231     <title>String Constants</title>
 232
 233     <indexterm zone="sql-syntax-strings">
 234      <primary>character string</primary>
 235      <secondary>constant</secondary>
 236     </indexterm>
 237
 238     <para>
 239      <indexterm>
 240       <primary>quotation marks</primary>
 241       <secondary>escaping</secondary>
 242      </indexterm>
 243      A string constant in SQL is an arbitrary sequence of characters
 244      bounded by single quotes (<literal>'</literal>), for example
 245      <literal>'This is a string'</literal>.  The standard-compliant way of
 246      writing a single-quote character within a string constant is to
 247      write two adjacent single quotes, e.g.
 248      <literal>'Dianne''s horse'</literal>.
 249      <productname>PostgreSQL</productname> also allows single quotes
 250      to be escaped with a backslash (<literal>\'</literal>).  However,
 251      future versions of <productname>PostgreSQL</productname> will not
 252      allow this, so applications using backslashes should convert to the
 253      standard-compliant method outlined above.
 254     </para>
 255
 256     <para>
 257      Another <productname>PostgreSQL</productname> extension is that
 258      C-style backslash escapes are available: <literal>\b</literal> is a
 259      backspace, <literal>\f</literal> is a form feed,
 260      <literal>\n</literal> is a newline, <literal>\r</literal> is a
 261      carriage return, <literal>\t</literal> is a tab. Also supported is
 262      <literal>\<replaceable>digits</replaceable></literal>, where
 263      <replaceable>digits</replaceable> represents an octal byte value, and
 264      <literal>\x<replaceable>hexdigits</replaceable></literal>, where
 265      <replaceable>hexdigits</replaceable> represents a hexadecimal byte value.
 266      (It is your responsibility that the byte sequences you create are
 267      valid characters in the server character set encoding.) Any other
 268      character following a backslash is taken literally. Thus, to
 269      include a backslash in a string constant, write two backslashes.
 270     </para>
 271
 272     <note>
 273     <para>
 274      While ordinary strings now support C-style backslash escapes,
 275      future versions will generate warnings for such usage and
 276      eventually treat backslashes as literal characters to be
 277      standard-conforming. The proper way to specify escape processing is
 278      to use the escape string syntax to indicate that escape
 279      processing is desired. Escape string syntax is specified by writing
 280      the letter <literal>E</literal> (upper or lower case) just before
 281      the string, e.g. <literal>E'\041'</>. This method will work in all
 282      future versions of <productname>PostgreSQL</productname>.
 283     </para>
 284     </note>
 285
 286     <para>
 287      The character with the code zero cannot be in a string constant.
 288     </para>
 289
 290     <para>
 291      Two string constants that are only separated by whitespace
 292      <emphasis>with at least one newline</emphasis> are concatenated
 293      and effectively treated as if the string had been written in one
 294      constant.  For example:
 295 <programlisting>
 296 SELECT 'foo'
 297 'bar';
 298 </programlisting>
 299      is equivalent to
 300 <programlisting>
 301 SELECT 'foobar';
 302 </programlisting>
 303      but
 304 <programlisting>
 305 SELECT 'foo'      'bar';
 306 </programlisting>
 307      is not valid syntax.  (This slightly bizarre behavior is specified
 308      by <acronym>SQL</acronym>; <productname>PostgreSQL</productname> is
 309      following the standard.)
 310     </para>
 311    </sect3>
 312
 313    <sect3 id="sql-syntax-dollar-quoting">
 314     <title>Dollar-Quoted String Constants</title>
 315
 316      <indexterm>
 317       <primary>dollar quoting</primary>
 318      </indexterm>
 319
 320     <para>
 321      While the standard syntax for specifying string constants is usually
 322      convenient, it can be difficult to understand when the desired string
 323      contains many single quotes or backslashes, since each of those must
 324      be doubled. To allow more readable queries in such situations,
 325      <productname>PostgreSQL</productname> provides another way, called
 326      <quote>dollar quoting</quote>, to write string constants.
 327      A dollar-quoted string constant
 328      consists of a dollar sign (<literal>$</literal>), an optional
 329      <quote>tag</quote> of zero or more characters, another dollar
 330      sign, an arbitrary sequence of characters that makes up the
 331      string content, a dollar sign, the same tag that began this
 332      dollar quote, and a dollar sign. For example, here are two
 333      different ways to specify the string <quote>Dianne's horse</>
 334      using dollar quoting:
 335 <programlisting>
 336 $$Dianne's horse$$
 337 $SomeTag$Dianne's horse$SomeTag$
 338 </programlisting>
 339      Notice that inside the dollar-quoted string, single quotes can be
 340      used without needing to be escaped.  Indeed, no characters inside
 341      a dollar-quoted string are ever escaped: the string content is always
 342      written literally.  Backslashes are not special, and neither are
 343      dollar signs, unless they are part of a sequence matching the opening
 344      tag.
 345     </para>
 346
 347     <para>
 348      It is possible to nest dollar-quoted string constants by choosing
 349      different tags at each nesting level.  This is most commonly used in
 350      writing function definitions.  For example:
 351 <programlisting>
 352 $function$
 353 BEGIN
 354     RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
 355 END;
 356 $function$
 357 </programlisting>
 358      Here, the sequence <literal>$q$[\t\r\n\v\\]$q$</> represents a
 359      dollar-quoted literal string <literal>[\t\r\n\v\\]</>, which will
 360      be recognized when the function body is executed by
 361      <productname>PostgreSQL</>.  But since the sequence does not match
 362      the outer dollar quoting delimiter <literal>$function$</>, it is
 363      just some more characters within the constant so far as the outer
 364      string is concerned.
 365     </para>
 366
 367     <para>
 368      The tag, if any, of a dollar-quoted string follows the same rules
 369      as an unquoted identifier, except that it cannot contain a dollar sign.
 370      Tags are case sensitive, so <literal>$tag$String content$tag$</literal>
 371      is correct, but <literal>$TAG$String content$tag$</literal> is not.
 372     </para>
 373
 374     <para>
 375      A dollar-quoted string that follows a keyword or identifier must
 376      be separated from it by whitespace; otherwise the dollar quoting
 377      delimiter would be taken as part of the preceding identifier.
 378     </para>
 379
 380     <para>
 381      Dollar quoting is not part of the SQL standard, but it is often a more
 382      convenient way to write complicated string literals than the
 383      standard-compliant single quote syntax.  It is particularly useful when
 384      representing string constants inside other constants, as is often needed
 385      in procedural function definitions.  With single-quote syntax, each
 386      backslash in the above example would have to be written as four
 387      backslashes, which would be reduced to two backslashes in parsing the
 388      original string constant, and then to one when the inner string constant
 389      is re-parsed during function execution.
 390     </para>
 391    </sect3>
 392
 393    <sect3 id="sql-syntax-bit-strings">
 394     <title>Bit-String Constants</title>
 395
 396     <indexterm zone="sql-syntax-bit-strings">
 397      <primary>bit string</primary>
 398      <secondary>constant</secondary>
 399     </indexterm>
 400
 401     <para>
 402      Bit-string constants look like regular string constants with a
 403      <literal>B</literal> (upper or lower case) immediately before the
 404      opening quote (no intervening whitespace), e.g.,
 405      <literal>B'1001'</literal>.  The only characters allowed within
 406      bit-string constants are <literal>0</literal> and
 407      <literal>1</literal>.
 408     </para>
 409
 410     <para>
 411      Alternatively, bit-string constants can be specified in hexadecimal
 412      notation, using a leading <literal>X</literal> (upper or lower case),
 413      e.g., <literal>X'1FF'</literal>.  This notation is equivalent to
 414      a bit-string constant with four binary digits for each hexadecimal digit.
 415     </para>
 416
 417     <para>
 418      Both forms of bit-string constant can be continued
 419      across lines in the same way as regular string constants.
 420      Dollar quoting cannot be used in a bit-string constant.
 421     </para>
 422    </sect3>
 423
 424    <sect3>
 425     <title>Numeric Constants</title>
 426
 427     <indexterm>
 428      <primary>number</primary>
 429      <secondary>constant</secondary>
 430     </indexterm>
 431
 432     <para>
 433      Numeric constants are accepted in these general forms:
 434 <synopsis>
 435 <replaceable>digits</replaceable>
 436 <replaceable>digits</replaceable>.<optional><replaceable>digits</replaceable></optional><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 437 <optional><replaceable>digits</replaceable></optional>.<replaceable>digits</replaceable><optional>e<optional>+-</optional><replaceable>digits</replaceable></optional>
 438 <replaceable>digits</replaceable>e<optional>+-</optional><replaceable>digits</replaceable>
 439 </synopsis>
 440      where <replaceable>digits</replaceable> is one or more decimal
 441      digits (0 through 9).  At least one digit must be before or after the
 442      decimal point, if one is used.  At least one digit must follow the
 443      exponent marker (<literal>e</literal>), if one is present.
 444      There may not be any spaces or other characters embedded in the
 445      constant.  Note that any leading plus or minus sign is not actually
 446      considered part of the constant; it is an operator applied to the
 447      constant.
 448     </para>
 449
 450     <para>
 451      These are some examples of valid numeric constants:
 452 <literallayout>
 453 42
 454 3.5
 455 4.
 456 .001
 457 5e2
 458 1.925e-3
 459 </literallayout>
 460     </para>
 461
 462     <para>
 463      <indexterm><primary>integer</primary></indexterm>
 464      <indexterm><primary>bigint</primary></indexterm>
 465      <indexterm><primary>numeric</primary></indexterm>
 466      A numeric constant that contains neither a decimal point nor an
 467      exponent is initially presumed to be type <type>integer</> if its
 468      value fits in type <type>integer</> (32 bits); otherwise it is
 469      presumed to be type <type>bigint</> if its
 470      value fits in type <type>bigint</> (64 bits); otherwise it is
 471      taken to be type <type>numeric</>.  Constants that contain decimal
 472      points and/or exponents are always initially presumed to be type
 473      <type>numeric</>.
 474     </para>
 475
 476     <para>
 477      The initially assigned data type of a numeric constant is just a
 478      starting point for the type resolution algorithms.  In most cases
 479      the constant will be automatically coerced to the most
 480      appropriate type depending on context.  When necessary, you can
 481      force a numeric value to be interpreted as a specific data type
 482      by casting it.<indexterm><primary>type cast</primary></indexterm>
 483      For example, you can force a numeric value to be treated as type
 484      <type>real</> (<type>float4</>) by writing
 485
 486 <programlisting>
 487 REAL '1.23'  -- string style
 488 1.23::REAL   -- PostgreSQL (historical) style
 489 </programlisting>
 490
 491      These are actually just special cases of the general casting
 492      notations discussed next.
 493     </para>
 494    </sect3>
 495
 496    <sect3 id="sql-syntax-constants-generic">
 497     <title>Constants of Other Types</title>
 498
 499     <indexterm>
 500      <primary>data type</primary>
 501      <secondary>constant</secondary>
 502     </indexterm>
 503
 504     <para>
 505      A constant of an <emphasis>arbitrary</emphasis> type can be
 506      entered using any one of the following notations:
 507 <synopsis>
 508 <replaceable>type</replaceable> '<replaceable>string</replaceable>'
 509 '<replaceable>string</replaceable>'::<replaceable>type</replaceable>
 510 CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 511 </synopsis>
 512      The string constant's text is passed to the input conversion
 513      routine for the type called <replaceable>type</replaceable>. The
 514      result is a constant of the indicated type.  The explicit type
 515      cast may be omitted if there is no ambiguity as to the type the
 516      constant must be (for example, when it is assigned directly to a
 517      table column), in which case it is automatically coerced.
 518     </para>
 519
 520     <para>
 521      The string constant can be written using either regular SQL
 522      notation or dollar-quoting.
 523     </para>
 524
 525     <para>
 526      It is also possible to specify a type coercion using a function-like
 527      syntax:
 528 <synopsis>
 529 <replaceable>typename</replaceable> ( '<replaceable>string</replaceable>' )
 530 </synopsis>
 531      but not all type names may be used in this way; see <xref
 532      linkend="sql-syntax-type-casts"> for details.
 533     </para>
 534
 535     <para>
 536      The <literal>::</literal>, <literal>CAST()</literal>, and
 537      function-call syntaxes can also be used to specify run-time type
 538      conversions of arbitrary expressions, as discussed in <xref
 539      linkend="sql-syntax-type-casts">.  But the form
 540      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 541      can only be used to specify the type of a literal constant.
 542      Another restriction on
 543      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 544      is that it does not work for array types; use <literal>::</literal>
 545      or <literal>CAST()</literal> to specify the type of an array constant.
 546     </para>
 547
 548     <para>
 549      The <literal>CAST()</> syntax conforms to SQL.  The
 550      <literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
 551      syntax is a generalization of the standard: SQL specifies this syntax only
 552      for a few data types, but <productname>PostgreSQL</productname> allows it
 553      for all types.  The syntax with
 554      <literal>::</literal> is historical <productname>PostgreSQL</productname>
 555      usage, as is the function-call syntax.
 556     </para>
 557    </sect3>
 558   </sect2>
 559
 560   <sect2 id="sql-syntax-operators">
 561    <title>Operators</title>
 562
 563    <indexterm zone="sql-syntax-operators">
 564     <primary>operator</primary>
 565     <secondary>syntax</secondary>
 566    </indexterm>
 567
 568    <para>
 569     An operator name is a sequence of up to <symbol>NAMEDATALEN</symbol>-1
 570     (63 by default) characters from the following list:
 571 <literallayout>
 572 + - * / &lt; &gt; = ~ ! @ # % ^ &amp; | ` ?
 573 </literallayout>
 574
 575     There are a few restrictions on operator names, however:
 576     <itemizedlist>
 577      <listitem>
 578       <para>
 579        <literal>--</literal> and <literal>/*</literal> cannot appear
 580        anywhere in an operator name, since they will be taken as the
 581        start of a comment.
 582       </para>
 583      </listitem>
 584
 585      <listitem>
 586       <para>
 587        A multiple-character operator name cannot end in <literal>+</> or <literal>-</>,
 588        unless the name also contains at least one of these characters:
 589 <literallayout>
 590 ~ ! @ # % ^ &amp; | ` ?
 591 </literallayout>
 592        For example, <literal>@-</literal> is an allowed operator name,
 593        but <literal>*-</literal> is not.  This restriction allows
 594        <productname>PostgreSQL</productname> to parse SQL-compliant
 595        queries without requiring spaces between tokens.
 596       </para>
 597      </listitem>
 598     </itemizedlist>
 599    </para>
 600
 601    <para>
 602     When working with non-SQL-standard operator names, you will usually
 603     need to separate adjacent operators with spaces to avoid ambiguity.
 604     For example, if you have defined a left unary operator named <literal>@</literal>,
 605     you cannot write <literal>X*@Y</literal>; you must write
 606     <literal>X* @Y</literal> to ensure that
 607     <productname>PostgreSQL</productname> reads it as two operator names
 608     not one.
 609    </para>
 610   </sect2>
 611
 612   <sect2>
 613    <title>Special Characters</title>
 614
 615   <para>
 616    Some characters that are not alphanumeric have a special meaning
 617    that is different from being an operator.  Details on the usage can
 618    be found at the location where the respective syntax element is
 619    described.  This section only exists to advise the existence and
 620    summarize the purposes of these characters.
 621
 622    <itemizedlist>
 623     <listitem>
 624      <para>
 625       A dollar sign (<literal>$</literal>) followed by digits is used
 626       to represent a positional parameter in the body of a function
 627       definition or a prepared statement.  In other contexts the
 628       dollar sign may be part of an identifier or a dollar-quoted string
 629       constant.
 630      </para>
 631     </listitem>
 632
 633     <listitem>
 634      <para>
 635       Parentheses (<literal>()</literal>) have their usual meaning to
 636       group expressions and enforce precedence.  In some cases
 637       parentheses are required as part of the fixed syntax of a
 638       particular SQL command.
 639      </para>
 640     </listitem>
 641
 642     <listitem>
 643      <para>
 644       Brackets (<literal>[]</literal>) are used to select the elements
 645       of an array.  See <xref linkend="arrays"> for more information
 646       on arrays.
 647      </para>
 648     </listitem>
 649
 650     <listitem>
 651      <para>
 652       Commas (<literal>,</literal>) are used in some syntactical
 653       constructs to separate the elements of a list.
 654      </para>
 655     </listitem>
 656
 657     <listitem>
 658      <para>
 659       The semicolon (<literal>;</literal>) terminates an SQL command.
 660       It cannot appear anywhere within a command, except within a
 661       string constant or quoted identifier.
 662      </para>
 663     </listitem>
 664
 665     <listitem>
 666      <para>
 667       The colon (<literal>:</literal>) is used to select
 668       <quote>slices</quote> from arrays. (See <xref
 669       linkend="arrays">.)  In certain SQL dialects (such as Embedded
 670       SQL), the colon is used to prefix variable names.
 671      </para>
 672     </listitem>
 673
 674     <listitem>
 675      <para>
 676       The asterisk (<literal>*</literal>) is used in some contexts to denote
 677       all the fields of a table row or composite value.  It also
 678       has a special meaning when used as the argument of the
 679       <function>COUNT</function> aggregate function.
 680      </para>
 681     </listitem>
 682
 683     <listitem>
 684      <para>
 685       The period (<literal>.</literal>) is used in numeric
 686       constants, and to separate schema, table, and column names.
 687      </para>
 688     </listitem>
 689    </itemizedlist>
 690
 691    </para>
 692   </sect2>
 693
 694   <sect2 id="sql-syntax-comments">
 695    <title>Comments</title>
 696
 697    <indexterm zone="sql-syntax-comments">
 698     <primary>comment</primary>
 699     <secondary sortas="SQL">in SQL</secondary>
 700    </indexterm>
 701
 702    <para>
 703     A comment is an arbitrary sequence of characters beginning with
 704     double dashes and extending to the end of the line, e.g.:
 705 <programlisting>
 706 -- This is a standard SQL comment
 707 </programlisting>
 708    </para>
 709
 710    <para>
 711     Alternatively, C-style block comments can be used:
 712 <programlisting>
 713 /* multiline comment
 714  * with nesting: /* nested block comment */
 715  */
 716 </programlisting>
 717     where the comment begins with <literal>/*</literal> and extends to
 718     the matching occurrence of <literal>*/</literal>. These block
 719     comments nest, as specified in the SQL standard but unlike C, so that one can
 720     comment out larger blocks of code that may contain existing block
 721     comments.
 722    </para>
 723
 724    <para>
 725     A comment is removed from the input stream before further syntax
 726     analysis and is effectively replaced by whitespace.
 727    </para>
 728   </sect2>
 729
 730   <sect2 id="sql-precedence">
 731    <title>Lexical Precedence</title>
 732
 733    <indexterm zone="sql-precedence">
 734     <primary>operator</primary>
 735     <secondary>precedence</secondary>
 736    </indexterm>
 737
 738    <para>
 739     <xref linkend="sql-precedence-table"> shows the precedence and
 740     associativity of the operators in <productname>PostgreSQL</>.
 741     Most operators have the same precedence and are left-associative.
 742     The precedence and associativity of the operators is hard-wired
 743     into the parser.  This may lead to non-intuitive behavior; for
 744     example the Boolean operators <literal>&lt;</> and
 745     <literal>&gt;</> have a different precedence than the Boolean
 746     operators <literal>&lt;=</> and <literal>&gt;=</>.  Also, you will
 747     sometimes need to add parentheses when using combinations of
 748     binary and unary operators.  For instance
 749 <programlisting>
 750 SELECT 5 ! - 6;
 751 </programlisting>
 752    will be parsed as
 753 <programlisting>
 754 SELECT 5 ! (- 6);
 755 </programlisting>
 756     because the parser has no idea &mdash; until it is too late
 757     &mdash; that <token>!</token> is defined as a postfix operator,
 758     not an infix one.  To get the desired behavior in this case, you
 759     must write
 760 <programlisting>
 761 SELECT (5 !) - 6;
 762 </programlisting>
 763     This is the price one pays for extensibility.
 764    </para>
 765
 766    <table id="sql-precedence-table">
 767     <title>Operator Precedence (decreasing)</title>
 768
 769     <tgroup cols="3">
 770      <thead>
 771       <row>
 772        <entry>Operator/Element</entry>
 773        <entry>Associativity</entry>
 774        <entry>Description</entry>
 775       </row>
 776      </thead>
 777
 778      <tbody>
 779       <row>
 780        <entry><token>.</token></entry>
 781        <entry>left</entry>
 782        <entry>table/column name separator</entry>
 783       </row>
 784
 785       <row>
 786        <entry><token>::</token></entry>
 787        <entry>left</entry>
 788        <entry><productname>PostgreSQL</productname>-style typecast</entry>
 789       </row>
 790
 791       <row>
 792        <entry><token>[</token> <token>]</token></entry>
 793        <entry>left</entry>
 794        <entry>array element selection</entry>
 795       </row>
 796
 797       <row>
 798        <entry><token>-</token></entry>
 799        <entry>right</entry>
 800        <entry>unary minus</entry>
 801       </row>
 802
 803       <row>
 804        <entry><token>^</token></entry>
 805        <entry>left</entry>
 806        <entry>exponentiation</entry>
 807       </row>
 808
 809       <row>
 810        <entry><token>*</token> <token>/</token> <token>%</token></entry>
 811        <entry>left</entry>
 812        <entry>multiplication, division, modulo</entry>
 813       </row>
 814
 815       <row>
 816        <entry><token>+</token> <token>-</token></entry>
 817        <entry>left</entry>
 818        <entry>addition, subtraction</entry>
 819       </row>
 820
 821       <row>
 822        <entry><token>IS</token></entry>
 823        <entry></entry>
 824        <entry><literal>IS TRUE</>, <literal>IS FALSE</>, <literal>IS UNKNOWN</>, <literal>IS NULL</></entry>
 825       </row>
 826
 827       <row>
 828        <entry><token>ISNULL</token></entry>
 829        <entry></entry>
 830        <entry>test for null</entry>
 831       </row>
 832
 833       <row>
 834        <entry><token>NOTNULL</token></entry>
 835        <entry></entry>
 836        <entry>test for not null</entry>
 837       </row>
 838
 839       <row>
 840        <entry>(any other)</entry>
 841        <entry>left</entry>
 842        <entry>all other native and user-defined operators</entry>
 843       </row>
 844
 845       <row>
 846        <entry><token>IN</token></entry>
 847        <entry></entry>
 848        <entry>set membership</entry>
 849       </row>
 850
 851       <row>
 852        <entry><token>BETWEEN</token></entry>
 853        <entry></entry>
 854        <entry>range containment</entry>
 855       </row>
 856
 857       <row>
 858        <entry><token>OVERLAPS</token></entry>
 859        <entry></entry>
 860        <entry>time interval overlap</entry>
 861       </row>
 862
 863       <row>
 864        <entry><token>LIKE</token> <token>ILIKE</token> <token>SIMILAR</token></entry>
 865        <entry></entry>
 866        <entry>string pattern matching</entry>
 867       </row>
 868
 869       <row>
 870        <entry><token>&lt;</token> <token>&gt;</token></entry>
 871        <entry></entry>
 872        <entry>less than, greater than</entry>
 873       </row>
 874
 875       <row>
 876        <entry><token>=</token></entry>
 877        <entry>right</entry>
 878        <entry>equality, assignment</entry>
 879       </row>
 880
 881       <row>
 882        <entry><token>NOT</token></entry>
 883        <entry>right</entry>
 884        <entry>logical negation</entry>
 885       </row>
 886
 887       <row>
 888        <entry><token>AND</token></entry>
 889        <entry>left</entry>
 890        <entry>logical conjunction</entry>
 891       </row>
 892
 893       <row>
 894        <entry><token>OR</token></entry>
 895        <entry>left</entry>
 896        <entry>logical disjunction</entry>
 897       </row>
 898      </tbody>
 899     </tgroup>
 900    </table>
 901
 902    <para>
 903     Note that the operator precedence rules also apply to user-defined
 904     operators that have the same names as the built-in operators
 905     mentioned above.  For example, if you define a
 906     <quote>+</quote> operator for some custom data type it will have
 907     the same precedence as the built-in <quote>+</quote> operator, no
 908     matter what yours does.
 909    </para>
 910
 911    <para>
 912     When a schema-qualified operator name is used in the
 913     <literal>OPERATOR</> syntax, as for example in
 914 <programlisting>
 915 SELECT 3 OPERATOR(pg_catalog.+) 4;
 916 </programlisting>
 917     the <literal>OPERATOR</> construct is taken to have the default precedence
 918     shown in <xref linkend="sql-precedence-table"> for <quote>any other</> operator.  This is true no matter
 919     which specific operator name appears inside <literal>OPERATOR()</>.
 920    </para>
 921   </sect2>
 922  </sect1>
 923
 924  <sect1 id="sql-expressions">
 925   <title>Value Expressions</title>
 926
 927   <indexterm zone="sql-expressions">
 928    <primary>expression</primary>
 929    <secondary>syntax</secondary>
 930   </indexterm>
 931
 932   <indexterm zone="sql-expressions">
 933    <primary>value expression</primary>
 934   </indexterm>
 935
 936   <indexterm>
 937    <primary>scalar</primary>
 938    <see>expression</see>
 939   </indexterm>
 940
 941   <para>
 942    Value expressions are used in a variety of contexts, such
 943    as in the target list of the <command>SELECT</command> command, as
 944    new column values in <command>INSERT</command> or
 945    <command>UPDATE</command>, or in search conditions in a number of
 946    commands.  The result of a value expression is sometimes called a
 947    <firstterm>scalar</firstterm>, to distinguish it from the result of
 948    a table expression (which is a table).  Value expressions are
 949    therefore also called <firstterm>scalar expressions</firstterm> (or
 950    even simply <firstterm>expressions</firstterm>).  The expression
 951    syntax allows the calculation of values from primitive parts using
 952    arithmetic, logical, set, and other operations.
 953   </para>
 954
 955   <para>
 956    A value expression is one of the following:
 957
 958    <itemizedlist>
 959     <listitem>
 960      <para>
 961       A constant or literal value.
 962      </para>
 963     </listitem>
 964
 965     <listitem>
 966      <para>
 967       A column reference.
 968      </para>
 969     </listitem>
 970
 971     <listitem>
 972      <para>
 973       A positional parameter reference, in the body of a function definition
 974       or prepared statement.
 975      </para>
 976     </listitem>
 977
 978     <listitem>
 979      <para>
 980       A subscripted expression.
 981      </para>
 982     </listitem>
 983
 984     <listitem>
 985      <para>
 986       A field selection expression.
 987      </para>
 988     </listitem>
 989
 990     <listitem>
 991      <para>
 992       An operator invocation.
 993      </para>
 994     </listitem>
 995
 996     <listitem>
 997      <para>
 998       A function call.
 999      </para>
1000     </listitem>
1001
1002     <listitem>
1003      <para>
1004       An aggregate expression.
1005      </para>
1006     </listitem>
1007
1008     <listitem>
1009      <para>
1010       A type cast.
1011      </para>
1012     </listitem>
1013
1014     <listitem>
1015      <para>
1016       A scalar subquery.
1017      </para>
1018     </listitem>
1019
1020     <listitem>
1021      <para>
1022       An array constructor.
1023      </para>
1024     </listitem>
1025
1026     <listitem>
1027      <para>
1028       A row constructor.
1029      </para>
1030     </listitem>
1031
1032     <listitem>
1033      <para>
1034       Another value expression in parentheses, useful to group
1035       subexpressions and override
1036       precedence.<indexterm><primary>parenthesis</></>
1037      </para>
1038     </listitem>
1039    </itemizedlist>
1040   </para>
1041
1042   <para>
1043    In addition to this list, there are a number of constructs that can
1044    be classified as an expression but do not follow any general syntax
1045    rules.  These generally have the semantics of a function or
1046    operator and are explained in the appropriate location in <xref
1047    linkend="functions">.  An example is the <literal>IS NULL</literal>
1048    clause.
1049   </para>
1050
1051   <para>
1052    We have already discussed constants in <xref
1053    linkend="sql-syntax-constants">.  The following sections discuss
1054    the remaining options.
1055   </para>
1056
1057   <sect2>
1058    <title>Column References</title>
1059
1060    <indexterm>
1061     <primary>column reference</primary>
1062    </indexterm>
1063
1064    <para>
1065     A column can be referenced in the form
1066 <synopsis>
1067 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable>
1068 </synopsis>
1069    </para>
1070
1071    <para>
1072     <replaceable>correlation</replaceable> is the name of a
1073     table (possibly qualified with a schema name), or an alias for a table
1074     defined by means of a <literal>FROM</literal> clause, or one of
1075     the key words <literal>NEW</literal> or <literal>OLD</literal>.
1076     (<literal>NEW</literal> and <literal>OLD</literal> can only appear in rewrite rules,
1077     while other correlation names can be used in any SQL statement.)
1078     The correlation name and separating dot may be omitted if the column name
1079     is unique across all the tables being used in the current query.  (See also <xref linkend="queries">.)
1080    </para>
1081   </sect2>
1082
1083   <sect2>
1084    <title>Positional Parameters</title>
1085
1086    <indexterm>
1087     <primary>parameter</primary>
1088     <secondary>syntax</secondary>
1089    </indexterm>
1090
1091    <indexterm>
1092     <primary>$</primary>
1093    </indexterm>
1094
1095    <para>
1096     A positional parameter reference is used to indicate a value
1097     that is supplied externally to an SQL statement.  Parameters are
1098     used in SQL function definitions and in prepared queries.  Some
1099     client libraries also support specifying data values separately
1100     from the SQL command string, in which case parameters are used to
1101     refer to the out-of-line data values.
1102     The form of a parameter reference is:
1103 <synopsis>
1104 $<replaceable>number</replaceable>
1105 </synopsis>
1106    </para>
1107
1108    <para>
1109     For example, consider the definition of a function,
1110     <function>dept</function>, as
1111
1112 <programlisting>
1113 CREATE FUNCTION dept(text) RETURNS dept
1114     AS $$ SELECT * FROM dept WHERE name = $1 $$
1115     LANGUAGE SQL;
1116 </programlisting>
1117
1118     Here the <literal>$1</literal> references the value of the first
1119     function argument whenever the function is invoked.
1120    </para>
1121   </sect2>
1122
1123   <sect2>
1124    <title>Subscripts</title>
1125
1126    <indexterm>
1127     <primary>subscript</primary>
1128    </indexterm>
1129
1130    <para>
1131     If an expression yields a value of an array type, then a specific
1132     element of the array value can be extracted by writing
1133 <synopsis>
1134 <replaceable>expression</replaceable>[<replaceable>subscript</replaceable>]
1135 </synopsis>
1136     or multiple adjacent elements (an <quote>array slice</>) can be extracted
1137     by writing
1138 <synopsis>
1139 <replaceable>expression</replaceable>[<replaceable>lower_subscript</replaceable>:<replaceable>upper_subscript</replaceable>]
1140 </synopsis>
1141     (Here, the brackets <literal>[ ]</literal> are meant to appear literally.)
1142     Each <replaceable>subscript</replaceable> is itself an expression,
1143     which must yield an integer value.
1144    </para>
1145
1146    <para>
1147     In general the array <replaceable>expression</replaceable> must be
1148     parenthesized, but the parentheses may be omitted when the expression
1149     to be subscripted is just a column reference or positional parameter.
1150     Also, multiple subscripts can be concatenated when the original array
1151     is multidimensional.
1152     For example,
1153
1154 <programlisting>
1155 mytable.arraycolumn[4]
1156 mytable.two_d_column[17][34]
1157 $1[10:42]
1158 (arrayfunction(a,b))[42]
1159 </programlisting>
1160
1161     The parentheses in the last example are required.
1162     See <xref linkend="arrays"> for more about arrays.
1163    </para>
1164   </sect2>
1165
1166   <sect2>
1167    <title>Field Selection</title>
1168
1169    <indexterm>
1170     <primary>field selection</primary>
1171    </indexterm>
1172
1173    <para>
1174     If an expression yields a value of a composite type (row type), then a
1175     specific field of the row can be extracted by writing
1176 <synopsis>
1177 <replaceable>expression</replaceable>.<replaceable>fieldname</replaceable>
1178 </synopsis>
1179    </para>
1180
1181    <para>
1182     In general the row <replaceable>expression</replaceable> must be
1183     parenthesized, but the parentheses may be omitted when the expression
1184     to be selected from is just a table reference or positional parameter.
1185     For example,
1186
1187 <programlisting>
1188 mytable.mycolumn
1189 $1.somecolumn
1190 (rowfunction(a,b)).col3
1191 </programlisting>
1192
1193     (Thus, a qualified column reference is actually just a special case
1194     of the field selection syntax.)
1195    </para>
1196   </sect2>
1197
1198   <sect2>
1199    <title>Operator Invocations</title>
1200
1201    <indexterm>
1202     <primary>operator</primary>
1203     <secondary>invocation</secondary>
1204    </indexterm>
1205
1206    <para>
1207     There are three possible syntaxes for an operator invocation:
1208     <simplelist>
1209      <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> <replaceable>expression</replaceable> (binary infix operator)</member>
1210      <member><replaceable>operator</replaceable> <replaceable>expression</replaceable> (unary prefix operator)</member>
1211      <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> (unary postfix operator)</member>
1212     </simplelist>
1213     where the <replaceable>operator</replaceable> token follows the syntax
1214     rules of <xref linkend="sql-syntax-operators">, or is one of the
1215     key words <token>AND</token>, <token>OR</token>, and
1216     <token>NOT</token>, or is a qualified operator name in the form
1217 <synopsis>
1218 <literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operatorname</><literal>)</>
1219 </synopsis>
1220     Which particular operators exist and whether
1221     they are unary or binary depends on what operators have been
1222     defined by the system or the user.  <xref linkend="functions">
1223     describes the built-in operators.
1224    </para>
1225   </sect2>
1226
1227   <sect2>
1228    <title>Function Calls</title>
1229
1230    <indexterm>
1231     <primary>function</primary>
1232     <secondary>invocation</secondary>
1233    </indexterm>
1234
1235    <para>
1236     The syntax for a function call is the name of a function
1237     (possibly qualified with a schema name), followed by its argument list
1238     enclosed in parentheses:
1239
1240 <synopsis>
1241 <replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
1242 </synopsis>
1243    </para>
1244
1245    <para>
1246     For example, the following computes the square root of 2:
1247 <programlisting>
1248 sqrt(2)
1249 </programlisting>
1250    </para>
1251
1252    <para>
1253     The list of built-in functions is in <xref linkend="functions">.
1254     Other functions may be added by the user.
1255    </para>
1256   </sect2>
1257
1258   <sect2 id="syntax-aggregates">
1259    <title>Aggregate Expressions</title>
1260
1261    <indexterm zone="syntax-aggregates">
1262     <primary>aggregate function</primary>
1263     <secondary>invocation</secondary>
1264    </indexterm>
1265
1266    <para>
1267     An <firstterm>aggregate expression</firstterm> represents the
1268     application of an aggregate function across the rows selected by a
1269     query.  An aggregate function reduces multiple inputs to a single
1270     output value, such as the sum or average of the inputs.  The
1271     syntax of an aggregate expression is one of the following:
1272
1273 <synopsis>
1274 <replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable>)
1275 <replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable>)
1276 <replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)
1277 <replaceable>aggregate_name</replaceable> ( * )
1278 </synopsis>
1279
1280     where <replaceable>aggregate_name</replaceable> is a previously
1281     defined aggregate (possibly qualified with a schema name), and
1282     <replaceable>expression</replaceable> is
1283     any value expression that does not itself contain an aggregate
1284     expression.
1285    </para>
1286
1287    <para>
1288     The first form of aggregate expression invokes the aggregate
1289     across all input rows for which the given expression yields a
1290     non-null value.  (Actually, it is up to the aggregate function
1291     whether to ignore null values or not &mdash; but all the standard ones do.)
1292     The second form is the same as the first, since
1293     <literal>ALL</literal> is the default.  The third form invokes the
1294     aggregate for all distinct non-null values of the expression found
1295     in the input rows.  The last form invokes the aggregate once for
1296     each input row regardless of null or non-null values; since no
1297     particular input value is specified, it is generally only useful
1298     for the <function>count()</function> aggregate function.
1299    </para>
1300
1301    <para>
1302     For example, <literal>count(*)</literal> yields the total number
1303     of input rows; <literal>count(f1)</literal> yields the number of
1304     input rows in which <literal>f1</literal> is non-null;
1305     <literal>count(distinct f1)</literal> yields the number of
1306     distinct non-null values of <literal>f1</literal>.
1307    </para>
1308
1309    <para>
1310     The predefined aggregate functions are described in <xref
1311     linkend="functions-aggregate">.  Other aggregate functions may be added
1312     by the user.
1313    </para>
1314
1315    <para>
1316     An aggregate expression may only appear in the result list or
1317     <literal>HAVING</> clause of a <command>SELECT</> command.
1318     It is forbidden in other clauses, such as <literal>WHERE</>,
1319     because those clauses are logically evaluated before the results
1320     of aggregates are formed.
1321    </para>
1322
1323    <para>
1324     When an aggregate expression appears in a subquery (see
1325     <xref linkend="sql-syntax-scalar-subqueries"> and
1326     <xref linkend="functions-subquery">), the aggregate is normally
1327     evaluated over the rows of the subquery.  But an exception occurs
1328     if the aggregate's argument contains only outer-level variables:
1329     the aggregate then belongs to the nearest such outer level, and is
1330     evaluated over the rows of that query.  The aggregate expression
1331     as a whole is then an outer reference for the subquery it appears in,
1332     and acts as a constant over any one evaluation of that subquery.
1333     The restriction about
1334     appearing only in the result list or <literal>HAVING</> clause
1335     applies with respect to the query level that the aggregate belongs to.
1336    </para>
1337   </sect2>
1338
1339   <sect2 id="sql-syntax-type-casts">
1340    <title>Type Casts</title>
1341
1342    <indexterm>
1343     <primary>data type</primary>
1344     <secondary>type cast</secondary>
1345    </indexterm>
1346
1347    <indexterm>
1348     <primary>type cast</primary>
1349    </indexterm>
1350
1351    <para>
1352     A type cast specifies a conversion from one data type to another.
1353     <productname>PostgreSQL</productname> accepts two equivalent syntaxes
1354     for type casts:
1355 <synopsis>
1356 CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable> )
1357 <replaceable>expression</replaceable>::<replaceable>type</replaceable>
1358 </synopsis>
1359     The <literal>CAST</> syntax conforms to SQL; the syntax with
1360     <literal>::</literal> is historical <productname>PostgreSQL</productname>
1361     usage.
1362    </para>
1363
1364    <para>
1365     When a cast is applied to a value expression of a known type, it
1366     represents a run-time type conversion.  The cast will succeed only
1367     if a suitable type conversion operation has been defined.  Notice that this
1368     is subtly different from the use of casts with constants, as shown in
1369     <xref linkend="sql-syntax-constants-generic">.  A cast applied to an
1370     unadorned string literal represents the initial assignment of a type
1371     to a literal constant value, and so it will succeed for any type
1372     (if the contents of the string literal are acceptable input syntax for the
1373     data type).
1374    </para>
1375
1376    <para>
1377     An explicit type cast may usually be omitted if there is no ambiguity as
1378     to the type that a value expression must produce (for example, when it is
1379     assigned to a table column); the system will automatically apply a
1380     type cast in such cases.  However, automatic casting is only done for
1381     casts that are marked <quote>OK to apply implicitly</>
1382     in the system catalogs.  Other casts must be invoked with
1383     explicit casting syntax.  This restriction is intended to prevent
1384     surprising conversions from being applied silently.
1385    </para>
1386
1387    <para>
1388     It is also possible to specify a type cast using a function-like
1389     syntax:
1390 <synopsis>
1391 <replaceable>typename</replaceable> ( <replaceable>expression</replaceable> )
1392 </synopsis>
1393     However, this only works for types whose names are also valid as
1394     function names.  For example, <literal>double precision</literal>
1395     can't be used this way, but the equivalent <literal>float8</literal>
1396     can.  Also, the names <literal>interval</>, <literal>time</>, and
1397     <literal>timestamp</> can only be used in this fashion if they are
1398     double-quoted, because of syntactic conflicts.  Therefore, the use of
1399     the function-like cast syntax leads to inconsistencies and should
1400     probably be avoided in new applications.
1401
1402     (The function-like syntax is in fact just a function call.  When
1403     one of the two standard cast syntaxes is used to do a run-time
1404     conversion, it will internally invoke a registered function to
1405     perform the conversion.  By convention, these conversion functions
1406     have the same name as their output type, and thus the <quote>function-like
1407     syntax</> is nothing more than a direct invocation of the underlying
1408     conversion function.  Obviously, this is not something that a portable
1409     application should rely on.)
1410    </para>
1411   </sect2>
1412
1413   <sect2 id="sql-syntax-scalar-subqueries">
1414    <title>Scalar Subqueries</title>
1415
1416    <indexterm>
1417     <primary>subquery</primary>
1418    </indexterm>
1419
1420    <para>
1421     A scalar subquery is an ordinary
1422     <command>SELECT</command> query in parentheses that returns exactly one
1423     row with one column.  (See <xref linkend="queries"> for information about writing queries.)
1424     The <command>SELECT</command> query is executed
1425     and the single returned value is used in the surrounding value expression.
1426     It is an error to use a query that
1427     returns more than one row or more than one column as a scalar subquery.
1428     (But if, during a particular execution, the subquery returns no rows,
1429     there is no error; the scalar result is taken to be null.)
1430     The subquery can refer to variables from the surrounding query,
1431     which will act as constants during any one evaluation of the subquery.
1432     See also <xref linkend="functions-subquery"> for other expressions involving subqueries.
1433    </para>
1434
1435    <para>
1436     For example, the following finds the largest city population in each
1437     state:
1438 <programlisting>
1439 SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
1440     FROM states;
1441 </programlisting>
1442    </para>
1443   </sect2>
1444
1445   <sect2 id="sql-syntax-array-constructors">
1446    <title>Array Constructors</title>
1447
1448    <indexterm>
1449     <primary>array</primary>
1450     <secondary>constructor</secondary>
1451    </indexterm>
1452
1453    <indexterm>
1454     <primary>ARRAY</primary>
1455    </indexterm>
1456
1457    <para>
1458     An array constructor is an expression that builds an
1459     array value from values for its member elements.  A simple array
1460     constructor
1461     consists of the key word <literal>ARRAY</literal>, a left square bracket
1462     <literal>[</>, one or more expressions (separated by commas) for the
1463     array element values, and finally a right square bracket <literal>]</>.
1464     For example,
1465 <programlisting>
1466 SELECT ARRAY[1,2,3+4];
1467   array
1468 ---------
1469  {1,2,7}
1470 (1 row)
1471 </programlisting>
1472     The array element type is the common type of the member expressions,
1473     determined using the same rules as for <literal>UNION</> or
1474     <literal>CASE</> constructs (see <xref linkend="typeconv-union-case">).
1475    </para>
1476
1477    <para>
1478     Multidimensional array values can be built by nesting array
1479     constructors.
1480     In the inner constructors, the key word <literal>ARRAY</literal> may
1481     be omitted.  For example, these produce the same result:
1482
1483 <programlisting>
1484 SELECT ARRAY[ARRAY[1,2], ARRAY[3,4]];
1485      array
1486 ---------------
1487  {{1,2},{3,4}}
1488 (1 row)
1489
1490 SELECT ARRAY[[1,2],[3,4]];
1491      array
1492 ---------------
1493  {{1,2},{3,4}}
1494 (1 row)
1495 </programlisting>
1496
1497     Since multidimensional arrays must be rectangular, inner constructors
1498     at the same level must produce sub-arrays of identical dimensions.
1499   </para>
1500
1501   <para>
1502     Multidimensional array constructor elements can be anything yielding
1503     an array of the proper kind, not only a sub-<literal>ARRAY</> construct.
1504     For example:
1505 <programlisting>
1506 CREATE TABLE arr(f1 int[], f2 int[]);
1507
1508 INSERT INTO arr VALUES (ARRAY[[1,2],[3,4]], ARRAY[[5,6],[7,8]]);
1509
1510 SELECT ARRAY[f1, f2, '{{9,10},{11,12}}'::int[]] FROM arr;
1511                      array
1512 ------------------------------------------------
1513  {{{1,2},{3,4}},{{5,6},{7,8}},{{9,10},{11,12}}}
1514 (1 row)
1515 </programlisting>
1516   </para>
1517
1518   <para>
1519    It is also possible to construct an array from the results of a
1520    subquery.  In this form, the array constructor is written with the
1521    key word <literal>ARRAY</literal> followed by a parenthesized (not
1522    bracketed) subquery. For example:
1523 <programlisting>
1524 SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
1525                           ?column?
1526 -------------------------------------------------------------
1527  {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31}
1528 (1 row)
1529 </programlisting>
1530    The subquery must return a single column. The resulting
1531    one-dimensional array will have an element for each row in the
1532    subquery result, with an element type matching that of the
1533    subquery's output column.
1534   </para>
1535
1536   <para>
1537    The subscripts of an array value built with <literal>ARRAY</literal>
1538    always begin with one.  For more information about arrays, see
1539    <xref linkend="arrays">.
1540   </para>
1541
1542   </sect2>
1543
1544   <sect2 id="sql-syntax-row-constructors">
1545    <title>Row Constructors</title>
1546
1547    <indexterm>
1548     <primary>composite type</primary>
1549     <secondary>constructor</secondary>
1550    </indexterm>
1551
1552    <indexterm>
1553     <primary>row type</primary>
1554     <secondary>constructor</secondary>
1555    </indexterm>
1556
1557    <indexterm>
1558     <primary>ROW</primary>
1559    </indexterm>
1560
1561    <para>
1562     A row constructor is an expression that builds a row value (also
1563     called a composite value) from values
1564     for its member fields.  A row constructor consists of the key word
1565     <literal>ROW</literal>, a left parenthesis, zero or more
1566     expressions (separated by commas) for the row field values, and finally
1567     a right parenthesis.  For example,
1568 <programlisting>
1569 SELECT ROW(1,2.5,'this is a test');
1570 </programlisting>
1571     The key word <literal>ROW</> is optional when there is more than one
1572     expression in the list.
1573    </para>
1574
1575    <para>
1576     By default, the value created by a <literal>ROW</> expression is of
1577     an anonymous record type.  If necessary, it can be cast to a named
1578     composite type &mdash; either the row type of a table, or a composite type
1579     created with <command>CREATE TYPE AS</>.  An explicit cast may be needed
1580     to avoid ambiguity.  For example:
1581 <programlisting>
1582 CREATE TABLE mytable(f1 int, f2 float, f3 text);
1583
1584 CREATE FUNCTION getf1(mytable) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
1585
1586 -- No cast needed since only one getf1() exists
1587 SELECT getf1(ROW(1,2.5,'this is a test'));
1588  getf1
1589 -------
1590      1
1591 (1 row)
1592
1593 CREATE TYPE myrowtype AS (f1 int, f2 text, f3 numeric);
1594
1595 CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
1596
1597 -- Now we need a cast to indicate which function to call:
1598 SELECT getf1(ROW(1,2.5,'this is a test'));
1599 ERROR:  function getf1(record) is not unique
1600
1601 SELECT getf1(ROW(1,2.5,'this is a test')::mytable);
1602  getf1
1603 -------
1604      1
1605 (1 row)
1606
1607 SELECT getf1(CAST(ROW(11,'this is a test',2.5) AS myrowtype));
1608  getf1
1609 -------
1610     11
1611 (1 row)
1612 </programlisting>
1613   </para>
1614
1615   <para>
1616    Row constructors can be used to build composite values to be stored
1617    in a composite-type table column, or to be passed to a function that
1618    accepts a composite parameter.  Also,
1619    it is possible to compare two row values or test a row with
1620    <literal>IS NULL</> or <literal>IS NOT NULL</>, for example
1621 <programlisting>
1622 SELECT ROW(1,2.5,'this is a test') = ROW(1, 3, 'not the same');
1623
1624 SELECT ROW(a, b, c) IS NOT NULL FROM table;
1625 </programlisting>
1626    For more detail see <xref linkend="functions-comparisons">.
1627    Row constructors can also be used in connection with subqueries,
1628    as discussed in <xref linkend="functions-subquery">.
1629   </para>
1630
1631   </sect2>
1632
1633   <sect2 id="syntax-express-eval">
1634    <title>Expression Evaluation Rules</title>
1635
1636    <indexterm>
1637     <primary>expression</primary>
1638     <secondary>order of evaluation</secondary>
1639    </indexterm>
1640
1641    <para>
1642     The order of evaluation of subexpressions is not defined.  In
1643     particular, the inputs of an operator or function are not necessarily
1644     evaluated left-to-right or in any other fixed order.
1645    </para>
1646
1647    <para>
1648     Furthermore, if the result of an expression can be determined by
1649     evaluating only some parts of it, then other subexpressions
1650     might not be evaluated at all.  For instance, if one wrote
1651 <programlisting>
1652 SELECT true OR somefunc();
1653 </programlisting>
1654     then <literal>somefunc()</literal> would (probably) not be called
1655     at all. The same would be the case if one wrote
1656 <programlisting>
1657 SELECT somefunc() OR true;
1658 </programlisting>
1659     Note that this is not the same as the left-to-right
1660     <quote>short-circuiting</quote> of Boolean operators that is found
1661     in some programming languages.
1662    </para>
1663
1664    <para>
1665     As a consequence, it is unwise to use functions with side effects
1666     as part of complex expressions.  It is particularly dangerous to
1667     rely on side effects or evaluation order in <literal>WHERE</> and <literal>HAVING</> clauses,
1668     since those clauses are extensively reprocessed as part of
1669     developing an execution plan.  Boolean
1670     expressions (<literal>AND</>/<literal>OR</>/<literal>NOT</> combinations) in those clauses may be reorganized
1671     in any manner allowed by the laws of Boolean algebra.
1672    </para>
1673
1674    <para>
1675     When it is essential to force evaluation order, a <literal>CASE</>
1676     construct (see <xref linkend="functions-conditional">) may be
1677     used.  For example, this is an untrustworthy way of trying to
1678     avoid division by zero in a <literal>WHERE</> clause:
1679 <programlisting>
1680 SELECT ... WHERE x &lt;&gt; 0 AND y/x &gt; 1.5;
1681 </programlisting>
1682     But this is safe:
1683 <programlisting>
1684 SELECT ... WHERE CASE WHEN x &lt;&gt; 0 THEN y/x &gt; 1.5 ELSE false END;
1685 </programlisting>
1686     A <literal>CASE</> construct used in this fashion will defeat optimization
1687     attempts, so it should only be done when necessary.  (In this particular
1688     example, it would doubtless be best to sidestep the problem by writing
1689     <literal>y &gt; 1.5*x</> instead.)
1690    </para>
1691   </sect2>
1692  </sect1>
1693
1694 </chapter>
1695
1696 <!-- Keep this comment at the end of the file
1697 Local variables:
1698 mode:sgml
1699 sgml-omittag:nil
1700 sgml-shorttag:t
1701 sgml-minimize-attributes:nil
1702 sgml-always-quote-attributes:t
1703 sgml-indent-step:1
1704 sgml-indent-data:t
1705 sgml-parent-document:nil
1706 sgml-default-dtd-file:"./reference.ced"
1707 sgml-exposed-tags:nil
1708 sgml-local-catalogs:("/usr/lib/sgml/catalog")
1709 sgml-local-ecat-files:nil
1710 End:
1711 -->