doc/src/sgml/datatype.sgml

   1 <!-- doc/src/sgml/datatype.sgml -->
   2
   3  <chapter id="datatype">
   4   <title>Data Types</title>
   5
   6   <indexterm zone="datatype">
   7    <primary>data type</primary>
   8   </indexterm>
   9
  10   <indexterm>
  11    <primary>type</primary>
  12    <see>data type</see>
  13   </indexterm>
  14
  15   <para>
  16    <productname>PostgreSQL</productname> has a rich set of native data
  17    types available to users.  Users can add new types to
  18    <productname>PostgreSQL</productname> using the <xref
  19    linkend="sql-createtype"> command.
  20   </para>
  21
  22   <para>
  23    <xref linkend="datatype-table"> shows all the built-in general-purpose data
  24    types. Most of the alternative names listed in the
  25    <quote>Aliases</quote> column are the names used internally by
  26    <productname>PostgreSQL</productname> for historical reasons.  In
  27    addition, some internally used or deprecated types are available,
  28    but are not listed here.
  29   </para>
  30
  31    <table id="datatype-table">
  32     <title>Data Types</title>
  33     <tgroup cols="3">
  34      <thead>
  35       <row>
  36        <entry>Name</entry>
  37        <entry>Aliases</entry>
  38        <entry>Description</entry>
  39       </row>
  40      </thead>
  41
  42      <tbody>
  43       <row>
  44        <entry><type>bigint</type></entry>
  45        <entry><type>int8</type></entry>
  46        <entry>signed eight-byte integer</entry>
  47       </row>
  48
  49       <row>
  50        <entry><type>bigserial</type></entry>
  51        <entry><type>serial8</type></entry>
  52        <entry>autoincrementing eight-byte integer</entry>
  53       </row>
  54
  55       <row>
  56        <entry><type>bit [ (<replaceable>n</replaceable>) ]</type></entry>
  57        <entry></entry>
  58        <entry>fixed-length bit string</entry>
  59       </row>
  60
  61       <row>
  62        <entry><type>bit varying [ (<replaceable>n</replaceable>) ]</type></entry>
  63        <entry><type>varbit</type></entry>
  64        <entry>variable-length bit string</entry>
  65       </row>
  66
  67       <row>
  68        <entry><type>boolean</type></entry>
  69        <entry><type>bool</type></entry>
  70        <entry>logical Boolean (true/false)</entry>
  71       </row>
  72
  73       <row>
  74        <entry><type>box</type></entry>
  75        <entry></entry>
  76        <entry>rectangular box on a plane</entry>
  77       </row>
  78
  79       <row>
  80        <entry><type>bytea</type></entry>
  81        <entry></entry>
  82        <entry>binary data (<quote>byte array</>)</entry>
  83       </row>
  84
  85       <row>
  86        <entry><type>character varying [ (<replaceable>n</replaceable>) ]</type></entry>
  87        <entry><type>varchar [ (<replaceable>n</replaceable>) ]</type></entry>
  88        <entry>variable-length character string</entry>
  89       </row>
  90
  91       <row>
  92        <entry><type>character [ (<replaceable>n</replaceable>) ]</type></entry>
  93        <entry><type>char [ (<replaceable>n</replaceable>) ]</type></entry>
  94        <entry>fixed-length character string</entry>
  95       </row>
  96
  97       <row>
  98        <entry><type>cidr</type></entry>
  99        <entry></entry>
 100        <entry>IPv4 or IPv6 network address</entry>
 101       </row>
 102
 103       <row>
 104        <entry><type>circle</type></entry>
 105        <entry></entry>
 106        <entry>circle on a plane</entry>
 107       </row>
 108
 109       <row>
 110        <entry><type>date</type></entry>
 111        <entry></entry>
 112        <entry>calendar date (year, month, day)</entry>
 113       </row>
 114
 115       <row>
 116        <entry><type>double precision</type></entry>
 117        <entry><type>float8</type></entry>
 118        <entry>double precision floating-point number (8 bytes)</entry>
 119       </row>
 120
 121       <row>
 122        <entry><type>inet</type></entry>
 123        <entry></entry>
 124        <entry>IPv4 or IPv6 host address</entry>
 125       </row>
 126
 127       <row>
 128        <entry><type>integer</type></entry>
 129        <entry><type>int</type>, <type>int4</type></entry>
 130        <entry>signed four-byte integer</entry>
 131       </row>
 132
 133       <row>
 134        <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
 135        <entry></entry>
 136        <entry>time span</entry>
 137       </row>
 138
 139       <row>
 140        <entry><type>line</type></entry>
 141        <entry></entry>
 142        <entry>infinite line on a plane</entry>
 143       </row>
 144
 145       <row>
 146        <entry><type>lseg</type></entry>
 147        <entry></entry>
 148        <entry>line segment on a plane</entry>
 149       </row>
 150
 151       <row>
 152        <entry><type>macaddr</type></entry>
 153        <entry></entry>
 154        <entry>MAC (Media Access Control) address</entry>
 155       </row>
 156
 157       <row>
 158        <entry><type>money</type></entry>
 159        <entry></entry>
 160        <entry>currency amount</entry>
 161       </row>
 162
 163       <row>
 164        <entry><type>numeric [ (<replaceable>p</replaceable>,
 165          <replaceable>s</replaceable>) ]</type></entry>
 166        <entry><type>decimal [ (<replaceable>p</replaceable>,
 167          <replaceable>s</replaceable>) ]</type></entry>
 168        <entry>exact numeric of selectable precision</entry>
 169       </row>
 170
 171       <row>
 172        <entry><type>path</type></entry>
 173        <entry></entry>
 174        <entry>geometric path on a plane</entry>
 175       </row>
 176
 177       <row>
 178        <entry><type>point</type></entry>
 179        <entry></entry>
 180        <entry>geometric point on a plane</entry>
 181       </row>
 182
 183       <row>
 184        <entry><type>polygon</type></entry>
 185        <entry></entry>
 186        <entry>closed geometric path on a plane</entry>
 187       </row>
 188
 189       <row>
 190        <entry><type>real</type></entry>
 191        <entry><type>float4</type></entry>
 192        <entry>single precision floating-point number (4 bytes)</entry>
 193       </row>
 194
 195       <row>
 196        <entry><type>smallint</type></entry>
 197        <entry><type>int2</type></entry>
 198        <entry>signed two-byte integer</entry>
 199       </row>
 200
 201       <row>
 202        <entry><type>serial</type></entry>
 203        <entry><type>serial4</type></entry>
 204        <entry>autoincrementing four-byte integer</entry>
 205       </row>
 206
 207       <row>
 208        <entry><type>text</type></entry>
 209        <entry></entry>
 210        <entry>variable-length character string</entry>
 211       </row>
 212
 213       <row>
 214        <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
 215        <entry></entry>
 216        <entry>time of day (no time zone)</entry>
 217       </row>
 218
 219       <row>
 220        <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
 221        <entry><type>timetz</type></entry>
 222        <entry>time of day, including time zone</entry>
 223       </row>
 224
 225       <row>
 226        <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
 227        <entry></entry>
 228        <entry>date and time (no time zone)</entry>
 229       </row>
 230
 231       <row>
 232        <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
 233        <entry><type>timestamptz</type></entry>
 234        <entry>date and time, including time zone</entry>
 235       </row>
 236
 237       <row>
 238        <entry><type>tsquery</type></entry>
 239        <entry></entry>
 240        <entry>text search query</entry>
 241       </row>
 242
 243       <row>
 244        <entry><type>tsvector</type></entry>
 245        <entry></entry>
 246        <entry>text search document</entry>
 247       </row>
 248
 249       <row>
 250        <entry><type>txid_snapshot</type></entry>
 251        <entry></entry>
 252        <entry>user-level transaction ID snapshot</entry>
 253       </row>
 254
 255       <row>
 256        <entry><type>uuid</type></entry>
 257        <entry></entry>
 258        <entry>universally unique identifier</entry>
 259       </row>
 260
 261       <row>
 262        <entry><type>xml</type></entry>
 263        <entry></entry>
 264        <entry>XML data</entry>
 265       </row>
 266      </tbody>
 267     </tgroup>
 268    </table>
 269
 270   <note>
 271    <title>Compatibility</title>
 272    <para>
 273     The following types (or spellings thereof) are specified by
 274     <acronym>SQL</acronym>: <type>bigint</type>, <type>bit</type>, <type>bit
 275     varying</type>, <type>boolean</type>, <type>char</type>,
 276     <type>character varying</type>, <type>character</type>,
 277     <type>varchar</type>, <type>date</type>, <type>double
 278     precision</type>, <type>integer</type>, <type>interval</type>,
 279     <type>numeric</type>, <type>decimal</type>, <type>real</type>,
 280     <type>smallint</type>, <type>time</type> (with or without time zone),
 281     <type>timestamp</type> (with or without time zone),
 282     <type>xml</type>.
 283    </para>
 284   </note>
 285
 286   <para>
 287    Each data type has an external representation determined by its input
 288    and output functions.  Many of the built-in types have
 289    obvious external formats.  However, several types are either unique
 290    to <productname>PostgreSQL</productname>, such as geometric
 291    paths, or have several possible formats, such as the date
 292    and time types.
 293    Some of the input and output functions are not invertible, i.e.,
 294    the result of an output function might lose accuracy when compared to
 295    the original input.
 296   </para>
 297
 298   <sect1 id="datatype-numeric">
 299    <title>Numeric Types</title>
 300
 301    <indexterm zone="datatype-numeric">
 302     <primary>data type</primary>
 303     <secondary>numeric</secondary>
 304    </indexterm>
 305
 306    <para>
 307     Numeric types consist of two-, four-, and eight-byte integers,
 308     four- and eight-byte floating-point numbers, and selectable-precision
 309     decimals.  <xref linkend="datatype-numeric-table"> lists the
 310     available types.
 311    </para>
 312
 313     <table id="datatype-numeric-table">
 314      <title>Numeric Types</title>
 315      <tgroup cols="4">
 316       <thead>
 317        <row>
 318         <entry>Name</entry>
 319         <entry>Storage Size</entry>
 320         <entry>Description</entry>
 321         <entry>Range</entry>
 322        </row>
 323       </thead>
 324
 325       <tbody>
 326        <row>
 327         <entry><type>smallint</></entry>
 328         <entry>2 bytes</entry>
 329         <entry>small-range integer</entry>
 330         <entry>-32768 to +32767</entry>
 331        </row>
 332        <row>
 333         <entry><type>integer</></entry>
 334         <entry>4 bytes</entry>
 335         <entry>typical choice for integer</entry>
 336         <entry>-2147483648 to +2147483647</entry>
 337        </row>
 338        <row>
 339         <entry><type>bigint</></entry>
 340         <entry>8 bytes</entry>
 341         <entry>large-range integer</entry>
 342         <entry>-9223372036854775808 to 9223372036854775807</entry>
 343        </row>
 344
 345        <row>
 346         <entry><type>decimal</></entry>
 347         <entry>variable</entry>
 348         <entry>user-specified precision, exact</entry>
 349         <entry>no limit</entry>
 350        </row>
 351        <row>
 352         <entry><type>numeric</></entry>
 353         <entry>variable</entry>
 354         <entry>user-specified precision, exact</entry>
 355         <entry>no limit</entry>
 356        </row>
 357
 358        <row>
 359         <entry><type>real</></entry>
 360         <entry>4 bytes</entry>
 361         <entry>variable-precision, inexact</entry>
 362         <entry>6 decimal digits precision</entry>
 363        </row>
 364        <row>
 365         <entry><type>double precision</></entry>
 366         <entry>8 bytes</entry>
 367         <entry>variable-precision, inexact</entry>
 368         <entry>15 decimal digits precision</entry>
 369        </row>
 370
 371        <row>
 372         <entry><type>serial</></entry>
 373         <entry>4 bytes</entry>
 374         <entry>autoincrementing integer</entry>
 375         <entry>1 to 2147483647</entry>
 376        </row>
 377
 378        <row>
 379         <entry><type>bigserial</type></entry>
 380         <entry>8 bytes</entry>
 381         <entry>large autoincrementing integer</entry>
 382         <entry>1 to 9223372036854775807</entry>
 383        </row>
 384       </tbody>
 385      </tgroup>
 386     </table>
 387
 388    <para>
 389     The syntax of constants for the numeric types is described in
 390     <xref linkend="sql-syntax-constants">.  The numeric types have a
 391     full set of corresponding arithmetic operators and
 392     functions. Refer to <xref linkend="functions"> for more
 393     information.  The following sections describe the types in detail.
 394    </para>
 395
 396    <sect2 id="datatype-int">
 397     <title>Integer Types</title>
 398
 399     <indexterm zone="datatype-int">
 400      <primary>integer</primary>
 401     </indexterm>
 402
 403     <indexterm zone="datatype-int">
 404      <primary>smallint</primary>
 405     </indexterm>
 406
 407     <indexterm zone="datatype-int">
 408      <primary>bigint</primary>
 409     </indexterm>
 410
 411     <indexterm>
 412      <primary>int4</primary>
 413      <see>integer</see>
 414     </indexterm>
 415
 416     <indexterm>
 417      <primary>int2</primary>
 418      <see>smallint</see>
 419     </indexterm>
 420
 421     <indexterm>
 422      <primary>int8</primary>
 423      <see>bigint</see>
 424     </indexterm>
 425
 426     <para>
 427      The types <type>smallint</type>, <type>integer</type>, and
 428      <type>bigint</type> store whole numbers, that is, numbers without
 429      fractional components, of various ranges.  Attempts to store
 430      values outside of the allowed range will result in an error.
 431     </para>
 432
 433     <para>
 434      The type <type>integer</type> is the common choice, as it offers
 435      the best balance between range, storage size, and performance.
 436      The <type>smallint</type> type is generally only used if disk
 437      space is at a premium.  The <type>bigint</type> type should only
 438      be used if the range of the <type>integer</type> type is insufficient,
 439      because the latter is definitely faster.
 440     </para>
 441
 442     <para>
 443      On very minimal operating systems the <type>bigint</type> type
 444      might not function correctly, because it relies on compiler support
 445      for eight-byte integers.  On such machines, <type>bigint</type>
 446      acts the same as <type>integer</type>, but still takes up eight
 447      bytes of storage.  (We are not aware of any modern
 448      platform where this is the case.)
 449     </para>
 450
 451     <para>
 452      <acronym>SQL</acronym> only specifies the integer types
 453      <type>integer</type> (or <type>int</type>),
 454      <type>smallint</type>, and <type>bigint</type>.  The
 455      type names <type>int2</type>, <type>int4</type>, and
 456      <type>int8</type> are extensions, which are also used by some
 457      other <acronym>SQL</acronym> database systems.
 458     </para>
 459
 460    </sect2>
 461
 462    <sect2 id="datatype-numeric-decimal">
 463     <title>Arbitrary Precision Numbers</title>
 464
 465     <indexterm>
 466      <primary>numeric (data type)</primary>
 467     </indexterm>
 468
 469    <indexterm>
 470     <primary>arbitrary precision numbers</primary>
 471    </indexterm>
 472
 473     <indexterm>
 474      <primary>decimal</primary>
 475      <see>numeric</see>
 476     </indexterm>
 477
 478     <para>
 479      The type <type>numeric</type> can store numbers with up to 1000
 480      digits of precision and perform calculations exactly. It is
 481      especially recommended for storing monetary amounts and other
 482      quantities where exactness is required. However, arithmetic on
 483      <type>numeric</type> values is very slow compared to the integer
 484      types, or to the floating-point types described in the next section.
 485     </para>
 486
 487     <para>
 488      We use the following terms below:  The
 489      <firstterm>scale</firstterm> of a <type>numeric</type> is the
 490      count of decimal digits in the fractional part, to the right of
 491      the decimal point.  The <firstterm>precision</firstterm> of a
 492      <type>numeric</type> is the total count of significant digits in
 493      the whole number, that is, the number of digits to both sides of
 494      the decimal point.  So the number 23.5141 has a precision of 6
 495      and a scale of 4.  Integers can be considered to have a scale of
 496      zero.
 497     </para>
 498
 499     <para>
 500      Both the maximum precision and the maximum scale of a
 501      <type>numeric</type> column can be
 502      configured.  To declare a column of type <type>numeric</type> use
 503      the syntax:
 504 <programlisting>
 505 NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>)
 506 </programlisting>
 507      The precision must be positive, the scale zero or positive.
 508      Alternatively:
 509 <programlisting>
 510 NUMERIC(<replaceable>precision</replaceable>)
 511 </programlisting>
 512      selects a scale of 0.  Specifying:
 513 <programlisting>
 514 NUMERIC
 515 </programlisting>
 516      without any precision or scale creates a column in which numeric
 517      values of any precision and scale can be stored, up to the
 518      implementation limit on precision.  A column of this kind will
 519      not coerce input values to any particular scale, whereas
 520      <type>numeric</type> columns with a declared scale will coerce
 521      input values to that scale.  (The <acronym>SQL</acronym> standard
 522      requires a default scale of 0, i.e., coercion to integer
 523      precision.  We find this a bit useless.  If you're concerned
 524      about portability, always specify the precision and scale
 525      explicitly.)
 526     </para>
 527
 528     <para>
 529      If the scale of a value to be stored is greater than the declared
 530      scale of the column, the system will round the value to the specified
 531      number of fractional digits.  Then, if the number of digits to the
 532      left of the decimal point exceeds the declared precision minus the
 533      declared scale, an error is raised.
 534     </para>
 535
 536     <para>
 537      Numeric values are physically stored without any extra leading or
 538      trailing zeroes.  Thus, the declared precision and scale of a column
 539      are maximums, not fixed allocations.  (In this sense the <type>numeric</>
 540      type is more akin to <type>varchar(<replaceable>n</>)</type>
 541      than to <type>char(<replaceable>n</>)</type>.)  The actual storage
 542      requirement is two bytes for each group of four decimal digits,
 543      plus five to eight bytes overhead.
 544     </para>
 545
 546     <indexterm>
 547      <primary>NaN</primary>
 548      <see>not a number</see>
 549    </indexterm>
 550
 551     <indexterm>
 552      <primary>not a number</primary>
 553      <secondary>numeric (data type)</secondary>
 554     </indexterm>
 555
 556     <para>
 557      In addition to ordinary numeric values, the <type>numeric</type>
 558      type allows the special value <literal>NaN</>, meaning
 559      <quote>not-a-number</quote>.  Any operation on <literal>NaN</>
 560      yields another <literal>NaN</>.  When writing this value
 561      as a constant in an SQL command, you must put quotes around it,
 562      for example <literal>UPDATE table SET x = 'NaN'</>.  On input,
 563      the string <literal>NaN</> is recognized in a case-insensitive manner.
 564     </para>
 565
 566     <note>
 567      <para>
 568       In most implementations of the <quote>not-a-number</> concept,
 569       <literal>NaN</> is not considered equal to any other numeric
 570       value (including <literal>NaN</>).  In order to allow
 571       <type>numeric</> values to be sorted and used in tree-based
 572       indexes, <productname>PostgreSQL</> treats <literal>NaN</>
 573       values as equal, and greater than all non-<literal>NaN</>
 574       values.
 575      </para>
 576     </note>
 577
 578     <para>
 579      The types <type>decimal</type> and <type>numeric</type> are
 580      equivalent.  Both types are part of the <acronym>SQL</acronym>
 581      standard.
 582     </para>
 583    </sect2>
 584
 585
 586    <sect2 id="datatype-float">
 587     <title>Floating-Point Types</title>
 588
 589     <indexterm zone="datatype-float">
 590      <primary>real</primary>
 591     </indexterm>
 592
 593     <indexterm zone="datatype-float">
 594      <primary>double precision</primary>
 595     </indexterm>
 596
 597     <indexterm>
 598      <primary>float4</primary>
 599      <see>real</see>
 600     </indexterm>
 601
 602     <indexterm>
 603      <primary>float8</primary>
 604      <see>double precision</see>
 605     </indexterm>
 606
 607     <indexterm zone="datatype-float">
 608      <primary>floating point</primary>
 609     </indexterm>
 610
 611     <para>
 612      The data types <type>real</type> and <type>double
 613      precision</type> are inexact, variable-precision numeric types.
 614      In practice, these types are usually implementations of
 615      <acronym>IEEE</acronym> Standard 754 for Binary Floating-Point
 616      Arithmetic (single and double precision, respectively), to the
 617      extent that the underlying processor, operating system, and
 618      compiler support it.
 619     </para>
 620
 621     <para>
 622      Inexact means that some values cannot be converted exactly to the
 623      internal format and are stored as approximations, so that storing
 624      and retrieving a value might show slight discrepancies.
 625      Managing these errors and how they propagate through calculations
 626      is the subject of an entire branch of mathematics and computer
 627      science and will not be discussed here, except for the
 628      following points:
 629      <itemizedlist>
 630       <listitem>
 631        <para>
 632         If you require exact storage and calculations (such as for
 633         monetary amounts), use the <type>numeric</type> type instead.
 634        </para>
 635       </listitem>
 636
 637       <listitem>
 638        <para>
 639         If you want to do complicated calculations with these types
 640         for anything important, especially if you rely on certain
 641         behavior in boundary cases (infinity, underflow), you should
 642         evaluate the implementation carefully.
 643        </para>
 644       </listitem>
 645
 646       <listitem>
 647        <para>
 648         Comparing two floating-point values for equality might not
 649         always work as expected.
 650        </para>
 651       </listitem>
 652      </itemizedlist>
 653     </para>
 654
 655     <para>
 656      On most platforms, the <type>real</type> type has a range of at least
 657      1E-37 to 1E+37 with a precision of at least 6 decimal digits.  The
 658      <type>double precision</type> type typically has a range of around
 659      1E-307 to 1E+308 with a precision of at least 15 digits.  Values that
 660      are too large or too small will cause an error.  Rounding might
 661      take place if the precision of an input number is too high.
 662      Numbers too close to zero that are not representable as distinct
 663      from zero will cause an underflow error.
 664     </para>
 665
 666     <indexterm>
 667      <primary>not a number</primary>
 668      <secondary>double precision</secondary>
 669     </indexterm>
 670
 671     <para>
 672      In addition to ordinary numeric values, the floating-point types
 673      have several special values:
 674 <literallayout>
 675 <literal>Infinity</literal>
 676 <literal>-Infinity</literal>
 677 <literal>NaN</literal>
 678 </literallayout>
 679      These represent the IEEE 754 special values
 680      <quote>infinity</quote>, <quote>negative infinity</quote>, and
 681      <quote>not-a-number</quote>, respectively.  (On a machine whose
 682      floating-point arithmetic does not follow IEEE 754, these values
 683      will probably not work as expected.)  When writing these values
 684      as constants in an SQL command, you must put quotes around them,
 685      for example <literal>UPDATE table SET x = 'Infinity'</>.  On input,
 686      these strings are recognized in a case-insensitive manner.
 687     </para>
 688
 689     <note>
 690      <para>
 691       IEEE754 specifies that <literal>NaN</> should not compare equal
 692       to any other floating-point value (including <literal>NaN</>).
 693       In order to allow floating-point values to be sorted and used
 694       in tree-based indexes, <productname>PostgreSQL</> treats
 695       <literal>NaN</> values as equal, and greater than all
 696       non-<literal>NaN</> values.
 697      </para>
 698     </note>
 699
 700     <para>
 701      <productname>PostgreSQL</productname> also supports the SQL-standard
 702      notations <type>float</type> and
 703      <type>float(<replaceable>p</replaceable>)</type> for specifying
 704      inexact numeric types.  Here, <replaceable>p</replaceable> specifies
 705      the minimum acceptable precision in <emphasis>binary</> digits.
 706      <productname>PostgreSQL</productname> accepts
 707      <type>float(1)</type> to <type>float(24)</type> as selecting the
 708      <type>real</type> type, while
 709      <type>float(25)</type> to <type>float(53)</type> select
 710      <type>double precision</type>.  Values of <replaceable>p</replaceable>
 711      outside the allowed range draw an error.
 712      <type>float</type> with no precision specified is taken to mean
 713      <type>double precision</type>.
 714     </para>
 715
 716     <note>
 717      <para>
 718       Prior to <productname>PostgreSQL</productname> 7.4, the precision in
 719       <type>float(<replaceable>p</replaceable>)</type> was taken to mean
 720       so many <emphasis>decimal</> digits.  This has been corrected to match the SQL
 721       standard, which specifies that the precision is measured in binary
 722       digits.  The assumption that <type>real</type> and
 723       <type>double precision</type> have exactly 24 and 53 bits in the
 724       mantissa respectively is correct for IEEE-standard floating point
 725       implementations.  On non-IEEE platforms it might be off a little, but
 726       for simplicity the same ranges of <replaceable>p</replaceable> are used
 727       on all platforms.
 728      </para>
 729     </note>
 730
 731    </sect2>
 732
 733    <sect2 id="datatype-serial">
 734     <title>Serial Types</title>
 735
 736     <indexterm zone="datatype-serial">
 737      <primary>serial</primary>
 738     </indexterm>
 739
 740     <indexterm zone="datatype-serial">
 741      <primary>bigserial</primary>
 742     </indexterm>
 743
 744     <indexterm zone="datatype-serial">
 745      <primary>serial4</primary>
 746     </indexterm>
 747
 748     <indexterm zone="datatype-serial">
 749      <primary>serial8</primary>
 750     </indexterm>
 751
 752     <indexterm>
 753      <primary>auto-increment</primary>
 754      <see>serial</see>
 755     </indexterm>
 756
 757     <indexterm>
 758      <primary>sequence</primary>
 759      <secondary>and serial type</secondary>
 760     </indexterm>
 761
 762     <para>
 763      The data types <type>serial</type> and <type>bigserial</type>
 764      are not true types, but merely
 765      a notational convenience for creating unique identifier columns
 766      (similar to the <literal>AUTO_INCREMENT</literal> property
 767      supported by some other databases). In the current
 768      implementation, specifying:
 769
 770 <programlisting>
 771 CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
 772     <replaceable class="parameter">colname</replaceable> SERIAL
 773 );
 774 </programlisting>
 775
 776      is equivalent to specifying:
 777
 778 <programlisting>
 779 CREATE SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq;
 780 CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
 781     <replaceable class="parameter">colname</replaceable> integer NOT NULL DEFAULT nextval('<replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq')
 782 );
 783 ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq OWNED BY <replaceable class="parameter">tablename</replaceable>.<replaceable class="parameter">colname</replaceable>;
 784 </programlisting>
 785
 786      Thus, we have created an integer column and arranged for its default
 787      values to be assigned from a sequence generator.  A <literal>NOT NULL</>
 788      constraint is applied to ensure that a null value cannot be
 789      inserted.  (In most cases you would also want to attach a
 790      <literal>UNIQUE</> or <literal>PRIMARY KEY</> constraint to prevent
 791      duplicate values from being inserted by accident, but this is
 792      not automatic.)  Lastly, the sequence is marked as <quote>owned by</>
 793      the column, so that it will be dropped if the column or table is dropped.
 794     </para>
 795
 796     <note>
 797      <para>
 798       Prior to <productname>PostgreSQL</productname> 7.3, <type>serial</type>
 799       implied <literal>UNIQUE</literal>.  This is no longer automatic.  If
 800       you wish a serial column to have a unique constraint or be a
 801       primary key, it must now be specified, just like
 802       any other data type.
 803      </para>
 804     </note>
 805
 806     <para>
 807      To insert the next value of the sequence into the <type>serial</type>
 808      column, specify that the <type>serial</type>
 809      column should be assigned its default value. This can be done
 810      either by excluding the column from the list of columns in
 811      the <command>INSERT</command> statement, or through the use of
 812      the <literal>DEFAULT</literal> key word.
 813     </para>
 814
 815     <para>
 816      The type names <type>serial</type> and <type>serial4</type> are
 817      equivalent: both create <type>integer</type> columns.  The type
 818      names <type>bigserial</type> and <type>serial8</type> work
 819      the same way, except that they create a <type>bigint</type>
 820      column.  <type>bigserial</type> should be used if you anticipate
 821      the use of more than 2<superscript>31</> identifiers over the
 822      lifetime of the table.
 823     </para>
 824
 825     <para>
 826      The sequence created for a <type>serial</type> column is
 827      automatically dropped when the owning column is dropped.
 828      You can drop the sequence without dropping the column, but this
 829      will force removal of the column default expression.
 830     </para>
 831    </sect2>
 832   </sect1>
 833
 834   <sect1 id="datatype-money">
 835    <title>Monetary Types</title>
 836
 837    <para>
 838     The <type>money</type> type stores a currency amount with a fixed
 839     fractional precision; see <xref
 840     linkend="datatype-money-table">.  The fractional precision is
 841     determined by the database's <xref linkend="guc-lc-monetary"> setting.
 842     The range shown in the table assumes there are two fractional digits.
 843     Input is accepted in a variety of formats, including integer and
 844     floating-point literals, as well as typical
 845     currency formatting, such as <literal>'$1,000.00'</literal>.
 846     Output is generally in the latter form but depends on the locale.
 847    </para>
 848
 849     <table id="datatype-money-table">
 850      <title>Monetary Types</title>
 851      <tgroup cols="4">
 852       <thead>
 853        <row>
 854         <entry>Name</entry>
 855         <entry>Storage Size</entry>
 856         <entry>Description</entry>
 857         <entry>Range</entry>
 858        </row>
 859       </thead>
 860       <tbody>
 861        <row>
 862         <entry>money</entry>
 863         <entry>8 bytes</entry>
 864         <entry>currency amount</entry>
 865         <entry>-92233720368547758.08 to +92233720368547758.07</entry>
 866        </row>
 867       </tbody>
 868      </tgroup>
 869     </table>
 870
 871    <para>
 872     Since the output of this data type is locale-sensitive, it might not
 873     work to load <type>money</> data into a database that has a different
 874     setting of <varname>lc_monetary</>.  To avoid problems, before
 875     restoring a dump into a new database make sure <varname>lc_monetary</> has
 876     the same or equivalent value as in the database that was dumped.
 877    </para>
 878
 879    <para>
 880     Values of the <type>numeric</type> data type can be cast to
 881     <type>money</type>.  Other numeric types can be converted to
 882     <type>money</type> by casting to <type>numeric</type> first, for example:
 883 <programlisting>
 884 SELECT 1234::numeric::money;
 885 </programlisting>
 886     A <type>money</type> value can be cast to <type>numeric</type> without
 887     loss of precision. Conversion to other types could potentially lose
 888     precision, and it must be done in two stages, for example:
 889 <programlisting>
 890 SELECT '52093.89'::money::numeric::float8;
 891 </programlisting>
 892    </para>
 893
 894    <para>
 895     When a <type>money</type> value is divided by another <type>money</type>
 896     value, the result is <type>double precision</type> (i.e., a pure number,
 897     not money); the currency units cancel each other out in the division.
 898    </para>
 899   </sect1>
 900
 901
 902   <sect1 id="datatype-character">
 903    <title>Character Types</title>
 904
 905    <indexterm zone="datatype-character">
 906     <primary>character string</primary>
 907     <secondary>data types</secondary>
 908    </indexterm>
 909
 910    <indexterm>
 911     <primary>string</primary>
 912     <see>character string</see>
 913    </indexterm>
 914
 915    <indexterm zone="datatype-character">
 916     <primary>character</primary>
 917    </indexterm>
 918
 919    <indexterm zone="datatype-character">
 920     <primary>character varying</primary>
 921    </indexterm>
 922
 923    <indexterm zone="datatype-character">
 924     <primary>text</primary>
 925    </indexterm>
 926
 927    <indexterm zone="datatype-character">
 928     <primary>char</primary>
 929    </indexterm>
 930
 931    <indexterm zone="datatype-character">
 932     <primary>varchar</primary>
 933    </indexterm>
 934
 935     <table id="datatype-character-table">
 936      <title>Character Types</title>
 937      <tgroup cols="2">
 938       <thead>
 939        <row>
 940         <entry>Name</entry>
 941         <entry>Description</entry>
 942        </row>
 943       </thead>
 944       <tbody>
 945        <row>
 946         <entry><type>character varying(<replaceable>n</>)</type>, <type>varchar(<replaceable>n</>)</type></entry>
 947         <entry>variable-length with limit</entry>
 948        </row>
 949        <row>
 950         <entry><type>character(<replaceable>n</>)</type>, <type>char(<replaceable>n</>)</type></entry>
 951         <entry>fixed-length, blank padded</entry>
 952        </row>
 953        <row>
 954         <entry><type>text</type></entry>
 955         <entry>variable unlimited length</entry>
 956        </row>
 957      </tbody>
 958      </tgroup>
 959     </table>
 960
 961    <para>
 962     <xref linkend="datatype-character-table"> shows the
 963     general-purpose character types available in
 964     <productname>PostgreSQL</productname>.
 965    </para>
 966
 967    <para>
 968     <acronym>SQL</acronym> defines two primary character types:
 969     <type>character varying(<replaceable>n</>)</type> and
 970     <type>character(<replaceable>n</>)</type>, where <replaceable>n</>
 971     is a positive integer.  Both of these types can store strings up to
 972     <replaceable>n</> characters (not bytes) in length.  An attempt to store a
 973     longer string into a column of these types will result in an
 974     error, unless the excess characters are all spaces, in which case
 975     the string will be truncated to the maximum length. (This somewhat
 976     bizarre exception is required by the <acronym>SQL</acronym>
 977     standard.) If the string to be stored is shorter than the declared
 978     length, values of type <type>character</type> will be space-padded;
 979     values of type <type>character varying</type> will simply store the
 980     shorter
 981     string.
 982    </para>
 983
 984    <para>
 985     If one explicitly casts a value to <type>character
 986     varying(<replaceable>n</>)</type> or
 987     <type>character(<replaceable>n</>)</type>, then an over-length
 988     value will be truncated to <replaceable>n</> characters without
 989     raising an error. (This too is required by the
 990     <acronym>SQL</acronym> standard.)
 991    </para>
 992
 993    <para>
 994     The notations <type>varchar(<replaceable>n</>)</type> and
 995     <type>char(<replaceable>n</>)</type> are aliases for <type>character
 996     varying(<replaceable>n</>)</type> and
 997     <type>character(<replaceable>n</>)</type>, respectively.
 998     <type>character</type> without length specifier is equivalent to
 999     <type>character(1)</type>. If <type>character varying</type> is used
1000     without length specifier, the type accepts strings of any size. The
1001     latter is a <productname>PostgreSQL</> extension.
1002    </para>
1003
1004    <para>
1005     In addition, <productname>PostgreSQL</productname> provides the
1006     <type>text</type> type, which stores strings of any length.
1007     Although the type <type>text</type> is not in the
1008     <acronym>SQL</acronym> standard, several other SQL database
1009     management systems have it as well.
1010    </para>
1011
1012    <para>
1013     Values of type <type>character</type> are physically padded
1014     with spaces to the specified width <replaceable>n</>, and are
1015     stored and displayed that way.  However, the padding spaces are
1016     treated as semantically insignificant.  Trailing spaces are
1017     disregarded when comparing two values of type <type>character</type>,
1018     and they will be removed when converting a <type>character</type> value
1019     to one of the other string types.  Note that trailing spaces
1020     <emphasis>are</> semantically significant in
1021     <type>character varying</type> and <type>text</type> values, and
1022     when using pattern matching, e.g. <literal>LIKE</>,
1023     regular expressions.
1024    </para>
1025
1026    <para>
1027     The storage requirement for a short string (up to 126 bytes) is 1 byte
1028     plus the actual string, which includes the space padding in the case of
1029     <type>character</type>.  Longer strings have 4 bytes of overhead instead
1030     of 1.  Long strings are compressed by the system automatically, so
1031     the physical requirement on disk might be less. Very long values are also
1032     stored in background tables so that they do not interfere with rapid
1033     access to shorter column values. In any case, the longest
1034     possible character string that can be stored is about 1 GB. (The
1035     maximum value that will be allowed for <replaceable>n</> in the data
1036     type declaration is less than that. It wouldn't be useful to
1037     change this because with multibyte character encodings the number of
1038     characters and bytes can be quite different. If you desire to
1039     store long strings with no specific upper limit, use
1040     <type>text</type> or <type>character varying</type> without a length
1041     specifier, rather than making up an arbitrary length limit.)
1042    </para>
1043
1044    <tip>
1045     <para>
1046      There is no performance difference among these three types,
1047      apart from increased storage space when using the blank-padded
1048      type, and a few extra CPU cycles to check the length when storing into
1049      a length-constrained column.  While
1050      <type>character(<replaceable>n</>)</type> has performance
1051      advantages in some other database systems, there is no such advantage in
1052      <productname>PostgreSQL</productname>; in fact
1053      <type>character(<replaceable>n</>)</type> is usually the slowest of
1054      the three because of its additional storage costs.  In most situations
1055      <type>text</type> or <type>character varying</type> should be used
1056      instead.
1057     </para>
1058    </tip>
1059
1060    <para>
1061     Refer to <xref linkend="sql-syntax-strings"> for information about
1062     the syntax of string literals, and to <xref linkend="functions">
1063     for information about available operators and functions. The
1064     database character set determines the character set used to store
1065     textual values; for more information on character set support,
1066     refer to <xref linkend="multibyte">.
1067    </para>
1068
1069    <example>
1070     <title>Using the Character Types</title>
1071
1072 <programlisting>
1073 CREATE TABLE test1 (a character(4));
1074 INSERT INTO test1 VALUES ('ok');
1075 SELECT a, char_length(a) FROM test1; -- <co id="co.datatype-char">
1076 <computeroutput>
1077   a   | char_length
1078 ------+-------------
1079  ok   |           2
1080 </computeroutput>
1081
1082 CREATE TABLE test2 (b varchar(5));
1083 INSERT INTO test2 VALUES ('ok');
1084 INSERT INTO test2 VALUES ('good      ');
1085 INSERT INTO test2 VALUES ('too long');
1086 <computeroutput>ERROR:  value too long for type character varying(5)</computeroutput>
1087 INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
1088 SELECT b, char_length(b) FROM test2;
1089 <computeroutput>
1090    b   | char_length
1091 -------+-------------
1092  ok    |           2
1093  good  |           5
1094  too l |           5
1095 </computeroutput>
1096 </programlisting>
1097     <calloutlist>
1098      <callout arearefs="co.datatype-char">
1099       <para>
1100        The <function>char_length</function> function is discussed in
1101        <xref linkend="functions-string">.
1102       </para>
1103      </callout>
1104     </calloutlist>
1105    </example>
1106
1107    <para>
1108     There are two other fixed-length character types in
1109     <productname>PostgreSQL</productname>, shown in <xref
1110     linkend="datatype-character-special-table">. The <type>name</type>
1111     type exists <emphasis>only</emphasis> for the storage of identifiers
1112     in the internal system catalogs and is not intended for use by the general user. Its
1113     length is currently defined as 64 bytes (63 usable characters plus
1114     terminator) but should be referenced using the constant
1115     <symbol>NAMEDATALEN</symbol> in <literal>C</> source code.
1116     The length is set at compile time (and
1117     is therefore adjustable for special uses); the default maximum
1118     length might change in a future release. The type <type>"char"</type>
1119     (note the quotes) is different from <type>char(1)</type> in that it
1120     only uses one byte of storage. It is internally used in the system
1121     catalogs as a simplistic enumeration type.
1122    </para>
1123
1124     <table id="datatype-character-special-table">
1125      <title>Special Character Types</title>
1126      <tgroup cols="3">
1127       <thead>
1128        <row>
1129         <entry>Name</entry>
1130         <entry>Storage Size</entry>
1131         <entry>Description</entry>
1132        </row>
1133       </thead>
1134       <tbody>
1135        <row>
1136         <entry><type>"char"</type></entry>
1137         <entry>1 byte</entry>
1138         <entry>single-byte internal type</entry>
1139        </row>
1140        <row>
1141         <entry><type>name</type></entry>
1142         <entry>64 bytes</entry>
1143         <entry>internal type for object names</entry>
1144        </row>
1145       </tbody>
1146      </tgroup>
1147     </table>
1148
1149   </sect1>
1150
1151  <sect1 id="datatype-binary">
1152   <title>Binary Data Types</title>
1153
1154   <indexterm zone="datatype-binary">
1155    <primary>binary data</primary>
1156   </indexterm>
1157
1158   <indexterm zone="datatype-binary">
1159    <primary>bytea</primary>
1160   </indexterm>
1161
1162    <para>
1163     The <type>bytea</type> data type allows storage of binary strings;
1164     see <xref linkend="datatype-binary-table">.
1165    </para>
1166
1167    <table id="datatype-binary-table">
1168     <title>Binary Data Types</title>
1169     <tgroup cols="3">
1170      <thead>
1171       <row>
1172        <entry>Name</entry>
1173        <entry>Storage Size</entry>
1174        <entry>Description</entry>
1175       </row>
1176      </thead>
1177      <tbody>
1178       <row>
1179        <entry><type>bytea</type></entry>
1180        <entry>1 or 4 bytes plus the actual binary string</entry>
1181        <entry>variable-length binary string</entry>
1182       </row>
1183      </tbody>
1184     </tgroup>
1185    </table>
1186
1187    <para>
1188     A binary string is a sequence of octets (or bytes).  Binary
1189     strings are distinguished from character strings in two
1190     ways.  First, binary strings specifically allow storing
1191     octets of value zero and other <quote>non-printable</quote>
1192     octets (usually, octets outside the range 32 to 126).
1193     Character strings disallow zero octets, and also disallow any
1194     other octet values and sequences of octet values that are invalid
1195     according to the database's selected character set encoding.
1196     Second, operations on binary strings process the actual bytes,
1197     whereas the processing of character strings depends on locale settings.
1198     In short, binary strings are appropriate for storing data that the
1199     programmer thinks of as <quote>raw bytes</>, whereas character
1200     strings are appropriate for storing text.
1201    </para>
1202
1203    <para>
1204     The <type>bytea</type> type supports two external formats for
1205     input and output: <productname>PostgreSQL</productname>'s historical
1206     <quote>escape</quote> format, and <quote>hex</quote> format.  Both
1207     of these are always accepted on input.  The output format depends
1208     on the configuration parameter <xref linkend="guc-bytea-output">;
1209     the default is hex.  (Note that the hex format was introduced in
1210     <productname>PostgreSQL</productname> 9.0; earlier versions and some
1211     tools don't understand it.)
1212    </para>
1213
1214    <para>
1215     The <acronym>SQL</acronym> standard defines a different binary
1216     string type, called <type>BLOB</type> or <type>BINARY LARGE
1217     OBJECT</type>.  The input format is different from
1218     <type>bytea</type>, but the provided functions and operators are
1219     mostly the same.
1220    </para>
1221
1222   <sect2>
1223    <title><type>bytea</> Hex Format</title>
1224
1225    <para>
1226     The <quote>hex</> format encodes binary data as 2 hexadecimal digits
1227     per byte, most significant nibble first.  The entire string is
1228     preceded by the sequence <literal>\x</literal> (to distinguish it
1229     from the escape format).  In some contexts, the initial backslash may
1230     need to be escaped by doubling it, in the same cases in which backslashes
1231     have to be doubled in escape format; details appear below.
1232     The hexadecimal digits can
1233     be either upper or lower case, and whitespace is permitted between
1234     digit pairs (but not within a digit pair nor in the starting
1235     <literal>\x</literal> sequence).
1236     The hex format is compatible with a wide
1237     range of external applications and protocols, and it tends to be
1238     faster to convert than the escape format, so its use is preferred.
1239    </para>
1240
1241    <para>
1242     Example:
1243 <programlisting>
1244 SELECT E'\\xDEADBEEF';
1245 </programlisting>
1246    </para>
1247   </sect2>
1248
1249   <sect2>
1250    <title><type>bytea</> Escape Format</title>
1251
1252    <para>
1253     The <quote>escape</quote> format is the traditional
1254     <productname>PostgreSQL</productname> format for the <type>bytea</type>
1255     type.  It
1256     takes the approach of representing a binary string as a sequence
1257     of ASCII characters, while converting those bytes that cannot be
1258     represented as an ASCII character into special escape sequences.
1259     If, from the point of view of the application, representing bytes
1260     as characters makes sense, then this representation can be
1261     convenient.  But in practice it is usually confusing because it
1262     fuzzes up the distinction between binary strings and character
1263     strings, and also the particular escape mechanism that was chosen is
1264     somewhat unwieldy.  So this format should probably be avoided
1265     for most new applications.
1266    </para>
1267
1268    <para>
1269     When entering <type>bytea</type> values in escape format,
1270     octets of certain
1271     values <emphasis>must</emphasis> be escaped, while all octet
1272     values <emphasis>can</emphasis> be escaped.  In
1273     general, to escape an octet, convert it into its three-digit
1274     octal value and precede it
1275     by a backslash (or two backslashes, if writing the value as a
1276     literal using escape string syntax).
1277     Backslash itself (octet value 92) can alternatively be represented by
1278     double backslashes.
1279     <xref linkend="datatype-binary-sqlesc">
1280     shows the characters that must be escaped, and gives the alternative
1281     escape sequences where applicable.
1282    </para>
1283
1284    <table id="datatype-binary-sqlesc">
1285     <title><type>bytea</> Literal Escaped Octets</title>
1286     <tgroup cols="5">
1287      <thead>
1288       <row>
1289        <entry>Decimal Octet Value</entry>
1290        <entry>Description</entry>
1291        <entry>Escaped Input Representation</entry>
1292        <entry>Example</entry>
1293        <entry>Output Representation</entry>
1294       </row>
1295      </thead>
1296
1297      <tbody>
1298       <row>
1299        <entry>0</entry>
1300        <entry>zero octet</entry>
1301        <entry><literal>E'\\000'</literal></entry>
1302        <entry><literal>SELECT E'\\000'::bytea;</literal></entry>
1303        <entry><literal>\000</literal></entry>
1304       </row>
1305
1306       <row>
1307        <entry>39</entry>
1308        <entry>single quote</entry>
1309        <entry><literal>''''</literal> or <literal>E'\\047'</literal></entry>
1310        <entry><literal>SELECT E'\''::bytea;</literal></entry>
1311        <entry><literal>'</literal></entry>
1312       </row>
1313
1314       <row>
1315        <entry>92</entry>
1316        <entry>backslash</entry>
1317        <entry><literal>E'\\\\'</literal> or <literal>E'\\134'</literal></entry>
1318        <entry><literal>SELECT E'\\\\'::bytea;</literal></entry>
1319        <entry><literal>\\</literal></entry>
1320       </row>
1321
1322       <row>
1323        <entry>0 to 31 and 127 to 255</entry>
1324        <entry><quote>non-printable</quote> octets</entry>
1325        <entry><literal>E'\\<replaceable>xxx'</></literal> (octal value)</entry>
1326        <entry><literal>SELECT E'\\001'::bytea;</literal></entry>
1327        <entry><literal>\001</literal></entry>
1328       </row>
1329
1330      </tbody>
1331     </tgroup>
1332    </table>
1333
1334    <para>
1335     The requirement to escape <emphasis>non-printable</emphasis> octets
1336     varies depending on locale settings. In some instances you can get away
1337     with leaving them unescaped. Note that the result in each of the examples
1338     in <xref linkend="datatype-binary-sqlesc"> was exactly one octet in
1339     length, even though the output representation is sometimes
1340     more than one character.
1341    </para>
1342
1343    <para>
1344     The reason multiple backslashes are required, as shown
1345     in <xref linkend="datatype-binary-sqlesc">, is that an input
1346     string written as a string literal must pass through two parse
1347     phases in the <productname>PostgreSQL</productname> server.
1348     The first backslash of each pair is interpreted as an escape
1349     character by the string-literal parser (assuming escape string
1350     syntax is used) and is therefore consumed, leaving the second backslash of the
1351     pair.  (Dollar-quoted strings can be used to avoid this level
1352     of escaping.)  The remaining backslash is then recognized by the
1353     <type>bytea</type> input function as starting either a three
1354     digit octal value or escaping another backslash.  For example,
1355     a string literal passed to the server as <literal>E'\\001'</literal>
1356     becomes <literal>\001</literal> after passing through the
1357     escape string parser. The <literal>\001</literal> is then sent
1358     to the <type>bytea</type> input function, where it is converted
1359     to a single octet with a decimal value of 1.  Note that the
1360     single-quote character is not treated specially by <type>bytea</type>,
1361     so it follows the normal rules for string literals.  (See also
1362     <xref linkend="sql-syntax-strings">.)
1363    </para>
1364
1365    <para>
1366     <type>Bytea</type> octets are sometimes escaped when output. In general, each
1367     <quote>non-printable</quote> octet is converted into
1368     its equivalent three-digit octal value and preceded by one backslash.
1369     Most <quote>printable</quote> octets are represented by their standard
1370     representation in the client character set. The octet with decimal
1371     value 92 (backslash) is doubled in the output.
1372     Details are in <xref linkend="datatype-binary-resesc">.
1373    </para>
1374
1375    <table id="datatype-binary-resesc">
1376     <title><type>bytea</> Output Escaped Octets</title>
1377     <tgroup cols="5">
1378      <thead>
1379       <row>
1380        <entry>Decimal Octet Value</entry>
1381        <entry>Description</entry>
1382        <entry>Escaped Output Representation</entry>
1383        <entry>Example</entry>
1384        <entry>Output Result</entry>
1385       </row>
1386      </thead>
1387
1388      <tbody>
1389
1390       <row>
1391        <entry>92</entry>
1392        <entry>backslash</entry>
1393        <entry><literal>\\</literal></entry>
1394        <entry><literal>SELECT E'\\134'::bytea;</literal></entry>
1395        <entry><literal>\\</literal></entry>
1396       </row>
1397
1398       <row>
1399        <entry>0 to 31 and 127 to 255</entry>
1400        <entry><quote>non-printable</quote> octets</entry>
1401        <entry><literal>\<replaceable>xxx</></literal> (octal value)</entry>
1402        <entry><literal>SELECT E'\\001'::bytea;</literal></entry>
1403        <entry><literal>\001</literal></entry>
1404       </row>
1405
1406       <row>
1407        <entry>32 to 126</entry>
1408        <entry><quote>printable</quote> octets</entry>
1409        <entry>client character set representation</entry>
1410        <entry><literal>SELECT E'\\176'::bytea;</literal></entry>
1411        <entry><literal>~</literal></entry>
1412       </row>
1413
1414      </tbody>
1415     </tgroup>
1416    </table>
1417
1418    <para>
1419     Depending on the front end to <productname>PostgreSQL</> you use,
1420     you might have additional work to do in terms of escaping and
1421     unescaping <type>bytea</type> strings. For example, you might also
1422     have to escape line feeds and carriage returns if your interface
1423     automatically translates these.
1424    </para>
1425   </sect2>
1426  </sect1>
1427
1428
1429   <sect1 id="datatype-datetime">
1430    <title>Date/Time Types</title>
1431
1432    <indexterm zone="datatype-datetime">
1433     <primary>date</primary>
1434    </indexterm>
1435    <indexterm zone="datatype-datetime">
1436     <primary>time</primary>
1437    </indexterm>
1438    <indexterm zone="datatype-datetime">
1439     <primary>time without time zone</primary>
1440    </indexterm>
1441    <indexterm zone="datatype-datetime">
1442     <primary>time with time zone</primary>
1443    </indexterm>
1444    <indexterm zone="datatype-datetime">
1445     <primary>timestamp</primary>
1446    </indexterm>
1447    <indexterm zone="datatype-datetime">
1448     <primary>timestamptz</primary>
1449    </indexterm>
1450    <indexterm zone="datatype-datetime">
1451     <primary>timestamp with time zone</primary>
1452    </indexterm>
1453    <indexterm zone="datatype-datetime">
1454     <primary>timestamp without time zone</primary>
1455    </indexterm>
1456    <indexterm zone="datatype-datetime">
1457     <primary>interval</primary>
1458    </indexterm>
1459    <indexterm zone="datatype-datetime">
1460     <primary>time span</primary>
1461    </indexterm>
1462
1463    <para>
1464     <productname>PostgreSQL</productname> supports the full set of
1465     <acronym>SQL</acronym> date and time types, shown in <xref
1466     linkend="datatype-datetime-table">.  The operations available
1467     on these data types are described in
1468     <xref linkend="functions-datetime">.
1469    </para>
1470
1471     <table id="datatype-datetime-table">
1472      <title>Date/Time Types</title>
1473      <tgroup cols="6">
1474       <thead>
1475        <row>
1476         <entry>Name</entry>
1477         <entry>Storage Size</entry>
1478         <entry>Description</entry>
1479         <entry>Low Value</entry>
1480         <entry>High Value</entry>
1481         <entry>Resolution</entry>
1482        </row>
1483       </thead>
1484       <tbody>
1485        <row>
1486         <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
1487         <entry>8 bytes</entry>
1488         <entry>both date and time (no time zone)</entry>
1489         <entry>4713 BC</entry>
1490         <entry>294276 AD</entry>
1491         <entry>1 microsecond / 14 digits</entry>
1492        </row>
1493        <row>
1494         <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
1495         <entry>8 bytes</entry>
1496         <entry>both date and time, with time zone</entry>
1497         <entry>4713 BC</entry>
1498         <entry>294276 AD</entry>
1499         <entry>1 microsecond / 14 digits</entry>
1500        </row>
1501        <row>
1502         <entry><type>date</type></entry>
1503         <entry>4 bytes</entry>
1504         <entry>date (no time of day)</entry>
1505         <entry>4713 BC</entry>
1506         <entry>5874897 AD</entry>
1507         <entry>1 day</entry>
1508        </row>
1509        <row>
1510         <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
1511         <entry>8 bytes</entry>
1512         <entry>time of day (no date)</entry>
1513         <entry>00:00:00</entry>
1514         <entry>24:00:00</entry>
1515         <entry>1 microsecond / 14 digits</entry>
1516        </row>
1517        <row>
1518         <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
1519         <entry>12 bytes</entry>
1520         <entry>times of day only, with time zone</entry>
1521         <entry>00:00:00+1459</entry>
1522         <entry>24:00:00-1459</entry>
1523         <entry>1 microsecond / 14 digits</entry>
1524        </row>
1525        <row>
1526         <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
1527         <entry>12 bytes</entry>
1528         <entry>time interval</entry>
1529         <entry>-178000000 years</entry>
1530         <entry>178000000 years</entry>
1531         <entry>1 microsecond / 14 digits</entry>
1532        </row>
1533       </tbody>
1534      </tgroup>
1535     </table>
1536
1537    <note>
1538     <para>
1539      The SQL standard requires that writing just <type>timestamp</type>
1540      be equivalent to <type>timestamp without time
1541      zone</type>, and <productname>PostgreSQL</productname> honors that
1542      behavior.  (Releases prior to 7.3 treated it as <type>timestamp
1543      with time zone</type>.)  <type>timestamptz</type> is accepted as an
1544      abbreviation for <type>timestamp with time zone</type>; this is a
1545      <productname>PostgreSQL</productname> extension.
1546     </para>
1547    </note>
1548
1549    <para>
1550     <type>time</type>, <type>timestamp</type>, and
1551     <type>interval</type> accept an optional precision value
1552     <replaceable>p</replaceable> which specifies the number of
1553     fractional digits retained in the seconds field. By default, there
1554     is no explicit bound on precision.  The allowed range of
1555     <replaceable>p</replaceable> is from 0 to 6 for the
1556     <type>timestamp</type> and <type>interval</type> types.
1557    </para>
1558
1559    <note>
1560    <para>
1561     When <type>timestamp</> values are stored as eight-byte integers
1562     (currently the default), microsecond precision is available over
1563     the full range of values. When <type>timestamp</> values are
1564     stored as double precision floating-point numbers instead (a
1565     deprecated compile-time option), the effective limit of precision
1566     might be less than 6. <type>timestamp</type> values are stored as
1567     seconds before or after midnight 2000-01-01.  When
1568     <type>timestamp</type> values are implemented using floating-point
1569     numbers, microsecond precision is achieved for dates within a few
1570     years of 2000-01-01, but the precision degrades for dates further
1571     away. Note that using floating-point datetimes allows a larger
1572     range of <type>timestamp</type> values to be represented than
1573     shown above: from 4713 BC up to 5874897 AD.
1574    </para>
1575
1576    <para>
1577     The same compile-time option also determines whether
1578     <type>time</type> and <type>interval</type> values are stored as
1579     floating-point numbers or eight-byte integers.  In the
1580     floating-point case, large <type>interval</type> values degrade in
1581     precision as the size of the interval increases.
1582    </para>
1583    </note>
1584
1585    <para>
1586     For the <type>time</type> types, the allowed range of
1587     <replaceable>p</replaceable> is from 0 to 6 when eight-byte integer
1588     storage is used, or from 0 to 10 when floating-point storage is used.
1589    </para>
1590
1591    <para>
1592     The <type>interval</type> type has an additional option, which is
1593     to restrict the set of stored fields by writing one of these phrases:
1594 <literallayout class="monospaced">
1595 YEAR
1596 MONTH
1597 DAY
1598 HOUR
1599 MINUTE
1600 SECOND
1601 YEAR TO MONTH
1602 DAY TO HOUR
1603 DAY TO MINUTE
1604 DAY TO SECOND
1605 HOUR TO MINUTE
1606 HOUR TO SECOND
1607 MINUTE TO SECOND
1608 </literallayout>
1609     Note that if both <replaceable>fields</replaceable> and
1610     <replaceable>p</replaceable> are specified, the
1611     <replaceable>fields</replaceable> must include <literal>SECOND</>,
1612     since the precision applies only to the seconds.
1613    </para>
1614
1615    <para>
1616     The type <type>time with time zone</type> is defined by the SQL
1617     standard, but the definition exhibits properties which lead to
1618     questionable usefulness. In most cases, a combination of
1619     <type>date</type>, <type>time</type>, <type>timestamp without time
1620     zone</type>, and <type>timestamp with time zone</type> should
1621     provide a complete range of date/time functionality required by
1622     any application.
1623    </para>
1624
1625    <para>
1626     The types <type>abstime</type>
1627     and <type>reltime</type> are lower precision types which are used internally.
1628     You are discouraged from using these types in
1629     applications;  these internal types
1630     might disappear in a future release.
1631    </para>
1632
1633    <sect2 id="datatype-datetime-input">
1634     <title>Date/Time Input</title>
1635
1636     <para>
1637      Date and time input is accepted in almost any reasonable format, including
1638      ISO 8601, <acronym>SQL</acronym>-compatible,
1639      traditional <productname>POSTGRES</productname>, and others.
1640      For some formats, ordering of day, month, and year in date input is
1641      ambiguous and there is support for specifying the expected
1642      ordering of these fields.  Set the <xref linkend="guc-datestyle"> parameter
1643      to <literal>MDY</> to select month-day-year interpretation,
1644      <literal>DMY</> to select day-month-year interpretation, or
1645      <literal>YMD</> to select year-month-day interpretation.
1646     </para>
1647
1648     <para>
1649      <productname>PostgreSQL</productname> is more flexible in
1650      handling date/time input than the
1651      <acronym>SQL</acronym> standard requires.
1652      See <xref linkend="datetime-appendix">
1653      for the exact parsing rules of date/time input and for the
1654      recognized text fields including months, days of the week, and
1655      time zones.
1656     </para>
1657
1658     <para>
1659      Remember that any date or time literal input needs to be enclosed
1660      in single quotes, like text strings.  Refer to
1661      <xref linkend="sql-syntax-constants-generic"> for more
1662      information.
1663      <acronym>SQL</acronym> requires the following syntax
1664 <synopsis>
1665 <replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'
1666 </synopsis>
1667      where <replaceable>p</replaceable> is an optional precision
1668      specification giving the number of
1669      fractional digits in the seconds field. Precision can be
1670      specified for <type>time</type>, <type>timestamp</type>, and
1671      <type>interval</type> types.  The allowed values are mentioned
1672      above.  If no precision is specified in a constant specification,
1673      it defaults to the precision of the literal value.
1674     </para>
1675
1676     <sect3>
1677     <title>Dates</title>
1678
1679     <indexterm>
1680      <primary>date</primary>
1681     </indexterm>
1682
1683     <para>
1684      <xref linkend="datatype-datetime-date-table"> shows some possible
1685      inputs for the <type>date</type> type.
1686     </para>
1687
1688      <table id="datatype-datetime-date-table">
1689       <title>Date Input</title>
1690       <tgroup cols="2">
1691        <thead>
1692         <row>
1693          <entry>Example</entry>
1694          <entry>Description</entry>
1695         </row>
1696        </thead>
1697        <tbody>
1698         <row>
1699          <entry>1999-01-08</entry>
1700          <entry>ISO 8601; January 8 in any mode
1701          (recommended format)</entry>
1702         </row>
1703         <row>
1704          <entry>January 8, 1999</entry>
1705          <entry>unambiguous in any <varname>datestyle</varname> input mode</entry>
1706         </row>
1707         <row>
1708          <entry>1/8/1999</entry>
1709          <entry>January 8 in <literal>MDY</> mode;
1710           August 1 in <literal>DMY</> mode</entry>
1711         </row>
1712         <row>
1713          <entry>1/18/1999</entry>
1714          <entry>January 18 in <literal>MDY</> mode;
1715           rejected in other modes</entry>
1716         </row>
1717         <row>
1718          <entry>01/02/03</entry>
1719          <entry>January 2, 2003 in <literal>MDY</> mode;
1720           February 1, 2003 in <literal>DMY</> mode;
1721           February 3, 2001 in <literal>YMD</> mode
1722          </entry>
1723         </row>
1724         <row>
1725          <entry>1999-Jan-08</entry>
1726          <entry>January 8 in any mode</entry>
1727         </row>
1728         <row>
1729          <entry>Jan-08-1999</entry>
1730          <entry>January 8 in any mode</entry>
1731         </row>
1732         <row>
1733          <entry>08-Jan-1999</entry>
1734          <entry>January 8 in any mode</entry>
1735         </row>
1736         <row>
1737          <entry>99-Jan-08</entry>
1738          <entry>January 8 in <literal>YMD</> mode, else error</entry>
1739         </row>
1740         <row>
1741          <entry>08-Jan-99</entry>
1742          <entry>January 8, except error in <literal>YMD</> mode</entry>
1743         </row>
1744         <row>
1745          <entry>Jan-08-99</entry>
1746          <entry>January 8, except error in <literal>YMD</> mode</entry>
1747         </row>
1748         <row>
1749          <entry>19990108</entry>
1750          <entry>ISO 8601; January 8, 1999 in any mode</entry>
1751         </row>
1752         <row>
1753          <entry>990108</entry>
1754          <entry>ISO 8601; January 8, 1999 in any mode</entry>
1755         </row>
1756         <row>
1757          <entry>1999.008</entry>
1758          <entry>year and day of year</entry>
1759         </row>
1760         <row>
1761          <entry>J2451187</entry>
1762          <entry>Julian day</entry>
1763         </row>
1764         <row>
1765          <entry>January 8, 99 BC</entry>
1766          <entry>year 99 BC</entry>
1767         </row>
1768        </tbody>
1769       </tgroup>
1770      </table>
1771     </sect3>
1772
1773     <sect3>
1774      <title>Times</title>
1775
1776      <indexterm>
1777       <primary>time</primary>
1778      </indexterm>
1779      <indexterm>
1780       <primary>time without time zone</primary>
1781      </indexterm>
1782      <indexterm>
1783       <primary>time with time zone</primary>
1784      </indexterm>
1785
1786      <para>
1787       The time-of-day types are <type>time [
1788       (<replaceable>p</replaceable>) ] without time zone</type> and
1789       <type>time [ (<replaceable>p</replaceable>) ] with time
1790       zone</type>.  <type>time</type> alone is equivalent to
1791       <type>time without time zone</type>.
1792      </para>
1793
1794      <para>
1795       Valid input for these types consists of a time of day followed
1796       by an optional time zone. (See <xref
1797       linkend="datatype-datetime-time-table">
1798       and <xref linkend="datatype-timezone-table">.)  If a time zone is
1799       specified in the input for <type>time without time zone</type>,
1800       it is silently ignored. You can also specify a date but it will
1801       be ignored, except when you use a time zone name that involves a
1802       daylight-savings rule, such as
1803       <literal>America/New_York</literal>. In this case specifying the date
1804       is required in order to determine whether standard or daylight-savings
1805       time applies.  The appropriate time zone offset is recorded in the
1806       <type>time with time zone</type> value.
1807      </para>
1808
1809       <table id="datatype-datetime-time-table">
1810        <title>Time Input</title>
1811        <tgroup cols="2">
1812         <thead>
1813          <row>
1814           <entry>Example</entry>
1815           <entry>Description</entry>
1816          </row>
1817         </thead>
1818         <tbody>
1819          <row>
1820           <entry><literal>04:05:06.789</literal></entry>
1821           <entry>ISO 8601</entry>
1822          </row>
1823          <row>
1824           <entry><literal>04:05:06</literal></entry>
1825           <entry>ISO 8601</entry>
1826          </row>
1827          <row>
1828           <entry><literal>04:05</literal></entry>
1829           <entry>ISO 8601</entry>
1830          </row>
1831          <row>
1832           <entry><literal>040506</literal></entry>
1833           <entry>ISO 8601</entry>
1834          </row>
1835          <row>
1836           <entry><literal>04:05 AM</literal></entry>
1837           <entry>same as 04:05; AM does not affect value</entry>
1838          </row>
1839          <row>
1840           <entry><literal>04:05 PM</literal></entry>
1841           <entry>same as 16:05; input hour must be &lt;= 12</entry>
1842          </row>
1843          <row>
1844           <entry><literal>04:05:06.789-8</literal></entry>
1845           <entry>ISO 8601</entry>
1846          </row>
1847          <row>
1848           <entry><literal>04:05:06-08:00</literal></entry>
1849           <entry>ISO 8601</entry>
1850          </row>
1851          <row>
1852           <entry><literal>04:05-08:00</literal></entry>
1853           <entry>ISO 8601</entry>
1854          </row>
1855          <row>
1856           <entry><literal>040506-08</literal></entry>
1857           <entry>ISO 8601</entry>
1858          </row>
1859          <row>
1860           <entry><literal>04:05:06 PST</literal></entry>
1861           <entry>time zone specified by abbreviation</entry>
1862          </row>
1863          <row>
1864           <entry><literal>2003-04-12 04:05:06 America/New_York</literal></entry>
1865           <entry>time zone specified by full name</entry>
1866          </row>
1867         </tbody>
1868        </tgroup>
1869       </table>
1870
1871       <table tocentry="1" id="datatype-timezone-table">
1872        <title>Time Zone Input</title>
1873        <tgroup cols="2">
1874         <thead>
1875          <row>
1876           <entry>Example</entry>
1877           <entry>Description</entry>
1878          </row>
1879         </thead>
1880         <tbody>
1881          <row>
1882           <entry><literal>PST</literal></entry>
1883           <entry>Abbreviation (for Pacific Standard Time)</entry>
1884          </row>
1885          <row>
1886           <entry><literal>America/New_York</literal></entry>
1887           <entry>Full time zone name</entry>
1888          </row>
1889          <row>
1890           <entry><literal>PST8PDT</literal></entry>
1891           <entry>POSIX-style time zone specification</entry>
1892          </row>
1893          <row>
1894           <entry><literal>-8:00</literal></entry>
1895           <entry>ISO-8601 offset for PST</entry>
1896          </row>
1897          <row>
1898           <entry><literal>-800</literal></entry>
1899           <entry>ISO-8601 offset for PST</entry>
1900          </row>
1901          <row>
1902           <entry><literal>-8</literal></entry>
1903           <entry>ISO-8601 offset for PST</entry>
1904          </row>
1905          <row>
1906           <entry><literal>zulu</literal></entry>
1907           <entry>Military abbreviation for UTC</entry>
1908          </row>
1909          <row>
1910           <entry><literal>z</literal></entry>
1911           <entry>Short form of <literal>zulu</literal></entry>
1912          </row>
1913         </tbody>
1914        </tgroup>
1915       </table>
1916
1917      <para>
1918      Refer to <xref linkend="datatype-timezones"> for more information on how
1919      to specify time zones.
1920     </para>
1921     </sect3>
1922
1923     <sect3>
1924     <title>Time Stamps</title>
1925
1926     <indexterm>
1927      <primary>timestamp</primary>
1928     </indexterm>
1929
1930     <indexterm>
1931      <primary>timestamp with time zone</primary>
1932     </indexterm>
1933
1934     <indexterm>
1935      <primary>timestamp without time zone</primary>
1936     </indexterm>
1937
1938      <para>
1939       Valid input for the time stamp types consists of the concatenation
1940       of a date and a time, followed by an optional time zone,
1941       followed by an optional <literal>AD</literal> or <literal>BC</literal>.
1942       (Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear
1943       before the time zone, but this is not the preferred ordering.)
1944       Thus:
1945
1946 <programlisting>
1947 1999-01-08 04:05:06
1948 </programlisting>
1949       and:
1950 <programlisting>
1951 1999-01-08 04:05:06 -8:00
1952 </programlisting>
1953
1954       are valid values, which follow the <acronym>ISO</acronym> 8601
1955       standard.  In addition, the common format:
1956 <programlisting>
1957 January 8 04:05:06 1999 PST
1958 </programlisting>
1959       is supported.
1960      </para>
1961
1962      <para>
1963       The <acronym>SQL</acronym> standard differentiates
1964       <type>timestamp without time zone</type>
1965       and <type>timestamp with time zone</type> literals by the presence of a
1966       <quote>+</quote> or <quote>-</quote> symbol and time zone offset after
1967       the time.  Hence, according to the standard,
1968
1969       <programlisting>TIMESTAMP '2004-10-19 10:23:54'</programlisting>
1970
1971       is a <type>timestamp without time zone</type>, while
1972
1973       <programlisting>TIMESTAMP '2004-10-19 10:23:54+02'</programlisting>
1974
1975       is a <type>timestamp with time zone</type>.
1976       <productname>PostgreSQL</productname> never examines the content of a
1977       literal string before determining its type, and therefore will treat
1978       both of the above as <type>timestamp without time zone</type>.  To
1979       ensure that a literal is treated as <type>timestamp with time
1980       zone</type>, give it the correct explicit type:
1981
1982       <programlisting>TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'</programlisting>
1983
1984       In a literal that has been determined to be <type>timestamp without time
1985       zone</type>, <productname>PostgreSQL</productname> will silently ignore
1986       any time zone indication.
1987       That is, the resulting value is derived from the date/time
1988       fields in the input value, and is not adjusted for time zone.
1989      </para>
1990
1991      <para>
1992       For <type>timestamp with time zone</type>, the internally stored
1993       value is always in UTC (Universal
1994       Coordinated Time, traditionally known as Greenwich Mean Time,
1995       <acronym>GMT</>).  An input value that has an explicit
1996       time zone specified is converted to UTC using the appropriate offset
1997       for that time zone.  If no time zone is stated in the input string,
1998       then it is assumed to be in the time zone indicated by the system's
1999       <xref linkend="guc-timezone"> parameter, and is converted to UTC using the
2000       offset for the <varname>timezone</> zone.
2001      </para>
2002
2003      <para>
2004       When a <type>timestamp with time
2005       zone</type> value is output, it is always converted from UTC to the
2006       current <varname>timezone</> zone, and displayed as local time in that
2007       zone.  To see the time in another time zone, either change
2008       <varname>timezone</> or use the <literal>AT TIME ZONE</> construct
2009       (see <xref linkend="functions-datetime-zoneconvert">).
2010      </para>
2011
2012      <para>
2013       Conversions between <type>timestamp without time zone</type> and
2014       <type>timestamp with time zone</type> normally assume that the
2015       <type>timestamp without time zone</type> value should be taken or given
2016       as <varname>timezone</> local time.  A different time zone can
2017       be specified for the conversion using <literal>AT TIME ZONE</>.
2018      </para>
2019     </sect3>
2020
2021     <sect3>
2022      <title>Special Values</title>
2023
2024      <indexterm>
2025       <primary>time</primary>
2026       <secondary>constants</secondary>
2027      </indexterm>
2028
2029      <indexterm>
2030       <primary>date</primary>
2031       <secondary>constants</secondary>
2032      </indexterm>
2033
2034      <para>
2035       <productname>PostgreSQL</productname> supports several
2036       special date/time input values for convenience, as shown in <xref
2037       linkend="datatype-datetime-special-table">.  The values
2038       <literal>infinity</literal> and <literal>-infinity</literal>
2039       are specially represented inside the system and will be displayed
2040       unchanged; but the others are simply notational shorthands
2041       that will be converted to ordinary date/time values when read.
2042       (In particular, <literal>now</> and related strings are converted
2043       to a specific time value as soon as they are read.)
2044       All of these values need to be enclosed in single quotes when used
2045       as constants in SQL commands.
2046      </para>
2047
2048       <table id="datatype-datetime-special-table">
2049        <title>Special Date/Time Inputs</title>
2050        <tgroup cols="3">
2051         <thead>
2052          <row>
2053           <entry>Input String</entry>
2054           <entry>Valid Types</entry>
2055           <entry>Description</entry>
2056          </row>
2057         </thead>
2058         <tbody>
2059          <row>
2060           <entry><literal>epoch</literal></entry>
2061           <entry><type>date</type>, <type>timestamp</type></entry>
2062           <entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry>
2063          </row>
2064          <row>
2065           <entry><literal>infinity</literal></entry>
2066           <entry><type>date</type>, <type>timestamp</type></entry>
2067           <entry>later than all other time stamps</entry>
2068          </row>
2069          <row>
2070           <entry><literal>-infinity</literal></entry>
2071           <entry><type>date</type>, <type>timestamp</type></entry>
2072           <entry>earlier than all other time stamps</entry>
2073          </row>
2074          <row>
2075           <entry><literal>now</literal></entry>
2076           <entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry>
2077           <entry>current transaction's start time</entry>
2078          </row>
2079          <row>
2080           <entry><literal>today</literal></entry>
2081           <entry><type>date</type>, <type>timestamp</type></entry>
2082           <entry>midnight today</entry>
2083          </row>
2084          <row>
2085           <entry><literal>tomorrow</literal></entry>
2086           <entry><type>date</type>, <type>timestamp</type></entry>
2087           <entry>midnight tomorrow</entry>
2088          </row>
2089          <row>
2090           <entry><literal>yesterday</literal></entry>
2091           <entry><type>date</type>, <type>timestamp</type></entry>
2092           <entry>midnight yesterday</entry>
2093          </row>
2094          <row>
2095           <entry><literal>allballs</literal></entry>
2096           <entry><type>time</type></entry>
2097           <entry>00:00:00.00 UTC</entry>
2098          </row>
2099         </tbody>
2100        </tgroup>
2101       </table>
2102
2103      <para>
2104       The following <acronym>SQL</acronym>-compatible functions can also
2105       be used to obtain the current time value for the corresponding data
2106       type:
2107       <literal>CURRENT_DATE</literal>, <literal>CURRENT_TIME</literal>,
2108       <literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>,
2109       <literal>LOCALTIMESTAMP</literal>.  The latter four accept an
2110       optional subsecond precision specification.  (See <xref
2111       linkend="functions-datetime-current">.)  Note that these are
2112       SQL functions and are <emphasis>not</> recognized in data input strings.
2113      </para>
2114
2115     </sect3>
2116    </sect2>
2117
2118    <sect2 id="datatype-datetime-output">
2119     <title>Date/Time Output</title>
2120
2121     <indexterm>
2122      <primary>date</primary>
2123      <secondary>output format</secondary>
2124      <seealso>formatting</seealso>
2125     </indexterm>
2126
2127     <indexterm>
2128      <primary>time</primary>
2129      <secondary>output format</secondary>
2130      <seealso>formatting</seealso>
2131     </indexterm>
2132
2133     <para>
2134      The output format of the date/time types can be set to one of the four
2135      styles ISO 8601,
2136      <acronym>SQL</acronym> (Ingres), traditional <productname>POSTGRES</>
2137      (Unix <application>date</> format), or
2138      German.  The default
2139      is the <acronym>ISO</acronym> format.  (The
2140      <acronym>SQL</acronym> standard requires the use of the ISO 8601
2141      format.  The name of the <quote>SQL</quote> output format is a
2142      historical accident.)  <xref
2143      linkend="datatype-datetime-output-table"> shows examples of each
2144      output style.  The output of the <type>date</type> and
2145      <type>time</type> types is of course only the date or time part
2146      in accordance with the given examples.
2147     </para>
2148
2149      <table id="datatype-datetime-output-table">
2150       <title>Date/Time Output Styles</title>
2151       <tgroup cols="3">
2152        <thead>
2153         <row>
2154          <entry>Style Specification</entry>
2155          <entry>Description</entry>
2156          <entry>Example</entry>
2157         </row>
2158        </thead>
2159        <tbody>
2160         <row>
2161          <entry>ISO</entry>
2162          <entry>ISO 8601/SQL standard</entry>
2163          <entry>1997-12-17 07:37:16-08</entry>
2164         </row>
2165         <row>
2166          <entry>SQL</entry>
2167          <entry>traditional style</entry>
2168          <entry>12/17/1997 07:37:16.00 PST</entry>
2169         </row>
2170         <row>
2171          <entry>POSTGRES</entry>
2172          <entry>original style</entry>
2173          <entry>Wed Dec 17 07:37:16 1997 PST</entry>
2174         </row>
2175         <row>
2176          <entry>German</entry>
2177          <entry>regional style</entry>
2178          <entry>17.12.1997 07:37:16.00 PST</entry>
2179         </row>
2180        </tbody>
2181       </tgroup>
2182      </table>
2183
2184     <para>
2185      In the <acronym>SQL</acronym> and POSTGRES styles, day appears before
2186      month if DMY field ordering has been specified, otherwise month appears
2187      before day.
2188      (See <xref linkend="datatype-datetime-input">
2189      for how this setting also affects interpretation of input values.)
2190      <xref linkend="datatype-datetime-output2-table"> shows an
2191      example.
2192     </para>
2193
2194      <table id="datatype-datetime-output2-table">
2195       <title>Date Order Conventions</title>
2196       <tgroup cols="3">
2197        <thead>
2198         <row>
2199          <entry><varname>datestyle</varname> Setting</entry>
2200          <entry>Input Ordering</entry>
2201          <entry>Example Output</entry>
2202         </row>
2203        </thead>
2204        <tbody>
2205         <row>
2206          <entry><literal>SQL, DMY</></entry>
2207          <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
2208          <entry>17/12/1997 15:37:16.00 CET</entry>
2209         </row>
2210         <row>
2211          <entry><literal>SQL, MDY</></entry>
2212          <entry><replaceable>month</replaceable>/<replaceable>day</replaceable>/<replaceable>year</replaceable></entry>
2213          <entry>12/17/1997 07:37:16.00 PST</entry>
2214         </row>
2215         <row>
2216          <entry><literal>Postgres, DMY</></entry>
2217          <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
2218          <entry>Wed 17 Dec 07:37:16 1997 PST</entry>
2219         </row>
2220        </tbody>
2221       </tgroup>
2222      </table>
2223
2224     <para>
2225      The date/time styles can be selected by the user using the
2226      <command>SET datestyle</command> command, the <xref
2227      linkend="guc-datestyle"> parameter in the
2228      <filename>postgresql.conf</filename> configuration file, or the
2229      <envar>PGDATESTYLE</envar> environment variable on the server or
2230      client.  The formatting function <function>to_char</function>
2231      (see <xref linkend="functions-formatting">) is also available as
2232      a more flexible way to format date/time output.
2233     </para>
2234    </sect2>
2235
2236    <sect2 id="datatype-timezones">
2237     <title>Time Zones</title>
2238
2239     <indexterm zone="datatype-timezones">
2240      <primary>time zone</primary>
2241     </indexterm>
2242
2243    <para>
2244     Time zones, and time-zone conventions, are influenced by
2245     political decisions, not just earth geometry. Time zones around the
2246     world became somewhat standardized during the 1900's,
2247     but continue to be prone to arbitrary changes, particularly with
2248     respect to daylight-savings rules.
2249     <productname>PostgreSQL</productname> uses the widely-used
2250     <literal>zoneinfo</> time zone database for information about
2251     historical time zone rules.  For times in the future, the assumption
2252     is that the latest known rules for a given time zone will
2253     continue to be observed indefinitely far into the future.
2254    </para>
2255
2256     <para>
2257      <productname>PostgreSQL</productname> endeavors to be compatible with
2258      the <acronym>SQL</acronym> standard definitions for typical usage.
2259      However, the <acronym>SQL</acronym> standard has an odd mix of date and
2260      time types and capabilities. Two obvious problems are:
2261
2262      <itemizedlist>
2263       <listitem>
2264        <para>
2265         Although the <type>date</type> type
2266         cannot have an associated time zone, the
2267         <type>time</type> type can.
2268         Time zones in the real world have little meaning unless
2269         associated with a date as well as a time,
2270         since the offset can vary through the year with daylight-saving
2271         time boundaries.
2272        </para>
2273       </listitem>
2274
2275       <listitem>
2276        <para>
2277         The default time zone is specified as a constant numeric offset
2278         from <acronym>UTC</>. It is therefore impossible to adapt to
2279         daylight-saving time when doing date/time arithmetic across
2280         <acronym>DST</acronym> boundaries.
2281        </para>
2282       </listitem>
2283
2284      </itemizedlist>
2285     </para>
2286
2287     <para>
2288      To address these difficulties, we recommend using date/time types
2289      that contain both date and time when using time zones. We
2290      do <emphasis>not</> recommend using the type <type>time with
2291      time zone</type> (though it is supported by
2292      <productname>PostgreSQL</productname> for legacy applications and
2293      for compliance with the <acronym>SQL</acronym> standard).
2294      <productname>PostgreSQL</productname> assumes
2295      your local time zone for any type containing only date or time.
2296     </para>
2297
2298     <para>
2299      All timezone-aware dates and times are stored internally in
2300      <acronym>UTC</acronym>.  They are converted to local time
2301      in the zone specified by the <xref linkend="guc-timezone"> configuration
2302      parameter before being displayed to the client.
2303     </para>
2304
2305     <para>
2306      <productname>PostgreSQL</productname> allows you to specify time zones in
2307      three different forms:
2308      <itemizedlist>
2309       <listitem>
2310        <para>
2311         A full time zone name, for example <literal>America/New_York</>.
2312         The recognized time zone names are listed in the
2313         <literal>pg_timezone_names</literal> view (see <xref
2314         linkend="view-pg-timezone-names">).
2315         <productname>PostgreSQL</productname> uses the widely-used
2316         <literal>zoneinfo</> time zone data for this purpose, so the same
2317         names are also recognized by much other software.
2318        </para>
2319       </listitem>
2320       <listitem>
2321        <para>
2322         A time zone abbreviation, for example <literal>PST</>.  Such a
2323         specification merely defines a particular offset from UTC, in
2324         contrast to full time zone names which can imply a set of daylight
2325         savings transition-date rules as well.  The recognized abbreviations
2326         are listed in the <literal>pg_timezone_abbrevs</> view (see <xref
2327         linkend="view-pg-timezone-abbrevs">).  You cannot set the
2328         configuration parameters <xref linkend="guc-timezone"> or
2329         <xref linkend="guc-log-timezone"> to a time
2330         zone abbreviation, but you can use abbreviations in
2331         date/time input values and with the <literal>AT TIME ZONE</>
2332         operator.
2333        </para>
2334       </listitem>
2335       <listitem>
2336        <para>
2337         In addition to the timezone names and abbreviations,
2338         <productname>PostgreSQL</productname> will accept POSIX-style time zone
2339         specifications of the form <replaceable>STD</><replaceable>offset</> or
2340         <replaceable>STD</><replaceable>offset</><replaceable>DST</>, where
2341         <replaceable>STD</> is a zone abbreviation, <replaceable>offset</> is a
2342         numeric offset in hours west from UTC, and <replaceable>DST</> is an
2343         optional daylight-savings zone abbreviation, assumed to stand for one
2344         hour ahead of the given offset. For example, if <literal>EST5EDT</>
2345         were not already a recognized zone name, it would be accepted and would
2346         be functionally equivalent to United States East Coast time.  When a
2347         daylight-savings zone name is present, it is assumed to be used
2348         according to the same daylight-savings transition rules used in the
2349         <literal>zoneinfo</> time zone database's <filename>posixrules</> entry.
2350         In a standard <productname>PostgreSQL</productname> installation,
2351         <filename>posixrules</> is the same as <literal>US/Eastern</>, so
2352         that POSIX-style time zone specifications follow USA daylight-savings
2353         rules.  If needed, you can adjust this behavior by replacing the
2354         <filename>posixrules</> file.
2355        </para>
2356       </listitem>
2357      </itemizedlist>
2358
2359      In short, this is the difference between abbreviations
2360      and full names: abbreviations always represent a fixed offset from
2361      UTC, whereas most of the full names imply a local daylight-savings time
2362      rule, and so have two possible UTC offsets.
2363     </para>
2364
2365     <para>
2366      One should be wary that the POSIX-style time zone feature can
2367      lead to silently accepting bogus input, since there is no check on the
2368      reasonableness of the zone abbreviations.  For example, <literal>SET
2369      TIMEZONE TO FOOBAR0</> will work, leaving the system effectively using
2370      a rather peculiar abbreviation for UTC.
2371      Another issue to keep in mind is that in POSIX time zone names,
2372      positive offsets are used for locations <emphasis>west</> of Greenwich.
2373      Everywhere else, <productname>PostgreSQL</productname> follows the
2374      ISO-8601 convention that positive timezone offsets are <emphasis>east</>
2375      of Greenwich.
2376     </para>
2377
2378     <para>
2379      In all cases, timezone names are recognized case-insensitively.
2380      (This is a change from <productname>PostgreSQL</productname> versions
2381      prior to 8.2, which were case-sensitive in some contexts but not others.)
2382     </para>
2383
2384     <para>
2385      Neither full names nor abbreviations are hard-wired into the server;
2386      they are obtained from configuration files stored under
2387      <filename>.../share/timezone/</> and <filename>.../share/timezonesets/</>
2388      of the installation directory
2389      (see <xref linkend="datetime-config-files">).
2390     </para>
2391
2392     <para>
2393      The <xref linkend="guc-timezone"> configuration parameter can
2394      be set in the file <filename>postgresql.conf</>, or in any of the
2395      other standard ways described in <xref linkend="runtime-config">.
2396      There are also several special ways to set it:
2397
2398      <itemizedlist>
2399       <listitem>
2400        <para>
2401         If <varname>timezone</> is not specified in
2402         <filename>postgresql.conf</> or as a server command-line option,
2403         the server attempts to use the value of the <envar>TZ</envar>
2404         environment variable as the default time zone.  If <envar>TZ</envar>
2405         is not defined or is not any of the time zone names known to
2406         <productname>PostgreSQL</productname>, the server attempts to
2407         determine the operating system's default time zone by checking the
2408         behavior of the C library function <literal>localtime()</>.  The
2409         default time zone is selected as the closest match among
2410         <productname>PostgreSQL</productname>'s known time zones.
2411         (These rules are also used to choose the default value of
2412         <xref linkend="guc-log-timezone">, if not specified.)
2413        </para>
2414       </listitem>
2415
2416       <listitem>
2417        <para>
2418         The <acronym>SQL</acronym> command <command>SET TIME ZONE</command>
2419         sets the time zone for the session.  This is an alternative spelling
2420         of <command>SET TIMEZONE TO</> with a more SQL-spec-compatible syntax.
2421        </para>
2422       </listitem>
2423
2424       <listitem>
2425        <para>
2426         The <envar>PGTZ</envar> environment variable is used by
2427         <application>libpq</application> clients
2428         to send a <command>SET TIME ZONE</command>
2429         command to the server upon connection.
2430        </para>
2431       </listitem>
2432      </itemizedlist>
2433     </para>
2434    </sect2>
2435
2436    <sect2 id="datatype-interval-input">
2437     <title>Interval Input</title>
2438
2439     <indexterm>
2440      <primary>interval</primary>
2441     </indexterm>
2442
2443      <para>
2444       <type>interval</type> values can be written using the following
2445       verbose syntax:
2446
2447 <synopsis>
2448 <optional>@</> <replaceable>quantity</> <replaceable>unit</> <optional><replaceable>quantity</> <replaceable>unit</>...</> <optional><replaceable>direction</></optional>
2449 </synopsis>
2450
2451      where <replaceable>quantity</> is a number (possibly signed);
2452      <replaceable>unit</> is <literal>microsecond</literal>,
2453      <literal>millisecond</literal>, <literal>second</literal>,
2454      <literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>,
2455      <literal>week</literal>, <literal>month</literal>, <literal>year</literal>,
2456      <literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>,
2457      or abbreviations or plurals of these units;
2458      <replaceable>direction</> can be <literal>ago</literal> or
2459      empty.  The at sign (<literal>@</>) is optional noise.  The amounts
2460      of the different units are implicitly added with appropriate
2461      sign accounting.  <literal>ago</literal> negates all the fields.
2462      This syntax is also used for interval output, if
2463      <xref linkend="guc-intervalstyle"> is set to
2464      <literal>postgres_verbose</>.
2465     </para>
2466
2467     <para>
2468      Quantities of days, hours, minutes, and seconds can be specified without
2469      explicit unit markings.  For example, <literal>'1 12:59:10'</> is read
2470      the same as <literal>'1 day 12 hours 59 min 10 sec'</>.  Also,
2471      a combination of years and months can be specified with a dash;
2472      for example <literal>'200-10'</> is read the same as <literal>'200 years
2473      10 months'</>.  (These shorter forms are in fact the only ones allowed
2474      by the <acronym>SQL</acronym> standard, and are used for output when
2475      <varname>IntervalStyle</> is set to <literal>sql_standard</literal>.)
2476     </para>
2477
2478     <para>
2479      Interval values can also be written as ISO 8601 time intervals, using
2480      either the <quote>format with designators</> of the standard's section
2481      4.4.3.2 or the <quote>alternative format</> of section 4.4.3.3.  The
2482      format with designators looks like this:
2483 <synopsis>
2484 P <replaceable>quantity</> <replaceable>unit</> <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional> <optional> T <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional></optional>
2485 </synopsis>
2486       The string must start with a <literal>P</>, and may include a
2487       <literal>T</> that introduces the time-of-day units.  The
2488       available unit abbreviations are given in <xref
2489       linkend="datatype-interval-iso8601-units">.  Units may be
2490       omitted, and may be specified in any order, but units smaller than
2491       a day must appear after <literal>T</>.  In particular, the meaning of
2492       <literal>M</> depends on whether it is before or after
2493       <literal>T</>.
2494      </para>
2495
2496      <table id="datatype-interval-iso8601-units">
2497       <title>ISO 8601 Interval Unit Abbreviations</title>
2498      <tgroup cols="2">
2499        <thead>
2500         <row>
2501          <entry>Abbreviation</entry>
2502          <entry>Meaning</entry>
2503         </row>
2504        </thead>
2505        <tbody>
2506         <row>
2507          <entry>Y</entry>
2508          <entry>Years</entry>
2509         </row>
2510         <row>
2511          <entry>M</entry>
2512          <entry>Months (in the date part)</entry>
2513         </row>
2514         <row>
2515          <entry>W</entry>
2516          <entry>Weeks</entry>
2517         </row>
2518         <row>
2519          <entry>D</entry>
2520          <entry>Days</entry>
2521         </row>
2522         <row>
2523          <entry>H</entry>
2524          <entry>Hours</entry>
2525         </row>
2526         <row>
2527          <entry>M</entry>
2528          <entry>Minutes (in the time part)</entry>
2529         </row>
2530         <row>
2531          <entry>S</entry>
2532          <entry>Seconds</entry>
2533         </row>
2534        </tbody>
2535       </tgroup>
2536      </table>
2537
2538      <para>
2539       In the alternative format:
2540 <synopsis>
2541 P <optional> <replaceable>years</>-<replaceable>months</>-<replaceable>days</> </optional> <optional> T <replaceable>hours</>:<replaceable>minutes</>:<replaceable>seconds</> </optional>
2542 </synopsis>
2543       the string must begin with <literal>P</literal>, and a
2544       <literal>T</> separates the date and time parts of the interval.
2545       The values are given as numbers similar to ISO 8601 dates.
2546     </para>
2547
2548     <para>
2549      When writing an interval constant with a <replaceable>fields</>
2550      specification, or when assigning a string to an interval column that was
2551      defined with a <replaceable>fields</> specification, the interpretation of
2552      unmarked quantities depends on the <replaceable>fields</>.  For
2553      example <literal>INTERVAL '1' YEAR</> is read as 1 year, whereas
2554      <literal>INTERVAL '1'</> means 1 second.  Also, field values
2555      <quote>to the right</> of the least significant field allowed by the
2556      <replaceable>fields</> specification are silently discarded.  For
2557      example, writing <literal>INTERVAL '1 day 2:03:04' HOUR TO MINUTE</>
2558      results in dropping the seconds field, but not the day field.
2559     </para>
2560
2561     <para>
2562      According to the <acronym>SQL</> standard all fields of an interval
2563      value must have the same sign, so a leading negative sign applies to all
2564      fields; for example the negative sign in the interval literal
2565      <literal>'-1 2:03:04'</> applies to both the days and hour/minute/second
2566      parts.  <productname>PostgreSQL</> allows the fields to have different
2567      signs, and traditionally treats each field in the textual representation
2568      as independently signed, so that the hour/minute/second part is
2569      considered positive in this example.  If <varname>IntervalStyle</> is
2570      set to <literal>sql_standard</literal> then a leading sign is considered
2571      to apply to all fields (but only if no additional signs appear).
2572      Otherwise the traditional <productname>PostgreSQL</> interpretation is
2573      used.  To avoid ambiguity, it's recommended to attach an explicit sign
2574      to each field if any field is negative.
2575     </para>
2576
2577     <para>
2578      Internally <type>interval</> values are stored as months, days,
2579      and seconds. This is done because the number of days in a month
2580      varies, and a day can have 23 or 25 hours if a daylight savings
2581      time adjustment is involved.  The months and days fields are integers
2582      while the seconds field can store fractions.  Because intervals are
2583      usually created from constant strings or <type>timestamp</> subtraction,
2584      this storage method works well in most cases. Functions
2585      <function>justify_days</> and <function>justify_hours</> are
2586      available for adjusting days and hours that overflow their normal
2587      ranges.
2588     </para>
2589
2590     <para>
2591      In the verbose input format, and in some fields of the more compact
2592      input formats, field values can have fractional parts; for example
2593      <literal>'1.5 week'</> or <literal>'01:02:03.45'</>.  Such input is
2594      converted to the appropriate number of months, days, and seconds
2595      for storage.  When this would result in a fractional number of
2596      months or days, the fraction is added to the lower-order fields
2597      using the conversion factors 1 month = 30 days and 1 day = 24 hours.
2598      For example, <literal>'1.5 month'</> becomes 1 month and 15 days.
2599      Only seconds will ever be shown as fractional on output.
2600     </para>
2601
2602     <para>
2603      <xref linkend="datatype-interval-input-examples"> shows some examples
2604      of valid <type>interval</> input.
2605     </para>
2606
2607      <table id="datatype-interval-input-examples">
2608       <title>Interval Input</title>
2609       <tgroup cols="2">
2610        <thead>
2611         <row>
2612          <entry>Example</entry>
2613          <entry>Description</entry>
2614         </row>
2615        </thead>
2616        <tbody>
2617         <row>
2618          <entry>1-2</entry>
2619          <entry>SQL standard format: 1 year 2 months</entry>
2620         </row>
2621         <row>
2622          <entry>3 4:05:06</entry>
2623          <entry>SQL standard format: 3 days 4 hours 5 minutes 6 seconds</entry>
2624         </row>
2625         <row>
2626          <entry>1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
2627          <entry>Traditional Postgres format: 1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
2628         </row>
2629         <row>
2630          <entry>P1Y2M3DT4H5M6S</entry>
2631          <entry>ISO 8601 <quote>format with designators</>: same meaning as above</entry>
2632         </row>
2633         <row>
2634          <entry>P0001-02-03T04:05:06</entry>
2635          <entry>ISO 8601 <quote>alternative format</>: same meaning as above</entry>
2636         </row>
2637        </tbody>
2638       </tgroup>
2639      </table>
2640
2641    </sect2>
2642
2643    <sect2 id="datatype-interval-output">
2644     <title>Interval Output</title>
2645
2646     <indexterm>
2647      <primary>interval</primary>
2648      <secondary>output format</secondary>
2649      <seealso>formatting</seealso>
2650     </indexterm>
2651
2652     <para>
2653      The output format of the interval type can be set to one of the
2654      four styles <literal>sql_standard</>, <literal>postgres</>,
2655      <literal>postgres_verbose</>, or <literal>iso_8601</>,
2656      using the command <literal>SET intervalstyle</literal>.
2657      The default is the <literal>postgres</> format.
2658      <xref linkend="interval-style-output-table"> shows examples of each
2659      output style.
2660     </para>
2661
2662     <para>
2663      The <literal>sql_standard</> style produces output that conforms to
2664      the SQL standard's specification for interval literal strings, if
2665      the interval value meets the standard's restrictions (either year-month
2666      only or day-time only, with no mixing of positive
2667      and negative components).  Otherwise the output looks like a standard
2668      year-month literal string followed by a day-time literal string,
2669      with explicit signs added to disambiguate mixed-sign intervals.
2670     </para>
2671
2672     <para>
2673      The output of the <literal>postgres</> style matches the output of
2674      <productname>PostgreSQL</> releases prior to 8.4 when the
2675      <xref linkend="guc-datestyle"> parameter was set to <literal>ISO</>.
2676     </para>
2677
2678     <para>
2679      The output of the <literal>postgres_verbose</> style matches the output of
2680      <productname>PostgreSQL</> releases prior to 8.4 when the
2681      <varname>DateStyle</> parameter was set to non-<literal>ISO</> output.
2682     </para>
2683
2684     <para>
2685      The output of the <literal>iso_8601</> style matches the <quote>format
2686      with designators</> described in section 4.4.3.2 of the
2687      ISO 8601 standard.
2688     </para>
2689
2690      <table id="interval-style-output-table">
2691        <title>Interval Output Style Examples</title>
2692        <tgroup cols="4">
2693         <thead>
2694          <row>
2695           <entry>Style Specification</entry>
2696           <entry>Year-Month Interval</entry>
2697           <entry>Day-Time Interval</entry>
2698           <entry>Mixed Interval</entry>
2699          </row>
2700         </thead>
2701         <tbody>
2702          <row>
2703           <entry><literal>sql_standard</></entry>
2704           <entry>1-2</entry>
2705           <entry>3 4:05:06</entry>
2706           <entry>-1-2 +3 -4:05:06</entry>
2707          </row>
2708          <row>
2709           <entry><literal>postgres</></entry>
2710           <entry>1 year 2 mons</entry>
2711           <entry>3 days 04:05:06</entry>
2712           <entry>-1 year -2 mons +3 days -04:05:06</entry>
2713          </row>
2714          <row>
2715           <entry><literal>postgres_verbose</></entry>
2716           <entry>@ 1 year 2 mons</entry>
2717           <entry>@ 3 days 4 hours 5 mins 6 secs</entry>
2718           <entry>@ 1 year 2 mons -3 days 4 hours 5 mins 6 secs ago</entry>
2719          </row>
2720          <row>
2721           <entry><literal>iso_8601</></entry>
2722           <entry>P1Y2M</entry>
2723           <entry>P3DT4H5M6S</entry>
2724           <entry>P-1Y-2M3DT-4H-5M-6S</entry>
2725          </row>
2726         </tbody>
2727        </tgroup>
2728     </table>
2729
2730    </sect2>
2731
2732    <sect2 id="datatype-datetime-internals">
2733     <title>Internals</title>
2734
2735     <para>
2736      <productname>PostgreSQL</productname> uses Julian dates
2737      for all date/time calculations. This has the useful property of correctly
2738      calculating dates from 4713 BC
2739      to far into the future, using the assumption that the length of the
2740      year is 365.2425 days.
2741     </para>
2742
2743     <para>
2744      Date conventions before the 19th century make for interesting reading,
2745      but are not consistent enough to warrant coding into a date/time handler.
2746     </para>
2747    </sect2>
2748
2749   </sect1>
2750
2751   <sect1 id="datatype-boolean">
2752    <title>Boolean Type</title>
2753
2754    <indexterm zone="datatype-boolean">
2755     <primary>Boolean</primary>
2756     <secondary>data type</secondary>
2757    </indexterm>
2758
2759    <indexterm zone="datatype-boolean">
2760     <primary>true</primary>
2761    </indexterm>
2762
2763    <indexterm zone="datatype-boolean">
2764     <primary>false</primary>
2765    </indexterm>
2766
2767    <para>
2768     <productname>PostgreSQL</productname> provides the
2769     standard <acronym>SQL</acronym> type <type>boolean</type>;
2770     see <xref linkend="datatype-boolean-table">.
2771     The <type>boolean</type> type can have one of only two states:
2772     <quote>true</quote> or <quote>false</quote>.  A third state,
2773     <quote>unknown</quote>, is represented by the
2774     <acronym>SQL</acronym> null value.
2775    </para>
2776
2777    <table id="datatype-boolean-table">
2778     <title>Boolean Data Type</title>
2779     <tgroup cols="3">
2780      <thead>
2781       <row>
2782        <entry>Name</entry>
2783        <entry>Storage Size</entry>
2784        <entry>Description</entry>
2785       </row>
2786      </thead>
2787      <tbody>
2788       <row>
2789        <entry><type>boolean</type></entry>
2790        <entry>1 byte</entry>
2791        <entry>state of true or false</entry>
2792       </row>
2793      </tbody>
2794     </tgroup>
2795    </table>
2796
2797    <para>
2798     Valid literal values for the <quote>true</quote> state are:
2799     <simplelist>
2800      <member><literal>TRUE</literal></member>
2801      <member><literal>'t'</literal></member>
2802      <member><literal>'true'</literal></member>
2803      <member><literal>'y'</literal></member>
2804      <member><literal>'yes'</literal></member>
2805      <member><literal>'on'</literal></member>
2806      <member><literal>'1'</literal></member>
2807     </simplelist>
2808     For the <quote>false</quote> state, the following values can be
2809     used:
2810     <simplelist>
2811      <member><literal>FALSE</literal></member>
2812      <member><literal>'f'</literal></member>
2813      <member><literal>'false'</literal></member>
2814      <member><literal>'n'</literal></member>
2815      <member><literal>'no'</literal></member>
2816      <member><literal>'off'</literal></member>
2817      <member><literal>'0'</literal></member>
2818     </simplelist>
2819     Leading or trailing whitespace is ignored, and case does not matter.
2820     The key words
2821     <literal>TRUE</literal> and <literal>FALSE</literal> are the preferred
2822     (<acronym>SQL</acronym>-compliant) usage.
2823    </para>
2824
2825    <para>
2826     <xref linkend="datatype-boolean-example"> shows that
2827     <type>boolean</type> values are output using the letters
2828     <literal>t</literal> and <literal>f</literal>.
2829    </para>
2830
2831    <example id="datatype-boolean-example">
2832     <title>Using the <type>boolean</type> Type</title>
2833
2834 <programlisting>
2835 CREATE TABLE test1 (a boolean, b text);
2836 INSERT INTO test1 VALUES (TRUE, 'sic est');
2837 INSERT INTO test1 VALUES (FALSE, 'non est');
2838 SELECT * FROM test1;
2839  a |    b
2840 ---+---------
2841  t | sic est
2842  f | non est
2843
2844 SELECT * FROM test1 WHERE a;
2845  a |    b
2846 ---+---------
2847  t | sic est
2848 </programlisting>
2849    </example>
2850   </sect1>
2851
2852   <sect1 id="datatype-enum">
2853    <title>Enumerated Types</title>
2854
2855    <indexterm zone="datatype-enum">
2856     <primary>data type</primary>
2857     <secondary>enumerated (enum)</secondary>
2858    </indexterm>
2859
2860    <indexterm zone="datatype-enum">
2861     <primary>enumerated types</primary>
2862    </indexterm>
2863
2864    <para>
2865     Enumerated (enum) types are data types that
2866     comprise a static, ordered set of values.
2867     They are equivalent to the <type>enum</type>
2868     types supported in a number of programming languages. An example of an enum
2869     type might be the days of the week, or a set of status values for
2870     a piece of data.
2871    </para>
2872
2873    <sect2>
2874     <title>Declaration of Enumerated Types</title>
2875
2876     <para>
2877      Enum types are created using the <xref
2878      linkend="sql-createtype"> command,
2879      for example:
2880
2881 <programlisting>
2882 CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
2883 </programlisting>
2884
2885      Once created, the enum type can be used in table and function
2886      definitions much like any other type:
2887 <programlisting>
2888 CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
2889 CREATE TABLE person (
2890     name text,
2891     current_mood mood
2892 );
2893 INSERT INTO person VALUES ('Moe', 'happy');
2894 SELECT * FROM person WHERE current_mood = 'happy';
2895  name | current_mood
2896 ------+--------------
2897  Moe  | happy
2898 (1 row)
2899 </programlisting>
2900     </para>
2901     </sect2>
2902
2903     <sect2>
2904      <title>Ordering</title>
2905
2906      <para>
2907       The ordering of the values in an enum type is the
2908       order in which the values were listed when the type was created.
2909       All standard comparison operators and related
2910       aggregate functions are supported for enums.  For example:
2911
2912 <programlisting>
2913 INSERT INTO person VALUES ('Larry', 'sad');
2914 INSERT INTO person VALUES ('Curly', 'ok');
2915 SELECT * FROM person WHERE current_mood > 'sad';
2916  name  | current_mood
2917 -------+--------------
2918  Moe   | happy
2919  Curly | ok
2920 (2 rows)
2921
2922 SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood;
2923  name  | current_mood
2924 -------+--------------
2925  Curly | ok
2926  Moe   | happy
2927 (2 rows)
2928
2929 SELECT name
2930 FROM person
2931 WHERE current_mood = (SELECT MIN(current_mood) FROM person);
2932  name
2933 -------
2934  Larry
2935 (1 row)
2936 </programlisting>
2937      </para>
2938    </sect2>
2939
2940    <sect2>
2941     <title>Type Safety</title>
2942
2943     <para>
2944      Each enumerated data type is separate and cannot
2945      be compared with other enumerated types.  See this example:
2946
2947 <programlisting>
2948 CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic');
2949 CREATE TABLE holidays (
2950     num_weeks integer,
2951     happiness happiness
2952 );
2953 INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy');
2954 INSERT INTO holidays(num_weeks,happiness) VALUES (6, 'very happy');
2955 INSERT INTO holidays(num_weeks,happiness) VALUES (8, 'ecstatic');
2956 INSERT INTO holidays(num_weeks,happiness) VALUES (2, 'sad');
2957 ERROR:  invalid input value for enum happiness: "sad"
2958 SELECT person.name, holidays.num_weeks FROM person, holidays
2959   WHERE person.current_mood = holidays.happiness;
2960 ERROR:  operator does not exist: mood = happiness
2961 </programlisting>
2962     </para>
2963
2964     <para>
2965      If you really need to do something like that, you can either
2966      write a custom operator or add explicit casts to your query:
2967
2968 <programlisting>
2969 SELECT person.name, holidays.num_weeks FROM person, holidays
2970   WHERE person.current_mood::text = holidays.happiness::text;
2971  name | num_weeks
2972 ------+-----------
2973  Moe  |         4
2974 (1 row)
2975
2976 </programlisting>
2977     </para>
2978    </sect2>
2979
2980    <sect2>
2981     <title>Implementation Details</title>
2982
2983     <para>
2984      An enum value occupies four bytes on disk.  The length of an enum
2985      value's textual label is limited by the <symbol>NAMEDATALEN</symbol>
2986      setting compiled into <productname>PostgreSQL</productname>; in standard
2987      builds this means at most 63 bytes.
2988     </para>
2989
2990     <para>
2991      Enum labels are case sensitive, so
2992      <type>'happy'</type> is not the same as <type>'HAPPY'</type>.
2993      White space in the labels is significant too.
2994     </para>
2995
2996     <para>
2997      The translations from internal enum values to textual labels are
2998      kept in the system catalog
2999      <link linkend="catalog-pg-enum"><structname>pg_enum</structname></link>.
3000      Querying this catalog directly can be useful.
3001     </para>
3002
3003    </sect2>
3004   </sect1>
3005
3006   <sect1 id="datatype-geometric">
3007    <title>Geometric Types</title>
3008
3009    <para>
3010     Geometric data types represent two-dimensional spatial
3011     objects. <xref linkend="datatype-geo-table"> shows the geometric
3012     types available in <productname>PostgreSQL</productname>.  The
3013     most fundamental type, the point, forms the basis for all of the
3014     other types.
3015    </para>
3016
3017     <table id="datatype-geo-table">
3018      <title>Geometric Types</title>
3019      <tgroup cols="4">
3020       <thead>
3021        <row>
3022         <entry>Name</entry>
3023         <entry>Storage Size</entry>
3024         <entry>Representation</entry>
3025         <entry>Description</entry>
3026        </row>
3027       </thead>
3028       <tbody>
3029        <row>
3030         <entry><type>point</type></entry>
3031         <entry>16 bytes</entry>
3032         <entry>Point on a plane</entry>
3033         <entry>(x,y)</entry>
3034        </row>
3035        <row>
3036         <entry><type>line</type></entry>
3037         <entry>32 bytes</entry>
3038         <entry>Infinite line (not fully implemented)</entry>
3039         <entry>((x1,y1),(x2,y2))</entry>
3040        </row>
3041        <row>
3042         <entry><type>lseg</type></entry>
3043         <entry>32 bytes</entry>
3044         <entry>Finite line segment</entry>
3045         <entry>((x1,y1),(x2,y2))</entry>
3046        </row>
3047        <row>
3048         <entry><type>box</type></entry>
3049         <entry>32 bytes</entry>
3050         <entry>Rectangular box</entry>
3051         <entry>((x1,y1),(x2,y2))</entry>
3052        </row>
3053        <row>
3054         <entry><type>path</type></entry>
3055         <entry>16+16n bytes</entry>
3056         <entry>Closed path (similar to polygon)</entry>
3057         <entry>((x1,y1),...)</entry>
3058        </row>
3059        <row>
3060         <entry><type>path</type></entry>
3061         <entry>16+16n bytes</entry>
3062         <entry>Open path</entry>
3063         <entry>[(x1,y1),...]</entry>
3064        </row>
3065        <row>
3066         <entry><type>polygon</type></entry>
3067         <entry>40+16n bytes</entry>
3068         <entry>Polygon (similar to closed path)</entry>
3069         <entry>((x1,y1),...)</entry>
3070        </row>
3071        <row>
3072         <entry><type>circle</type></entry>
3073         <entry>24 bytes</entry>
3074         <entry>Circle</entry>
3075         <entry>&lt;(x,y),r&gt; (center point and radius)</entry>
3076        </row>
3077       </tbody>
3078      </tgroup>
3079     </table>
3080
3081    <para>
3082     A rich set of functions and operators is available to perform various geometric
3083     operations such as scaling, translation, rotation, and determining
3084     intersections.  They are explained in <xref linkend="functions-geometry">.
3085    </para>
3086
3087    <sect2>
3088     <title>Points</title>
3089
3090     <indexterm>
3091      <primary>point</primary>
3092     </indexterm>
3093
3094     <para>
3095      Points are the fundamental two-dimensional building block for geometric
3096      types.  Values of type <type>point</type> are specified using either of
3097      the following syntaxes:
3098
3099 <synopsis>
3100 ( <replaceable>x</replaceable> , <replaceable>y</replaceable> )
3101   <replaceable>x</replaceable> , <replaceable>y</replaceable>
3102 </synopsis>
3103
3104      where <replaceable>x</> and <replaceable>y</> are the respective
3105      coordinates, as floating-point numbers.
3106     </para>
3107
3108     <para>
3109      Points are output using the first syntax.
3110     </para>
3111    </sect2>
3112
3113    <sect2>
3114     <title>Line Segments</title>
3115
3116     <indexterm>
3117      <primary>lseg</primary>
3118     </indexterm>
3119
3120     <indexterm>
3121      <primary>line segment</primary>
3122     </indexterm>
3123
3124     <para>
3125      Line segments (<type>lseg</type>) are represented by pairs of points.
3126      Values of type <type>lseg</type> are specified using any of the following
3127      syntaxes:
3128
3129 <synopsis>
3130 [ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) ]
3131 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
3132   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
3133     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   ,   <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
3134 </synopsis>
3135
3136      where
3137      <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
3138      and
3139      <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
3140      are the end points of the line segment.
3141     </para>
3142
3143     <para>
3144      Line segments are output using the first syntax.
3145     </para>
3146    </sect2>
3147
3148    <sect2>
3149     <title>Boxes</title>
3150
3151     <indexterm>
3152      <primary>box (data type)</primary>
3153     </indexterm>
3154
3155     <indexterm>
3156      <primary>rectangle</primary>
3157     </indexterm>
3158
3159     <para>
3160      Boxes are represented by pairs of points that are opposite
3161      corners of the box.
3162      Values of type <type>box</type> are specified using any of the following
3163      syntaxes:
3164
3165 <synopsis>
3166 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
3167   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
3168     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   ,   <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
3169 </synopsis>
3170
3171      where
3172      <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
3173      and
3174      <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
3175      are any two opposite corners of the box.
3176     </para>
3177
3178     <para>
3179      Boxes are output using the second syntax.
3180     </para>
3181
3182     <para>
3183      Any two opposite corners can be supplied on input, but the values
3184      will be reordered as needed to store the
3185      upper right and lower left corners, in that order.
3186     </para>
3187    </sect2>
3188
3189    <sect2>
3190     <title>Paths</title>
3191
3192     <indexterm>
3193      <primary>path (data type)</primary>
3194     </indexterm>
3195
3196     <para>
3197      Paths are represented by lists of connected points. Paths can be
3198      <firstterm>open</firstterm>, where
3199      the first and last points in the list are considered not connected, or
3200      <firstterm>closed</firstterm>,
3201      where the first and last points are considered connected.
3202     </para>
3203
3204     <para>
3205      Values of type <type>path</type> are specified using any of the following
3206      syntaxes:
3207
3208 <synopsis>
3209 [ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ]
3210 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
3211   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3212   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3213     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
3214 </synopsis>
3215
3216      where the points are the end points of the line segments
3217      comprising the path.  Square brackets (<literal>[]</>) indicate
3218      an open path, while parentheses (<literal>()</>) indicate a
3219      closed path.  When the outermost parentheses are omitted, as
3220      in the third through fifth syntaxes, a closed path is assumed.
3221     </para>
3222
3223     <para>
3224      Paths are output using the first or second syntax, as appropriate.
3225     </para>
3226    </sect2>
3227
3228    <sect2>
3229     <title>Polygons</title>
3230
3231     <indexterm>
3232      <primary>polygon</primary>
3233     </indexterm>
3234
3235     <para>
3236      Polygons are represented by lists of points (the vertexes of the
3237      polygon). Polygons are very similar to closed paths, but are
3238      stored differently and have their own set of support routines.
3239     </para>
3240
3241     <para>
3242      Values of type <type>polygon</type> are specified using any of the
3243      following syntaxes:
3244
3245 <synopsis>
3246 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
3247   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3248   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3249     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
3250 </synopsis>
3251
3252      where the points are the end points of the line segments
3253      comprising the boundary of the polygon.
3254     </para>
3255
3256     <para>
3257      Polygons are output using the first syntax.
3258     </para>
3259    </sect2>
3260
3261    <sect2>
3262     <title>Circles</title>
3263
3264     <indexterm>
3265      <primary>circle</primary>
3266     </indexterm>
3267
3268     <para>
3269      Circles are represented by a center point and radius.
3270      Values of type <type>circle</type> are specified using any of the
3271      following syntaxes:
3272
3273 <synopsis>
3274 &lt; ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> &gt;
3275 ( ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> )
3276   ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable>
3277     <replaceable>x</replaceable> , <replaceable>y</replaceable>   , <replaceable>r</replaceable>
3278 </synopsis>
3279
3280      where
3281      <literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</>
3282      is the center point and <replaceable>r</replaceable> is the radius of the
3283      circle.
3284     </para>
3285
3286     <para>
3287      Circles are output using the first syntax.
3288     </para>
3289    </sect2>
3290
3291   </sect1>
3292
3293   <sect1 id="datatype-net-types">
3294    <title>Network Address Types</title>
3295
3296    <indexterm zone="datatype-net-types">
3297     <primary>network</primary>
3298     <secondary>data types</secondary>
3299    </indexterm>
3300
3301    <para>
3302     <productname>PostgreSQL</> offers data types to store IPv4, IPv6, and MAC
3303     addresses, as shown in <xref linkend="datatype-net-types-table">.  It
3304     is better to use these types instead of plain text types to store
3305     network addresses, because
3306     these types offer input error checking and specialized
3307     operators and functions (see <xref linkend="functions-net">).
3308    </para>
3309
3310     <table tocentry="1" id="datatype-net-types-table">
3311      <title>Network Address Types</title>
3312      <tgroup cols="3">
3313       <thead>
3314        <row>
3315         <entry>Name</entry>
3316         <entry>Storage Size</entry>
3317         <entry>Description</entry>
3318        </row>
3319       </thead>
3320       <tbody>
3321
3322        <row>
3323         <entry><type>cidr</type></entry>
3324         <entry>7 or 19 bytes</entry>
3325         <entry>IPv4 and IPv6 networks</entry>
3326        </row>
3327
3328        <row>
3329         <entry><type>inet</type></entry>
3330         <entry>7 or 19 bytes</entry>
3331         <entry>IPv4 and IPv6 hosts and networks</entry>
3332        </row>
3333
3334        <row>
3335         <entry><type>macaddr</type></entry>
3336         <entry>6 bytes</entry>
3337         <entry>MAC addresses</entry>
3338        </row>
3339
3340       </tbody>
3341      </tgroup>
3342     </table>
3343
3344    <para>
3345     When sorting <type>inet</type> or <type>cidr</type> data types,
3346     IPv4 addresses will always sort before IPv6 addresses, including
3347     IPv4 addresses encapsulated or mapped to IPv6 addresses, such as
3348     ::10.2.3.4 or ::ffff:10.4.3.2.
3349    </para>
3350
3351
3352    <sect2 id="datatype-inet">
3353     <title><type>inet</type></title>
3354
3355     <indexterm>
3356      <primary>inet (data type)</primary>
3357     </indexterm>
3358
3359     <para>
3360      The <type>inet</type> type holds an IPv4 or IPv6 host address, and
3361      optionally its subnet, all in one field.
3362      The subnet is represented by the number of network address bits
3363      present in the host address (the
3364      <quote>netmask</quote>).  If the netmask is 32 and the address is IPv4,
3365      then the value does not indicate a subnet, only a single host.
3366      In IPv6, the address length is 128 bits, so 128 bits specify a
3367      unique host address.  Note that if you
3368      want to accept only networks, you should use the
3369      <type>cidr</type> type rather than <type>inet</type>.
3370     </para>
3371
3372     <para>
3373       The input format for this type is
3374       <replaceable class="parameter">address/y</replaceable>
3375       where
3376       <replaceable class="parameter">address</replaceable>
3377       is an IPv4 or IPv6 address and
3378       <replaceable class="parameter">y</replaceable>
3379       is the number of bits in the netmask.  If the
3380       <replaceable class="parameter">/y</replaceable>
3381       portion is missing, the
3382       netmask is 32 for IPv4 and 128 for IPv6, so the value represents
3383       just a single host.  On display, the
3384       <replaceable class="parameter">/y</replaceable>
3385       portion is suppressed if the netmask specifies a single host.
3386     </para>
3387    </sect2>
3388
3389    <sect2 id="datatype-cidr">
3390     <title><type>cidr</></title>
3391
3392     <indexterm>
3393      <primary>cidr</primary>
3394     </indexterm>
3395
3396     <para>
3397      The <type>cidr</type> type holds an IPv4 or IPv6 network specification.
3398      Input and output formats follow Classless Internet Domain Routing
3399      conventions.
3400      The format for specifying networks is <replaceable
3401      class="parameter">address/y</> where <replaceable
3402      class="parameter">address</> is the network represented as an
3403      IPv4 or IPv6 address, and <replaceable
3404      class="parameter">y</> is the number of bits in the netmask.  If
3405      <replaceable class="parameter">y</> is omitted, it is calculated
3406      using assumptions from the older classful network numbering system, except
3407      it will be at least large enough to include all of the octets
3408      written in the input.  It is an error to specify a network address
3409      that has bits set to the right of the specified netmask.
3410     </para>
3411
3412     <para>
3413      <xref linkend="datatype-net-cidr-table"> shows some examples.
3414     </para>
3415
3416      <table id="datatype-net-cidr-table">
3417       <title><type>cidr</> Type Input Examples</title>
3418       <tgroup cols="3">
3419        <thead>
3420         <row>
3421          <entry><type>cidr</type> Input</entry>
3422          <entry><type>cidr</type> Output</entry>
3423          <entry><literal><function>abbrev(<type>cidr</type>)</function></literal></entry>
3424         </row>
3425        </thead>
3426        <tbody>
3427         <row>
3428          <entry>192.168.100.128/25</entry>
3429          <entry>192.168.100.128/25</entry>
3430          <entry>192.168.100.128/25</entry>
3431         </row>
3432         <row>
3433          <entry>192.168/24</entry>
3434          <entry>192.168.0.0/24</entry>
3435          <entry>192.168.0/24</entry>
3436         </row>
3437         <row>
3438          <entry>192.168/25</entry>
3439          <entry>192.168.0.0/25</entry>
3440          <entry>192.168.0.0/25</entry>
3441         </row>
3442         <row>
3443          <entry>192.168.1</entry>
3444          <entry>192.168.1.0/24</entry>
3445          <entry>192.168.1/24</entry>
3446         </row>
3447         <row>
3448          <entry>192.168</entry>
3449          <entry>192.168.0.0/24</entry>
3450          <entry>192.168.0/24</entry>
3451         </row>
3452         <row>
3453          <entry>128.1</entry>
3454          <entry>128.1.0.0/16</entry>
3455          <entry>128.1/16</entry>
3456         </row>
3457         <row>
3458          <entry>128</entry>
3459          <entry>128.0.0.0/16</entry>
3460          <entry>128.0/16</entry>
3461         </row>
3462         <row>
3463          <entry>128.1.2</entry>
3464          <entry>128.1.2.0/24</entry>
3465          <entry>128.1.2/24</entry>
3466         </row>
3467         <row>
3468          <entry>10.1.2</entry>
3469          <entry>10.1.2.0/24</entry>
3470          <entry>10.1.2/24</entry>
3471         </row>
3472         <row>
3473          <entry>10.1</entry>
3474          <entry>10.1.0.0/16</entry>
3475          <entry>10.1/16</entry>
3476         </row>
3477         <row>
3478          <entry>10</entry>
3479          <entry>10.0.0.0/8</entry>
3480          <entry>10/8</entry>
3481         </row>
3482         <row>
3483          <entry>10.1.2.3/32</entry>
3484          <entry>10.1.2.3/32</entry>
3485          <entry>10.1.2.3/32</entry>
3486         </row>
3487         <row>
3488          <entry>2001:4f8:3:ba::/64</entry>
3489          <entry>2001:4f8:3:ba::/64</entry>
3490          <entry>2001:4f8:3:ba::/64</entry>
3491         </row>
3492         <row>
3493          <entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
3494          <entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
3495          <entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1</entry>
3496         </row>
3497         <row>
3498          <entry>::ffff:1.2.3.0/120</entry>
3499          <entry>::ffff:1.2.3.0/120</entry>
3500          <entry>::ffff:1.2.3/120</entry>
3501         </row>
3502         <row>
3503          <entry>::ffff:1.2.3.0/128</entry>
3504          <entry>::ffff:1.2.3.0/128</entry>
3505          <entry>::ffff:1.2.3.0/128</entry>
3506         </row>
3507        </tbody>
3508       </tgroup>
3509      </table>
3510    </sect2>
3511
3512    <sect2 id="datatype-inet-vs-cidr">
3513     <title><type>inet</type> vs. <type>cidr</type></title>
3514
3515     <para>
3516     The essential difference between <type>inet</type> and <type>cidr</type>
3517     data types is that <type>inet</type> accepts values with nonzero bits to
3518     the right of the netmask, whereas <type>cidr</type> does not.
3519     </para>
3520
3521       <tip>
3522         <para>
3523         If you do not like the output format for <type>inet</type> or
3524         <type>cidr</type> values, try the functions <function>host</>,
3525         <function>text</>, and <function>abbrev</>.
3526         </para>
3527       </tip>
3528    </sect2>
3529
3530    <sect2 id="datatype-macaddr">
3531     <title><type>macaddr</type></title>
3532
3533     <indexterm>
3534      <primary>macaddr (data type)</primary>
3535     </indexterm>
3536
3537     <indexterm>
3538      <primary>MAC address</primary>
3539      <see>macaddr</see>
3540     </indexterm>
3541
3542     <para>
3543      The <type>macaddr</> type stores MAC addresses, known for example
3544      from Ethernet card hardware addresses (although MAC addresses are
3545      used for other purposes as well).  Input is accepted in the
3546      following formats:
3547
3548      <simplelist>
3549       <member><literal>'08:00:2b:01:02:03'</></member>
3550       <member><literal>'08-00-2b-01-02-03'</></member>
3551       <member><literal>'08002b:010203'</></member>
3552       <member><literal>'08002b-010203'</></member>
3553       <member><literal>'0800.2b01.0203'</></member>
3554       <member><literal>'08002b010203'</></member>
3555      </simplelist>
3556
3557      These examples would all specify the same address.  Upper and
3558      lower case is accepted for the digits
3559      <literal>a</> through <literal>f</>.  Output is always in the
3560      first of the forms shown.
3561     </para>
3562
3563     <para>
3564      IEEE Std 802-2001 specifies the second shown form (with hyphens)
3565      as the canonical form for MAC addresses, and specifies the first
3566      form (with colons) as the bit-reversed notation, so that
3567      08-00-2b-01-02-03 = 01:00:4D:08:04:0C.  This convention is widely
3568      ignored nowadays, and it is only relevant for obsolete network
3569      protocols (such as Token Ring).  PostgreSQL makes no provisions
3570      for bit reversal, and all accepted formats use the canonical LSB
3571      order.
3572     </para>
3573
3574     <para>
3575      The remaining four input formats are not part of any standard.
3576     </para>
3577    </sect2>
3578
3579   </sect1>
3580
3581   <sect1 id="datatype-bit">
3582    <title>Bit String Types</title>
3583
3584    <indexterm zone="datatype-bit">
3585     <primary>bit string</primary>
3586     <secondary>data type</secondary>
3587    </indexterm>
3588
3589    <para>
3590     Bit strings are strings of 1's and 0's.  They can be used to store
3591     or visualize bit masks.  There are two SQL bit types:
3592     <type>bit(<replaceable>n</replaceable>)</type> and <type>bit
3593     varying(<replaceable>n</replaceable>)</type>, where
3594     <replaceable>n</replaceable> is a positive integer.
3595    </para>
3596
3597    <para>
3598     <type>bit</type> type data must match the length
3599     <replaceable>n</replaceable> exactly; it is an error to attempt to
3600     store shorter or longer bit strings.  <type>bit varying</type> data is
3601     of variable length up to the maximum length
3602     <replaceable>n</replaceable>; longer strings will be rejected.
3603     Writing <type>bit</type> without a length is equivalent to
3604     <literal>bit(1)</literal>, while <type>bit varying</type> without a length
3605     specification means unlimited length.
3606    </para>
3607
3608    <note>
3609     <para>
3610      If one explicitly casts a bit-string value to
3611      <type>bit(<replaceable>n</>)</type>, it will be truncated or
3612      zero-padded on the right to be exactly <replaceable>n</> bits,
3613      without raising an error.  Similarly,
3614      if one explicitly casts a bit-string value to
3615      <type>bit varying(<replaceable>n</>)</type>, it will be truncated
3616      on the right if it is more than <replaceable>n</> bits.
3617     </para>
3618    </note>
3619
3620    <para>
3621     Refer to <xref
3622     linkend="sql-syntax-bit-strings"> for information about the syntax
3623     of bit string constants.  Bit-logical operators and string
3624     manipulation functions are available; see <xref
3625     linkend="functions-bitstring">.
3626    </para>
3627
3628    <example>
3629     <title>Using the Bit String Types</title>
3630
3631 <programlisting>
3632 CREATE TABLE test (a BIT(3), b BIT VARYING(5));
3633 INSERT INTO test VALUES (B'101', B'00');
3634 INSERT INTO test VALUES (B'10', B'101');
3635 <computeroutput>
3636 ERROR:  bit string length 2 does not match type bit(3)
3637 </computeroutput>
3638 INSERT INTO test VALUES (B'10'::bit(3), B'101');
3639 SELECT * FROM test;
3640 <computeroutput>
3641   a  |  b
3642 -----+-----
3643  101 | 00
3644  100 | 101
3645 </computeroutput>
3646 </programlisting>
3647    </example>
3648
3649    <para>
3650     A bit string value requires 1 byte for each group of 8 bits, plus
3651     5 or 8 bytes overhead depending on the length of the string
3652     (but long values may be compressed or moved out-of-line, as explained
3653     in <xref linkend="datatype-character"> for character strings).
3654    </para>
3655   </sect1>
3656
3657   <sect1 id="datatype-textsearch">
3658    <title>Text Search Types</title>
3659
3660    <indexterm zone="datatype-textsearch">
3661     <primary>full text search</primary>
3662     <secondary>data types</secondary>
3663    </indexterm>
3664
3665    <indexterm zone="datatype-textsearch">
3666     <primary>text search</primary>
3667     <secondary>data types</secondary>
3668    </indexterm>
3669
3670    <para>
3671     <productname>PostgreSQL</productname> provides two data types that
3672     are designed to support full text search, which is the activity of
3673     searching through a collection of natural-language <firstterm>documents</>
3674     to locate those that best match a <firstterm>query</>.
3675     The <type>tsvector</type> type represents a document in a form optimized
3676     for text search; the <type>tsquery</type> type similarly represents
3677     a text query.
3678     <xref linkend="textsearch"> provides a detailed explanation of this
3679     facility, and <xref linkend="functions-textsearch"> summarizes the
3680     related functions and operators.
3681    </para>
3682
3683    <sect2 id="datatype-tsvector">
3684     <title><type>tsvector</type></title>
3685
3686     <indexterm>
3687      <primary>tsvector (data type)</primary>
3688     </indexterm>
3689
3690     <para>
3691      A <type>tsvector</type> value is a sorted list of distinct
3692      <firstterm>lexemes</>, which are words that have been
3693      <firstterm>normalized</> to merge different variants of the same word
3694      (see <xref linkend="textsearch"> for details).  Sorting and
3695      duplicate-elimination are done automatically during input, as shown in
3696      this example:
3697
3698 <programlisting>
3699 SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;
3700                       tsvector
3701 ----------------------------------------------------
3702  'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat'
3703 </programlisting>
3704
3705      To represent
3706      lexemes containing whitespace or punctuation, surround them with quotes:
3707
3708 <programlisting>
3709 SELECT $$the lexeme '    ' contains spaces$$::tsvector;
3710                  tsvector
3711 -------------------------------------------
3712  '    ' 'contains' 'lexeme' 'spaces' 'the'
3713 </programlisting>
3714
3715      (We use dollar-quoted string literals in this example and the next one
3716      to avoid the confusion of having to double quote marks within the
3717      literals.)  Embedded quotes and backslashes must be doubled:
3718
3719 <programlisting>
3720 SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;
3721                     tsvector
3722 ------------------------------------------------
3723  'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'
3724 </programlisting>
3725
3726      Optionally, integer <firstterm>positions</>
3727      can be attached to lexemes:
3728
3729 <programlisting>
3730 SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;
3731                                   tsvector
3732 -------------------------------------------------------------------------------
3733  'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4
3734 </programlisting>
3735
3736      A position normally indicates the source word's location in the
3737      document.  Positional information can be used for
3738      <firstterm>proximity ranking</firstterm>.  Position values can
3739      range from 1 to 16383; larger numbers are silently set to 16383.
3740      Duplicate positions for the same lexeme are discarded.
3741     </para>
3742
3743     <para>
3744      Lexemes that have positions can further be labeled with a
3745      <firstterm>weight</>, which can be <literal>A</literal>,
3746      <literal>B</literal>, <literal>C</literal>, or <literal>D</literal>.
3747      <literal>D</literal> is the default and hence is not shown on output:
3748
3749 <programlisting>
3750 SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
3751           tsvector
3752 ----------------------------
3753  'a':1A 'cat':5 'fat':2B,4C
3754 </programlisting>
3755
3756      Weights are typically used to reflect document structure, for example
3757      by marking title words differently from body words.  Text search
3758      ranking functions can assign different priorities to the different
3759      weight markers.
3760     </para>
3761
3762     <para>
3763      It is important to understand that the
3764      <type>tsvector</type> type itself does not perform any normalization;
3765      it assumes the words it is given are normalized appropriately
3766      for the application.  For example,
3767
3768 <programlisting>
3769 select 'The Fat Rats'::tsvector;
3770       tsvector
3771 --------------------
3772  'Fat' 'Rats' 'The'
3773 </programlisting>
3774
3775      For most English-text-searching applications the above words would
3776      be considered non-normalized, but <type>tsvector</type> doesn't care.
3777      Raw document text should usually be passed through
3778      <function>to_tsvector</> to normalize the words appropriately
3779      for searching:
3780
3781 <programlisting>
3782 SELECT to_tsvector('english', 'The Fat Rats');
3783    to_tsvector
3784 -----------------
3785  'fat':2 'rat':3
3786 </programlisting>
3787
3788      Again, see <xref linkend="textsearch"> for more detail.
3789     </para>
3790
3791    </sect2>
3792
3793    <sect2 id="datatype-tsquery">
3794     <title><type>tsquery</type></title>
3795
3796     <indexterm>
3797      <primary>tsquery (data type)</primary>
3798     </indexterm>
3799
3800     <para>
3801      A <type>tsquery</type> value stores lexemes that are to be
3802      searched for, and combines them honoring the Boolean operators
3803      <literal>&amp;</literal> (AND), <literal>|</literal> (OR), and
3804      <literal>!</> (NOT).  Parentheses can be used to enforce grouping
3805      of the operators:
3806
3807 <programlisting>
3808 SELECT 'fat &amp; rat'::tsquery;
3809     tsquery
3810 ---------------
3811  'fat' &amp; 'rat'
3812
3813 SELECT 'fat &amp; (rat | cat)'::tsquery;
3814           tsquery
3815 ---------------------------
3816  'fat' &amp; ( 'rat' | 'cat' )
3817
3818 SELECT 'fat &amp; rat &amp; ! cat'::tsquery;
3819         tsquery
3820 ------------------------
3821  'fat' &amp; 'rat' &amp; !'cat'
3822 </programlisting>
3823
3824      In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
3825      and <literal>&amp;</literal> (AND) binds more tightly than
3826      <literal>|</literal> (OR).
3827     </para>
3828
3829     <para>
3830      Optionally, lexemes in a <type>tsquery</type> can be labeled with
3831      one or more weight letters, which restricts them to match only
3832      <type>tsvector</> lexemes with matching weights:
3833
3834 <programlisting>
3835 SELECT 'fat:ab &amp; cat'::tsquery;
3836     tsquery
3837 ------------------
3838  'fat':AB &amp; 'cat'
3839 </programlisting>
3840     </para>
3841
3842     <para>
3843      Also, lexemes in a <type>tsquery</type> can be labeled with <literal>*</>
3844      to specify prefix matching:
3845 <programlisting>
3846 SELECT 'super:*'::tsquery;
3847   tsquery
3848 -----------
3849  'super':*
3850 </programlisting>
3851      This query will match any word in a <type>tsvector</> that begins
3852      with <quote>super</>.  Note that prefixes are first processed by
3853      text search configurations, which means this comparison returns
3854      true:
3855 <programlisting>
3856 SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
3857  ?column?
3858 ----------
3859  t
3860 (1 row)
3861 </programlisting>
3862      because <literal>postgres</> gets stemmed to <literal>postgr</>:
3863 <programlisting>
3864 SELECT to_tsquery('postgres:*');
3865  to_tsquery
3866 ------------
3867  'postgr':*
3868 (1 row)
3869 </programlisting>
3870      which then matches <literal>postgraduate</>.
3871     </para>
3872
3873     <para>
3874      Quoting rules for lexemes are the same as described previously for
3875      lexemes in <type>tsvector</>; and, as with <type>tsvector</>,
3876      any required normalization of words must be done before converting
3877      to the <type>tsquery</> type.  The <function>to_tsquery</>
3878      function is convenient for performing such normalization:
3879
3880 <programlisting>
3881 SELECT to_tsquery('Fat:ab &amp; Cats');
3882     to_tsquery
3883 ------------------
3884  'fat':AB &amp; 'cat'
3885 </programlisting>
3886     </para>
3887
3888    </sect2>
3889
3890   </sect1>
3891
3892   <sect1 id="datatype-uuid">
3893    <title><acronym>UUID</acronym> Type</title>
3894
3895    <indexterm zone="datatype-uuid">
3896     <primary>UUID</primary>
3897    </indexterm>
3898
3899    <para>
3900     The data type <type>uuid</type> stores Universally Unique Identifiers
3901     (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.
3902     (Some systems refer to this data type as a globally unique identifier, or
3903     GUID,<indexterm><primary>GUID</primary></indexterm> instead.)  This
3904     identifier is a 128-bit quantity that is generated by an algorithm chosen
3905     to make it very unlikely that the same identifier will be generated by
3906     anyone else in the known universe using the same algorithm.  Therefore,
3907     for distributed systems, these identifiers provide a better uniqueness
3908     guarantee than sequence generators, which
3909     are only unique within a single database.
3910    </para>
3911
3912    <para>
3913     A UUID is written as a sequence of lower-case hexadecimal digits,
3914     in several groups separated by hyphens, specifically a group of 8
3915     digits followed by three groups of 4 digits followed by a group of
3916     12 digits, for a total of 32 digits representing the 128 bits.  An
3917     example of a UUID in this standard form is:
3918 <programlisting>
3919 a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11
3920 </programlisting>
3921     <productname>PostgreSQL</productname> also accepts the following
3922     alternative forms for input:
3923     use of upper-case digits, the standard format surrounded by
3924     braces, omitting some or all hyphens, adding a hyphen after any
3925     group of four digits.  Examples are:
3926 <programlisting>
3927 A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11
3928 {a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}
3929 a0eebc999c0b4ef8bb6d6bb9bd380a11
3930 a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
3931 {a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}
3932 </programlisting>
3933     Output is always in the standard form.
3934    </para>
3935
3936    <para>
3937     <productname>PostgreSQL</productname> provides storage and comparison
3938     functions for UUIDs, but the core database does not include any
3939     function for generating UUIDs, because no single algorithm is well
3940     suited for every application.  The <xref
3941     linkend="uuid-ossp"> module
3942     provides functions that implement several standard algorithms.
3943     Alternatively, UUIDs could be generated by client applications or
3944     other libraries invoked through a server-side function.
3945    </para>
3946   </sect1>
3947
3948   <sect1 id="datatype-xml">
3949    <title><acronym>XML</> Type</title>
3950
3951    <indexterm zone="datatype-xml">
3952     <primary>XML</primary>
3953    </indexterm>
3954
3955    <para>
3956     The <type>xml</type> data type can be used to store XML data.  Its
3957     advantage over storing XML data in a <type>text</type> field is that it
3958     checks the input values for well-formedness, and there are support
3959     functions to perform type-safe operations on it; see <xref
3960     linkend="functions-xml">.  Use of this data type requires the
3961     installation to have been built with <command>configure
3962     --with-libxml</>.
3963    </para>
3964
3965    <para>
3966     The <type>xml</type> type can store well-formed
3967     <quote>documents</quote>, as defined by the XML standard, as well
3968     as <quote>content</quote> fragments, which are defined by the
3969     production <literal>XMLDecl? content</literal> in the XML
3970     standard.  Roughly, this means that content fragments can have
3971     more than one top-level element or character node.  The expression
3972     <literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal>
3973     can be used to evaluate whether a particular <type>xml</type>
3974     value is a full document or only a content fragment.
3975    </para>
3976
3977    <sect2>
3978     <title>Creating XML Values</title>
3979    <para>
3980     To produce a value of type <type>xml</type> from character data,
3981     use the function
3982     <function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm>
3983 <synopsis>
3984 XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
3985 </synopsis>
3986     Examples:
3987 <programlisting><![CDATA[
3988 XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
3989 XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
3990 ]]></programlisting>
3991     While this is the only way to convert character strings into XML
3992     values according to the SQL standard, the PostgreSQL-specific
3993     syntaxes:
3994 <programlisting><![CDATA[
3995 xml '<foo>bar</foo>'
3996 '<foo>bar</foo>'::xml
3997 ]]></programlisting>
3998     can also be used.
3999    </para>
4000
4001    <para>
4002     The <type>xml</type> type does not validate input values
4003     against a document type declaration
4004     (DTD),<indexterm><primary>DTD</primary></indexterm>
4005     even when the input value specifies a DTD.
4006     There is also currently no built-in support for validating against
4007     other XML schema languages such as XML Schema.
4008    </para>
4009
4010    <para>
4011     The inverse operation, producing a character string value from
4012     <type>xml</type>, uses the function
4013     <function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
4014 <synopsis>
4015 XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )
4016 </synopsis>
4017     <replaceable>type</replaceable> can be
4018     <type>character</type>, <type>character varying</type>, or
4019     <type>text</type> (or an alias for one of those).  Again, according
4020     to the SQL standard, this is the only way to convert between type
4021     <type>xml</type> and character types, but PostgreSQL also allows
4022     you to simply cast the value.
4023    </para>
4024
4025    <para>
4026     When a character string value is cast to or from type
4027     <type>xml</type> without going through <type>XMLPARSE</type> or
4028     <type>XMLSERIALIZE</type>, respectively, the choice of
4029     <literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is
4030     determined by the <quote>XML option</quote>
4031     <indexterm><primary>XML option</primary></indexterm>
4032     session configuration parameter, which can be set using the
4033     standard command:
4034 <synopsis>
4035 SET XML OPTION { DOCUMENT | CONTENT };
4036 </synopsis>
4037     or the more PostgreSQL-like syntax
4038 <synopsis>
4039 SET xmloption TO { DOCUMENT | CONTENT };
4040 </synopsis>
4041     The default is <literal>CONTENT</literal>, so all forms of XML
4042     data are allowed.
4043    </para>
4044
4045    <note>
4046     <para>
4047      With the default XML option setting, you cannot directly cast
4048      character strings to type <type>xml</type> if they contain a
4049      document type declaration, because the definition of XML content
4050      fragment does not accept them.  If you need to do that, either
4051      use <literal>XMLPARSE</literal> or change the XML option.
4052     </para>
4053    </note>
4054
4055    </sect2>
4056
4057    <sect2>
4058     <title>Encoding Handling</title>
4059    <para>
4060     Care must be taken when dealing with multiple character encodings
4061     on the client, server, and in the XML data passed through them.
4062     When using the text mode to pass queries to the server and query
4063     results to the client (which is the normal mode), PostgreSQL
4064     converts all character data passed between the client and the
4065     server and vice versa to the character encoding of the respective
4066     end; see <xref linkend="multibyte">.  This includes string
4067     representations of XML values, such as in the above examples.
4068     This would ordinarily mean that encoding declarations contained in
4069     XML data can become invalid as the character data is converted
4070     to other encodings while travelling between client and server,
4071     because the embedded encoding declaration is not changed.  To cope
4072     with this behavior, encoding declarations contained in
4073     character strings presented for input to the <type>xml</type> type
4074     are <emphasis>ignored</emphasis>, and content is assumed
4075     to be in the current server encoding.  Consequently, for correct
4076     processing, character strings of XML data must be sent
4077     from the client in the current client encoding.  It is the
4078     responsibility of the client to either convert documents to the
4079     current client encoding before sending them to the server, or to
4080     adjust the client encoding appropriately.  On output, values of
4081     type <type>xml</type> will not have an encoding declaration, and
4082     clients should assume all data is in the current client
4083     encoding.
4084    </para>
4085
4086    <para>
4087     When using binary mode to pass query parameters to the server
4088     and query results back to the client, no character set conversion
4089     is performed, so the situation is different.  In this case, an
4090     encoding declaration in the XML data will be observed, and if it
4091     is absent, the data will be assumed to be in UTF-8 (as required by
4092     the XML standard; note that PostgreSQL does not support UTF-16).
4093     On output, data will have an encoding declaration
4094     specifying the client encoding, unless the client encoding is
4095     UTF-8, in which case it will be omitted.
4096    </para>
4097
4098    <para>
4099     Needless to say, processing XML data with PostgreSQL will be less
4100     error-prone and more efficient if the XML data encoding, client encoding,
4101     and server encoding are the same.  Since XML data is internally
4102     processed in UTF-8, computations will be most efficient if the
4103     server encoding is also UTF-8.
4104    </para>
4105
4106    <caution>
4107     <para>
4108      Some XML-related functions may not work at all on non-ASCII data
4109      when the server encoding is not UTF-8.  This is known to be an
4110      issue for <function>xpath()</> in particular.
4111     </para>
4112    </caution>
4113    </sect2>
4114
4115    <sect2>
4116    <title>Accessing XML Values</title>
4117
4118    <para>
4119     The <type>xml</type> data type is unusual in that it does not
4120     provide any comparison operators.  This is because there is no
4121     well-defined and universally useful comparison algorithm for XML
4122     data.  One consequence of this is that you cannot retrieve rows by
4123     comparing an <type>xml</type> column against a search value.  XML
4124     values should therefore typically be accompanied by a separate key
4125     field such as an ID.  An alternative solution for comparing XML
4126     values is to convert them to character strings first, but note
4127     that character string comparison has little to do with a useful
4128     XML comparison method.
4129    </para>
4130
4131    <para>
4132     Since there are no comparison operators for the <type>xml</type>
4133     data type, it is not possible to create an index directly on a
4134     column of this type.  If speedy searches in XML data are desired,
4135     possible workarounds include casting the expression to a
4136     character string type and indexing that, or indexing an XPath
4137     expression.  Of course, the actual query would have to be adjusted
4138     to search by the indexed expression.
4139    </para>
4140
4141    <para>
4142     The text-search functionality in PostgreSQL can also be used to speed
4143     up full-document searches of XML data.  The necessary
4144     preprocessing support is, however, not yet available in the PostgreSQL
4145     distribution.
4146    </para>
4147    </sect2>
4148   </sect1>
4149
4150   &array;
4151
4152   &rowtypes;
4153
4154   <sect1 id="datatype-oid">
4155    <title>Object Identifier Types</title>
4156
4157    <indexterm zone="datatype-oid">
4158     <primary>object identifier</primary>
4159     <secondary>data type</secondary>
4160    </indexterm>
4161
4162    <indexterm zone="datatype-oid">
4163     <primary>oid</primary>
4164    </indexterm>
4165
4166    <indexterm zone="datatype-oid">
4167     <primary>regproc</primary>
4168    </indexterm>
4169
4170    <indexterm zone="datatype-oid">
4171     <primary>regprocedure</primary>
4172    </indexterm>
4173
4174    <indexterm zone="datatype-oid">
4175     <primary>regoper</primary>
4176    </indexterm>
4177
4178    <indexterm zone="datatype-oid">
4179     <primary>regoperator</primary>
4180    </indexterm>
4181
4182    <indexterm zone="datatype-oid">
4183     <primary>regclass</primary>
4184    </indexterm>
4185
4186    <indexterm zone="datatype-oid">
4187     <primary>regtype</primary>
4188    </indexterm>
4189
4190    <indexterm zone="datatype-oid">
4191     <primary>regconfig</primary>
4192    </indexterm>
4193
4194    <indexterm zone="datatype-oid">
4195     <primary>regdictionary</primary>
4196    </indexterm>
4197
4198    <indexterm zone="datatype-oid">
4199     <primary>xid</primary>
4200    </indexterm>
4201
4202    <indexterm zone="datatype-oid">
4203     <primary>cid</primary>
4204    </indexterm>
4205
4206    <indexterm zone="datatype-oid">
4207     <primary>tid</primary>
4208    </indexterm>
4209
4210    <para>
4211     Object identifiers (OIDs) are used internally by
4212     <productname>PostgreSQL</productname> as primary keys for various
4213     system tables.  OIDs are not added to user-created tables, unless
4214     <literal>WITH OIDS</literal> is specified when the table is
4215     created, or the <xref linkend="guc-default-with-oids">
4216     configuration variable is enabled.  Type <type>oid</> represents
4217     an object identifier.  There are also several alias types for
4218     <type>oid</>: <type>regproc</>, <type>regprocedure</>,
4219     <type>regoper</>, <type>regoperator</>, <type>regclass</>,
4220     <type>regtype</>, <type>regconfig</>, and <type>regdictionary</>.
4221     <xref linkend="datatype-oid-table"> shows an overview.
4222    </para>
4223
4224    <para>
4225     The <type>oid</> type is currently implemented as an unsigned
4226     four-byte integer.  Therefore, it is not large enough to provide
4227     database-wide uniqueness in large databases, or even in large
4228     individual tables.  So, using a user-created table's OID column as
4229     a primary key is discouraged.  OIDs are best used only for
4230     references to system tables.
4231    </para>
4232
4233    <para>
4234     The <type>oid</> type itself has few operations beyond comparison.
4235     It can be cast to integer, however, and then manipulated using the
4236     standard integer operators.  (Beware of possible
4237     signed-versus-unsigned confusion if you do this.)
4238    </para>
4239
4240    <para>
4241     The OID alias types have no operations of their own except
4242     for specialized input and output routines.  These routines are able
4243     to accept and display symbolic names for system objects, rather than
4244     the raw numeric value that type <type>oid</> would use.  The alias
4245     types allow simplified lookup of OID values for objects.  For example,
4246     to examine the <structname>pg_attribute</> rows related to a table
4247     <literal>mytable</>, one could write:
4248 <programlisting>
4249 SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass;
4250 </programlisting>
4251     rather than:
4252 <programlisting>
4253 SELECT * FROM pg_attribute
4254   WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable');
4255 </programlisting>
4256     While that doesn't look all that bad by itself, it's still oversimplified.
4257     A far more complicated sub-select would be needed to
4258     select the right OID if there are multiple tables named
4259     <literal>mytable</> in different schemas.
4260     The <type>regclass</> input converter handles the table lookup according
4261     to the schema path setting, and so it does the <quote>right thing</>
4262     automatically.  Similarly, casting a table's OID to
4263     <type>regclass</> is handy for symbolic display of a numeric OID.
4264    </para>
4265
4266     <table id="datatype-oid-table">
4267      <title>Object Identifier Types</title>
4268      <tgroup cols="4">
4269       <thead>
4270        <row>
4271         <entry>Name</entry>
4272         <entry>References</entry>
4273         <entry>Description</entry>
4274         <entry>Value Example</entry>
4275        </row>
4276       </thead>
4277
4278       <tbody>
4279
4280        <row>
4281         <entry><type>oid</></entry>
4282         <entry>any</entry>
4283         <entry>numeric object identifier</entry>
4284         <entry><literal>564182</></entry>
4285        </row>
4286
4287        <row>
4288         <entry><type>regproc</></entry>
4289         <entry><structname>pg_proc</></entry>
4290         <entry>function name</entry>
4291         <entry><literal>sum</></entry>
4292        </row>
4293
4294        <row>
4295         <entry><type>regprocedure</></entry>
4296         <entry><structname>pg_proc</></entry>
4297         <entry>function with argument types</entry>
4298         <entry><literal>sum(int4)</></entry>
4299        </row>
4300
4301        <row>
4302         <entry><type>regoper</></entry>
4303         <entry><structname>pg_operator</></entry>
4304         <entry>operator name</entry>
4305         <entry><literal>+</></entry>
4306        </row>
4307
4308        <row>
4309         <entry><type>regoperator</></entry>
4310         <entry><structname>pg_operator</></entry>
4311         <entry>operator with argument types</entry>
4312         <entry><literal>*(integer,integer)</> or <literal>-(NONE,integer)</></entry>
4313        </row>
4314
4315        <row>
4316         <entry><type>regclass</></entry>
4317         <entry><structname>pg_class</></entry>
4318         <entry>relation name</entry>
4319         <entry><literal>pg_type</></entry>
4320        </row>
4321
4322        <row>
4323         <entry><type>regtype</></entry>
4324         <entry><structname>pg_type</></entry>
4325         <entry>data type name</entry>
4326         <entry><literal>integer</></entry>
4327        </row>
4328
4329        <row>
4330         <entry><type>regconfig</></entry>
4331         <entry><structname>pg_ts_config</></entry>
4332         <entry>text search configuration</entry>
4333         <entry><literal>english</></entry>
4334        </row>
4335
4336        <row>
4337         <entry><type>regdictionary</></entry>
4338         <entry><structname>pg_ts_dict</></entry>
4339         <entry>text search dictionary</entry>
4340         <entry><literal>simple</></entry>
4341        </row>
4342       </tbody>
4343      </tgroup>
4344     </table>
4345
4346    <para>
4347     All of the OID alias types accept schema-qualified names, and will
4348     display schema-qualified names on output if the object would not
4349     be found in the current search path without being qualified.
4350     The <type>regproc</> and <type>regoper</> alias types will only
4351     accept input names that are unique (not overloaded), so they are
4352     of limited use; for most uses <type>regprocedure</> or
4353     <type>regoperator</> are more appropriate.  For <type>regoperator</>,
4354     unary operators are identified by writing <literal>NONE</> for the unused
4355     operand.
4356    </para>
4357
4358    <para>
4359     An additional property of the OID alias types is the creation of
4360     dependencies.  If a
4361     constant of one of these types appears in a stored expression
4362     (such as a column default expression or view), it creates a dependency
4363     on the referenced object.  For example, if a column has a default
4364     expression <literal>nextval('my_seq'::regclass)</>,
4365     <productname>PostgreSQL</productname>
4366     understands that the default expression depends on the sequence
4367     <literal>my_seq</>; the system will not let the sequence be dropped
4368     without first removing the default expression.
4369    </para>
4370
4371    <para>
4372     Another identifier type used by the system is <type>xid</>, or transaction
4373     (abbreviated <abbrev>xact</>) identifier.  This is the data type of the system columns
4374     <structfield>xmin</> and <structfield>xmax</>.  Transaction identifiers are 32-bit quantities.
4375    </para>
4376
4377    <para>
4378     A third identifier type used by the system is <type>cid</>, or
4379     command identifier.  This is the data type of the system columns
4380     <structfield>cmin</> and <structfield>cmax</>. Command identifiers are also 32-bit quantities.
4381    </para>
4382
4383    <para>
4384     A final identifier type used by the system is <type>tid</>, or tuple
4385     identifier (row identifier).  This is the data type of the system column
4386     <structfield>ctid</>.  A tuple ID is a pair
4387     (block number, tuple index within block) that identifies the
4388     physical location of the row within its table.
4389    </para>
4390
4391    <para>
4392     (The system columns are further explained in <xref
4393     linkend="ddl-system-columns">.)
4394    </para>
4395   </sect1>
4396
4397   <sect1 id="datatype-pseudo">
4398    <title>Pseudo-Types</title>
4399
4400    <indexterm zone="datatype-pseudo">
4401     <primary>record</primary>
4402    </indexterm>
4403
4404    <indexterm zone="datatype-pseudo">
4405     <primary>any</primary>
4406    </indexterm>
4407
4408    <indexterm zone="datatype-pseudo">
4409     <primary>anyelement</primary>
4410    </indexterm>
4411
4412    <indexterm zone="datatype-pseudo">
4413     <primary>anyarray</primary>
4414    </indexterm>
4415
4416    <indexterm zone="datatype-pseudo">
4417     <primary>anynonarray</primary>
4418    </indexterm>
4419
4420    <indexterm zone="datatype-pseudo">
4421     <primary>anyenum</primary>
4422    </indexterm>
4423
4424    <indexterm zone="datatype-pseudo">
4425     <primary>void</primary>
4426    </indexterm>
4427
4428    <indexterm zone="datatype-pseudo">
4429     <primary>trigger</primary>
4430    </indexterm>
4431
4432    <indexterm zone="datatype-pseudo">
4433     <primary>language_handler</primary>
4434    </indexterm>
4435
4436    <indexterm zone="datatype-pseudo">
4437     <primary>fdw_handler</primary>
4438    </indexterm>
4439
4440    <indexterm zone="datatype-pseudo">
4441     <primary>cstring</primary>
4442    </indexterm>
4443
4444    <indexterm zone="datatype-pseudo">
4445     <primary>internal</primary>
4446    </indexterm>
4447
4448    <indexterm zone="datatype-pseudo">
4449     <primary>opaque</primary>
4450    </indexterm>
4451
4452    <para>
4453     The <productname>PostgreSQL</productname> type system contains a
4454     number of special-purpose entries that are collectively called
4455     <firstterm>pseudo-types</>.  A pseudo-type cannot be used as a
4456     column data type, but it can be used to declare a function's
4457     argument or result type.  Each of the available pseudo-types is
4458     useful in situations where a function's behavior does not
4459     correspond to simply taking or returning a value of a specific
4460     <acronym>SQL</acronym> data type.  <xref
4461     linkend="datatype-pseudotypes-table"> lists the existing
4462     pseudo-types.
4463    </para>
4464
4465     <table id="datatype-pseudotypes-table">
4466      <title>Pseudo-Types</title>
4467      <tgroup cols="2">
4468       <thead>
4469        <row>
4470         <entry>Name</entry>
4471         <entry>Description</entry>
4472        </row>
4473       </thead>
4474
4475       <tbody>
4476        <row>
4477         <entry><type>any</></entry>
4478         <entry>Indicates that a function accepts any input data type.</entry>
4479        </row>
4480
4481        <row>
4482         <entry><type>anyarray</></entry>
4483         <entry>Indicates that a function accepts any array data type
4484         (see <xref linkend="extend-types-polymorphic">).</entry>
4485        </row>
4486
4487        <row>
4488         <entry><type>anyelement</></entry>
4489         <entry>Indicates that a function accepts any data type
4490         (see <xref linkend="extend-types-polymorphic">).</entry>
4491        </row>
4492
4493        <row>
4494         <entry><type>anyenum</></entry>
4495         <entry>Indicates that a function accepts any enum data type
4496         (see <xref linkend="extend-types-polymorphic"> and
4497         <xref linkend="datatype-enum">).</entry>
4498        </row>
4499
4500        <row>
4501         <entry><type>anynonarray</></entry>
4502         <entry>Indicates that a function accepts any non-array data type
4503         (see <xref linkend="extend-types-polymorphic">).</entry>
4504        </row>
4505
4506        <row>
4507         <entry><type>cstring</></entry>
4508         <entry>Indicates that a function accepts or returns a null-terminated C string.</entry>
4509        </row>
4510
4511        <row>
4512         <entry><type>internal</></entry>
4513         <entry>Indicates that a function accepts or returns a server-internal
4514         data type.</entry>
4515        </row>
4516
4517        <row>
4518         <entry><type>language_handler</></entry>
4519         <entry>A procedural language call handler is declared to return <type>language_handler</>.</entry>
4520        </row>
4521
4522        <row>
4523         <entry><type>fdw_handler</></entry>
4524         <entry>A foreign-data wrapper handler is declared to return <type>fdw_handler</>.</entry>
4525        </row>
4526
4527        <row>
4528         <entry><type>record</></entry>
4529         <entry>Identifies a function returning an unspecified row type.</entry>
4530        </row>
4531
4532        <row>
4533         <entry><type>trigger</></entry>
4534         <entry>A trigger function is declared to return <type>trigger.</></entry>
4535        </row>
4536
4537        <row>
4538         <entry><type>void</></entry>
4539         <entry>Indicates that a function returns no value.</entry>
4540        </row>
4541
4542        <row>
4543         <entry><type>opaque</></entry>
4544         <entry>An obsolete type name that formerly served all the above purposes.</entry>
4545        </row>
4546       </tbody>
4547      </tgroup>
4548     </table>
4549
4550    <para>
4551     Functions coded in C (whether built-in or dynamically loaded) can be
4552     declared to accept or return any of these pseudo data types.  It is up to
4553     the function author to ensure that the function will behave safely
4554     when a pseudo-type is used as an argument type.
4555    </para>
4556
4557    <para>
4558     Functions coded in procedural languages can use pseudo-types only as
4559     allowed by their implementation languages.  At present the procedural
4560     languages all forbid use of a pseudo-type as argument type, and allow
4561     only <type>void</> and <type>record</> as a result type (plus
4562     <type>trigger</> when the function is used as a trigger).  Some also
4563     support polymorphic functions using the types <type>anyarray</>,
4564     <type>anyelement</>, <type>anyenum</>, and <type>anynonarray</>.
4565    </para>
4566
4567    <para>
4568     The <type>internal</> pseudo-type is used to declare functions
4569     that are meant only to be called internally by the database
4570     system, and not by direct invocation in an <acronym>SQL</acronym>
4571     query.  If a function has at least one <type>internal</>-type
4572     argument then it cannot be called from <acronym>SQL</acronym>.  To
4573     preserve the type safety of this restriction it is important to
4574     follow this coding rule: do not create any function that is
4575     declared to return <type>internal</> unless it has at least one
4576     <type>internal</> argument.
4577    </para>
4578
4579   </sect1>
4580
4581  </chapter>