I686LINUX/util/I686LINUX/doc/postgresql/html/parser-stage.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
   2 <HTML
   3 ><HEAD
   4 ><TITLE
   5 >The Parser Stage</TITLE
   6 ><META
   7 NAME="GENERATOR"
   8 CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
   9 REV="MADE"
  10 HREF="mailto:pgsql-docs@postgresql.org"><LINK
  11 REL="HOME"
  12 TITLE="PostgreSQL 7.4.1 Documentation"
  13 HREF="index.html"><LINK
  14 REL="UP"
  15 TITLE="Overview of PostgreSQL Internals"
  16 HREF="overview.html"><LINK
  17 REL="PREVIOUS"
  18 TITLE="How Connections are Established"
  19 HREF="connect-estab.html"><LINK
  20 REL="NEXT"
  21 TITLE="The PostgreSQL Rule System"
  22 HREF="rule-system.html"><LINK
  23 REL="STYLESHEET"
  24 TYPE="text/css"
  25 HREF="stylesheet.css"><META
  26 NAME="creation"
  27 CONTENT="2003-12-22T03:48:47"></HEAD
  28 ><BODY
  29 CLASS="SECT1"
  30 ><DIV
  31 CLASS="NAVHEADER"
  32 ><TABLE
  33 SUMMARY="Header navigation table"
  34 WIDTH="100%"
  35 BORDER="0"
  36 CELLPADDING="0"
  37 CELLSPACING="0"
  38 ><TR
  39 ><TH
  40 COLSPAN="5"
  41 ALIGN="center"
  42 VALIGN="bottom"
  43 >PostgreSQL 7.4.1 Documentation</TH
  44 ></TR
  45 ><TR
  46 ><TD
  47 WIDTH="10%"
  48 ALIGN="left"
  49 VALIGN="top"
  50 ><A
  51 HREF="connect-estab.html"
  52 ACCESSKEY="P"
  53 >Prev</A
  54 ></TD
  55 ><TD
  56 WIDTH="10%"
  57 ALIGN="left"
  58 VALIGN="top"
  59 ><A
  60 HREF="overview.html"
  61 >Fast Backward</A
  62 ></TD
  63 ><TD
  64 WIDTH="60%"
  65 ALIGN="center"
  66 VALIGN="bottom"
  67 >Chapter 42. Overview of PostgreSQL Internals</TD
  68 ><TD
  69 WIDTH="10%"
  70 ALIGN="right"
  71 VALIGN="top"
  72 ><A
  73 HREF="overview.html"
  74 >Fast Forward</A
  75 ></TD
  76 ><TD
  77 WIDTH="10%"
  78 ALIGN="right"
  79 VALIGN="top"
  80 ><A
  81 HREF="rule-system.html"
  82 ACCESSKEY="N"
  83 >Next</A
  84 ></TD
  85 ></TR
  86 ></TABLE
  87 ><HR
  88 ALIGN="LEFT"
  89 WIDTH="100%"></DIV
  90 ><DIV
  91 CLASS="SECT1"
  92 ><H1
  93 CLASS="SECT1"
  94 ><A
  95 NAME="PARSER-STAGE"
  96 >42.3. The Parser Stage</A
  97 ></H1
  98 ><P
  99 >    The <I
 100 CLASS="FIRSTTERM"
 101 >parser stage</I
 102 > consists of two parts:
 103
 104     <P
 105 ></P
 106 ></P><UL
 107 ><LI
 108 ><P
 109 >       The <I
 110 CLASS="FIRSTTERM"
 111 >parser</I
 112 > defined in
 113        <TT
 114 CLASS="FILENAME"
 115 >gram.y</TT
 116 > and <TT
 117 CLASS="FILENAME"
 118 >scan.l</TT
 119 > is
 120        built using the Unix tools <SPAN
 121 CLASS="APPLICATION"
 122 >yacc</SPAN
 123 >
 124        and <SPAN
 125 CLASS="APPLICATION"
 126 >lex</SPAN
 127 >.
 128       </P
 129 ></LI
 130 ><LI
 131 ><P
 132 >       The <I
 133 CLASS="FIRSTTERM"
 134 >transformation process</I
 135 > does
 136        modifications and augmentations to the data structures returned by the parser.
 137       </P
 138 ></LI
 139 ></UL
 140 ><P>
 141    </P
 142 ><DIV
 143 CLASS="SECT2"
 144 ><H2
 145 CLASS="SECT2"
 146 ><A
 147 NAME="AEN48540"
 148 >42.3.1. Parser</A
 149 ></H2
 150 ><P
 151 >     The parser has to check the query string (which arrives as plain
 152      ASCII text) for valid syntax. If the syntax is correct a
 153      <I
 154 CLASS="FIRSTTERM"
 155 >parse tree</I
 156 > is built up and handed back;
 157      otherwise an error is returned. The parser and lexer are
 158      implemented using the well-known Unix tools <SPAN
 159 CLASS="APPLICATION"
 160 >yacc</SPAN
 161 >
 162      and <SPAN
 163 CLASS="APPLICATION"
 164 >lex</SPAN
 165 >.
 166     </P
 167 ><P
 168 >     The <I
 169 CLASS="FIRSTTERM"
 170 >lexer</I
 171 > is defined in the file
 172      <TT
 173 CLASS="FILENAME"
 174 >scan.l</TT
 175 > and is responsible
 176      for recognizing <I
 177 CLASS="FIRSTTERM"
 178 >identifiers</I
 179 >,
 180      the <I
 181 CLASS="FIRSTTERM"
 182 >SQL key words</I
 183 > etc. For
 184      every key word or identifier that is found, a <I
 185 CLASS="FIRSTTERM"
 186 >token</I
 187 >
 188      is generated and handed to the parser.
 189     </P
 190 ><P
 191 >     The parser is defined in the file <TT
 192 CLASS="FILENAME"
 193 >gram.y</TT
 194 > and
 195      consists of a set of <I
 196 CLASS="FIRSTTERM"
 197 >grammar rules</I
 198 > and
 199      <I
 200 CLASS="FIRSTTERM"
 201 >actions</I
 202 > that are executed whenever a rule
 203      is fired. The code of the actions (which is actually C code) is
 204      used to build up the parse tree.
 205     </P
 206 ><P
 207 >     The file <TT
 208 CLASS="FILENAME"
 209 >scan.l</TT
 210 > is transformed to the C
 211      source file <TT
 212 CLASS="FILENAME"
 213 >scan.c</TT
 214 > using the program
 215      <SPAN
 216 CLASS="APPLICATION"
 217 >lex</SPAN
 218 > and <TT
 219 CLASS="FILENAME"
 220 >gram.y</TT
 221 > is
 222      transformed to <TT
 223 CLASS="FILENAME"
 224 >gram.c</TT
 225 > using
 226      <SPAN
 227 CLASS="APPLICATION"
 228 >yacc</SPAN
 229 >.  After these transformations
 230      have taken place a normal C compiler can be used to create the
 231      parser. Never make any changes to the generated C files as they
 232      will be overwritten the next time <SPAN
 233 CLASS="APPLICATION"
 234 >lex</SPAN
 235 >
 236      or <SPAN
 237 CLASS="APPLICATION"
 238 >yacc</SPAN
 239 > is called.
 240
 241      </P><DIV
 242 CLASS="NOTE"
 243 ><BLOCKQUOTE
 244 CLASS="NOTE"
 245 ><P
 246 ><B
 247 >Note: </B
 248 >       The mentioned transformations and compilations are normally done
 249        automatically using the <I
 250 CLASS="FIRSTTERM"
 251 >makefiles</I
 252 >
 253        shipped with the <SPAN
 254 CLASS="PRODUCTNAME"
 255 >PostgreSQL</SPAN
 256 >
 257        source distribution.
 258       </P
 259 ></BLOCKQUOTE
 260 ></DIV
 261 ><P>
 262     </P
 263 ><P
 264 >     A detailed description of <SPAN
 265 CLASS="APPLICATION"
 266 >yacc</SPAN
 267 > or
 268      the grammar rules given in <TT
 269 CLASS="FILENAME"
 270 >gram.y</TT
 271 > would be
 272      beyond the scope of this paper. There are many books and
 273      documents dealing with <SPAN
 274 CLASS="APPLICATION"
 275 >lex</SPAN
 276 > and
 277      <SPAN
 278 CLASS="APPLICATION"
 279 >yacc</SPAN
 280 >. You should be familiar with
 281      <SPAN
 282 CLASS="APPLICATION"
 283 >yacc</SPAN
 284 > before you start to study the
 285      grammar given in <TT
 286 CLASS="FILENAME"
 287 >gram.y</TT
 288 > otherwise you won't
 289      understand what happens there.
 290     </P
 291 ></DIV
 292 ><DIV
 293 CLASS="SECT2"
 294 ><H2
 295 CLASS="SECT2"
 296 ><A
 297 NAME="AEN48576"
 298 >42.3.2. Transformation Process</A
 299 ></H2
 300 ><P
 301 >     The parser stage creates a parse tree using only fixed rules about
 302      the syntactic structure of SQL.  It does not make any lookups in the
 303      system catalogs, so there is no possibility to understand the detailed
 304      semantics of the requested operations.  After the parser completes,
 305      the <I
 306 CLASS="FIRSTTERM"
 307 >transformation process</I
 308 > takes the tree handed
 309      back by the parser as input and does the semantic interpretation needed
 310      to understand which tables, functions, and operators are referenced by
 311      the query.  The data structure that is built to represent this
 312      information is called the <I
 313 CLASS="FIRSTTERM"
 314 >query tree</I
 315 >.
 316     </P
 317 ><P
 318 >     The reason for separating raw parsing from semantic analysis is that
 319      system catalog lookups can only be done within a transaction, and we
 320      do not wish to start a transaction immediately upon receiving a query
 321      string.  The raw parsing stage is sufficient to identify the transaction
 322      control commands (<TT
 323 CLASS="COMMAND"
 324 >BEGIN</TT
 325 >, <TT
 326 CLASS="COMMAND"
 327 >ROLLBACK</TT
 328 >, etc), and
 329      these can then be correctly executed without any further analysis.
 330      Once we know that we are dealing with an actual query (such as
 331      <TT
 332 CLASS="COMMAND"
 333 >SELECT</TT
 334 > or <TT
 335 CLASS="COMMAND"
 336 >UPDATE</TT
 337 >), it is okay to
 338      start a transaction if we're not already in one.  Only then can the
 339      transformation process be invoked.
 340     </P
 341 ><P
 342 >     The query tree created by the transformation process is structurally
 343      similar to the raw parse tree in most places, but it has many differences
 344      in detail.  For example, a <TT
 345 CLASS="STRUCTNAME"
 346 >FuncCall</TT
 347 > node in the
 348      parse tree represents something that looks syntactically like a function
 349      call.  This may be transformed to either a <TT
 350 CLASS="STRUCTNAME"
 351 >FuncExpr</TT
 352 >
 353      or <TT
 354 CLASS="STRUCTNAME"
 355 >Aggref</TT
 356 > node depending on whether the referenced
 357      name turns out to be an ordinary function or an aggregate function.
 358      Also, information about the actual data types of columns and expression
 359      results is added to the query tree.
 360     </P
 361 ></DIV
 362 ></DIV
 363 ><DIV
 364 CLASS="NAVFOOTER"
 365 ><HR
 366 ALIGN="LEFT"
 367 WIDTH="100%"><TABLE
 368 SUMMARY="Footer navigation table"
 369 WIDTH="100%"
 370 BORDER="0"
 371 CELLPADDING="0"
 372 CELLSPACING="0"
 373 ><TR
 374 ><TD
 375 WIDTH="33%"
 376 ALIGN="left"
 377 VALIGN="top"
 378 ><A
 379 HREF="connect-estab.html"
 380 ACCESSKEY="P"
 381 >Prev</A
 382 ></TD
 383 ><TD
 384 WIDTH="34%"
 385 ALIGN="center"
 386 VALIGN="top"
 387 ><A
 388 HREF="index.html"
 389 ACCESSKEY="H"
 390 >Home</A
 391 ></TD
 392 ><TD
 393 WIDTH="33%"
 394 ALIGN="right"
 395 VALIGN="top"
 396 ><A
 397 HREF="rule-system.html"
 398 ACCESSKEY="N"
 399 >Next</A
 400 ></TD
 401 ></TR
 402 ><TR
 403 ><TD
 404 WIDTH="33%"
 405 ALIGN="left"
 406 VALIGN="top"
 407 >How Connections are Established</TD
 408 ><TD
 409 WIDTH="34%"
 410 ALIGN="center"
 411 VALIGN="top"
 412 ><A
 413 HREF="overview.html"
 414 ACCESSKEY="U"
 415 >Up</A
 416 ></TD
 417 ><TD
 418 WIDTH="33%"
 419 ALIGN="right"
 420 VALIGN="top"
 421 >The <SPAN
 422 CLASS="PRODUCTNAME"
 423 >PostgreSQL</SPAN
 424 > Rule System</TD
 425 ></TR
 426 ></TABLE
 427 ></DIV
 428 ></BODY
 429 ></HTML
 430 >