Add description of error style.

author Bruce Momjian <bruce@momjian.us>

Sat, 16 Aug 2003 23:53:54 +0000 (23:53 +0000)

committer Bruce Momjian <bruce@momjian.us>

Sat, 16 Aug 2003 23:53:54 +0000 (23:53 +0000)
author Bruce Momjian <bruce@momjian.us>
Sat, 16 Aug 2003 23:53:54 +0000 (23:53 +0000)
committer Bruce Momjian <bruce@momjian.us>
Sat, 16 Aug 2003 23:53:54 +0000 (23:53 +0000)
diff --git a/src/tools/error_text b/src/tools/error_text

new file mode 100644 (file)

index 0000000..de37a3e
--- /dev/null
+++ b/src/tools/error_text
@@ -0,0 +1,299 @@
+To: Peter Eisentraut <peter_e@gmx.net>
+cc: pgsql-hackers@postgresql.org
+Subject: [HACKERS] Error message style guide, take 2
+Date: Wed, 14 May 2003 16:58:18 -0400
+From: Tom Lane <tgl@sss.pgh.pa.us>
+
+I'm about to start going through the backend's elog() calls to update
+them to ereport() style, add error code numbers, polish wording, etc.
+So it's time to nail down our style guide for message wording.  Attached
+is a revision of the draft that Peter posted on 14-March.  Any further
+comments?
+
+BTW, I'd like to SGML-ify this and put it into the developer's guide
+somewhere; any thoughts where exactly?
+
+                       regards, tom lane
+
+
+
+What goes where
+---------------
+
+The primary message should be short, factual, and avoid reference to
+implementation details such as specific function names.  "Short" means
+"should fit on one line under normal conditions".  Use a detail message if
+needed to keep the primary message short, or if you feel a need to mention
+implementation details such as the particular system call that failed.
+Both primary and detail messages should be factual.  Use a hint message
+for suggestions about what to do to fix the problem, especially if the
+suggestion might not always be applicable.
+
+For example, instead of
+       IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
+       (plus a long addendum that is basically a hint)
+write
+       Primary:    Could not create shared memory segment: %m
+       Detail:     Failed syscall was shmget(key=%d, size=%u, 0%o)
+       Hint:       the addendum
+
+RATIONALE: keeping the primary message short helps keep it to the point,
+and lets clients lay out screen space on the assumption that one line is
+enough for error messages.  Detail and hint messages may be relegated to a
+verbose mode, or perhaps a pop-up error-details window.  Also, details and
+hints would normally be suppressed from the server log to save space.
+Reference to implementation details is best avoided since users don't know
+the details anyway.
+
+
+Formatting
+----------
+
+Don't put any specific assumptions about formatting into the message
+texts.  Expect clients and the server log to wrap lines to fit their own
+needs.  In long messages, newline characters (\n) may be used to indicate
+suggested paragraph breaks.  Don't end a message with a newline.  Don't
+use tabs or other formatting characters.  (In error context displays,
+newlines are automatically added to separate levels of context such
+as function calls.)
+
+RATIONALE: Messages are not necessarily displayed on terminal-type
+displays.  In GUI displays or browsers these formatting instructions
+are at best ignored.
+
+
+Quotation marks
+---------------
+
+English text should use double quotes when quoting is appropriate.
+Text in other languages should consistently use one kind of quotes
+that is consistent with publishing customs and computer output of
+other programs.
+
+RATIONALE: The choice of double quotes over single quotes is somewhat
+arbitrary, but tends to be the preferred use.  Some have suggested
+choosing the kind of quotes depending on the type of object according to
+SQL conventions (namely, strings single quoted, identifiers double
+quoted).  But this is a language-internal technical issue that many users
+aren't even familiar with, it won't scale to other kinds of quoted terms,
+it doesn't translate to other languages, and it's pretty pointless, too.
+
+
+Use of quotes
+-------------
+
+Use quotes always to delimit file names, user-supplied identifiers,
+and other variables that might contain words.  Do not use them to
+mark up variables that will not contain words (for example, operator
+names).
+
+There are functions in the backend that will double-quote their own
+output at need (for example, format_type_be()).  Do not put additional
+quotes around the output of such functions.
+
+RATIONALE: Objects can have names that create ambiguity when embedded
+in a message.  Be consistent about denoting where a plugged-in name
+starts and ends.  But don't clutter messages with unnecessary or
+duplicate quote marks.
+
+
+Grammar and punctuation
+-----------------------
+
+The rules are different for primary error messages and for detail/hint
+messages:
+
+Primary error messages: Do not capitalize the first letter.  Do not end a
+message with a period.  Do not even think about ending a message with an
+exclamation point.
+
+Detail and hint messages: Use complete sentences, and end each with
+a period.  Capitalize the starts of sentences.
+
+RATIONALE: Avoiding punctuation makes it easier for client applications to
+embed the message into a variety of grammatical contexts.  Often, primary
+messages are not grammatically complete sentences anyway.  (And if they're
+long enough to be more than one sentence, they should be split into
+primary and detail parts.)  However, detail and hint messages are longer
+and may need to include multiple sentences.  For consistency, they should
+follow complete-sentence style even when there's only one sentence.
+
+
+Upper case vs. lower case
+-------------------------
+
+Use lower case for message wording, including the first letter of a
+primary error message.  Use upper case for SQL commands and key words if
+they appear in the message.
+
+RATIONALE: It's easier to make everything look more consistent this
+way, since some messages are complete sentences and some not.
+
+
+Avoid passive voice
+-------------------
+
+Use the active voice.  Use complete sentences when there is an acting
+subject ("A could not do B").  Use telegram style without subject if
+the subject would be the program itself; do not use "I" for the
+program.
+
+RATIONALE: The program is not human.  Don't pretend otherwise.
+
+
+Present vs past tense
+---------------------
+
+Use past tense if an attempt to do something failed, but could perhaps
+succeed next time (perhaps after fixing some problem).  Use present tense
+if the failure is certainly permanent.
+
+There is a nontrivial semantic difference between sentences of the
+form
+
+       could not open file "%s": %m
+
+and
+
+       cannot open file "%s"
+
+The first one means that the attempt to open the file failed.  The
+message should give a reason, such as "disk full" or "file doesn't
+exist".  The past tense is appropriate because next time the disk
+might not be full anymore or the file in question may exist.
+
+The second form indicates the the functionality of opening the named
+file does not exist at all in the program, or that it's conceptually
+impossible.  The present tense is appropriate because the condition
+will persist indefinitely.
+
+RATIONALE: Granted, the average user will not be able to draw great
+conclusions merely from the tense of the message, but since the
+language provides us with a grammar we should use it correctly.
+
+
+Type of the object
+------------------
+
+When citing the name of an object, state what kind of object it is.
+
+RATIONALE: Else no one will know what "foo.bar.baz" is.
+
+
+Brackets
+--------
+
+Square brackets are only to be used (1) in command synopses to denote
+optional arguments, or (2) to denote an array subscript.
+
+RATIONALE: Anything else does not correspond to widely-known customary
+usage and will confuse people.
+
+
+Assembling error messages
+-------------------------
+
+When a message includes text that is generated elsewhere, embed it in
+this style:
+
+       could not open file %s: %m
+
+RATIONALE: It would be difficult to account for all possible error codes
+to paste this into a single smooth sentence, so some sort of punctuation
+is needed.  Putting the embedded text in parentheses has also been
+suggested, but it's unnatural if the embedded text is likely to be the
+most important part of the message, as is often the case.
+
+
+Reasons for errors
+------------------
+
+Messages should always state the reason why an error occurred.
+For example:
+
+BAD:   could not open file %s
+BETTER:        could not open file %s (I/O failure)
+
+If no reason is known you better fix the code. ;-)
+
+
+Function names
+--------------
+
+Don't include the name of the reporting routine in the error text.
+We have other mechanisms for finding that out when needed, and for
+most users it's not helpful information.  If the error text doesn't
+make as much sense without the function name, reword it.
+
+BAD:   pg_atoi: error in "z": can't parse "z"
+BETTER:        invalid input syntax for integer: "z"
+
+Avoid mentioning called function names, either; instead say what the code
+was trying to do:
+
+BAD:   open() failed: %m
+BETTER:        could not open file %s: %m
+
+If it really seems necessary, mention the system call in the detail
+message.  (In some cases, providing the actual values passed to the
+system call might be appropriate information for the detail message.)
+
+RATIONALE: Users don't know what all those functions do.
+
+
+Tricky words to avoid
+---------------------
+
+unable:
+
+"unable" is nearly the passive voice.  Better use "cannot" or "could
+not", as appropriate.
+
+bad:
+
+Error messages like "bad result" are really hard to interpret
+intelligently.  It's better to write why the result is "bad", e.g.,
+"invalid format".
+
+illegal:
+
+"Illegal" stands for a violation of the law, the rest is "invalid".
+Better yet, say why it's invalid.
+
+unknown:
+
+Try to avoid "unknown".  Consider "error: unknown response".  If you
+don't know what the response is, how do you know it's erroneous?
+"Unrecognized" is often a better choice.  Also, be sure to include the
+value being complained of.
+
+BAD:   unknown node type
+BETTER:        unrecognized node type: 42
+
+find vs. exists:
+
+If the program uses a nontrivial algorithm to locate a resource (e.g., a
+path search) and that algorithm fails, it is fair to say that the program
+couldn't "find" the resource.  If, on the other hand, the expected
+location of the resource is known but the program cannot access it there
+then say that the resource doesn't "exist".  Using "find" in this case
+sounds weak and confuses the issue.
+
+
+Proper spelling
+---------------
+
+Spell out words in full.  For instance, avoid:
+
+spec
+stats
+parens
+auth
+xact
+
+RATIONALE: This will improve consistency.
+
+---------------------------(end of broadcast)---------------------------
+TIP 3: if posting/reading through Usenet, please send an appropriate
+subscribe-nomail command to majordomo@postgresql.org so that your
+message can get through to the mailing list cleanly
author	Bruce Momjian <bruce@momjian.us>
	Sat, 16 Aug 2003 23:53:54 +0000 (23:53 +0000)
committer	Bruce Momjian <bruce@momjian.us>
	Sat, 16 Aug 2003 23:53:54 +0000 (23:53 +0000)