*
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/storage/ipc/procarray.c,v 1.57 2010/01/16 17:17:26 tgl Exp $
+ * $PostgreSQL: pgsql/src/backend/storage/ipc/procarray.c,v 1.58 2010/01/21 00:53:58 sriggs Exp $
*
*-------------------------------------------------------------------------
*/
/*
* GetConflictingVirtualXIDs -- returns an array of currently active VXIDs.
*
- * The array is palloc'd and is terminated with an invalid VXID.
- *
* Usage is limited to conflict resolution during recovery on standby servers.
* limitXmin is supplied as either latestRemovedXid, or InvalidTransactionId
* in cases where we cannot accurately determine a value for latestRemovedXid.
- * If limitXmin is InvalidTransactionId then we know that the very
+ *
+ * If limitXmin is InvalidTransactionId then we are forced to assume that
* latest xid that might have caused a cleanup record will be
* latestCompletedXid, so we set limitXmin to be latestCompletedXid instead.
* We then skip any backends with xmin > limitXmin. This means that
* cleanup records don't conflict with some recent snapshots.
*
+ * The reason for using latestCompletedxid is that we aren't certain which
+ * of the xids in KnownAssignedXids are actually FATAL errors that did
+ * not write abort records. In almost every case they won't be, but we
+ * don't know that for certain. So we need to conflict with all current
+ * snapshots whose xmin is less than latestCompletedXid to be safe. This
+ * causes false positives in our assessment of which vxids conflict.
+ *
+ * By using exclusive lock we prevent new snapshots from being taken while
+ * we work out which snapshots to conflict with. This protects those new
+ * snapshots from also being included in our conflict list.
+ *
+ * After the lock is released, we allow snapshots again. It is possible
+ * that we arrive at a snapshot that is identical to one that we just
+ * decided we should conflict with. This a case of false positives, not an
+ * actual problem.
+ *
+ * There are two cases: (1) if we were correct in using latestCompletedXid
+ * then that means that all xids in the snapshot lower than that are FATAL
+ * errors, so not xids that ever commit. We can make no visibility errors
+ * if we allow such xids into the snapshot. (2) if we erred on the side of
+ * caution and in fact the latestRemovedXid should have been earlier than
+ * latestCompletedXid then we conflicted with a snapshot needlessly. Taking
+ * another identical snapshot is OK, because the earlier conflicted
+ * snapshot was a false positive.
+ *
+ * In either case, a snapshot taken after conflict assessment will still be
+ * valid and non-conflicting even if an identical snapshot that existed
+ * before conflict assessment was assessed as conflicting.
+ *
+ * If we allowed concurrent snapshots while we were deciding who to
+ * conflict with we would need to include all concurrent snapshotters in
+ * the conflict list as well. We'd have difficulty in working out exactly
+ * who that was, so it is happier for all concerned if we take an exclusive
+ * lock. Notice that we only hold that lock for as long as it takes to
+ * make the conflict list, not for the whole duration of the conflict
+ * resolution.
+ *
+ * It also means that users waiting for a snapshot is a good thing, since
+ * it is more likely that they will live longer after having waited. So it
+ * is a benefit, not an oversight that we use exclusive lock here.
+ *
* We replace InvalidTransactionId with latestCompletedXid here because
* this is the most convenient place to do that, while we hold ProcArrayLock.
* The originator of the cleanup record wanted to avoid checking the value of