10 Opportunistic Encryption
16 Linux FreeS/WAN Project
20 Opportunistic encryption permits secure
21 (encrypted, authenticated) communication via IPsec
22 without connection-by-connection prearrangement,
23 either explicitly between hosts (when the hosts
24 are capable of it) or transparently via packet-
25 intercepting security gateways. It uses DNS
26 records (authenticated with DNSSEC) to provide the
27 necessary information for gateway discovery and
28 gateway authentication, and constrains negotiation
29 enough to guarantee success.
31 Substantive changes since draft 3: write off
32 inverse queries as a lost cause; use Invalid-SPI
33 rather than Delete as notification of unknown SA;
34 minor wording improvements and clarifications.
35 This document takes over from the older ``Imple-
36 menting Opportunistic Encryption'' document.
41 A major goal of the FreeS/WAN project is opportunistic
42 encryption: a (security) gateway intercepts an outgoing
43 packet aimed at a remote host, and quickly attempts to nego-
44 tiate an IPsec tunnel to that host's security gateway. If
45 the attempt succeeds, traffic can then be secure, transpar-
46 ently (without changes to the host software). If the
47 attempt fails, the packet (or a retry thereof) passes
48 through in clear or is dropped, depending on local policy.
49 Prearranged tunnels bypass the packet interception etc., so
50 static VPNs can coexist with opportunistic encryption.
52 This generalizes trivially to the end-to-end case: host and
53 security gateway simply are one and the same. Some opti-
54 mizations are possible in that case, but the basic scheme
57 The objectives for security systems need to be explicitly
58 stated. Opportunistic encryption is meant to achieve secure
59 communication, without prearrangement of the individual con-
60 nection (although some prearrangement on a per-host basis is
70 Opportunistic Encryption
73 required), between any two hosts which implement the proto-
74 col (and, if they act as security gateways, between hosts
75 behind them). Here ``secure'' means strong encryption and
76 authentication of packets, with authentication of partici-
77 pants--to prevent man-in-the-middle and impersonation
78 attacks--dependent on several factors. The biggest factor
79 is the authentication of DNS records, via DNSSEC or equiva-
80 lent means. A lesser factor is which exact variant of the
81 setup procedure (see section 2.2) is used, because there is
82 a tradeoff between strong authentication of the other end
83 and ability to negotiate opportunistic encryption with hosts
84 which have limited or no control of their reverse-map DNS
85 records: without reverse-map information, we can verify that
86 the host has the right to use a particular FQDN (Fully Qual-
87 ified Domain Name), but not whether that FQDN is authorized
88 to use that IP address. Local policy must decide whether
89 authentication or connectivity has higher priority.
91 Apart from careful attention to detail in various areas,
92 there are three crucial design problems for opportunistic
93 encryption. It needs a way to quickly identify the remote
94 host's security gateway. It needs a way to quickly obtain
95 an authentication key for the security gateway. And the
96 numerous options which can be specified with IKE must be
97 constrained sufficiently that two independent implementa-
98 tions are guaranteed to reach agreement, without any
99 explicit prearrangement or preliminary negotiation. The
100 first two problems are solved using DNS, with DNSSEC ensur-
101 ing that the data obtained is reliable; the third is solved
102 by specifying a minimum standard which must be supported.
104 A note on philosophy: we have deliberately avoided providing
105 six different ways to do each job, in favor of specifying
106 one good one. Choices are provided only when they appear to
107 be necessary, or at least important.
109 A note on terminology: to avoid constant circumlocutions, an
110 ISAKMP/IKE SA, possibly recreated occasionally by rekeying,
111 will be referred to as a ``keying channel'', and a set of
112 IPsec SAs providing bidirectional communication between two
113 IPsec hosts, possibly recreated occasionally by rekeying,
114 will be referred to as a ``tunnel'' (it could conceivably
115 use transport mode in the host-to-host case, but we advocate
116 using tunnel mode even there). The word ``connection'' is
117 here used in a more generic sense. The word ``lifetime''
118 will be avoided in favor of ``rekeying interval'', since
119 many of the connections will have useful lives far shorter
120 than any reasonable rekeying interval, and hence the two
121 concepts must be separated.
123 A note on document structure: Discussions of why things were
124 done a particular way, or not done a particular way, are
125 broken out in paragraphs headed ``Rationale:'' (to preserve
126 the flow of the text, many such paragraphs are deferred to
136 Opportunistic Encryption
139 the ends of sections). Paragraphs headed ``Ahem:'' are dis-
140 cussions of where the problem is being made significantly
141 harder by problems elsewhere, and how that might be cor-
142 rected. Some meta-comments are enclosed in [].
144 Rationale: The motive is to get the Internet encrypted.
145 That requires encryption without connection-by-connection
146 prearrangement: a system must be able to reliably negotiate
147 an encrypted, authenticated connection with a total
148 stranger. While end-to-end encryption is preferable, doing
149 opportunistic encryption in security gateways gives enormous
150 leverage for quick deployment of this technology, in a world
151 where end-host software is often primitive, rigid, and out-
154 Rationale: Speed is of the essence in tunnel setup: a con-
155 nection-establishment delay longer than about 10 seconds
156 begins to cause problems for users and applications. Thus
157 the emphasis on rapidity in gateway discovery and key fetch-
160 Ahem: Host-to-host opportunistic encryption would be utterly
161 trivial if a fast public-key encryption/signature algorithm
162 was available. You would do a reverse lookup on the desti-
163 nation address to obtain a public key for that address, and
164 simply encrypt all packets going to it with that key, sign-
165 ing them with your own private key. Alas, this is impracti-
166 cal with current CPU speeds and current algorithms (although
167 as noted later, it might be of some use for limited pur-
168 poses). Nevertheless, it is a useful model.
172 For purposes of discussion, the network is taken to look
175 Source----Initiator----...----Responder----Destination
177 The intercepted packet comes from the Source, bound for the
178 Destination, and is intercepted at the Initiator. The Ini-
179 tiator communicates over the insecure Internet to the
180 Responder. The Source and the Initiator might be the same
181 host, or the Source might be an end-user host and the Ini-
182 tiator a security gateway (SG). Likewise for the Responder
185 Given an intercepted packet, whose useful information (for
186 our purposes) is essentially only the Destination's IP
187 address, the Initiator must quickly determine the Responder
188 (the Destination's SG) and fetch everything needed to
189 authenticate it. The Responder must do likewise for the
190 Initiator. Both must eventually also confirm that the other
191 is authorized to act on behalf of the client host behind it
202 Opportunistic Encryption
205 An important subtlety here is that if the alternative to an
206 IPsec tunnel is plaintext transmission, negative results
207 must be obtained quickly. That is, the decision that no
208 tunnel can be established must also be made rapidly.
210 2.1. Packet Interception
212 Interception of outgoing packets is relatively straightfor-
213 ward in principle. It is preferable to put the intercepted
214 packet on hold rather than dropping it, since higher-level
215 retries are not necessarily well-timed. There is a problem
216 of hosts and applications retrying during negotiations. ARP
217 implementations, which face the same problem, use the
218 approach of keeping the most recent packet for an as-yet-
219 unresolved address, and throwing away older ones. (Incre-
220 menting of request numbers etc. means that replies to older
221 ones may no longer be accepted.)
223 Is it worth intercepting incoming packets, from the outside
224 world, and attempting tunnel setup based on them? No,
225 unless and until a way can be devised to initiate oppor-
226 tunistic encryption to a non-opportunistic responder,
227 because if the other end has not initiated tunnel setup
228 itself, it will not be prepared to do so at our request.
230 Rationale: Note, however, that most incoming packets will
231 promptly be followed by an outgoing packet in response!
232 Conceivably it might be useful to start early stages of
233 negotiation, at least as far as looking up information, in
234 response to an incoming packet.
236 Rationale: If a plaintext incoming packet indicates that the
237 other end is not prepared to do opportunistic encryption, it
238 might seem that this fact should be noted, to avoid consum-
239 ing resources and delaying traffic in an attempt at oppor-
240 tunistic setup which is doomed to fail. However, this would
241 be a major security hole, since the plaintext packet is not
242 authenticated; see section 2.5.
246 For clarity, the following defers most discussion of error
249 Step 1. Initiator does a DNS reverse lookup on the Destina-
250 tion address, asking not for the usual PTR records,
251 but for TXT records. Meanwhile, Initiator also
252 sends a ping to the Destination, to cause any other
253 dynamic setup actions to start happening. (Ping
254 replies are disregarded; the host might not be
255 reachable with plaintext pings.)
257 Step 2A. If at least one suitable TXT record (see section
258 2.3) comes back, each contains a potential
268 Opportunistic Encryption
271 Responder's IP address and that Responder's public
272 key (or where to find it). Initiator picks one TXT
273 record, based on priority (see 2.3), thus picking a
274 Responder. If there was no public key in the TXT
275 record, the Initiator also starts a DNS lookup (as
276 specified by the TXT record) to get KEY records.
278 Step 2B. If no suitable TXT record is available, and policy
279 permits, Initiator designates the Destination
280 itself as the Responder (see section 2.4). If pol-
281 icy does not permit, or the Destination is unre-
282 sponsive to the negotiation, then opportunistic
283 encryption is not possible, and Initiator gives up
286 Step 3. If there already is a keying channel to the Respon-
287 der's IP address, the Initiator uses the existing
288 keying channel; skip to step 10. Otherwise, the
289 Initiator starts an IKE Phase 1 negotiation (see
290 section 2.7 for details) with the Responder. The
291 address family of the Responder's IP address dic-
292 tates whether the keying channel and the outside of
293 the tunnel should be IPv4 or IPv6.
295 Step 4. Responder gets the first IKE message, and responds.
296 It also starts a DNS reverse lookup on the Initia-
297 tor's IP address, for KEY records, on speculation.
299 Step 5. Initiator gets Responder's reply, and sends first
300 message of IKE's D-H exchange (see 2.4).
302 Step 6. Responder gets Initiator's D-H message, and
303 responds with a matching one.
305 Step 7. Initiator gets Responder's D-H message; encryption
306 is now established, authentication remains to be
307 done. Initiator sends IKE authentication message,
308 with an FQDN identity if a reverse lookup on its
309 address will not yield a suitable KEY record.
310 (Note, an FQDN need not actually correspond to a
311 host--e.g., the DNS data for it need not include an
314 Step 8. Responder gets Initiator's authentication message.
315 If there is no identity included, Responder waits
316 for step 4's speculative DNS lookup to finish; it
317 should yield a suitable KEY record (see 2.3). If
318 there is an FQDN identity, responder discards any
319 data obtained from step 4's DNS lookup; does a for-
320 ward lookup on the FQDN, for a KEY record; waits
321 for that lookup to return; it should yield a suit-
322 able KEY record. Either way, Responder uses the
323 KEY data to verify the message's hash. Responder
324 replies with an authentication message, with an
334 Opportunistic Encryption
337 FQDN identity if a reverse lookup on its address
338 will not yield a suitable KEY record.
340 Step 9A. (If step 2A was used.) The Initiator gets the
341 Responder's authentication message. Step 2A has
342 provided a key (from the TXT record or via DNS
343 lookup). Verify message's hash. Encrypted and
344 authenticated keying channel established, man-in-
345 middle attack precluded.
347 Step 9B. (If step 2B was used.) The Initiator gets the
348 Responder's authentication message, which must con-
349 tain an FQDN identity (if the Responder can't put a
350 TXT in his reverse map he presumably can't do a KEY
351 either). Do forward lookup on the FQDN, get suit-
352 able KEY record, verify hash. Encrypted keying
353 channel established, man-in-middle attack pre-
354 cluded, but authentication weak (see 2.4).
356 Step 10. Initiator initiates IKE Phase 2 negotiation (see
357 2.7) to establish tunnel, specifying Source and
358 Destination identities as IP addresses (see 2.6).
359 The address family of those addresses also deter-
360 mines whether the inside of the tunnel should be
363 Step 11. Responder gets first Phase 2 message. Now the
364 Responder finally knows what's going on! Unless
365 the specified Source is identical to the Initiator,
366 Responder initiates DNS reverse lookup on Source IP
367 address, for TXT records; waits for result; gets
368 suitable TXT record(s) (see 2.3), which should con-
369 tain either the Initiator's IP address or an FQDN
370 identity identical to that supplied by the Initia-
371 tor in step 7. This verifies that the Initiator is
372 authorized to act as SG for the Source. Responder
373 replies with second Phase 2 message, selecting
374 acceptable details (see 2.7), and establishes tun-
377 Step 12. Initiator gets second Phase 2 message, establishes
378 tunnel (if he didn't already), and releases the
379 intercepted packet into it, finally.
381 Step 13. Communication proceeds. See section 3 for what
384 As additional information becomes available, notably in
385 steps 1, 2, 4, 8, 9, 11, and 12, there is always a possibil-
386 ity that local policy (e.g., access limitations) might pre-
387 vent further progress. Whenever possible, at least attempt
388 to inform the other end of this.
400 Opportunistic Encryption
403 At any time, there is a possibility of the negotiation fail-
404 ing due to unexpected responses, e.g. the Responder not
405 responding at all or rejecting all Initiator's proposals.
406 If multiple SGs were found as possible Responders, the Ini-
407 tiator should try at least one more before giving up. The
408 number tried should be influenced by what the alternative
409 is: if the traffic will otherwise be discarded, trying the
410 full list is probably appropriate, while if the alternative
411 is plaintext transmission, it might be based on how long the
412 tries are taking. The Initiator should try as many as it
413 reasonably can, ideally all of them.
415 There is a sticky problem with timeouts. If the Responder
416 is down or otherwise inaccessible, in the worst case we
417 won't hear about this except by not getting responses. Some
418 other, more pathological or even evil, failure cases can
419 have the same result. The problem is that in the case where
420 plaintext is permitted, we want to decide whether a tunnel
421 is possible quickly. There is no good solution to this,
422 alas; we just have to take the time and do it right. (Pass-
423 ing plaintext meanwhile looks attractive at first glance...
424 but exposing the first few seconds of a connection is often
425 almost as bad as exposing the whole thing. Worse, if the
426 user checks the status of the connection, after that brief
427 window it looks secure!)
429 The flip side of waiting for a timeout is that all other
430 forms of feedback, e.g. ``host not reachable'', arguably
431 should be ignored, because in the absence of authenticated
432 ICMP, you cannot trust them!
434 Rationale: An alternative, sometimes suggested, to the use
435 of explicit DNS records for SG discovery is to directly
436 attempt IKE negotiation with the destination host, and
437 assume that any relevant SG will be on the packet path, will
438 intercept the IKE packets, and will impersonate the destina-
439 tion host for the IKE negotiation. This is superficially
440 attractive but is a very bad idea. It assumes that routing
441 is stable throughout negotiation, that the SG is on the
442 plaintext-packets path, and that the destination host is
443 routable (yes, it is possible to have (private) DNS data for
444 an unroutable host). Playing extra games in the plaintext-
445 packet path hurts performance and can be expected to be
446 unpopular. Various difficulties ensue when there are multi-
447 ple SGs along the path (there is already bad experience with
448 this, in RSVP), and the presence of even one can make it
449 impossible to do IKE direct to the host when that is what's
450 wanted. Worst of all, such impersonation breaks the IP net-
451 work model badly, making problems difficult to diagnose and
452 impossible to work around (and there is already bad experi-
453 ence with this, in areas like web caching).
455 Rationale: (Step 1.) Dynamic setup actions might include
456 establishment of demand-dialed links. These might be
466 Opportunistic Encryption
469 present anywhere along the path, so one cannot rely on out-
470 of-band communication at the Initiator to trigger them.
473 Rationale: (Step 2.) In many cases, the IP address on the
474 intercepted packet will be the result of a name lookup just
475 done. Inverse queries, an obscure DNS feature from the dis-
476 tant past, in theory can be used to ask a DNS server to
477 reverse that lookup, giving the name that produced the
478 address. This is not the same as a reverse lookup, and the
479 difference can matter a great deal in cases where a host
480 does not control its reverse map (e.g., when the host's IP
481 address is dynamically assigned). Unfortunately, inverse
482 queries were never widely implemented and are now considered
485 Ahem: Support for a small subset of this admittedly-obscure
486 feature would be useful. Unfortunately, it seems unlikely.
488 Rationale: (Step 3.) Using only IP addresses to decide
489 whether there is already a relevant keying channel avoids
490 some difficult problems. In particular, it might seem that
491 this should be based on identities, but those are not known
492 until very late in IKE Phase 1 negotiations.
494 Rationale: (Step 4.) The DNS lookup is done on speculation
495 because the data will probably be useful and the lookup can
496 be done in parallel with IKE activity, potentially speeding
499 Rationale: (Steps 7 and 8.) If an SG does not control its
500 reverse map, there is no way it can prove its right to use
501 an IP address, but it can nevertheless supply both an iden-
502 tity (as an FQDN) and proof of its right to use that iden-
503 tity. This is somewhat better than nothing, and may be
504 quite useful if the SG is representing a client host which
505 can prove its right to its IP address. (For example, a
506 fixed-address subnet might live behind an SG with a dynami-
507 cally-assigned address; such an SG has to be the Initiator,
508 not the Responder, so the subnet's TXT records can contain
509 FQDN identities, but with that restriction, this works.) It
510 might sound like this would permit some man-in-the-middle
511 attacks in important cases like Road Warrior, but the RW can
512 still do full authentication of the home base, so a man in
513 the middle cannot successfully impersonate home base, and
514 the D-H exchange doesn't work unless the man in the middle
515 impersonates both ends.
517 Rationale: (Steps 7 and 8.) Another situation where proof
518 of the right to use an identity can be very useful is when
519 access is deliberately limited. While opportunistic encryp-
520 tion is intended as a general-purpose connection mechanism
521 between strangers, it may well be convenient for prearranged
522 connections to use the same mechanism.
532 Opportunistic Encryption
535 Rationale: (Steps 7 and 8.) FQDNs as identities are avoided
536 where possible, since they can involve synchronous DNS
539 Rationale: (Step 11.) Note that only here, in Phase 2, does
540 the Responder actually learn who the Source and Destination
541 hosts are. This unfortunately demands a synchronous DNS
542 lookup to verify that the Initiator is authorized to repre-
543 sent the Source, unless they are one and the same. This and
544 the initial TXT lookup are the only synchronous DNS lookups
545 absolutely required by the algorithm, and they appear to be
548 Rationale: While it might seem unlikely that a refusal to
549 cooperate from one SG could be remedied by trying another--
550 presumably they all use the same policies--it's conceivable
551 that one might be misconfigured. Preferably they should all
552 be tried, but it may be necessary to set some limits on this
553 if alternatives exist.
557 Gateway discovery and key lookup are based on TXT and KEY
558 DNS records. The TXT record specifies IP address or other
559 identity of a host's SG, and possibly supplies its public
560 key as well, while the KEY record supplies public keys not
561 found in TXT records.
565 Opportunistic-encryption SG discovery uses TXT records with
568 X-IPsec-Gateway(nnn)=iii kkk
570 following RFC 1464 attribute/value notation. Records which
571 do not contain an ``='', or which do not have exactly the
572 specified form to the left of it, are ignored. (Near misses
573 perhaps should be reported.)
575 The nnn is an unsigned integer which will fit in 16 bits,
576 specifying an MX-style preference (lower number = stronger
577 preference) to control the order in which multiple SGs are
578 tried. If there are ties, pick one, randomly enough that
579 the choice will probably be different each time. The pref-
580 erence field is not optional; use ``0'' if there is no mean-
581 ingful preference ordering.
583 The iii part identifies the SG. Normally this is a dotted-
584 decimal IPv4 address or a colon-hex IPv6 address. The sole
585 exception is if the SG has no fixed address (see 2.4) but
586 the host(s) behind it do, in which case iii is of the form
587 ``@fqdn'', where fqdn is the FQDN that the SG will use to
588 identify itself (in step 7 of section 2.2); such a record
598 Opportunistic Encryption
601 cannot be used for SG discovery by an Initiator, but can be
602 used for SG verification (step 11 of 2.2) by a Responder.
604 The kkk part is optional. If it is present, it is an RSA-
605 MD5 public key in base-64 notation, as in the text form of
606 an RFC 2535 KEY record. If it is not present, this speci-
607 fies that the public key can be found in a KEY record
608 located based on the SG's identification: if iii is an IP
609 address, do a reverse lookup on that address, else do a for-
610 ward lookup on the FQDN.
612 Rationale: While it is unusual for a reverse lookup to go
613 for records other than PTR records (or possibly CNAME
614 records, for RFC 2317 classless delegation), there's no rea-
615 son why it can't. The TXT record is a temporary stand-in
616 for (we hope, someday) a new DNS record for SG identifica-
617 tion and keying. Keeping the setup process fast requires
618 minimizing the number of DNS lookups, hence the desire to
619 put all the information in one place.
621 Rationale: The use of RFC 1464 notation avoids collisions
622 with other uses of TXT records. The ``X-'' in the attribute
623 name indicates that this format is tentative and experimen-
624 tal; this design will probably need modification after ini-
625 tial experiments. The format is chosen with an eye on even-
626 tual binary encoding. Note, in particular, that the TXT
627 record normally contains the address of the SG, not (repeat,
628 not) its name. Name-to-address conversion is the job of
629 whatever generates the TXT record, which is expected to be a
630 program, not a human--this is conceptually a binary record,
631 temporarily using a text encoding. The ``@fqdn'' form of
632 the SG identity is for specialized uses and is never mapped
635 Ahem: A DNS TXT record contains one or more character
636 strings, but RFC 1035 does not describe exactly how a multi-
637 string TXT record is interpreted. This is relevant because
638 a string can be at most 255 characters, and public keys can
639 exceed this. Empirically, the standard pattern is that each
640 string which is both less than 255 characters and not the
641 final string of the record should have a blank appended to
642 it, and the strings of the record should then be concate-
643 nated. (This observation is based on how BIND 8 transforms
644 a TXT record from text to DNS binary.)
648 An opportunistic-encryption KEY record is an Authentication-
649 permitted, Entity (host), non-Signatory, IPsec, RSA/MD5
650 record (that is, its first four bytes are 0x42000401), as
651 per RFCs 2535 and 2537. KEY records with other flags, pro-
652 tocol, or algorithm values are ignored.
658 Draft 4 3 May 2001 10
664 Opportunistic Encryption
667 Rationale: Unfortunately, the public key has to be associ-
668 ated with the SG, not the client host behind it. The
669 Responder does not know which client it is supposed to be
670 representing, or which client the Initiator is representing,
673 Ahem: Per-client keys would reduce vulnerability to key com-
674 promise, and simplify key changes, but they would require
675 changes to IKE Phase 1, to separately identify the SG and
676 its initial client(s). (At present, the client identities
677 are not known to the Responder until IKE Phase 2.) While
678 the current IKE standard does not actually specify (!) who
679 is being identified by identity payloads, the overwhelming
680 consensus is that they identify the SG, and as seen earlier,
681 this has important uses.
685 For reference, the minimum set of DNS records needed to make
686 this all work is either:
688 1. TXT in Destination reverse map, identifying Responder
689 and providing public key.
691 2. KEY in Initiator reverse map, providing public key.
693 3. TXT in Source reverse map, verifying relationship to
698 1. TXT in Destination reverse map, identifying Responder.
700 2. KEY in Responder reverse map, providing public key.
702 3. KEY in Initiator reverse map, providing public key.
704 4. TXT in Source reverse map, verifying relationship to
707 Slight complications ensue for dynamic addresses, lack of
708 control over reverse maps, etc.
710 2.3.4. Implementation
712 In the long run, we need either a tree of trust or a web of
713 trust, so we can trust our DNS data. The obvious approach
714 for DNS is a tree of trust, but there are various practical
715 problems with running all of this through the root servers,
716 and a web of trust is arguably more robust anyway. This is
717 logically independent of opportunistic encryption, and a
718 separate design proposal will be prepared.
724 Draft 4 3 May 2001 11
730 Opportunistic Encryption
733 Interim stages of implementation of this will require a bit
734 of thought. Notably, we need some way of dealing with the
735 lack of fully signed DNSSEC records right away. Without
736 user interaction, probably the best we can do is to remember
737 the results of old fetches, compare them to the results of
738 new fetches, and complain and disbelieve all of it if
739 there's a mismatch. This does mean that somebody who gets
740 fake data into our very first fetch will fool us, at least
741 for a while, but that seems an acceptable tradeoff. (Obvi-
742 ously there needs to be a way to manually flush the remem-
743 bered results for a specific host, to permit deliberate
746 2.4. Responders Without Credentials
748 In cases where the Destination simply does not control its
749 DNS reverse-map entries, there is no verifiable way to
750 determine a suitable SG. This does not make communication
751 utterly impossible, though.
753 Simply attempting negotiation directly with the host is a
754 last resort. (An aggressive implementation might wish to
755 attempt it in parallel, rather than waiting until other
756 options are known to be unavailable.) In particular, in
757 many cases involving dynamic addresses, it will work. It
758 has the disadvantage of delaying the discovery that oppor-
759 tunistic encryption is entirely impossible, but the case
760 seems common enough to justify the overhead.
762 However, there are policy issues here either way, because it
763 is possible to impersonate such a host. The host can supply
764 an FQDN identity and verify its right to use that identity,
765 but except by prearrangement, there is no way to verify that
766 the FQDN is the right one for that IP address. (The data
767 from forward lookups may be controlled by people who do not
768 own the address, so it cannot be trusted.) The encryption
769 is still solid, though, so in many cases this may be useful.
771 2.5. Failure of Opportunism
773 When there is no way to do opportunistic encryption, a pol-
774 icy issue arises: whether to put in a bypass (which allows
775 plaintext traffic through) or a block (which discards it,
776 perhaps with notification back to the sender). The choice
777 is very much a matter of local policy, and may depend on
778 details such as the higher-level protocol being used. For
779 example, an SG might well permit plaintext HTTP but forbid
780 plaintext Telnet, in which case both a block and a bypass
781 would be set up if opportunistic encryption failed.
783 A bypass/block must, in practice, be treated much like an
784 IPsec tunnel. It should persist for a while, so that high-
785 overhead processing doesn't have to be done for every
786 packet, but should go away eventually to return resources.
790 Draft 4 3 May 2001 12
796 Opportunistic Encryption
799 It may be simplest to treat it as a degenerate tunnel. It
800 should have a relatively long lifetime (say 6h) to keep the
801 frequency of negotiation attempts down, except in the case
802 where the other SG simply did not respond to IKE packets,
803 where the lifetime should be short (say 10min) because the
804 other SG is presumably down and might come back up again.
805 (Cases where the other SG responded to IKE with unauthenti-
806 cated error reports like ``port unreachable'' are border-
807 line, and might deserve to be treated as an intermediate
808 case: while such reports cannot be trusted unreservedly, in
809 the absence of any other response, they do give some reason
810 to suspect that the other SG is unable or unwilling to par-
811 ticipate in opportunistic encryption.)
813 As noted in section 2.1, one might think that arrival of a
814 plaintext incoming packet should cause a bypass/block to be
815 set up for its source host: such a packet is almost always
816 followed by an outgoing reply packet; the incoming packet is
817 clear evidence that opportunistic encryption is not avail-
818 able at the other end; attempting it will waste resources
819 and delay traffic to no good purpose. Unfortunately, this
820 means that anyone out on the Internet who can forge a source
821 address can prevent encrypted communication! Since their
822 source addresses are not authenticated, plaintext packets
823 cannot be taken as evidence of anything, except perhaps that
824 communication from that host is likely to occur soon.
826 There needs to be a way for local administrators to remove a
827 bypass/block ahead of its normal expiry time, to force a
828 retry after a problem at the other end is known to have been
831 2.6. Subnet Opportunism
833 In principle, when the Source or Destination host belongs to
834 a subnet and the corresponding SG is willing to provide tun-
835 nels to the whole subnet, this should be done. There is no
836 extra overhead, and considerable potential for avoiding
837 later overhead if similar communication occurs with other
838 members of the subnet. Unfortunately, at the moment, oppor-
839 tunistic tunnels can only have degenerate subnets (single
840 hosts) at their ends. (This does, at least, set up the key-
841 ing channel, so that negotiations for tunnels to other hosts
842 in the same subnets will be considerably faster.)
844 The crucial problem is step 11 of section 2.2: the Responder
845 must verify that the Initiator is authorized to represent
846 the Source, and this is impossible for a subnet because
847 there is no way to do a reverse lookup on it. Information
848 in DNS records for a name or a single address cannot be
849 trusted, because they may be controlled by people who do not
850 control the whole subnet.
856 Draft 4 3 May 2001 13
862 Opportunistic Encryption
865 Ahem: Except in the special case of a subnet masked on a
866 byte boundary (in which case RFC 1035's convention of an
867 incomplete in-addr.arpa name could be used), subnet lookup
868 would need extensions to the reverse-map name space, perhaps
869 along the lines of that commonly done for RFC 2317 delega-
870 tion. IPv6 already has suitable name syntax, as in RFC
871 2874, but has no specific provisions for subnet entries in
872 its reverse maps. Fixing all this is is not conceptually
873 difficult, but is logically independent of opportunistic
874 encryption, and will be proposed separately.
876 A less-troublesome problem is that the Initiator, in step 10
877 of 2.2, must know exactly what subnet is present on the
878 Responder's end so he can propose a tunnel to it. This
879 information could be included in the TXT record of the Des-
880 tination (it would have to be verified with a subnet lookup,
881 but that could be done in parallel with other operations).
882 The Initiator presumably can be configured to know what sub-
883 net(s) are present on its end.
887 IPsec and IKE have far too many useless options, and a few
888 useful ones. IKE negotiation is quite simplistic, and can-
889 not handle even simple discrepancies between the two SGs.
890 So it is necessary to be quite specific about what should be
891 done and what should be proposed, to guarantee interoper-
892 ability without prearrangement or other negotiation proto-
895 Rationale: The prohibition of other negotiations is simply
896 because there is no time. The setup algorithm (section 2.2)
899 [Open question: should opportunistic IKE use a different
900 port than normal IKE?]
902 Somewhat arbitrarily and tentatively, opportunistic SGs must
903 support Main Mode, Oakley group 5 for D-H, 3DES encryption
904 and MD5 authentication for both ISAKMP and IPsec SAs,
905 RSA/MD5 digital-signature authentication with keys between
906 2048 and 8192 bits, and ESP doing both encryption and
907 authentication. They must do key PFS in Quick Mode, but not
908 identity PFS. They may support IPComp, preferably using
909 Deflate, but must not insist on it. They may support AES as
910 an alternative to 3DES, but must not insist on it.
912 Rationale: Identity PFS essentially requires establishing a
913 complete new keying channel for each new tunnel, but key PFS
914 just does a new Diffie-Hellman exchange for each rekeying,
915 which is relatively cheap.
917 Keying channels must remain in existence at least as long as
918 any tunnel created with them remains (they are not costly,
922 Draft 4 3 May 2001 14
928 Opportunistic Encryption
931 and keeping the management path up and available simplifies
932 various issues). See section 3.1 for related issues. Given
933 the use of key PFS, frequent rekeying does not seem critical
934 here. In the absence of strong reason to do otherwise, the
935 Initiator should propose rekeying at 8hr-or-1MB. The
936 Responder must accept any proposal which specifies a rekey-
937 ing time between 1hr and 24hr inclusive and a rekeying vol-
938 ume between 100KB and 10MB inclusive.
940 Given the short expected useful life of most tunnels (see
941 section 3.1), very few of them will survive long enough to
942 be rekeyed. In the absence of strong reason to do other-
943 wise, the Initiator should propose rekeying at 1hr-or-100MB.
944 The Responder must accept any proposal which specifies a
945 rekeying time between 10min and 8hr inclusive and a rekeying
946 volume between 1MB and 1000MB inclusive.
948 It is highly desirable to add some random jitter to the
949 times of actual rekeying attempts, to break up ``convoys''
950 of rekeying events; this and certain other aspects of robust
951 rekeying practice will be the subject of a separate design
954 Rationale: The numbers used here for rekeying intervals are
955 chosen quite arbitrarily and should be re-assessed after
956 some implementation experience is gathered.
958 3. Renewal and Teardown
962 When to tear tunnels down is a bit problematic, but if we're
963 setting up a potentially unbounded number of them, we have
964 to tear them down somehow sometime.
966 Set a short initial tentative lifespan, say 1min, since most
967 net flows in fact last only a few seconds. When that
968 expires, look to see if the tunnel is still in use (defini-
969 tion: has had traffic, in either direction, in the last half
970 of the tentative lifespan). If so, assign it a somewhat
971 longer tentative lifespan, say 20min, after which, look
972 again. If not, close it down. (This tentative lifespan is
973 independent of rekeying; it is just the time when the tun-
974 nel's future is next considered. This should happen reason-
975 ably frequently, unlike rekeying, which is costly and
976 shouldn't be too frequent.) Multi-step backoff algorithms
977 are not worth the trouble; looking every 20min doesn't seem
980 If the security gateway and the client host are one and the
981 same, tunnel teardown decisions might wish to pay attention
982 to TCP connection status, as reported by the local TCP
983 layer. A still-open TCP connection is almost a guarantee
984 that more traffic is coming, while the demise of the only
988 Draft 4 3 May 2001 15
994 Opportunistic Encryption
997 TCP connection through a tunnel is a strong hint that none
998 is. If the SG and the client host are separate machines,
999 though, tracking TCP connection status requires packet
1000 snooping, which is complicated and probably not worthwhile.
1002 IKE keying channels likewise are torn down when it appears
1003 the need has passed. They always linger longer than the
1004 last tunnel they administer, in case they are needed again;
1005 the cost of retaining them is low. Other than that, unless
1006 the number of keying channels on the SG gets large, the SG
1007 should simply retain all of them until rekeying time, since
1008 rekeying is the only costly event. When about to rekey a
1009 keying channel which has no current tunnels, note when the
1010 last actual keying-channel traffic occurred, and close the
1011 keying channel down if it wasn't in the last, say, 30min.
1012 When rekeying a keying channel (or perhaps shortly before
1013 rekeying is expected), Initiator and Responder should re-
1014 fetch the public keys used for SG authentication, against
1015 the possibility that they have changed or disappeared.
1017 See section 2.7 for discussion of rekeying intervals.
1019 Given the low user impact of tearing down and rebuilding a
1020 connection (a tunnel or a keying channel), rekeying attempts
1021 should not be too persistent: one can always just rebuild
1022 when needed, so heroic efforts to preserve an existing con-
1023 nection are unnecessary. Say, try every 10s for a minute
1024 and every minute for 5min, and then give up and declare the
1025 connection (and all other connections to that IKE peer)
1028 Rationale: In future, more sophisticated, versions of this
1029 protocol, examining the initial packet might permit a more
1030 intelligent guess at the tunnel's useful life. HTTP connec-
1031 tions in particular are notoriously bursty and repetitive.
1033 Rationale: Note that rekeying a keying connection basically
1034 consists of building a new keying connection from scratch,
1035 using IKE Phase 1, and abandoning the old one.
1037 3.2. Teardown and Cleanup
1039 Teardown should always be coordinated with the other end.
1040 This means interpreting and sending Delete notifications.
1042 On receiving a Delete for the outbound SAs of a tunnel (or
1043 some subset of them), tear down the inbound ones too, and
1044 notify the other end with a Delete. Tunnels need to be con-
1045 sidered as bidirectional entities, even though the low-level
1046 protocols don't think of them that way.
1048 When the deletion is initiated locally, rather than as a
1049 response to a received Delete, send a Delete for (all) the
1050 inbound SAs of a tunnel. If no responding Delete is
1054 Draft 4 3 May 2001 16
1060 Opportunistic Encryption
1063 received for the outbound SAs, try re-sending the original
1064 Delete. Three tries spaced 10s apart seems a reasonable
1065 level of effort. (Indefinite persistence is not necessary;
1066 whether the other end isn't cooperating because it doesn't
1067 feel like it, or because it is down/disconnected/etc., the
1068 problem will eventually be cleared up by other means.)
1070 After rekeying, transmission should switch to using the new
1071 SAs (ISAKMP or IPsec) immediately, and the old leftover SAs
1072 should be cleared out promptly (and Deletes sent) rather
1073 than waiting for them to expire. This reduces clutter and
1074 minimizes confusion.
1076 Since there is only one keying channel per remote IP
1077 address, the question of whether a Delete notification has
1078 appeared on a ``suitable'' keying channel does not arise.
1080 Rationale: The pairing of Delete notifications effectively
1081 constitutes an acknowledged Delete, which is highly desir-
1084 3.3. Outages and Reboots
1086 Tunnels sometimes go down because the other end crashes, or
1087 disconnects, or has a network link break, and there is no
1088 notice of this in the general case. (Even in the event of a
1089 crash and successful reboot, other SGs don't hear about it
1090 unless the rebooted SG has specific reason to talk to them
1091 immediately.) Over-quick response to temporary network out-
1092 ages is undesirable... but note that a tunnel can be torn
1093 down and then re-established without any user-visible effect
1094 except a pause in traffic, whereas if one end does reboot,
1095 the other end can't get packets to it at all (except via
1096 IKE) until the situation is noticed. So a bias toward quick
1097 response is appropriate, even at the cost of occasional
1100 Heartbeat mechanisms are somewhat unsatisfactory for this.
1101 Unless they are very frequent, which causes other problems,
1102 they do not detect the problem promptly.
1104 Ahem: What is really wanted is authenticated ICMP. This
1105 might be a case where public-key encryption/authentication
1106 of network packets is the right thing to do, despite the
1109 In the absence of that, a two-part approach seems warranted.
1111 First, when an SG receives an IPsec packet that is addressed
1112 to it, and otherwise appears healthy, but specifies an
1113 unknown SA and is from a host that the receiver currently
1114 has no keying channel to, the receiver must attempt to
1115 inform the sender via an IKE Initial-Contact notification
1116 (necessarily sent in plaintext, since there is no suitable
1120 Draft 4 3 May 2001 17
1126 Opportunistic Encryption
1129 keying channel). This must be severely rate-limited on both
1130 ends; one notification per SG pair per minute seems ample.
1132 Second, there is an obvious difficulty with this: the Ini-
1133 tial-Contact notification is unauthenticated and cannot be
1134 trusted. So it must be taken as a hint only: there must be
1135 a way to confirm it.
1137 What is needed here is something that's desirable for debug-
1138 ging and testing anyway: an IKE-level ping mechanism. Ping-
1139 ing direct at the IP level instead will not tell us about a
1140 crash/reboot event. Sending pings through tunnels has vari-
1141 ous complications (they should stop at the far mouth of the
1142 tunnel instead of going on to a subnet; they should not
1143 count against idle timers; etc.). What is needed is a con-
1144 tinuity check on a keying channel. (This could also be used
1145 as a heartbeat, should that seem useful.)
1147 IKE Ping delivery need not be reliable, since the whole
1148 point of a ping is simply to provoke an acknowledgement.
1149 They should preferably be authenticated, but it is not clear
1150 that this is absolutely necessary, although if they are not
1151 they need encryption plus a timestamp or a nonce, to foil
1152 replay mischief. How they are implemented is a secondary
1153 issue, and a separate design proposal will be prepared.
1155 Ahem: Some existing implementations are already using (pri-
1156 vate) notify value 30000 (``LIKE_HELLO'') as ping and (pri-
1157 vate) notify value 30002 (``SHUT_UP'') as ping reply.
1159 If an IKE Ping gets no response, try some (say 8) IP pings,
1160 spaced a few seconds apart, to check IP connectivity; if one
1161 comes back, try another IKE Ping; if that gets no response,
1162 the other end probably has rebooted, or otherwise been re-
1163 initialized, and its tunnels and keying channel(s) should be
1166 In a similar vein, giving limited rekeying persistence, a
1167 short network outage could take some tunnels down without
1168 disrupting others. On receiving a packet for an unknown SA
1169 from a host that a keying channel is currently open to, send
1170 that host a Invalid-SPI notification for that SA. The other
1171 host can then tear down the half-torn-down tunnel, and nego-
1172 tiate a new tunnel for the traffic it presumably still wants
1175 Finally, it would be helpful if SGs made some attempt to
1176 deal intelligently with crashes and reboots. A deliberate
1177 shutdown should include an attempt to notify all other SGs
1178 currently connected by keying channels, using Deletes, that
1179 communication is about to fail. (Again, these will be taken
1180 as teardowns; attempts by the other SGs to negotiate new
1181 tunnels as replacements should be ignored at this point.)
1182 And when possible, SGs should attempt to preserve
1186 Draft 4 3 May 2001 18
1192 Opportunistic Encryption
1195 information about currently-connected SGs in non-volatile
1196 storage, so that after a crash, an Initial-Contact can be
1197 sent to previous partners to indicate loss of all previ-
1198 ously-established connections.
1202 This design appears to achieve the objective of setting up
1203 encryption with strangers. The authentication aspects also
1204 seem adequately addressed if the destination controls its
1205 reverse-map DNS entries and the DNS data itself can be reli-
1206 ably authenticated as having originated from the legitimate
1207 administrators of that subnet/FQDN. The authentication sit-
1208 uation is less satisfactory when DNS is less helpful, but it
1209 is difficult to see what else could be done about it.
1215 6. Appendix: Separate Design Proposals TBW
1217 o How can we build a web of trust with DNSSEC? (See sec-
1220 o How can we extend DNS reverse lookups to permit reverse
1221 lookup on a subnet? (Both address and mask must appear
1222 in the name to be looked up.) (See section 2.6.)
1224 o How can rekeying be done as robustly as possible? (At
1225 least partly, this is just documenting current FreeS/WAN
1226 practice.) (See section 2.7.)
1228 o How should IKE Pings be implemented? (See section 3.3.)
1252 Draft 4 3 May 2001 19