original/man2/ptrace.2

   1 .\" Hey Emacs! This file is -*- nroff -*- source.
   2 .\"
   3 .\" Copyright (c) 1993 Michael Haardt <michael@moria.de>
   4 .\" Fri Apr  2 11:32:09 MET DST 1993
   5 .\"
   6 .\" and changes Copyright (C) 1999 Mike Coleman (mkc@acm.org)
   7 .\" -- major revision to fully document ptrace semantics per recent Linux
   8 .\"    kernel (2.2.10) and glibc (2.1.2)
   9 .\" Sun Nov  7 03:18:35 CST 1999
  10 .\"
  11 .\" and Copyright (c) 2011, Denys Vlasenko <vda.linux@googlemail.com>
  12 .\"
  13 .\" This is free documentation; you can redistribute it and/or
  14 .\" modify it under the terms of the GNU General Public License as
  15 .\" published by the Free Software Foundation; either version 2 of
  16 .\" the License, or (at your option) any later version.
  17 .\"
  18 .\" The GNU General Public License's references to "object code"
  19 .\" and "executables" are to be interpreted as the output of any
  20 .\" document formatting or typesetting system, including
  21 .\" intermediate and printed output.
  22 .\"
  23 .\" This manual is distributed in the hope that it will be useful,
  24 .\" but WITHOUT ANY WARRANTY; without even the implied warranty of
  25 .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  26 .\" GNU General Public License for more details.
  27 .\"
  28 .\" You should have received a copy of the GNU General Public
  29 .\" License along with this manual; if not, write to the Free
  30 .\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111,
  31 .\" USA.
  32 .\"
  33 .\" Modified Fri Jul 23 23:47:18 1993 by Rik Faith <faith@cs.unc.edu>
  34 .\" Modified Fri Jan 31 16:46:30 1997 by Eric S. Raymond <esr@thyrsus.com>
  35 .\" Modified Thu Oct  7 17:28:49 1999 by Andries Brouwer <aeb@cwi.nl>
  36 .\" Modified, 27 May 2004, Michael Kerrisk <mtk.manpages@gmail.com>
  37 .\"     Added notes on capability requirements
  38 .\"
  39 .\" 2006-03-24, Chuck Ebbert <76306.1226@compuserve.com>
  40 .\"    Added    PTRACE_SETOPTIONS, PTRACE_GETEVENTMSG, PTRACE_GETSIGINFO,
  41 .\"        PTRACE_SETSIGINFO, PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP
  42 .\"    (Thanks to Blaisorblade, Daniel Jacobowitz and others who helped.)
  43 .\" 2011-09, major update by Denys Vlasenko <vda.linux@googlemail.com>
  44 .\"
  45 .\" FIXME Linux 2.6.34 adds PTRACE_GETREGSET/PTRACE_SETREGSET
  46 .\" FIXME Linux 3.1 adds PTRACE_SEIZE, PTRACE_INTERRUPT,
  47 .\"                and PTRACE_LISTEN.
  48 .\"
  49 .TH PTRACE 2 2012-03-24 "Linux" "Linux Programmer's Manual"
  50 .SH NAME
  51 ptrace \- process trace
  52 .SH SYNOPSIS
  53 .nf
  54 .B #include <sys/ptrace.h>
  55 .sp
  56 .BI "long ptrace(enum __ptrace_request " request ", pid_t " pid ", "
  57 .BI "            void *" addr ", void *" data );
  58 .fi
  59 .SH DESCRIPTION
  60 The
  61 .BR ptrace ()
  62 system call provides a means by which one process (the "tracer")
  63 may observe and control the execution of another process (the "tracee"),
  64 and examine and change the tracee's memory and registers.
  65 It is primarily used to implement breakpoint debugging and system
  66 call tracing.
  67 .LP
  68 A tracee first needs to be attached to the tracer.
  69 Attachment and subsequent commands are per thread:
  70 in a multithreaded process,
  71 every thread can be individually attached to a
  72 (potentially different) tracer,
  73 or left not attached and thus not debugged.
  74 Therefore, "tracee" always means "(one) thread",
  75 never "a (possibly multithreaded) process".
  76 Ptrace commands are always sent to
  77 a specific tracee using a call of the form
  78
  79     ptrace(PTRACE_foo, pid, ...)
  80
  81 where
  82 .I pid
  83 is the thread ID of the corresponding Linux thread.
  84 .LP
  85 (Note that in this page, a "multithreaded process"
  86 means a thread group consisting of threads created using the
  87 .BR clone (2)
  88 .B CLONE_THREAD
  89 flag.)
  90 .LP
  91 A process can initiate a trace by calling
  92 .BR fork (2)
  93 and having the resulting child do a
  94 .BR PTRACE_TRACEME ,
  95 followed (typically) by an
  96 .BR execve (2).
  97 Alternatively, one process may commence tracing another process using
  98 .BR PTRACE_ATTACH .
  99 .LP
 100 While being traced, the tracee will stop each time a signal is delivered,
 101 even if the signal is being ignored.
 102 (An exception is
 103 .BR SIGKILL ,
 104 which has its usual effect.)
 105 The tracer will be notified at its next call to
 106 .BR waitpid (2)
 107 (or one of the related "wait" system calls); that call will return a
 108 .I status
 109 value containing information that indicates
 110 the cause of the stop in the tracee.
 111 While the tracee is stopped,
 112 the tracer can use various ptrace requests to inspect and modify the tracee.
 113 The tracer then causes the tracee to continue,
 114 optionally ignoring the delivered signal
 115 (or even delivering a different signal instead).
 116 .LP
 117 When the tracer is finished tracing, it can cause the tracee to continue
 118 executing in a normal, untraced mode via
 119 .BR PTRACE_DETACH .
 120 .LP
 121 The value of
 122 .I request
 123 determines the action to be performed:
 124 .TP
 125 .B PTRACE_TRACEME
 126 Indicate that this process is to be traced by its parent.
 127 Any signal (except
 128 .BR SIGKILL )
 129 delivered to this process will cause it to stop and its
 130 parent to be notified via
 131 .BR waitpid (2).
 132 In addition, all subsequent calls to
 133 .BR execve (2)
 134 by the traced process will cause a
 135 .B SIGTRAP
 136 to be sent to it,
 137 giving the parent a chance to gain control before the new program
 138 begins execution.
 139 A process probably shouldn't make this request if its parent
 140 isn't expecting to trace it.
 141 .RI ( pid ,
 142 .IR addr ,
 143 and
 144 .IR data
 145 are ignored.)
 146 .LP
 147 The
 148 .B PTRACE_TRACEME
 149 request is used only by the tracee;
 150 the remaining requests are used only by the tracer.
 151 In the following requests,
 152 .I pid
 153 specifies the thread ID of the tracee to be acted on.
 154 For requests other than
 155 .BR PTRACE_KILL ,
 156 the tracee must be stopped.
 157 .TP
 158 .BR PTRACE_PEEKTEXT ", " PTRACE_PEEKDATA
 159 Read a word at the address
 160 .I addr
 161 in the tracee's memory, returning the word as the result of the
 162 .BR ptrace ()
 163 call.
 164 Linux does not have separate text and data address spaces,
 165 so these two requests are currently equivalent.
 166 .RI ( data
 167 is ignored.)
 168 .TP
 169 .B PTRACE_PEEKUSER
 170 .\" PTRACE_PEEKUSR in kernel source, but glibc uses PTRACE_PEEKUSER,
 171 .\" and that is the name that seems common on other systems.
 172 Read a word at offset
 173 .I addr
 174 in the tracee's USER area,
 175 which holds the registers and other information about the process
 176 (see
 177 .IR <sys/user.h> ).
 178 The word is returned as the result of the
 179 .BR ptrace ()
 180 call.
 181 Typically, the offset must be word-aligned, though this might vary by
 182 architecture.
 183 See NOTES.
 184 .RI ( data
 185 is ignored.)
 186 .TP
 187 .BR PTRACE_POKETEXT ", " PTRACE_POKEDATA
 188 Copy the word
 189 .I data
 190 to the address
 191 .I addr
 192 in the tracee's memory.
 193 As for
 194 .BR PTRACE_PEEKTEXT
 195 and
 196 .BR PTRACE_PEEKDATA ,
 197 these two requests are currently equivalent.
 198 .TP
 199 .B PTRACE_POKEUSER
 200 .\" PTRACE_POKEUSR in kernel source, but glibc uses PTRACE_POKEUSER,
 201 .\" and that is the name that seems common on other systems.
 202 Copy the word
 203 .I data
 204 to offset
 205 .I addr
 206 in the tracee's USER area.
 207 As for
 208 .BR PTRACE_PEEKUSER ,
 209 the offset must typically be word-aligned.
 210 In order to maintain the integrity of the kernel,
 211 some modifications to the USER area are disallowed.
 212 .\" FIXME In the preceding sentence, which modifications are disallowed,
 213 .\" and when they are disallowed, how does userspace discover that fact?
 214 .TP
 215 .BR PTRACE_GETREGS ", " PTRACE_GETFPREGS
 216 Copy the tracee's general-purpose or floating-point registers,
 217 respectively, to the address
 218 .I data
 219 in the tracer.
 220 See
 221 .I <sys/user.h>
 222 for information on the format of this data.
 223 .RI ( addr
 224 is ignored.)
 225 .TP
 226 .BR PTRACE_GETSIGINFO " (since Linux 2.3.99-pre6)"
 227 Retrieve information about the signal that caused the stop.
 228 Copy a
 229 .I siginfo_t
 230 structure (see
 231 .BR sigaction (2))
 232 from the tracee to the address
 233 .I data
 234 in the tracer.
 235 .RI ( addr
 236 is ignored.)
 237 .TP
 238 .BR PTRACE_SETREGS ", " PTRACE_SETFPREGS
 239 Copy the tracee's general-purpose or floating-point registers,
 240 respectively, from the address
 241 .I data
 242 in the tracer.
 243 As for
 244 .BR PTRACE_POKEUSER ,
 245 some general-purpose register modifications may be disallowed.
 246 .\" FIXME In the preceding sentence, which modifications are disallowed,
 247 .\" and when they are disallowed, how does userspace discover that fact?
 248 .RI ( addr
 249 is ignored.)
 250 .TP
 251 .BR PTRACE_SETSIGINFO " (since Linux 2.3.99-pre6)"
 252 Set signal information:
 253 copy a
 254 .I siginfo_t
 255 structure from the address
 256 .I data
 257 in the tracer to the tracee.
 258 This will affect only signals that would normally be delivered to
 259 the tracee and were caught by the tracer.
 260 It may be difficult to tell
 261 these normal signals from synthetic signals generated by
 262 .BR ptrace ()
 263 itself.
 264 .RI ( addr
 265 is ignored.)
 266 .TP
 267 .BR PTRACE_SETOPTIONS " (since Linux 2.4.6; see BUGS for caveats)"
 268 Set ptrace options from
 269 .IR data .
 270 .RI ( addr
 271 is ignored.)
 272 .IR data
 273 is interpreted as a bit mask of options,
 274 which are specified by the following flags:
 275 .RS
 276 .TP
 277 .BR PTRACE_O_TRACESYSGOOD " (since Linux 2.4.6)"
 278 When delivering system call traps, set bit 7 in the signal number
 279 (i.e., deliver
 280 .IR "SIGTRAP|0x80" ).
 281 This makes it easy for the tracer to distinguish
 282 normal traps from those caused by a system call.
 283 .RB ( PTRACE_O_TRACESYSGOOD
 284 may not work on all architectures.)
 285 .TP
 286 .BR PTRACE_O_TRACEFORK " (since Linux 2.5.46)"
 287 Stop the tracee at the next
 288 .BR fork (2)
 289 and automatically start tracing the newly forked process,
 290 which will start with a
 291 .BR SIGSTOP .
 292 A
 293 .BR waitpid (2)
 294 by the tracer will return a
 295 .I status
 296 value such that
 297
 298 .nf
 299   status>>8 == (SIGTRAP | (PTRACE_EVENT_FORK<<8))
 300 .fi
 301
 302 The PID of the new process can be retrieved with
 303 .BR PTRACE_GETEVENTMSG .
 304 .TP
 305 .BR PTRACE_O_TRACEVFORK " (since Linux 2.5.46)"
 306 Stop the tracee at the next
 307 .BR vfork (2)
 308 and automatically start tracing the newly vforked process,
 309 which will start with a
 310 .BR SIGSTOP .
 311 A
 312 .BR waitpid (2)
 313 by the tracer will return a
 314 .I status
 315 value such that
 316
 317 .nf
 318   status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK<<8))
 319 .fi
 320
 321 The PID of the new process can be retrieved with
 322 .BR PTRACE_GETEVENTMSG .
 323 .TP
 324 .BR PTRACE_O_TRACECLONE " (since Linux 2.5.46)"
 325 Stop the tracee at the next
 326 .BR clone (2)
 327 and automatically start tracing the newly cloned process,
 328 which will start with a
 329 .BR SIGSTOP .
 330 A
 331 .BR waitpid (2)
 332 by the tracer will return a
 333 .I status
 334 value such that
 335
 336 .nf
 337   status>>8 == (SIGTRAP | (PTRACE_EVENT_CLONE<<8))
 338 .fi
 339
 340 The PID of the new process can be retrieved with
 341 .BR PTRACE_GETEVENTMSG .
 342 .IP
 343 This option may not catch
 344 .BR clone (2)
 345 calls in all cases.
 346 If the tracee calls
 347 .BR clone (2)
 348 with the
 349 .B CLONE_VFORK
 350 flag,
 351 .B PTRACE_EVENT_VFORK
 352 will be delivered instead
 353 if
 354 .B PTRACE_O_TRACEVFORK
 355 is set; otherwise if the tracee calls
 356 .BR clone (2)
 357 with the exit signal set to
 358 .BR SIGCHLD ,
 359 .B PTRACE_EVENT_FORK
 360 will be delivered if
 361 .B PTRACE_O_TRACEFORK
 362 is set.
 363 .TP
 364 .BR PTRACE_O_TRACEEXEC " (since Linux 2.5.46)"
 365 Stop the tracee at the next
 366 .BR execve (2).
 367 A
 368 .BR waitpid (2)
 369 by the tracer will return a
 370 .I status
 371 value such that
 372
 373 .nf
 374   status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))
 375 .fi
 376
 377 If the execing thread is not a thread group leader,
 378 the thread ID is reset to thread group leader's ID before this stop.
 379 Since Linux 3.0, the former thread ID can be retrieved with
 380 .BR PTRACE_GETEVENTMSG .
 381 .TP
 382 .BR PTRACE_O_TRACEVFORKDONE " (since Linux 2.5.60)"
 383 Stop the tracee at the completion of the next
 384 .BR vfork (2).
 385 A
 386 .BR waitpid (2)
 387 by the tracer will return a
 388 .I status
 389 value such that
 390
 391 .nf
 392   status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK_DONE<<8))
 393 .fi
 394
 395 The PID of the new process can (since Linux 2.6.18) be retrieved with
 396 .BR PTRACE_GETEVENTMSG .
 397 .TP
 398 .BR PTRACE_O_TRACEEXIT " (since Linux 2.5.60)"
 399 Stop the tracee at exit.
 400 A
 401 .BR waitpid (2)
 402 by the tracer will return a
 403 .I status
 404 value such that
 405
 406 .nf
 407   status>>8 == (SIGTRAP | (PTRACE_EVENT_EXIT<<8))
 408 .fi
 409
 410 The tracee's exit status can be retrieved with
 411 .BR PTRACE_GETEVENTMSG .
 412 .IP
 413 The tracee is stopped early during process exit,
 414 when registers are still available,
 415 allowing the tracer to see where the exit occurred,
 416 whereas the normal exit notification is done after the process
 417 is finished exiting.
 418 Even though context is available,
 419 the tracer cannot prevent the exit from happening at this point.
 420 .RE
 421 .TP
 422 .BR PTRACE_GETEVENTMSG " (since Linux 2.5.46)"
 423 Retrieve a message (as an
 424 .IR "unsigned long" )
 425 about the ptrace event
 426 that just happened, placing it at the address
 427 .I data
 428 in the tracer.
 429 For
 430 .BR PTRACE_EVENT_EXIT ,
 431 this is the tracee's exit status.
 432 For
 433 .BR PTRACE_EVENT_FORK ,
 434 .BR PTRACE_EVENT_VFORK ,
 435 .BR PTRACE_EVENT_VFORK_DONE ,
 436 and
 437 .BR PTRACE_EVENT_CLONE ,
 438 this is the PID of the new process.
 439 .RI (  addr
 440 is ignored.)
 441 .TP
 442 .B PTRACE_CONT
 443 Restart the stopped tracee process.
 444 If
 445 .I data
 446 is nonzero,
 447 it is interpreted as the number of a signal to be delivered to the tracee;
 448 otherwise, no signal is delivered.
 449 Thus, for example, the tracer can control
 450 whether a signal sent to the tracee is delivered or not.
 451 .RI ( addr
 452 is ignored.)
 453 .TP
 454 .BR PTRACE_SYSCALL ", " PTRACE_SINGLESTEP
 455 Restart the stopped tracee as for
 456 .BR PTRACE_CONT ,
 457 but arrange for the tracee to be stopped at
 458 the next entry to or exit from a system call,
 459 or after execution of a single instruction, respectively.
 460 (The tracee will also, as usual, be stopped upon receipt of a signal.)
 461 From the tracer's perspective, the tracee will appear to have been
 462 stopped by receipt of a
 463 .BR SIGTRAP .
 464 So, for
 465 .BR PTRACE_SYSCALL ,
 466 for example, the idea is to inspect
 467 the arguments to the system call at the first stop,
 468 then do another
 469 .B PTRACE_SYSCALL
 470 and inspect the return value of the system call at the second stop.
 471 The
 472 .I data
 473 argument is treated as for
 474 .BR PTRACE_CONT .
 475 .RI ( addr
 476 is ignored.)
 477 .TP
 478 .BR PTRACE_SYSEMU ", " PTRACE_SYSEMU_SINGLESTEP " (since Linux 2.6.14)"
 479 For
 480 .BR PTRACE_SYSEMU ,
 481 continue and stop on entry to the next system call,
 482 which will not be executed.
 483 For
 484 .BR PTRACE_SYSEMU_SINGLESTEP ,
 485 do the same but also singlestep if not a system call.
 486 This call is used by programs like
 487 User Mode Linux that want to emulate all the tracee's system calls.
 488 The
 489 .I data
 490 argument is treated as for
 491 .BR PTRACE_CONT .
 492 .RI ( addr
 493 is ignored;
 494 not supported on all architectures.)
 495 .TP
 496 .B PTRACE_KILL
 497 Send the tracee a
 498 .B SIGKILL
 499 to terminate it.
 500 .RI ( addr
 501 and
 502 .I data
 503 are ignored.)
 504 .IP
 505 .I This operation is deprecated; do not use it!
 506 Instead, send a
 507 .BR SIGKILL
 508 directly using
 509 .BR kill (2)
 510 or
 511 .BR tgkill (2).
 512 The problem with
 513 .B PTRACE_KILL
 514 is that it requires the tracee to be in signal-delivery-stop,
 515 otherwise it may not work
 516 (i.e., may complete successfully but won't kill the tracee).
 517 By contrast, sending a
 518 .B SIGKILL
 519 directly has no such limitation.
 520 .\" [Note from Denys Vlasenko:
 521 .\"     deprecation suggested by Oleg Nesterov. He prefers to deprecate it
 522 .\"     instead of describing (and needing to support) PTRACE_KILL's quirks.]
 523 .TP
 524 .B PTRACE_ATTACH
 525 Attach to the process specified in
 526 .IR pid ,
 527 making it a tracee of the calling process.
 528 .\" No longer true (removed by Denys Vlasenko, 2011, who remarks:
 529 .\"        "I think it isn't true in non-ancient 2.4 and in 2.6/3.x.
 530 .\"         Basically, it's not true for any Linux in practical use.
 531 .\" ; the behavior of the tracee is as if it had done a
 532 .\" .BR PTRACE_TRACEME .
 533 .\" The calling process actually becomes the parent of the tracee
 534 .\" process for most purposes (e.g., it will receive
 535 .\" notification of tracee events and appears in
 536 .\" .BR ps (1)
 537 .\" output as the tracee's parent), but a
 538 .\" .BR getppid (2)
 539 .\" by the tracee will still return the PID of the original parent.
 540 The tracee is sent a
 541 .BR SIGSTOP ,
 542 but will not necessarily have stopped
 543 by the completion of this call; use
 544 .BR waitpid (2)
 545 to wait for the tracee to stop.
 546 See the "Attaching and detaching" subsection for additional information.
 547 .RI ( addr
 548 and
 549 .I data
 550 are ignored.)
 551 .TP
 552 .B PTRACE_DETACH
 553 Restart the stopped tracee as for
 554 .BR PTRACE_CONT ,
 555 but first detach from it.
 556 Under Linux, a tracee can be detached in this way regardless
 557 of which method was used to initiate tracing.
 558 .RI ( addr
 559 is ignored.)
 560 .SS Death under ptrace
 561 When a (possibly multithreaded) process receives a killing signal
 562 (one whose disposition is set to
 563 .B SIG_DFL
 564 and whose default action is to kill the process),
 565 all threads exit.
 566 Tracees report their death to their tracer(s).
 567 Notification of this event is delivered via
 568 .BR waitpid (2).
 569 .LP
 570 Note that the killing signal will first cause signal-delivery-stop
 571 (on one tracee only),
 572 and only after it is injected by the tracer
 573 (or after it was dispatched to a thread which isn't traced),
 574 will death from the signal happen on
 575 .I all
 576 tracees within a multithreaded process.
 577 (The term "signal-delivery-stop" is explained below.)
 578 .LP
 579 .B SIGKILL
 580 operates similarly, with exceptions.
 581 No signal-delivery-stop is generated for
 582 .B SIGKILL
 583 and therefore the tracer can't suppress it.
 584 .B SIGKILL
 585 kills even within system calls
 586 (syscall-exit-stop is not generated prior to death by
 587 .BR SIGKILL ).
 588 The net effect is that
 589 .B SIGKILL
 590 always kills the process (all its threads),
 591 even if some threads of the process are ptraced.
 592 .LP
 593 When the tracee calls
 594 .BR _exit (2),
 595 it reports its death to its tracer.
 596 Other threads are not affected.
 597 .LP
 598 When any thread executes
 599 .BR exit_group (2),
 600 every tracee in its thread group reports its death to its tracer.
 601 .LP
 602 If the
 603 .B PTRACE_O_TRACEEXIT
 604 option is on,
 605 .B PTRACE_EVENT_EXIT
 606 will happen before actual death.
 607 This applies to exits via
 608 .BR exit (2),
 609 .BR exit_group (2),
 610 and signal deaths (except
 611 .BR SIGKILL ),
 612 and when threads are torn down on
 613 .BR execve (2)
 614 in a multithreaded process.
 615 .LP
 616 The tracer cannot assume that the ptrace-stopped tracee exists.
 617 There are many scenarios when the tracee may die while stopped (such as
 618 .BR SIGKILL ).
 619 Therefore, the tracer must be prepared to handle an
 620 .B ESRCH
 621 error on any ptrace operation.
 622 Unfortunately, the same error is returned if the tracee
 623 exists but is not ptrace-stopped
 624 (for commands which require a stopped tracee),
 625 or if it is not traced by the process which issued the ptrace call.
 626 The tracer needs to keep track of the stopped/running state of the tracee,
 627 and interpret
 628 .B ESRCH
 629 as "tracee died unexpectedly" only if it knows that the tracee has
 630 been observed to enter ptrace-stop.
 631 Note that there is no guarantee that
 632 .I waitpid(WNOHANG)
 633 will reliably report the tracee's death status if a
 634 ptrace operation returned
 635 .BR ESRCH .
 636 .I waitpid(WNOHANG)
 637 may return 0 instead.
 638 In other words, the tracee may be "not yet fully dead",
 639 but already refusing ptrace requests.
 640 .LP
 641 The tracer can't assume that the tracee
 642 .I always
 643 ends its life by reporting
 644 .I WIFEXITED(status)
 645 or
 646 .IR WIFSIGNALED(status) ;
 647 there are cases where this does not occur.
 648 For example, if a thread other than thread group leader does an
 649 .BR execve (2),
 650 it disappears;
 651 its PID will never be seen again,
 652 and any subsequent ptrace stops will be reported under
 653 the thread group leader's PID.
 654 .SS Stopped states
 655 A tracee can be in two states: running or stopped.
 656 .LP
 657 There are many kinds of states when the tracee is stopped, and in ptrace
 658 discussions they are often conflated.
 659 Therefore, it is important to use precise terms.
 660 .LP
 661 In this manual page, any stopped state in which the tracee is ready
 662 to accept ptrace commands from the tracer is called
 663 .IR ptrace-stop .
 664 Ptrace-stops can
 665 be further subdivided into
 666 .IR signal-delivery-stop ,
 667 .IR group-stop ,
 668 .IR syscall-stop ,
 669 and so on.
 670 These stopped states are described in detail below.
 671 .LP
 672 When the running tracee enters ptrace-stop, it notifies its tracer using
 673 .BR waitpid (2)
 674 (or one of the other "wait" system calls).
 675 Most of this manual page assumes that the tracer waits with:
 676 .LP
 677     pid = waitpid(pid_or_minus_1, &status, __WALL);
 678 .LP
 679 Ptrace-stopped tracees are reported as returns with
 680 .I pid
 681 greater than 0 and
 682 .I WIFSTOPPED(status)
 683 true.
 684 .\" Denys Vlasenko:
 685 .\"     Do we require __WALL usage, or will just using 0 be ok? (With 0,
 686 .\"     I am not 100% sure there aren't ugly corner cases.) Are the
 687 .\"     rules different if user wants to use waitid? Will waitid require
 688 .\"     WEXITED?
 689 .\"
 690 .LP
 691 The
 692 .B __WALL
 693 flag does not include the
 694 .B WSTOPPED
 695 and
 696 .B WEXITED
 697 flags, but implies their functionality.
 698 .LP
 699 Setting the
 700 .B WCONTINUED
 701 flag when calling
 702 .BR waitpid (2)
 703 is not recommended: the "continued" state is per-process and
 704 consuming it can confuse the real parent of the tracee.
 705 .LP
 706 Use of the
 707 .B WNOHANG
 708 flag may cause
 709 .BR waitpid (2)
 710 to return 0 ("no wait results available yet")
 711 even if the tracer knows there should be a notification.
 712 Example:
 713 .nf
 714
 715     kill(tracee, SIGKILL);
 716     waitpid(tracee, &status, __WALL | WNOHANG);
 717 .fi
 718 .\" FIXME:
 719 .\"     waitid usage? WNOWAIT?
 720 .\"     describe how wait notifications queue (or not queue)
 721 .LP
 722 The following kinds of ptrace-stops exist: signal-delivery-stops,
 723 group-stops,
 724 .B PTRACE_EVENT
 725 stops, syscall-stops.
 726 They all are reported by
 727 .BR waitpid (2)
 728 with
 729 .I WIFSTOPPED(status)
 730 true.
 731 They may be differentiated by examining the value
 732 .IR status>>8 ,
 733 and if there is ambiguity in that value, by querying
 734 .BR PTRACE_GETSIGINFO .
 735 (Note: the
 736 .I WSTOPSIG(status)
 737 macro can't be used to perform this examination,
 738 because it returns the value
 739 .IR "(status\>>8)\ &\ 0xff" .)
 740 .SS Signal-delivery-stop
 741 When a (possibly multithreaded) process receives any signal except
 742 .BR SIGKILL ,
 743 the kernel selects an arbitrary thread which handles the signal.
 744 (If the signal is generated with
 745 .BR tgkill (2),
 746 the target thread can be explicitly selected by the caller.)
 747 If the selected thread is traced, it enters signal-delivery-stop.
 748 At this point, the signal is not yet delivered to the process,
 749 and can be suppressed by the tracer.
 750 If the tracer doesn't suppress the signal,
 751 it passes the signal to the tracee in the next ptrace restart request.
 752 This second step of signal delivery is called
 753 .I "signal injection"
 754 in this manual page.
 755 Note that if the signal is blocked,
 756 signal-delivery-stop doesn't happen until the signal is unblocked,
 757 with the usual exception that
 758 .B SIGSTOP
 759 can't be blocked.
 760 .LP
 761 Signal-delivery-stop is observed by the tracer as
 762 .BR waitpid (2)
 763 returning with
 764 .I WIFSTOPPED(status)
 765 true, with the signal returned by
 766 .IR WSTOPSIG(status) .
 767 If the signal is
 768 .BR SIGTRAP ,
 769 this may be a different kind of ptrace-stop;
 770 see the "Syscall-stops" and "execve" sections below for details.
 771 If
 772 .I WSTOPSIG(status)
 773 returns a stopping signal, this may be a group-stop; see below.
 774 .SS Signal injection and suppression
 775 After signal-delivery-stop is observed by the tracer,
 776 the tracer should restart the tracee with the call
 777 .LP
 778     ptrace(PTRACE_restart, pid, 0, sig)
 779 .LP
 780 where
 781 .B PTRACE_restart
 782 is one of the restarting ptrace requests.
 783 If
 784 .I sig
 785 is 0, then a signal is not delivered.
 786 Otherwise, the signal
 787 .I sig
 788 is delivered.
 789 This operation is called
 790 .I "signal injection"
 791 in this manual page, to distinguish it from signal-delivery-stop.
 792 .LP
 793 The
 794 .I sig
 795 value may be different from the
 796 .I WSTOPSIG(status)
 797 value: the tracer can cause a different signal to be injected.
 798 .LP
 799 Note that a suppressed signal still causes system calls to return
 800 prematurely.
 801 In this case system calls will be restarted: the tracer will
 802 observe the tracee to reexecute the interrupted system call (or
 803 .BR restart_syscall (2)
 804 system call for a few syscalls which use a different mechanism
 805 for restarting) if the tracer uses
 806 .BR PTRACE_SYSCALL .
 807 Even system calls (such as
 808 .BR poll (2))
 809 which are not restartable after signal are restarted after
 810 signal is suppressed;
 811 however, kernel bugs exist which cause some syscalls to fail with
 812 .B EINTR
 813 even though no observable signal is injected to the tracee.
 814 .LP
 815 Restarting ptrace commands issued in ptrace-stops other than
 816 signal-delivery-stop are not guaranteed to inject a signal, even if
 817 .I sig
 818 is nonzero.
 819 No error is reported; a nonzero
 820 .I sig
 821 may simply be ignored.
 822 Ptrace users should not try to "create a new signal" this way: use
 823 .BR tgkill (2)
 824 instead.
 825 .LP
 826 The fact that signal injection requests may be ignored
 827 when restarting the tracee after
 828 ptrace stops that are not signal-delivery-stops
 829 is a cause of confusion among ptrace users.
 830 One typical scenario is that the tracer observes group-stop,
 831 mistakes it for signal-delivery-stop, restarts the tracee with
 832
 833     ptrace(PTRACE_rest, pid, 0, stopsig)
 834
 835 with the intention of injecting
 836 .IR stopsig ,
 837 but
 838 .I stopsig
 839 gets ignored and the tracee continues to run.
 840 .LP
 841 The
 842 .B SIGCONT
 843 signal has a side effect of waking up (all threads of)
 844 a group-stopped process.
 845 This side effect happens before signal-delivery-stop.
 846 The tracer can't suppress this side effect (it can
 847 only suppress signal injection, which only causes the
 848 .BR SIGCONT
 849 handler to not be executed in the tracee, if such a handler is installed).
 850 In fact, waking up from group-stop may be followed by
 851 signal-delivery-stop for signal(s)
 852 .I other than
 853 .BR SIGCONT ,
 854 if they were pending when
 855 .B SIGCONT
 856 was delivered.
 857 In other words,
 858 .B SIGCONT
 859 may be not the first signal observed by the tracee after it was sent.
 860 .LP
 861 Stopping signals cause (all threads of) a process to enter group-stop.
 862 This side effect happens after signal injection, and therefore can be
 863 suppressed by the tracer.
 864 .LP
 865 In Linux 2.4 and earlier, the
 866 .B SIGSTOP
 867 signal can't be injected.
 868 .\" In the Linux 2.4 sources, in arch/i386/kernel/signal.c::do_signal(),
 869 .\" there is:
 870 .\"
 871 .\"             /* The debugger continued.  Ignore SIGSTOP.  */
 872 .\"             if (signr == SIGSTOP)
 873 .\"                     continue;
 874 .LP
 875 .B PTRACE_GETSIGINFO
 876 can be used to retrieve a
 877 .I siginfo_t
 878 structure which corresponds to the delivered signal.
 879 .B PTRACE_SETSIGINFO
 880 may be used to modify it.
 881 If
 882 .B PTRACE_SETSIGINFO
 883 has been used to alter
 884 .IR siginfo_t ,
 885 the
 886 .I si_signo
 887 field and the
 888 .I sig
 889 parameter in the restarting command must match,
 890 otherwise the result is undefined.
 891 .SS Group-stop
 892 When a (possibly multithreaded) process receives a stopping signal,
 893 all threads stop.
 894 If some threads are traced, they enter a group-stop.
 895 Note that the stopping signal will first cause signal-delivery-stop
 896 (on one tracee only), and only after it is injected by the tracer
 897 (or after it was dispatched to a thread which isn't traced),
 898 will group-stop be initiated on
 899 .I all
 900 tracees within the multithreaded process.
 901 As usual, every tracee reports its group-stop separately
 902 to the corresponding tracer.
 903 .LP
 904 Group-stop is observed by the tracer as
 905 .BR waitpid (2)
 906 returning with
 907 .I WIFSTOPPED(status)
 908 true, with the stopping signal available via
 909 .IR WSTOPSIG(status) .
 910 The same result is returned by some other classes of ptrace-stops,
 911 therefore the recommended practice is to perform the call
 912 .LP
 913     ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo)
 914 .LP
 915 The call can be avoided if the signal is not
 916 .BR SIGSTOP ,
 917 .BR SIGTSTP ,
 918 .BR SIGTTIN ,
 919 or
 920 .BR SIGTTOU ;
 921 only these four signals are stopping signals.
 922 If the tracer sees something else, it can't be a group-stop.
 923 Otherwise, the tracer needs to call
 924 .BR PTRACE_GETSIGINFO .
 925 If
 926 .B PTRACE_GETSIGINFO
 927 fails with
 928 .BR EINVAL ,
 929 then it is definitely a group-stop.
 930 (Other failure codes are possible, such as
 931 .B ESRCH
 932 ("no such process") if a
 933 .B SIGKILL
 934 killed the tracee.)
 935 .LP
 936 As of kernel 2.6.38,
 937 after the tracer sees the tracee ptrace-stop and until it
 938 restarts or kills it, the tracee will not run,
 939 and will not send notifications (except
 940 .B SIGKILL
 941 death) to the tracer, even if the tracer enters into another
 942 .BR waitpid (2)
 943 call.
 944 .LP
 945 The kernel behavior described in the previous paragraph
 946 causes a problem with transparent handling of stopping signals.
 947 If the tracer restarts the tracee after group-stop,
 948 the stopping signal
 949 is effectively ignored\(emthe tracee doesn't remain stopped, it runs.
 950 If the tracer doesn't restart the tracee before entering into the next
 951 .BR waitpid (2),
 952 future
 953 .B SIGCONT
 954 signals will not be reported to the tracer;
 955 this would cause the
 956 .B SIGCONT
 957 signals to have no effect on the tracee.
 958 .SS PTRACE_EVENT stops
 959 If the tracer sets
 960 .B PTRACE_O_TRACE_*
 961 options, the tracee will enter ptrace-stops called
 962 .B PTRACE_EVENT
 963 stops.
 964 .LP
 965 .B PTRACE_EVENT
 966 stops are observed by the tracer as
 967 .BR waitpid (2)
 968 returning with
 969 .IR WIFSTOPPED(status) ,
 970 and
 971 .I WSTOPSIG(status)
 972 returns
 973 .BR SIGTRAP .
 974 An additional bit is set in the higher byte of the status word:
 975 the value
 976 .I status>>8
 977 will be
 978
 979     (SIGTRAP | PTRACE_EVENT_foo << 8).
 980
 981 The following events exist:
 982 .TP
 983 .B PTRACE_EVENT_VFORK
 984 Stop before return from
 985 .BR vfork (2)
 986 or
 987 .BR clone (2)
 988 with the
 989 .B CLONE_VFORK
 990 flag.
 991 When the tracee is continued after this stop, it will wait for child to
 992 exit/exec before continuing its execution
 993 (in other words, the usual behavior on
 994 .BR vfork (2)).
 995 .TP
 996 .B PTRACE_EVENT_FORK
 997 Stop before return from
 998 .BR fork (2)
 999 or
1000 .BR clone (2)
1001 with the exit signal set to
1002 .BR SIGCHLD .
1003 .TP
1004 .B PTRACE_EVENT_CLONE
1005 Stop before return from
1006 .BR clone (2).
1007 .TP
1008 .B PTRACE_EVENT_VFORK_DONE
1009 Stop before return from
1010 .BR vfork (2)
1011 or
1012 .BR clone (2)
1013 with the
1014 .B CLONE_VFORK
1015 flag,
1016 but after the child unblocked this tracee by exiting or execing.
1017 .LP
1018 For all four stops described above,
1019 the stop occurs in the parent (i.e., the tracee),
1020 not in the newly created thread.
1021 .BR PTRACE_GETEVENTMSG
1022 can be used to retrieve the new thread's ID.
1023 .TP
1024 .B PTRACE_EVENT_EXEC
1025 Stop before return from
1026 .BR execve (2).
1027 Since Linux 3.0,
1028 .BR PTRACE_GETEVENTMSG
1029 returns the former thread ID.
1030 .TP
1031 .B PTRACE_EVENT_EXIT
1032 Stop before exit (including death from
1033 .BR exit_group (2)),
1034 signal death, or exit caused by
1035 .BR execve (2)
1036 in a multithreaded process.
1037 .B PTRACE_GETEVENTMSG
1038 returns the exit status.
1039 Registers can be examined
1040 (unlike when "real" exit happens).
1041 The tracee is still alive; it needs to be
1042 .BR PTRACE_CONT ed
1043 or
1044 .BR PTRACE_DETACH ed
1045 to finish exiting.
1046 .LP
1047 .B PTRACE_GETSIGINFO
1048 on
1049 .B PTRACE_EVENT
1050 stops returns
1051 .B SIGTRAP
1052 in
1053 .IR si_signo ,
1054 with
1055 .I si_code
1056 set to
1057 .IR "(event<<8)\ |\ SIGTRAP" .
1058 .SS Syscall-stops
1059 If the tracee was restarted by
1060 .BR PTRACE_SYSCALL ,
1061 the tracee enters
1062 syscall-enter-stop just prior to entering any system call.
1063 If the tracer restarts the tracee with
1064 .BR PTRACE_SYSCALL ,
1065 the tracee enters syscall-exit-stop when the system call is finished,
1066 or if it is interrupted by a signal.
1067 (That is, signal-delivery-stop never happens between syscall-enter-stop
1068 and syscall-exit-stop; it happens
1069 .I after
1070 syscall-exit-stop.)
1071 .LP
1072 Other possibilities are that the tracee may stop in a
1073 .B PTRACE_EVENT
1074 stop, exit (if it entered
1075 .BR _exit (2)
1076 or
1077 .BR exit_group (2)),
1078 be killed by
1079 .BR SIGKILL ,
1080 or die silently (if it is a thread group leader, the
1081 .BR execve (2)
1082 happened in another thread,
1083 and that thread is not traced by the same tracer;
1084 this situation is discussed later).
1085 .LP
1086 Syscall-enter-stop and syscall-exit-stop are observed by the tracer as
1087 .BR waitpid (2)
1088 returning with
1089 .I WIFSTOPPED(status)
1090 true, and
1091 .I WSTOPSIG(status)
1092 giving
1093 .BR SIGTRAP .
1094 If the
1095 .B PTRACE_O_TRACESYSGOOD
1096 option was set by the tracer, then
1097 .I WSTOPSIG(status)
1098 will give the value
1099 .IR "(SIGTRAP\ |\ 0x80)" .
1100 .LP
1101 Syscall-stops can be distinguished from signal-delivery-stop with
1102 .B SIGTRAP
1103 by querying
1104 .BR PTRACE_GETSIGINFO
1105 for the following cases:
1106 .TP
1107 .IR si_code " <= 0"
1108 .B SIGTRAP
1109 was delivered as a result of a userspace action,
1110 for example, a system call
1111 .RB ( tgkill (2),
1112 .BR kill (2),
1113 .BR sigqueue (3),
1114 etc.),
1115 expiration of a POSIX timer,
1116 change of state on a POSIX message queue,
1117 or completion of an asynchronous I/O request.
1118 .TP
1119 .IR si_code " == SI_KERNEL (0x80)"
1120 .B SIGTRAP
1121 was sent by the kernel.
1122 .TP
1123 .IR si_code " == SIGTRAP or " si_code " == (SIGTRAP|0x80)"
1124 This is a syscall-stop.
1125 .LP
1126 However, syscall-stops happen very often (twice per system call),
1127 and performing
1128 .B PTRACE_GETSIGINFO
1129 for every syscall-stop may be somewhat expensive.
1130 .LP
1131 Some architectures allow the cases to be distinguished
1132 by examining registers.
1133 For example, on x86,
1134 .I rax
1135 ==
1136 .RB - ENOSYS
1137 in syscall-enter-stop.
1138 Since
1139 .B SIGTRAP
1140 (like any other signal) always happens
1141 .I after
1142 syscall-exit-stop,
1143 and at this point
1144 .I rax
1145 almost never contains
1146 .RB - ENOSYS ,
1147 the
1148 .B SIGTRAP
1149 looks like "syscall-stop which is not syscall-enter-stop";
1150 in other words, it looks like a
1151 "stray syscall-exit-stop" and can be detected this way.
1152 But such detection is fragile and is best avoided.
1153 .LP
1154 Using the
1155 .B PTRACE_O_TRACESYSGOOD
1156 option is the recommended method to distinguish syscall-stops
1157 from other kinds of ptrace-stops,
1158 since it is reliable and does not incur a performance penalty.
1159 .LP
1160 Syscall-enter-stop and syscall-exit-stop are
1161 indistinguishable from each other by the tracer.
1162 The tracer needs to keep track of the sequence of
1163 ptrace-stops in order to not misinterpret syscall-enter-stop as
1164 syscall-exit-stop or vice versa.
1165 The rule is that syscall-enter-stop is
1166 always followed by syscall-exit-stop,
1167 .B PTRACE_EVENT
1168 stop or the tracee's death;
1169 no other kinds of ptrace-stop can occur in between.
1170 .LP
1171 If after syscall-enter-stop,
1172 the tracer uses a restarting command other than
1173 .BR PTRACE_SYSCALL ,
1174 syscall-exit-stop is not generated.
1175 .LP
1176 .B PTRACE_GETSIGINFO
1177 on syscall-stops returns
1178 .B SIGTRAP
1179 in
1180 .IR si_signo ,
1181 with
1182 .I si_code
1183 set to
1184 .B SIGTRAP
1185 or
1186 .IR (SIGTRAP|0x80) .
1187 .SS PTRACE_SINGLESTEP, PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP stops
1188 [Details of these kinds of stops are yet to be documented.]
1189 .\"
1190 .\" FIXME
1191 .\" document stops occurring with PTRACE_SINGLESTEP, PTRACE_SYSEMU,
1192 .\" PTRACE_SYSEMU_SINGLESTEP
1193 .SS Informational and restarting ptrace commands
1194 Most ptrace commands (all except
1195 .BR PTRACE_ATTACH ,
1196 .BR PTRACE_TRACEME ,
1197 and
1198 .BR PTRACE_KILL )
1199 require the tracee to be in a ptrace-stop, otherwise they fail with
1200 .BR ESRCH .
1201 .LP
1202 When the tracee is in ptrace-stop,
1203 the tracer can read and write data to
1204 the tracee using informational commands.
1205 These commands leave the tracee in ptrace-stopped state:
1206 .LP
1207 .nf
1208     ptrace(PTRACE_PEEKTEXT/PEEKDATA/PEEKUSER, pid, addr, 0);
1209     ptrace(PTRACE_POKETEXT/POKEDATA/POKEUSER, pid, addr, long_val);
1210     ptrace(PTRACE_GETREGS/GETFPREGS, pid, 0, &struct);
1211     ptrace(PTRACE_SETREGS/SETFPREGS, pid, 0, &struct);
1212     ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo);
1213     ptrace(PTRACE_SETSIGINFO, pid, 0, &siginfo);
1214     ptrace(PTRACE_GETEVENTMSG, pid, 0, &long_var);
1215     ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags);
1216 .fi
1217 .LP
1218 Note that some errors are not reported.
1219 For example, setting signal information
1220 .RI ( siginfo )
1221 may have no effect in some ptrace-stops, yet the call may succeed
1222 (return 0 and not set
1223 .IR errno );
1224 querying
1225 .B PTRACE_GETEVENTMSG
1226 may succeed and return some random value if current ptrace-stop
1227 is not documented as returning a meaningful event message.
1228 .LP
1229 The call
1230
1231     ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags);
1232
1233 affects one tracee.
1234 The tracee's current flags are replaced.
1235 Flags are inherited by new tracees created and "auto-attached" via active
1236 .BR PTRACE_O_TRACEFORK ,
1237 .BR PTRACE_O_TRACEVFORK ,
1238 or
1239 .BR PTRACE_O_TRACECLONE
1240 options.
1241 .LP
1242 Another group of commands makes the ptrace-stopped tracee run.
1243 They have the form:
1244 .LP
1245     ptrace(cmd, pid, 0, sig);
1246 .LP
1247 where
1248 .I cmd
1249 is
1250 .BR PTRACE_CONT ,
1251 .BR PTRACE_DETACH ,
1252 .BR PTRACE_SYSCALL ,
1253 .BR PTRACE_SINGLESTEP ,
1254 .BR PTRACE_SYSEMU ,
1255 or
1256 .BR PTRACE_SYSEMU_SINGLESTEP .
1257 If the tracee is in signal-delivery-stop,
1258 .I sig
1259 is the signal to be injected (if it is nonzero).
1260 Otherwise,
1261 .I sig
1262 may be ignored.
1263 (When restarting a tracee from a ptrace-stop other than signal-delivery-stop,
1264 recommended practice is to always pass 0 in
1265 .IR sig .)
1266 .SS Attaching and detaching
1267 A thread can be attached to the tracer using the call
1268
1269     ptrace(PTRACE_ATTACH, pid, 0, 0);
1270
1271 This also sends
1272 .B SIGSTOP
1273 to this thread.
1274 If the tracer wants this
1275 .B SIGSTOP
1276 to have no effect, it needs to suppress it.
1277 Note that if other signals are concurrently sent to
1278 this thread during attach,
1279 the tracer may see the tracee enter signal-delivery-stop
1280 with other signal(s) first!
1281 The usual practice is to reinject these signals until
1282 .B SIGSTOP
1283 is seen, then suppress
1284 .B SIGSTOP
1285 injection.
1286 The design bug here is that a ptrace attach and a concurrently delivered
1287 .B SIGSTOP
1288 may race and the concurrent
1289 .B SIGSTOP
1290 may be lost.
1291 .\"
1292 .\" FIXME: Describe how to attach to a thread which is already
1293 .\"        group-stopped.
1294 .LP
1295 Since attaching sends
1296 .B SIGSTOP
1297 and the tracer usually suppresses it, this may cause a stray
1298 .B EINTR
1299 return from the currently executing system call in the tracee,
1300 as described in the "Signal injection and suppression" section.
1301 .LP
1302 The request
1303
1304     ptrace(PTRACE_TRACEME, 0, 0, 0);
1305
1306 turns the calling thread into a tracee.
1307 The thread continues to run (doesn't enter ptrace-stop).
1308 A common practice is to follow the
1309 .B PTRACE_TRACEME
1310 with
1311
1312     raise(SIGSTOP);
1313
1314 and allow the parent (which is our tracer now) to observe our
1315 signal-delivery-stop.
1316 .LP
1317 If the
1318 .BR PTRACE_O_TRACEFORK ,
1319 .BR PTRACE_O_TRACEVFORK ,
1320 or
1321 .BR PTRACE_O_TRACECLONE
1322 options are in effect, then children created by, respectively,
1323 .BR vfork (2)
1324 or
1325 .BR clone (2)
1326 with the
1327 .B CLONE_VFORK
1328 flag,
1329 .BR fork (2)
1330 or
1331 .BR clone (2)
1332 with the exit signal set to
1333 .BR SIGCHLD ,
1334 and other kinds of
1335 .BR clone (2),
1336 are automatically attached to the same tracer which traced their parent.
1337 .B SIGSTOP
1338 is delivered to the children, causing them to enter
1339 signal-delivery-stop after they exit the system call which created them.
1340 .LP
1341 Detaching of the tracee is performed by:
1342
1343     ptrace(PTRACE_DETACH, pid, 0, sig);
1344
1345 .B PTRACE_DETACH
1346 is a restarting operation;
1347 therefore it requires the tracee to be in ptrace-stop.
1348 If the tracee is in signal-delivery-stop, a signal can be injected.
1349 Otherwise, the
1350 .I sig
1351 parameter may be silently ignored.
1352 .LP
1353 If the tracee is running when the tracer wants to detach it,
1354 the usual solution is to send
1355 .B SIGSTOP
1356 (using
1357 .BR tgkill (2),
1358 to make sure it goes to the correct thread),
1359 wait for the tracee to stop in signal-delivery-stop for
1360 .B SIGSTOP
1361 and then detach it (suppressing
1362 .B SIGSTOP
1363 injection).
1364 A design bug is that this can race with concurrent
1365 .BR SIGSTOP s.
1366 Another complication is that the tracee may enter other ptrace-stops
1367 and needs to be restarted and waited for again, until
1368 .B SIGSTOP
1369 is seen.
1370 Yet another complication is to be sure that
1371 the tracee is not already ptrace-stopped,
1372 because no signal delivery happens while it is\(emnot even
1373 .BR SIGSTOP .
1374 .\" FIXME: Describe how to detach from a group-stopped tracee so that it
1375 .\"        doesn't run, but continues to wait for SIGCONT.
1376 .LP
1377 If the tracer dies, all tracees are automatically detached and restarted,
1378 unless they were in group-stop.
1379 Handling of restart from group-stop is currently buggy,
1380 but the "as planned" behavior is to leave tracee stopped and waiting for
1381 .BR SIGCONT .
1382 If the tracee is restarted from signal-delivery-stop,
1383 the pending signal is injected.
1384 .SS execve(2) under ptrace
1385 .\" clone(2) THREAD_CLONE says:
1386 .\"     If  any  of the threads in a thread group performs an execve(2),
1387 .\"     then all threads other than the thread group leader are terminated,
1388 .\"     and the new program is executed in the thread group leader.
1389 .\"
1390 When one thread in a multithreaded process calls
1391 .BR execve (2),
1392 the kernel destroys all other threads in the process,
1393 .\" In kernel 3.1 sources, see fs/exec.c::de_thread()
1394 and resets the thread ID of the execing thread to the
1395 thread group ID (process ID).
1396 (Or, to put things another way, when a multithreaded process does an
1397 .BR execve (2),
1398 at completion of the call, it appears as though the
1399 .BR execve (2)
1400 occurred in the thread group leader, regardless of which thread did the
1401 .BR execve (2).)
1402 This resetting of the thread ID looks very confusing to tracers:
1403 .IP * 3
1404 All other threads stop in
1405 .B PTRACE_EVENT_EXIT
1406 stop, if the
1407 .BR PTRACE_O_TRACEEXIT
1408 option was turned on.
1409 Then all other threads except the thread group leader report
1410 death as if they exited via
1411 .BR _exit (2)
1412 with exit code 0.
1413 .IP *
1414 The execing tracee changes its thread ID while it is in the
1415 .BR execve (2).
1416 (Remember, under ptrace, the "pid" returned from
1417 .BR waitpid (2),
1418 or fed into ptrace calls, is the tracee's thread ID.)
1419 That is, the tracee's thread ID is reset to be the same as its process ID,
1420 which is the same as the thread group leader's thread ID.
1421 .IP *
1422 Then a
1423 .B PTRACE_EVENT_EXEC
1424 stop happens, if the
1425 .BR PTRACE_O_TRACEEXEC
1426 option was turned on.
1427 .IP *
1428 If the thread group leader has reported its
1429 .B PTRACE_EVENT_EXIT
1430 stop by this time,
1431 it appears to the tracer that
1432 the dead thread leader "reappears from nowhere".
1433 (Note: the thread group leader does not report death via
1434 .I WIFEXITED(status)
1435 until there is at least one other live thread.
1436 This eliminates the possibility that the tracer will see
1437 it dying and then reappearing.)
1438 If the thread group leader was still alive,
1439 for the tracer this may look as if thread group leader
1440 returns from a different system call than it entered,
1441 or even "returned from a system call even though
1442 it was not in any system call".
1443 If the thread group leader was not traced
1444 (or was traced by a different tracer), then during
1445 .BR execve (2)
1446 it will appear as if it has become a tracee of
1447 the tracer of the execing tracee.
1448 .LP
1449 All of the above effects are the artifacts of
1450 the thread ID change in the tracee.
1451 .LP
1452 The
1453 .B PTRACE_O_TRACEEXEC
1454 option is the recommended tool for dealing with this situation.
1455 First, it enables
1456 .BR PTRACE_EVENT_EXEC
1457 stop,
1458 which occurs before
1459 .BR execve (2)
1460 returns.
1461 In this stop, the tracer can use
1462 .B PTRACE_GETEVENTMSG
1463 to retrieve the tracee's former thread ID.
1464 (This feature was introduced in Linux 3.0).
1465 Second, the
1466 .B PTRACE_O_TRACEEXEC
1467 option disables legacy
1468 .B SIGTRAP
1469 generation on
1470 .BR execve (2).
1471 .LP
1472 When the tracer receives
1473 .B PTRACE_EVENT_EXEC
1474 stop notification,
1475 it is guaranteed that except this tracee and the thread group leader,
1476 no other threads from the process are alive.
1477 .LP
1478 On receiving the
1479 .B PTRACE_EVENT_EXEC
1480 stop notification,
1481 the tracer should clean up all its internal
1482 data structures describing the threads of this process,
1483 and retain only one data structure\(emone which
1484 describes the single still running tracee, with
1485
1486     thread ID == thread group ID == process ID.
1487 .LP
1488 Example: two threads call
1489 .BR execve (2)
1490 at the same time:
1491 .LP
1492 .nf
1493 *** we get syscall-enter-stop in thread 1: **
1494 PID1 execve("/bin/foo", "foo" <unfinished ...>
1495 *** we issue PTRACE_SYSCALL for thread 1 **
1496 *** we get syscall-enter-stop in thread 2: **
1497 PID2 execve("/bin/bar", "bar" <unfinished ...>
1498 *** we issue PTRACE_SYSCALL for thread 2 **
1499 *** we get PTRACE_EVENT_EXEC for PID0, we issue PTRACE_SYSCALL **
1500 *** we get syscall-exit-stop for PID0: **
1501 PID0 <... execve resumed> )             = 0
1502 .fi
1503 .LP
1504 If the
1505 .B PTRACE_O_TRACEEXEC
1506 option is
1507 .I not
1508 in effect for the execing tracee, the kernel delivers an extra
1509 .B SIGTRAP
1510 to the tracee after
1511 .BR execve (2)
1512 returns.
1513 This is an ordinary signal (similar to one which can be
1514 generated by
1515 .IR "kill -TRAP" ),
1516 not a special kind of ptrace-stop.
1517 Employing
1518 .B PTRACE_GETSIGINFO
1519 for this signal returns
1520 .I si_code
1521 set to 0
1522 .RI ( SI_USER ).
1523 This signal may be blocked by signal mask,
1524 and thus may be delivered (much) later.
1525 .LP
1526 Usually, the tracer (for example,
1527 .BR strace (1))
1528 would not want to show this extra post-execve
1529 .B SIGTRAP
1530 signal to the user, and would suppress its delivery to the tracee (if
1531 .B SIGTRAP
1532 is set to
1533 .BR SIG_DFL ,
1534 it is a killing signal).
1535 However, determining
1536 .I which
1537 .B SIGTRAP
1538 to suppress is not easy.
1539 Setting the
1540 .B PTRACE_O_TRACEEXEC
1541 option and thus suppressing this extra
1542 .B SIGTRAP
1543 is the recommended approach.
1544 .SS Real parent
1545 The ptrace API (ab)uses the standard UNIX parent/child signaling over
1546 .BR waitpid (2).
1547 This used to cause the real parent of the process to stop receiving
1548 several kinds of
1549 .BR waitpid (2)
1550 notifications when the child process is traced by some other process.
1551 .LP
1552 Many of these bugs have been fixed, but as of Linux 2.6.38 several still
1553 exist; see BUGS below.
1554 .LP
1555 As of Linux 2.6.38, the following is believed to work correctly:
1556 .IP * 3
1557 exit/death by signal is reported first to the tracer, then,
1558 when the tracer consumes the
1559 .BR waitpid (2)
1560 result, to the real parent (to the real parent only when the
1561 whole multithreaded process exits).
1562 If the tracer and the real parent are the same process,
1563 the report is sent only once.
1564 .SH "RETURN VALUE"
1565 On success,
1566 .B PTRACE_PEEK*
1567 requests return the requested data,
1568 while other requests return zero.
1569 On error, all requests return \-1, and
1570 .I errno
1571 is set appropriately.
1572 Since the value returned by a successful
1573 .B PTRACE_PEEK*
1574 request may be \-1, the caller must clear
1575 .I errno
1576 before the call, and then check it afterward
1577 to determine whether or not an error occurred.
1578 .SH ERRORS
1579 .TP
1580 .B EBUSY
1581 (i386 only) There was an error with allocating or freeing a debug register.
1582 .TP
1583 .B EFAULT
1584 There was an attempt to read from or write to an invalid area in
1585 the tracer's or the tracee's memory,
1586 probably because the area wasn't mapped or accessible.
1587 Unfortunately, under Linux, different variations of this fault
1588 will return
1589 .B EIO
1590 or
1591 .B EFAULT
1592 more or less arbitrarily.
1593 .TP
1594 .B EINVAL
1595 An attempt was made to set an invalid option.
1596 .TP
1597 .B EIO
1598 .I request
1599 is invalid, or an attempt was made to read from or
1600 write to an invalid area in the tracer's or the tracee's memory,
1601 or there was a word-alignment violation,
1602 or an invalid signal was specified during a restart request.
1603 .TP
1604 .B EPERM
1605 The specified process cannot be traced.
1606 This could be because the
1607 tracer has insufficient privileges (the required capability is
1608 .BR CAP_SYS_PTRACE );
1609 unprivileged processes cannot trace processes that they
1610 cannot send signals to or those running
1611 set-user-ID/set-group-ID programs, for obvious reasons.
1612 Alternatively, the process may already be being traced,
1613 or (on kernels before 2.6.26) be
1614 .BR init (8)
1615 (PID 1).
1616 .TP
1617 .B ESRCH
1618 The specified process does not exist, or is not currently being traced
1619 by the caller, or is not stopped
1620 (for requests that require a stopped tracee).
1621 .SH "CONFORMING TO"
1622 SVr4, 4.3BSD.
1623 .SH NOTES
1624 Although arguments to
1625 .BR ptrace ()
1626 are interpreted according to the prototype given,
1627 glibc currently declares
1628 .BR ptrace ()
1629 as a variadic function with only the
1630 .I request
1631 argument fixed.
1632 This means that unneeded trailing arguments may be omitted,
1633 though doing so makes use of undocumented
1634 .BR gcc (1)
1635 behavior.
1636 .LP
1637 In Linux kernels before 2.6.26,
1638 .\" See commit 00cd5c37afd5f431ac186dd131705048c0a11fdb
1639 .BR init (8),
1640 the process with PID 1, may not be traced.
1641 .LP
1642 The layout of the contents of memory and the USER area are
1643 quite operating-system- and architecture-specific.
1644 The offset supplied, and the data returned,
1645 might not entirely match with the definition of
1646 .IR "struct user" .
1647 .\" See http://lkml.org/lkml/2008/5/8/375
1648 .LP
1649 The size of a "word" is determined by the operating-system variant
1650 (e.g., for 32-bit Linux it is 32 bits, etc.).
1651 .LP
1652 This page documents the way the
1653 .BR ptrace ()
1654 call works currently in Linux.
1655 Its behavior differs noticeably on other flavors of UNIX.
1656 In any case, use of
1657 .BR ptrace ()
1658 is highly specific to the operating system and architecture.
1659 .SH BUGS
1660 On hosts with 2.6 kernel headers,
1661 .B PTRACE_SETOPTIONS
1662 is declared with a different value than the one for 2.4.
1663 This leads to applications compiled with 2.6 kernel
1664 headers failing when run on 2.4 kernels.
1665 This can be worked around by redefining
1666 .B PTRACE_SETOPTIONS
1667 to
1668 .BR PTRACE_OLDSETOPTIONS ,
1669 if that is defined.
1670 .LP
1671 Group-stop notifications are sent to the tracer, but not to real parent.
1672 Last confirmed on 2.6.38.6.
1673 .LP
1674 If a thread group leader is traced and exits by calling
1675 .BR _exit (2),
1676 .\" Note from Denys Vlasenko:
1677 .\"     Here "exits" means any kind of death - _exit, exit_group,
1678 .\"     signal death. Signal death and exit_group cases are trivial,
1679 .\"     though: since signal death and exit_group kill all other threads
1680 .\"     too, "until all other threads exit" thing happens rather soon
1681 .\"     in these cases. Therefore, only _exit presents observably
1682 .\"     puzzling behavior to ptrace users: thread leader _exit's,
1683 .\"     but WIFEXITED isn't reported! We are trying to explain here
1684 .\"     why it is so.
1685 a
1686 .B PTRACE_EVENT_EXIT
1687 stop will happen for it (if requested), but the subsequent
1688 .B WIFEXITED
1689 notification will not be delivered until all other threads exit.
1690 As explained above, if one of other threads calls
1691 .BR execve (2),
1692 the death of the thread group leader will
1693 .I never
1694 be reported.
1695 If the execed thread is not traced by this tracer,
1696 the tracer will never know that
1697 .BR execve (2)
1698 happened.
1699 One possible workaround is to
1700 .B PTRACE_DETACH
1701 the thread group leader instead of restarting it in this case.
1702 Last confirmed on 2.6.38.6.
1703 .\"  FIXME: ^^^ need to test/verify this scenario
1704 .LP
1705 A
1706 .B SIGKILL
1707 signal may still cause a
1708 .B PTRACE_EVENT_EXIT
1709 stop before actual signal death.
1710 This may be changed in the future;
1711 .B SIGKILL
1712 is meant to always immediately kill tasks even under ptrace.
1713 Last confirmed on 2.6.38.6.
1714 .LP
1715 Some system calls return with
1716 .B EINTR
1717 if a signal was sent to a tracee, but delivery was suppressed by the tracer.
1718 (This is very typical operation: it is usually
1719 done by debuggers on every attach, in order to not introduce
1720 a bogus
1721 .BR SIGSTOP ).
1722 As of Linux 3.2.9, the following system calls are affected
1723 (this list is likely incomplete):
1724 .BR epoll_wait (2),
1725 and
1726 .BR read (2)
1727 from an
1728 .BR inotify (7)
1729 file descriptor.
1730 .SH "SEE ALSO"
1731 .BR gdb (1),
1732 .BR strace (1),
1733 .BR clone (2),
1734 .BR execve (2),
1735 .BR fork (2),
1736 .BR gettid (2),
1737 .BR sigaction (2),
1738 .BR tgkill (2),
1739 .BR vfork (2),
1740 .BR waitpid (2),
1741 .BR exec (3),
1742 .BR capabilities (7),
1743 .BR signal (7)