1 ------------------------------------------------------------------------------
2 NVIDIA CUDA Profiler Tools Interface (CUPTI)
5 ------------------------------------------------------------------------------
9 * <cupti_dir>/include : Contains CUPTI header files
11 * <cupti_dir>/lib* : Contains CUPTI library
13 * <cupti_dir>/sample : Contains samples showing use of the CUPTI APIs
15 * <cupti_dir>/doc : Contains the CUPTI release notes
18 SUPPORTED DISTRIBUTIONS
19 -----------------------
20 CUPTI is supported on all platforms for which CUDA Toolkit is supported.
27 . NVIDIA Display Driver
32 COMPILING AND RUNNING CUPTI SAMPLES
33 -----------------------------------
34 On Windows, the compiling and running CUPTI samples using the included
35 Makefiles requires the Cygwin environment.
38 > cd <cupti_dir>/sample/<sample>
45 INCOMPATIBLE CHANGES FROM CUPTI 4.0
46 -----------------------------------
47 A number of non-backward compatible API changes are made in 4.1. These
48 changes require minor source modifications to existing code compiled
49 against CUPTI 4.0. In addition, some previously incorrect and
50 undefined behavior is now prevented by improved error checking. Your
51 code may need to be modified to handle these new error cases.
53 - Multiple CUPTI subscribers are not allowed. In 4.0, cuptiSubscribe()
54 could be used to enable multiple subscriber callback functions to be
55 active at the same time. When multiple callback functions were
56 subscribed, invocation of those callbacks did not respect the domain
57 registration for those callback functions. In 4.1 and later,
58 cuptiSubscribe() returns CUPTI_ERROR_MAX_LIMIT_REACHED if there is
59 already an active subscriber.
61 - The CUpti_EventID values for tesla devices have changed in 4.1 to
62 make all CUpti_EventID values unique across all devices. Going
63 forward CUpti_EventID values will be added for new devices and
64 events, but existing values will not be changed. If your application
65 has stored CUpti_EventID values (for example, as part of the data
66 collected for a profiling session), those CUpti_EventIDs must be
67 translated to the new ID values before being used in 4.1 and later
70 - In enumeration CUpti_EventDomainAttribute,
71 CUPTI_EVENT_DOMAIN_MAX_EVENTS has been removed. The number of events
72 in an event domain can be retrieved with
73 cuptiEventDomainGetNumEvents().
75 - cuptiDeviceGetAttribute(), cuptiEventGroupGetAttribute() and
76 cuptiEventGroupSetAttribute() now take a size parameter and the
77 'value' parameter now has type 'void *'.
79 - cuptiEventDomainGetAttribute() no longer takes a CUdevice
80 parameter. This function is now used to get event domain attributes
81 that are device independent. A new function
82 cuptiDeviceGetEventDomainAttribute() is added to get event domain
83 attributes that are device dependent.
85 - cuptiEventDomainGetNumEvents(), cuptiEventDomainEnumEvents() and
86 cuptiEventGetAttribute() no longer take a CUdevice parameter.
88 - The contextUid field of the CUpti_CallbackData structure has been
89 changed from type uint64_t to type uint32_t.
95 - CUPTI activity record collection must be initialized before any CUDA
96 function is invoked. If not, activity collection may be incomplete
97 or entirely disabled. Make sure that some CUPTI activity API (such
98 as cuptiActivityEnable()) is called before the first CUDA driver or
101 - The activity API functions cuptiActivityEnqueueBuffer() and
102 cuptiActivityDequeueBuffer() are deprecated and will be removed in a
103 future release. The new asynchronous API implemented by
104 cuptiActivityRegisterCallbacks(), cuptiActivityFlush(), and
105 cuptiActivityFlushAll() should be adopted. See the CUPTI
106 documentation for details.