2 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" lang="en-us" xml:lang="en-us">
5 <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
6 <meta http-equiv="X-UA-Compatible" content="IE=edge"></meta>
7 <meta name="copyright" content="(C) Copyright 2005"></meta>
8 <meta name="DC.rights.owner" content="(C) Copyright 2005"></meta>
9 <meta name="DC.Type" content="concept"></meta>
10 <meta name="DC.Title" content="CUSPARSE"></meta>
11 <meta name="abstract" content="The API reference guide for CUSPARSE, the CUDA sparse matrix library."></meta>
12 <meta name="description" content="The API reference guide for CUSPARSE, the CUDA sparse matrix library."></meta>
13 <meta name="DC.Coverage" content="CUDA API References"></meta>
14 <meta name="DC.subject" content="CUDA CUSPARSE, CUDA CUSPARSE asynchronous execution, CUDA CUSPARSE index format, CUDA CUSPARSE vector format, CUDA CUSPARSE matrix format, CUDA CUSPARSE type, CUDA CUSPARSE helper function, CUDA CUSPARSE format conversion, CUDA CUSPARSE preconditioners, CUDA CUSPARSE level function"></meta>
15 <meta name="keywords" content="CUDA CUSPARSE, CUDA CUSPARSE asynchronous execution, CUDA CUSPARSE index format, CUDA CUSPARSE vector format, CUDA CUSPARSE matrix format, CUDA CUSPARSE type, CUDA CUSPARSE helper function, CUDA CUSPARSE format conversion, CUDA CUSPARSE preconditioners, CUDA CUSPARSE level function"></meta>
16 <meta name="DC.Format" content="XHTML"></meta>
17 <meta name="DC.Identifier" content="abstract"></meta>
18 <link rel="stylesheet" type="text/css" href="../common/formatting/commonltr.css"></link>
19 <link rel="stylesheet" type="text/css" href="../common/formatting/site.css"></link>
20 <title>CUSPARSE :: CUDA Toolkit Documentation</title>
22 <script src="../common/formatting/html5shiv-printshiv.min.js"></script>
24 <script type="text/javascript" charset="utf-8" src="../common/formatting/jquery.min.js"></script>
25 <script type="text/javascript" charset="utf-8" src="../common/formatting/jquery.ba-hashchange.min.js"></script>
26 <link rel="canonical" href="http://docs.nvidia.com/cuda/cusparse/index.html"></link>
27 <link rel="stylesheet" type="text/css" href="../common/formatting/qwcode.highlight.css"></link>
31 <article id="contents">
32 <div id="eqn-warning">This document includes math equations
33 (highlighted in red) which are best viewed with <a target="_blank" href="https://www.mozilla.org/firefox">Firefox</a> version 4.0
34 or higher, or another <a target="_blank" href="http://www.w3.org/Math/Software/mathml_software_cat_browsers.html">MathML-aware
35 browser</a>. There is also a <a href="../../pdf/CUSPARSE_Library.pdf">PDF version of this
39 <div id="eqn-warning-buf"></div>
40 <div id="release-info">CUSPARSE
41 (<a href="../../pdf/CUSPARSE_Library.pdf">PDF</a>)
44 (<a href="https://developer.nvidia.com/cuda-toolkit-archive">older</a>)
49 <a href="mailto:cudatools@nvidia.com?subject=CUDA Tools Documentation Feedback: cusparse">Send Feedback</a></div>
50 <div class="topic nested0" id="abstract"><a name="abstract" shape="rect">
51 <!-- --></a><h2 class="title topictitle1"><a href="#abstract" name="abstract" shape="rect">CUSPARSE</a></h2>
52 <div class="body conbody"></div>
54 <div class="topic concept nested0" id="introduction"><a name="introduction" shape="rect">
55 <!-- --></a><h2 class="title topictitle1"><a href="#introduction" name="introduction" shape="rect">1. Introduction</a></h2>
56 <div class="body conbody">
57 <p class="p">The CUSPARSE library contains a set of basic linear algebra subroutines used for handling sparse
58 matrices. It is implemented on top of the NVIDIA® CUDA™ runtime (which is part of the
59 CUDA Toolkit) and is designed to be called from C and C++. The library routines can be classified
62 <div class="p"><a name="introduction__ul_wnl_2dp_sg" shape="rect">
63 <!-- --></a><ul class="ul" id="introduction__ul_wnl_2dp_sg">
64 <li class="li">Level 1: operations between a vector in sparse format and a vector in dense format</li>
65 <li class="li">Level 2: operations between a matrix in sparse format and a vector in dense format</li>
66 <li class="li">Level 3: operations between a matrix in sparse format and a set of vectors in dense format
67 (which can also usually be viewed as a dense tall matrix)
69 <li class="li">Conversion: operations that allow conversion between different matrix formats</li>
72 <p class="p">The CUSPARSE library allows developers to access the computational resources of the NVIDIA
73 graphics processing unit (GPU), although it does not auto-parallelize across multiple GPUs. The
74 CUSPARSE API assumes that input and output data reside in GPU (device) memory, unless it is
75 explicitly indicated otherwise by the string <samp class="ph codeph">DevHostPtr</samp> in a function parameter's
76 name (for example, the parameter <samp class="ph codeph">*resultDevHostPtr</samp> in the function
77 <samp class="ph codeph">cusparse<t>doti()</samp>).
79 <p class="p">It is the responsibility of the developer to allocate memory and to copy data between GPU memory
80 and CPU memory using standard CUDA runtime API routines, such as <samp class="ph codeph">cudaMalloc()</samp>,
81 <samp class="ph codeph">cudaFree()</samp>, <samp class="ph codeph">cudaMemcpy()</samp>, and
82 <samp class="ph codeph">cudaMemcpyAsync()</samp>.
85 <div class="note note"><span class="notetitle">Note:</span> The CUSPARSE library requires hardware with compute capability (CC) of at least 1.1 or higher.
86 Please see the <em class="ph i">NVIDIA CUDA C Programming Guide, Appendix A</em> for a list of the compute
87 capabilities corresponding to all NVIDIA GPUs.
91 <div class="topic concept nested1" id="naming-convention"><a name="naming-convention" shape="rect">
92 <!-- --></a><h3 class="title topictitle2"><a href="#naming-convention" name="naming-convention" shape="rect">1.1. Naming Conventions</a></h3>
93 <div class="body conbody">
94 <p class="p">The CUSPARSE library functions are available for data types <samp class="ph codeph">float</samp>,
95 <samp class="ph codeph">double</samp>, <samp class="ph codeph">cuComplex</samp>, and <samp class="ph codeph">cuDoubleComplex</samp>.
96 The sparse Level 1, Level 2, and Level 3 functions follow this naming convention:
98 <p class="p"><samp class="ph codeph">cusparse</samp><<samp class="ph codeph">t</samp>>[<<samp class="ph codeph">matrix data
99 format</samp>>]<<samp class="ph codeph">operation</samp>>[<<samp class="ph codeph">output matrix
100 data format</samp>>]
102 <p class="p">where <<samp class="ph codeph">t</samp>> can be <samp class="ph codeph">S</samp>, <samp class="ph codeph">D</samp>,
103 <samp class="ph codeph">C</samp>, <samp class="ph codeph">Z</samp>, or <samp class="ph codeph">X</samp>, corresponding to the data
104 types <samp class="ph codeph">float</samp>, <samp class="ph codeph">double</samp>, <samp class="ph codeph">cuComplex</samp>,
105 <samp class="ph codeph">cuDoubleComplex</samp>, and the generic type, respectively.
107 <p class="p">The <<samp class="ph codeph">matrix data format</samp>> can be <samp class="ph codeph">dense</samp>,
108 <samp class="ph codeph">coo</samp>, <samp class="ph codeph">csr</samp>, <samp class="ph codeph">csc</samp>, or <samp class="ph codeph">hyb</samp>,
109 corresponding to the dense, coordinate, compressed sparse row, compressed sparse column, and
110 hybrid storage formats, respectively.
112 <p class="p">Finally, the <<samp class="ph codeph">operation</samp>> can be <samp class="ph codeph">axpyi</samp>,
113 <samp class="ph codeph">doti</samp>, <samp class="ph codeph">dotci</samp>, <samp class="ph codeph">gthr</samp>,
114 <samp class="ph codeph">gthrz</samp>, <samp class="ph codeph">roti</samp>, or <samp class="ph codeph">sctr</samp>, corresponding to the
115 Level 1 functions; it also can be <samp class="ph codeph">mv</samp> or <samp class="ph codeph">sv</samp>, corresponding to
116 the Level 2 functions, as well as <samp class="ph codeph">mm</samp> or <samp class="ph codeph">sm</samp>, corresponding to
117 the Level 3 functions.
119 <p class="p">All of the functions have the return type <samp class="ph codeph">cusparseStatus_t</samp> and are explained in
120 more detail in the chapters that follow.
124 <div class="topic concept nested1" id="asynchronous-execution"><a name="asynchronous-execution" shape="rect">
125 <!-- --></a><h3 class="title topictitle2"><a href="#asynchronous-execution" name="asynchronous-execution" shape="rect">1.2. Asynchronous Execution</a></h3>
126 <div class="body conbody">
127 <p class="p">The CUSPARSE library functions are executed asynchronously with respect to the host and may
128 return control to the application on the host before the result is ready. Developers can use
129 the <samp class="ph codeph">cudaDeviceSynchronize()</samp> function to ensure that the execution of a
130 particular CUSPARSE library routine has completed.
132 <p class="p">A developer can also use the <samp class="ph codeph">cudaMemcpy()</samp> routine to copy data from the device
133 to the host and vice versa, using the <samp class="ph codeph">cudaMemcpyDeviceToHost</samp> and
134 <samp class="ph codeph">cudaMemcpyHostToDevice</samp> parameters, respectively. In this case there is no
135 need to add a call to <samp class="ph codeph">cudaDeviceSynchronize()</samp> because the call to
136 <samp class="ph codeph">cudaMemcpy()</samp> with the above parameters is blocking and completes only when
137 the results are ready on the host.
142 <div class="topic concept nested0" id="using-the-cusparse-api"><a name="using-the-cusparse-api" shape="rect">
143 <!-- --></a><h2 class="title topictitle1"><a href="#using-the-cusparse-api" name="using-the-cusparse-api" shape="rect">2. Using the CUSPARSE API</a></h2>
144 <div class="body conbody">
145 <p class="p">This chapter describes how to use the CUSPARSE library API. It is not a reference for the CUSPARSE API data types and functions;
146 that is provided in subsequent chapters.
149 <div class="topic concept nested1" id="thread-safety"><a name="thread-safety" shape="rect">
150 <!-- --></a><h3 class="title topictitle2"><a href="#thread-safety" name="thread-safety" shape="rect">2.1. Thread Safety</a></h3>
151 <div class="body conbody">
152 <p class="p">The library is thread safe and its functions can be called from multiple host threads.</p>
155 <div class="topic concept nested1" id="scalar-parameters2"><a name="scalar-parameters2" shape="rect">
156 <!-- --></a><h3 class="title topictitle2"><a href="#scalar-parameters2" name="scalar-parameters2" shape="rect">2.2. Scalar Parameters</a></h3>
157 <div class="body conbody">
158 <p class="p">In the CUSPARSE API, the scalar parameters
160 <math xmlns="http://www.w3.org/1998/Math/MathML">
165 <math xmlns="http://www.w3.org/1998/Math/MathML">
168 can be passed by reference on the host or the device.
170 <p class="p">The few functions that return a scalar result, such as <samp class="ph codeph">doti()</samp> and
171 <samp class="ph codeph">nnz()</samp>, return the resulting value by reference on the host or the device.
172 Even though these functions return immediately, similarly to those that return matrix and
173 vector results, the scalar result is not ready until execution of the routine on the GPU
174 completes. This requires proper synchronization be used when reading the result from the
177 <p class="p">This feature allows the CUSPARSE library functions to execute completely asynchronously using
180 <math xmlns="http://www.w3.org/1998/Math/MathML">
185 <math xmlns="http://www.w3.org/1998/Math/MathML">
188 are generated by a previous kernel. This situation arises, for example, when the
189 library is used to implement iterative methods for the solution of linear systems and
190 eigenvalue problems [3].
194 <div class="topic concept nested1" id="parallelism-with-streams"><a name="parallelism-with-streams" shape="rect">
195 <!-- --></a><h3 class="title topictitle2"><a href="#parallelism-with-streams" name="parallelism-with-streams" shape="rect">2.3. Parallelism with Streams</a></h3>
196 <div class="body conbody">
197 <p class="p">If the application performs several small independent computations, or if it makes data transfers
198 in parallel with the computation, CUDA streams can be used to overlap these tasks.
200 <p class="p">The application can conceptually associate a stream with each task. To achieve the overlap of
201 computation between the tasks, the developer should create CUDA streams using the
202 function <samp class="ph codeph">cudaStreamCreate()</samp> and set the stream to be used by each
203 individual CUSPARSE library routine by calling
204 <samp class="ph codeph">cusparseSetStream()</samp> just before calling the actual CUSPARSE
205 routine. Then, computations performed in separate streams would be overlapped
206 automatically on the GPU, when possible. This approach is especially useful when
207 the computation performed by a single task is relatively small and is not enough
208 to fill the GPU with work, or when there is a data transfer that can be performed
209 in parallel with the computation.
211 <p class="p">When streams are used, we recommend using the new CUSPARSE API with scalar parameters and results
212 passed by reference in the device memory to achieve maximum computational
215 <p class="p">Although a developer can create many streams, in practice it is not possible to have more than 16
216 concurrent kernels executing at the same time.
221 <div class="topic concept nested0" id="cusparse-indexing-and-data-formats"><a name="cusparse-indexing-and-data-formats" shape="rect">
222 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-indexing-and-data-formats" name="cusparse-indexing-and-data-formats" shape="rect">3. CUSPARSE Indexing and Data Formats</a></h2>
223 <div class="body conbody">
224 <p class="p">The CUSPARSE library supports dense and sparse vector, and dense and sparse matrix formats.</p>
226 <div class="topic concept nested1" id="index-base-format"><a name="index-base-format" shape="rect">
227 <!-- --></a><h3 class="title topictitle2"><a href="#index-base-format" name="index-base-format" shape="rect">3.1. Index Base Format</a></h3>
228 <div class="body conbody">
229 <p class="p">The library supports zero- and one-based indexing. The index base is selected through the <samp class="ph codeph">cusparseIndexBase_t</samp> type, which is passed as a standalone parameter or as a field in the matrix descriptor <samp class="ph codeph">cusparseMatDescr_t</samp> type.
233 <div class="topic concept nested1" id="vector-formats"><a name="vector-formats" shape="rect">
234 <!-- --></a><h3 class="title topictitle2"><a href="#vector-formats" name="vector-formats" shape="rect">3.2. Vector Formats</a></h3>
235 <div class="body conbody">
236 <p class="p">This section describes dense and sparse vector formats.</p>
238 <div class="topic concept nested2" id="dense-format"><a name="dense-format" shape="rect">
239 <!-- --></a><h3 class="title topictitle2"><a href="#dense-format" name="dense-format" shape="rect">3.2.1. Dense Format</a></h3>
240 <div class="body conbody">
241 <p class="p">Dense vectors are represented with a single data array that is stored linearly in memory, such as
244 <math xmlns="http://www.w3.org/1998/Math/MathML">
251 <div class="tablenoborder">
252 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
253 <tbody class="tbody">
255 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
256 <p class="p d4p_eqn_block">
257 <math xmlns="http://www.w3.org/1998/Math/MathML">
258 <mfenced open="[" close="]">
259 <mtable rowspacing="4pt" columnspacing="1em">
292 <p class="p">(This vector is referenced again in the next section.)</p>
295 <div class="topic concept nested2" id="sparse-format"><a name="sparse-format" shape="rect">
296 <!-- --></a><h3 class="title topictitle2"><a href="#sparse-format" name="sparse-format" shape="rect">3.2.2. Sparse Format</a></h3>
297 <div class="body conbody">
298 <p class="p">Sparse vectors are represented with two arrays.</p>
301 <p class="p">The <em class="ph i">data array</em> has the nonzero values from the equivalent array in dense format.
305 <p class="p">The <em class="ph i">integer index array</em> has the positions of the corresponding nonzero values in the equivalent array in dense format.
309 <p class="p">For example, the dense vector in section 3.2.1 can be stored as a sparse vector with
312 <div class="tablenoborder">
313 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
314 <tbody class="tbody">
316 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
317 <p class="p d4p_eqn_block">
318 <math xmlns="http://www.w3.org/1998/Math/MathML">
319 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
324 <mfenced open="[" close="]">
325 <mtable rowspacing="4pt" columnspacing="1em">
348 <mfenced open="[" close="]">
349 <mtable rowspacing="4pt" columnspacing="1em">
353 <mrow class="MJX-TeXAtom-ORD">
361 <mrow class="MJX-TeXAtom-ORD">
369 <mrow class="MJX-TeXAtom-ORD">
377 <mrow class="MJX-TeXAtom-ORD">
396 <p class="p">It can also be stored as a sparse vector with zero-based indexing.</p>
397 <div class="tablenoborder">
398 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
399 <tbody class="tbody">
401 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
402 <p class="p d4p_eqn_block">
403 <math xmlns="http://www.w3.org/1998/Math/MathML">
404 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
409 <mfenced open="[" close="]">
410 <mtable rowspacing="4pt" columnspacing="1em">
433 <mfenced open="[" close="]">
434 <mtable rowspacing="4pt" columnspacing="1em">
438 <mrow class="MJX-TeXAtom-ORD">
446 <mrow class="MJX-TeXAtom-ORD">
454 <mrow class="MJX-TeXAtom-ORD">
462 <mrow class="MJX-TeXAtom-ORD">
481 <p class="p">In each example, the top row is the data array and the bottom row is the index array, and it
482 is assumed that the indices are provided in increasing order and that each index appears only
488 <div class="topic concept nested1" id="matrix-formats"><a name="matrix-formats" shape="rect">
489 <!-- --></a><h3 class="title topictitle2"><a href="#matrix-formats" name="matrix-formats" shape="rect">3.3. Matrix Formats</a></h3>
490 <div class="body conbody">
491 <p class="p">Dense and several sparse formats for matrices are discussed in this section.</p>
493 <div class="topic concept nested2" id="dense-format2"><a name="dense-format2" shape="rect">
494 <!-- --></a><h3 class="title topictitle2"><a href="#dense-format2" name="dense-format2" shape="rect">3.3.1. Dense Format</a></h3>
495 <div class="body conbody">
496 <p class="p">The dense matrix <samp class="ph codeph">X</samp> is assumed to be stored in column-major format in memory and
497 is represented by the following parameters.
499 <div class="tablenoborder">
500 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
501 <tbody class="tbody">
503 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
504 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
505 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of rows in the matrix.</td>
508 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
509 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
510 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of columns in the matrix.</td>
513 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">ldX</samp></td>
514 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
515 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The leading dimension of <samp class="ph codeph">X</samp>, which must be greater than or equal to
516 <samp class="ph codeph">m</samp>. If <samp class="ph codeph">ldX</samp> is greater than <samp class="ph codeph">m</samp>, then
517 <samp class="ph codeph">X</samp> represents a sub-matrix of a larger matrix stored in
522 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">X</samp></td>
523 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
524 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the data array containing the matrix elements. It is assumed that enough storage
525 is allocated for <samp class="ph codeph">X</samp> to hold all of the matrix elements and that
526 CUSPARSE library functions may access values outside of the sub-matrix, but will never
533 <p class="p">For example, <samp class="ph codeph">m×n</samp> dense matrix <samp class="ph codeph">X</samp> with leading dimension
534 <samp class="ph codeph">ldX</samp> can be stored with one-based indexing as shown.
536 <div class="tablenoborder">
537 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
538 <tbody class="tbody">
540 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
541 <p class="p d4p_eqn_block">
542 <math xmlns="http://www.w3.org/1998/Math/MathML">
543 <mfenced open="[" close="]">
544 <mtable rowspacing="4pt" columnspacing="1em">
549 <mrow class="MJX-TeXAtom-ORD">
559 <mrow class="MJX-TeXAtom-ORD">
572 <mrow class="MJX-TeXAtom-ORD">
584 <mrow class="MJX-TeXAtom-ORD">
594 <mrow class="MJX-TeXAtom-ORD">
607 <mrow class="MJX-TeXAtom-ORD">
633 <mrow class="MJX-TeXAtom-ORD">
643 <mrow class="MJX-TeXAtom-ORD">
656 <mrow class="MJX-TeXAtom-ORD">
682 <mrow class="MJX-TeXAtom-ORD">
694 <mrow class="MJX-TeXAtom-ORD">
709 <mrow class="MJX-TeXAtom-ORD">
728 <p class="p">Its elements are arranged linearly in memory in the order below.</p>
729 <div class="tablenoborder">
730 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
731 <tbody class="tbody">
733 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
734 <p class="p d4p_eqn_block">
735 <math xmlns="http://www.w3.org/1998/Math/MathML">
736 <mfenced open="[" close="]">
737 <mtable rowspacing="4pt" columnspacing="1em">
742 <mrow class="MJX-TeXAtom-ORD">
752 <mrow class="MJX-TeXAtom-ORD">
765 <mrow class="MJX-TeXAtom-ORD">
778 <mrow class="MJX-TeXAtom-ORD">
793 <mrow class="MJX-TeXAtom-ORD">
803 <mrow class="MJX-TeXAtom-ORD">
816 <mrow class="MJX-TeXAtom-ORD">
829 <mrow class="MJX-TeXAtom-ORD">
848 <div class="note note"><span class="notetitle">Note:</span> This format and notation are similar to those used in the NVIDIA CUDA CUBLAS
853 <div class="topic concept nested2" id="coordinate-format-coo"><a name="coordinate-format-coo" shape="rect">
854 <!-- --></a><h3 class="title topictitle2"><a href="#coordinate-format-coo" name="coordinate-format-coo" shape="rect">3.3.2. Coordinate Format (COO)</a></h3>
855 <div class="body conbody">
856 <p class="p">The <samp class="ph codeph">m×n</samp> sparse matrix <samp class="ph codeph">A</samp> is represented in COO format
857 by the following parameters.
859 <div class="tablenoborder">
860 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
861 <tbody class="tbody">
863 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
864 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
865 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of nonzero elements in the matrix.</td>
868 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">cooValA</samp></td>
869 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
870 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the data array of length <samp class="ph codeph">nnz</samp> that holds all nonzero
871 values of <samp class="ph codeph">A</samp> in row-major format.
875 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">cooRowIndA</samp></td>
876 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
877 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">nnz</samp> that contains the row
878 indices of the corresponding elements in array <samp class="ph codeph">cooValA</samp>.
882 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">cooColIndA</samp></td>
883 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
884 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">nnz</samp> that contains the
885 column indices of the corresponding elements in array
886 <samp class="ph codeph">cooValA</samp>.
892 <p class="p">A sparse matrix in COO format is assumed to be stored in row-major format: the index arrays
893 are first sorted by row indices and then within the same row by compressed column indices. It
894 is assumed that each pair of row and column indices appears only once.
896 <p class="p">For example, consider the following
898 <math xmlns="http://www.w3.org/1998/Math/MathML">
903 matrix <samp class="ph codeph">A</samp>.
905 <div class="tablenoborder">
906 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
907 <tbody class="tbody">
909 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
910 <p class="p d4p_eqn_block">
911 <math xmlns="http://www.w3.org/1998/Math/MathML">
912 <mfenced open="[" close="]">
913 <mtable rowspacing="4pt" columnspacing="1em">
991 <p class="p">It is stored in COO format with zero-based indexing this way.</p>
992 <div class="tablenoborder">
993 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
994 <tbody class="tbody">
996 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
997 <p class="p d4p_eqn_block">
998 <math xmlns="http://www.w3.org/1998/Math/MathML">
999 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
1002 <mtext>cooValA</mtext>
1008 <mfenced open="[" close="]">
1009 <mtable rowspacing="4pt" columnspacing="1em">
1045 <mtext>cooRowIndA</mtext>
1051 <mfenced open="[" close="]">
1052 <mtable rowspacing="4pt" columnspacing="1em">
1056 <mrow class="MJX-TeXAtom-ORD">
1064 <mrow class="MJX-TeXAtom-ORD">
1072 <mrow class="MJX-TeXAtom-ORD">
1080 <mrow class="MJX-TeXAtom-ORD">
1088 <mrow class="MJX-TeXAtom-ORD">
1096 <mrow class="MJX-TeXAtom-ORD">
1104 <mrow class="MJX-TeXAtom-ORD">
1112 <mrow class="MJX-TeXAtom-ORD">
1120 <mrow class="MJX-TeXAtom-ORD">
1133 <mtext>cooColIndA</mtext>
1139 <mfenced open="[" close="]">
1140 <mtable rowspacing="4pt" columnspacing="1em">
1144 <mrow class="MJX-TeXAtom-ORD">
1152 <mrow class="MJX-TeXAtom-ORD">
1160 <mrow class="MJX-TeXAtom-ORD">
1168 <mrow class="MJX-TeXAtom-ORD">
1176 <mrow class="MJX-TeXAtom-ORD">
1184 <mrow class="MJX-TeXAtom-ORD">
1192 <mrow class="MJX-TeXAtom-ORD">
1200 <mrow class="MJX-TeXAtom-ORD">
1208 <mrow class="MJX-TeXAtom-ORD">
1227 <p class="p">In the COO format with one-based indexing, it is stored as shown.</p>
1228 <div class="tablenoborder">
1229 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
1230 <tbody class="tbody">
1232 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
1233 <p class="p d4p_eqn_block">
1234 <math xmlns="http://www.w3.org/1998/Math/MathML">
1235 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
1238 <mtext>cooValA</mtext>
1244 <mfenced open="[" close="]">
1245 <mtable rowspacing="4pt" columnspacing="1em">
1281 <mtext>cooRowIndA</mtext>
1287 <mfenced open="[" close="]">
1288 <mtable rowspacing="4pt" columnspacing="1em">
1292 <mrow class="MJX-TeXAtom-ORD">
1300 <mrow class="MJX-TeXAtom-ORD">
1308 <mrow class="MJX-TeXAtom-ORD">
1316 <mrow class="MJX-TeXAtom-ORD">
1324 <mrow class="MJX-TeXAtom-ORD">
1332 <mrow class="MJX-TeXAtom-ORD">
1340 <mrow class="MJX-TeXAtom-ORD">
1348 <mrow class="MJX-TeXAtom-ORD">
1356 <mrow class="MJX-TeXAtom-ORD">
1369 <mtext>cooColIndA</mtext>
1375 <mfenced open="[" close="]">
1376 <mtable rowspacing="4pt" columnspacing="1em">
1380 <mrow class="MJX-TeXAtom-ORD">
1388 <mrow class="MJX-TeXAtom-ORD">
1396 <mrow class="MJX-TeXAtom-ORD">
1404 <mrow class="MJX-TeXAtom-ORD">
1412 <mrow class="MJX-TeXAtom-ORD">
1420 <mrow class="MJX-TeXAtom-ORD">
1428 <mrow class="MJX-TeXAtom-ORD">
1436 <mrow class="MJX-TeXAtom-ORD">
1444 <mrow class="MJX-TeXAtom-ORD">
1465 <div class="topic concept nested2" id="compressed-sparse-row-format-csr"><a name="compressed-sparse-row-format-csr" shape="rect">
1466 <!-- --></a><h3 class="title topictitle2"><a href="#compressed-sparse-row-format-csr" name="compressed-sparse-row-format-csr" shape="rect">3.3.3. Compressed Sparse Row Format (CSR)</a></h3>
1467 <div class="body conbody">
1468 <p class="p">The only way the CSR differs from the COO format is that the array containing the row indices is
1469 compressed in CSR format. The <samp class="ph codeph">m×n</samp> sparse matrix <samp class="ph codeph">A</samp> is
1470 represented in CSR format by the following parameters.
1472 <div class="tablenoborder">
1473 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
1474 <tbody class="tbody">
1476 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
1477 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
1478 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of nonzero elements in the matrix.</td>
1481 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
1482 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
1483 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the data array of length <samp class="ph codeph">nnz</samp> that holds all nonzero values of A
1484 in row-major format.
1488 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
1489 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
1490 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">m+1</samp> that holds indices into the
1491 arrays <samp class="ph codeph">csrColIndA</samp> and <samp class="ph codeph">csrValA</samp>. The first
1492 <samp class="ph codeph">m</samp> entries of this array contain the indices of the first nonzero
1493 element in the <samp class="ph codeph">i</samp>th row for <samp class="ph codeph">i=i,...,m</samp>, while the last
1494 entry contains <samp class="ph codeph">nnz+csrRowPtrA(0)</samp>. In general,
1495 <samp class="ph codeph">csrRowPtrA(0)</samp> is <samp class="ph codeph">0</samp> or <samp class="ph codeph">1</samp> for zero-
1496 and one-based indexing, respectively.
1500 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
1501 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
1502 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">nnz</samp> that contains the column indices of the corresponding elements in array <samp class="ph codeph">csrValA</samp>.
1508 <p class="p">Sparse matrices in CSR format are assumed to be stored in row-major CSR format, in other words,
1509 the index arrays are first sorted by row indices and then within the same row by column
1510 indices. It is assumed that each pair of row and column indices appears only once.
1512 <p class="p">Consider again the
1514 <math xmlns="http://www.w3.org/1998/Math/MathML">
1519 matrix<samp class="ph codeph">A</samp>.
1521 <div class="tablenoborder">
1522 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
1523 <tbody class="tbody">
1525 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
1526 <p class="p d4p_eqn_block">
1527 <math xmlns="http://www.w3.org/1998/Math/MathML">
1528 <mfenced open="[" close="]">
1529 <mtable rowspacing="4pt" columnspacing="1em">
1607 <p class="p">It is stored in CSR format with zero-based indexing as shown.</p>
1608 <div class="tablenoborder">
1609 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
1610 <tbody class="tbody">
1612 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
1613 <p class="p d4p_eqn_block">
1614 <math xmlns="http://www.w3.org/1998/Math/MathML">
1615 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
1618 <mtext>csrValA</mtext>
1624 <mfenced open="[" close="]">
1625 <mtable rowspacing="4pt" columnspacing="1em">
1661 <mtext>csrRowPtrA</mtext>
1667 <mfenced open="[" close="]">
1668 <mtable rowspacing="4pt" columnspacing="1em">
1672 <mrow class="MJX-TeXAtom-ORD">
1680 <mrow class="MJX-TeXAtom-ORD">
1688 <mrow class="MJX-TeXAtom-ORD">
1696 <mrow class="MJX-TeXAtom-ORD">
1704 <mrow class="MJX-TeXAtom-ORD">
1717 <mtext>csrColIndA</mtext>
1723 <mfenced open="[" close="]">
1724 <mtable rowspacing="4pt" columnspacing="1em">
1728 <mrow class="MJX-TeXAtom-ORD">
1736 <mrow class="MJX-TeXAtom-ORD">
1744 <mrow class="MJX-TeXAtom-ORD">
1752 <mrow class="MJX-TeXAtom-ORD">
1760 <mrow class="MJX-TeXAtom-ORD">
1768 <mrow class="MJX-TeXAtom-ORD">
1776 <mrow class="MJX-TeXAtom-ORD">
1784 <mrow class="MJX-TeXAtom-ORD">
1792 <mrow class="MJX-TeXAtom-ORD">
1811 <p class="p">This is how it is stored in CSR format with one-based indexing.</p>
1812 <div class="tablenoborder">
1813 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
1814 <tbody class="tbody">
1816 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
1817 <p class="p d4p_eqn_block">
1818 <math xmlns="http://www.w3.org/1998/Math/MathML">
1819 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
1822 <mtext>csrValA</mtext>
1828 <mfenced open="[" close="]">
1829 <mtable rowspacing="4pt" columnspacing="1em">
1865 <mtext>csrRowPtrA</mtext>
1871 <mfenced open="[" close="]">
1872 <mtable rowspacing="4pt" columnspacing="1em">
1876 <mrow class="MJX-TeXAtom-ORD">
1884 <mrow class="MJX-TeXAtom-ORD">
1892 <mrow class="MJX-TeXAtom-ORD">
1900 <mrow class="MJX-TeXAtom-ORD">
1908 <mrow class="MJX-TeXAtom-ORD">
1921 <mtext>csrColIndA</mtext>
1927 <mfenced open="[" close="]">
1928 <mtable rowspacing="4pt" columnspacing="1em">
1932 <mrow class="MJX-TeXAtom-ORD">
1940 <mrow class="MJX-TeXAtom-ORD">
1948 <mrow class="MJX-TeXAtom-ORD">
1956 <mrow class="MJX-TeXAtom-ORD">
1964 <mrow class="MJX-TeXAtom-ORD">
1972 <mrow class="MJX-TeXAtom-ORD">
1980 <mrow class="MJX-TeXAtom-ORD">
1988 <mrow class="MJX-TeXAtom-ORD">
1996 <mrow class="MJX-TeXAtom-ORD">
2017 <div class="topic concept nested2" id="compressed-sparse-column-format-csc"><a name="compressed-sparse-column-format-csc" shape="rect">
2018 <!-- --></a><h3 class="title topictitle2"><a href="#compressed-sparse-column-format-csc" name="compressed-sparse-column-format-csc" shape="rect">3.3.4. Compressed Sparse Column Format (CSC)</a></h3>
2019 <div class="body conbody">
2020 <p class="p">The CSC format is different from the COO format in two ways: the matrix is stored in
2021 column-major format, and the array containing the column indices is compressed in CSC format.
2022 The <samp class="ph codeph">m×n</samp> matrix <samp class="ph codeph">A</samp> is represented in CSC format by the
2023 following parameters.
2025 <div class="tablenoborder">
2026 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
2027 <tbody class="tbody">
2029 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
2030 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
2031 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of nonzero elements in the matrix.</td>
2034 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">cscValA</samp></td>
2035 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
2036 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the data array of length <samp class="ph codeph">nnz</samp> that holds all nonzero
2037 values of <samp class="ph codeph">A</samp> in column-major format.
2041 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">cscRowIndA</samp></td>
2042 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
2043 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">nnz</samp> that contains the row
2044 indices of the corresponding elements in array <samp class="ph codeph">cscValA</samp>.
2048 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">cscColPtrA</samp></td>
2049 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
2050 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">n+1</samp> that holds indices
2051 into the arrays <samp class="ph codeph">cscRowIndA</samp> and <samp class="ph codeph">cscValA</samp>. The first
2052 <samp class="ph codeph">n</samp> entries of this array contain the indices of the first nonzero
2053 element in the <samp class="ph codeph">i</samp>th row for <samp class="ph codeph">i=i,...,n</samp>, while the last
2054 entry contains <samp class="ph codeph">nnz+cscColPtrA(0)</samp>. In general,
2055 <samp class="ph codeph">cscColPtrA(0)</samp> is <samp class="ph codeph">0</samp> or <samp class="ph codeph">1</samp> for zero-
2056 and one-based indexing, respectively.
2062 <div class="note note"><span class="notetitle">Note:</span> The matrix <samp class="ph codeph">A</samp> in CSR format has exactly the same memory layout as its
2063 transpose in CSC format (and vice versa).
2065 <p class="p">For example, consider once again the
2067 <math xmlns="http://www.w3.org/1998/Math/MathML">
2072 matrix <samp class="ph codeph">A</samp>.
2074 <div class="tablenoborder">
2075 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
2076 <tbody class="tbody">
2078 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
2079 <p class="p d4p_eqn_block">
2080 <math xmlns="http://www.w3.org/1998/Math/MathML">
2081 <mfenced open="[" close="]">
2082 <mtable rowspacing="4pt" columnspacing="1em">
2160 <p class="p">It is stored in CSC format with zero-based indexing this way.</p>
2161 <div class="tablenoborder">
2162 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
2163 <tbody class="tbody">
2165 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
2166 <p class="p d4p_eqn_block">
2167 <math xmlns="http://www.w3.org/1998/Math/MathML">
2168 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
2171 <mtext>cscValA</mtext>
2177 <mfenced open="[" close="]">
2178 <mtable rowspacing="4pt" columnspacing="1em">
2214 <mtext>cscRowIndA</mtext>
2220 <mfenced open="[" close="]">
2221 <mtable rowspacing="4pt" columnspacing="1em">
2225 <mrow class="MJX-TeXAtom-ORD">
2233 <mrow class="MJX-TeXAtom-ORD">
2241 <mrow class="MJX-TeXAtom-ORD">
2249 <mrow class="MJX-TeXAtom-ORD">
2257 <mrow class="MJX-TeXAtom-ORD">
2265 <mrow class="MJX-TeXAtom-ORD">
2273 <mrow class="MJX-TeXAtom-ORD">
2281 <mrow class="MJX-TeXAtom-ORD">
2289 <mrow class="MJX-TeXAtom-ORD">
2302 <mtext>cscColPtrA</mtext>
2308 <mfenced open="[" close="]">
2309 <mtable rowspacing="4pt" columnspacing="1em">
2313 <mrow class="MJX-TeXAtom-ORD">
2321 <mrow class="MJX-TeXAtom-ORD">
2329 <mrow class="MJX-TeXAtom-ORD">
2337 <mrow class="MJX-TeXAtom-ORD">
2345 <mrow class="MJX-TeXAtom-ORD">
2353 <mrow class="MJX-TeXAtom-ORD">
2372 <p class="p">In CSC format with one-based indexing, this is how it is stored.</p>
2373 <div class="tablenoborder">
2374 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
2375 <tbody class="tbody">
2377 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
2378 <p class="p d4p_eqn_block">
2379 <math xmlns="http://www.w3.org/1998/Math/MathML">
2380 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
2383 <mtext>cscValA</mtext>
2389 <mfenced open="[" close="]">
2390 <mtable rowspacing="4pt" columnspacing="1em">
2426 <mtext>cscRowIndA</mtext>
2432 <mfenced open="[" close="]">
2433 <mtable rowspacing="4pt" columnspacing="1em">
2437 <mrow class="MJX-TeXAtom-ORD">
2445 <mrow class="MJX-TeXAtom-ORD">
2453 <mrow class="MJX-TeXAtom-ORD">
2461 <mrow class="MJX-TeXAtom-ORD">
2469 <mrow class="MJX-TeXAtom-ORD">
2477 <mrow class="MJX-TeXAtom-ORD">
2485 <mrow class="MJX-TeXAtom-ORD">
2493 <mrow class="MJX-TeXAtom-ORD">
2501 <mrow class="MJX-TeXAtom-ORD">
2514 <mtext>cscColPtrA</mtext>
2520 <mfenced open="[" close="]">
2521 <mtable rowspacing="4pt" columnspacing="1em">
2525 <mrow class="MJX-TeXAtom-ORD">
2533 <mrow class="MJX-TeXAtom-ORD">
2541 <mrow class="MJX-TeXAtom-ORD">
2549 <mrow class="MJX-TeXAtom-ORD">
2557 <mrow class="MJX-TeXAtom-ORD">
2565 <mrow class="MJX-TeXAtom-ORD">
2584 <p class="p"> Each pair of row and column indices appears only once.</p>
2587 <div class="topic concept nested2" id="ellpack-itpack-format-ell"><a name="ellpack-itpack-format-ell" shape="rect">
2588 <!-- --></a><h3 class="title topictitle2"><a href="#ellpack-itpack-format-ell" name="ellpack-itpack-format-ell" shape="rect">3.3.5. Ellpack-Itpack Format (ELL)</a></h3>
2589 <div class="body conbody">
2590 <p class="p">An <samp class="ph codeph">m×n</samp> sparse matrix <samp class="ph codeph">A</samp> with at most
2591 <samp class="ph codeph">k</samp> nonzero elements per row is stored in the Ellpack-Itpack (ELL) format
2592 [2] using two dense arrays of dimension <samp class="ph codeph">m×k</samp>. The first data
2593 array contains the values of the nonzero elements in the matrix, while the second integer
2594 array contains the corresponding column indices.
2596 <p class="p">For example, consider the
2598 <math xmlns="http://www.w3.org/1998/Math/MathML">
2603 matrix <samp class="ph codeph">A</samp>.
2605 <div class="tablenoborder">
2606 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
2607 <tbody class="tbody">
2609 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
2610 <p class="p d4p_eqn_block">
2611 <math xmlns="http://www.w3.org/1998/Math/MathML">
2612 <mfenced open="[" close="]">
2613 <mtable rowspacing="4pt" columnspacing="1em">
2691 <p class="p">This is how it is stored in ELL format with zero-based indexing.</p>
2692 <div class="tablenoborder">
2693 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
2694 <tbody class="tbody">
2696 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
2697 <p class="p d4p_eqn_block">
2698 <math xmlns="http://www.w3.org/1998/Math/MathML">
2699 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
2708 <mfenced open="[" close="]">
2709 <mtable rowspacing="4pt" columnspacing="1em">
2760 <mtext>indices</mtext>
2766 <mfenced open="[" close="]">
2767 <mtable rowspacing="4pt" columnspacing="1em">
2771 <mrow class="MJX-TeXAtom-ORD">
2779 <mrow class="MJX-TeXAtom-ORD">
2788 <mrow class="MJX-TeXAtom-ORD">
2798 <mrow class="MJX-TeXAtom-ORD">
2806 <mrow class="MJX-TeXAtom-ORD">
2815 <mrow class="MJX-TeXAtom-ORD">
2825 <mrow class="MJX-TeXAtom-ORD">
2833 <mrow class="MJX-TeXAtom-ORD">
2840 <mrow class="MJX-TeXAtom-ORD">
2846 <mrow class="MJX-TeXAtom-ORD">
2856 <mrow class="MJX-TeXAtom-ORD">
2864 <mrow class="MJX-TeXAtom-ORD">
2873 <mrow class="MJX-TeXAtom-ORD">
2892 <p class="p">It is stored this way in ELL format with one-based indexing.</p>
2893 <div class="tablenoborder">
2894 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
2895 <tbody class="tbody">
2897 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
2898 <p class="p d4p_eqn_block">
2899 <math xmlns="http://www.w3.org/1998/Math/MathML">
2900 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
2909 <mfenced open="[" close="]">
2910 <mtable rowspacing="4pt" columnspacing="1em">
2961 <mtext>indices</mtext>
2967 <mfenced open="[" close="]">
2968 <mtable rowspacing="4pt" columnspacing="1em">
2972 <mrow class="MJX-TeXAtom-ORD">
2980 <mrow class="MJX-TeXAtom-ORD">
2989 <mrow class="MJX-TeXAtom-ORD">
2999 <mrow class="MJX-TeXAtom-ORD">
3007 <mrow class="MJX-TeXAtom-ORD">
3016 <mrow class="MJX-TeXAtom-ORD">
3026 <mrow class="MJX-TeXAtom-ORD">
3034 <mrow class="MJX-TeXAtom-ORD">
3041 <mrow class="MJX-TeXAtom-ORD">
3047 <mrow class="MJX-TeXAtom-ORD">
3057 <mrow class="MJX-TeXAtom-ORD">
3065 <mrow class="MJX-TeXAtom-ORD">
3074 <mrow class="MJX-TeXAtom-ORD">
3093 <p class="p">Sparse matrices in ELL format are assumed to be stored in column-major format in memory.
3094 Also, rows with less than <samp class="ph codeph">k</samp> nonzero elements are padded in the
3095 <samp class="ph codeph">data</samp> and <samp class="ph codeph">indices</samp> arrays with zero and
3097 <math xmlns="http://www.w3.org/1998/Math/MathML">
3103 <p class="p">The ELL format is not supported directly, but it is used to store the regular part of the
3104 matrix in the HYB format that is described in the next section.
3108 <div class="topic concept nested2" id="hybrid-format-hyb"><a name="hybrid-format-hyb" shape="rect">
3109 <!-- --></a><h3 class="title topictitle2"><a href="#hybrid-format-hyb" name="hybrid-format-hyb" shape="rect">3.3.6. Hybrid Format (HYB)</a></h3>
3110 <div class="body conbody">
3111 <p class="p">The HYB sparse storage format is composed of a regular part, usually stored in ELL format,
3112 and an irregular part, usually stored in COO format [1]. The ELL and COO parts are
3113 always stored using zero-based indexing. HYB is implemented as an opaque data format that
3114 requires the use of a conversion operation to store a matrix in it. The conversion operation
3115 partitions the general matrix into the regular and irregular parts automatically or according
3116 to developer-specified criteria.
3118 <p class="p">For more information, please refer to the description of
3119 <samp class="ph codeph">cusparseHybPartition_t</samp> type, as well as the description of the conversion
3120 routines <samp class="ph codeph">dense2hyb</samp>, <samp class="ph codeph">csc2hyb</samp> and <samp class="ph codeph">csr2hyb</samp>.
3124 <div class="topic concept nested2" id="block-compressed-sparse-row-format-bsr"><a name="block-compressed-sparse-row-format-bsr" shape="rect">
3125 <!-- --></a><h3 class="title topictitle2"><a href="#block-compressed-sparse-row-format-bsr" name="block-compressed-sparse-row-format-bsr" shape="rect">3.3.7. Block Compressed Sparse Row Format (BSR)</a></h3>
3126 <div class="body conbody">
3127 <p class="p">The only difference between the CSR and BSR formats is the format of the storage element. The
3128 former stores primitive data types (<samp class="ph codeph">single</samp>, <samp class="ph codeph">double</samp>,
3129 <samp class="ph codeph">cuComplex</samp>, and <samp class="ph codeph">cuDoubleComplex</samp>) whereas the latter stores
3130 a two-dimensional square block of primitive data types. The dimension of the square block is
3132 <math xmlns="http://www.w3.org/1998/Math/MathML">
3142 . The <samp class="ph codeph">m×n</samp> sparse matrix <samp class="ph codeph">A</samp> is equivalent
3143 to a block sparse matrix
3145 <math xmlns="http://www.w3.org/1998/Math/MathML">
3153 <math xmlns="http://www.w3.org/1998/Math/MathML">
3186 <math xmlns="http://www.w3.org/1998/Math/MathML">
3219 <math xmlns="http://www.w3.org/1998/Math/MathML">
3224 <math xmlns="http://www.w3.org/1998/Math/MathML">
3229 <math xmlns="http://www.w3.org/1998/Math/MathML">
3239 , then zeros are filled into
3241 <math xmlns="http://www.w3.org/1998/Math/MathML">
3249 <p class="p"><samp class="ph codeph">A</samp> is represented in BSR format by the following parameters.
3251 <div class="tablenoborder">
3252 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
3253 <tbody class="tbody">
3255 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">blockDim</samp></td>
3256 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
3257 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Block dimension of matrix <samp class="ph codeph">A</samp>.
3261 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">mb</samp></td>
3262 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
3263 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of block rows of <samp class="ph codeph">A</samp>.
3267 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">nb</samp></td>
3268 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
3269 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of block columns of <samp class="ph codeph">A</samp>.
3273 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">nnzb</samp></td>
3274 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
3275 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of nonzero blocks in the matrix.</td>
3278 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">bsrValA</samp></td>
3279 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
3280 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the data array of length
3282 <math xmlns="http://www.w3.org/1998/Math/MathML">
3300 that holds all elements of nonzero blocks of <samp class="ph codeph">A</samp>. The
3301 block elements are stored in either column-major order or row-major order.
3305 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">bsrRowPtrA</samp></td>
3306 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
3307 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">mb+1</samp> that holds indices into the
3308 arrays <samp class="ph codeph">bsrColIndA</samp> and <samp class="ph codeph">bsrValA</samp>. The first
3309 <samp class="ph codeph">mb</samp> entries of this array contain the indices of the first nonzero
3310 block in the <samp class="ph codeph">i</samp>th block row for <samp class="ph codeph">i=1,...,mb</samp>, while the
3311 last entry contains <samp class="ph codeph">nnzb+bsrRowPtrA(0)</samp>. In general,
3312 <samp class="ph codeph">bsrRowPtrA(0)</samp> is <samp class="ph codeph">0</samp> or <samp class="ph codeph">1</samp> for zero-
3313 and one-based indexing, respectively.
3317 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">bsrColIndA</samp></td>
3318 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
3319 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">nnzb</samp> that contains the column indices of the corresponding blocks in array <samp class="ph codeph">bsrValA</samp>.
3325 <p class="p">As with CSR format, (row, column) indices of BSR are stored in row-major order. The index arrays
3326 are first sorted by row indices and then within the same row by column indices.
3328 <p class="p">For example, consider again the <samp class="ph codeph">4×5</samp> matrix <samp class="ph codeph">A</samp>.
3330 <div class="tablenoborder">
3331 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
3332 <tbody class="tbody">
3334 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
3335 <p class="p d4p_eqn_block">
3336 <math xmlns="http://www.w3.org/1998/Math/MathML">
3337 <mfenced open="[" close="]">
3338 <mtable rowspacing="4pt" columnspacing="1em">
3418 <math xmlns="http://www.w3.org/1998/Math/MathML">
3430 <math xmlns="http://www.w3.org/1998/Math/MathML">
3436 <math xmlns="http://www.w3.org/1998/Math/MathML">
3440 is 3, and matrix <samp class="ph codeph">A</samp> is split into <samp class="ph codeph">2×3</samp>
3443 <math xmlns="http://www.w3.org/1998/Math/MathML">
3451 <math xmlns="http://www.w3.org/1998/Math/MathML">
3457 is <samp class="ph codeph">4×6</samp>, slightly bigger than matrix
3459 <math xmlns="http://www.w3.org/1998/Math/MathML">
3462 , so zeros are filled in the last column of
3464 <math xmlns="http://www.w3.org/1998/Math/MathML">
3470 . The element-wise view of
3472 <math xmlns="http://www.w3.org/1998/Math/MathML">
3480 <div class="tablenoborder">
3481 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
3482 <tbody class="tbody">
3484 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
3485 <p class="p d4p_eqn_block">
3486 <math xmlns="http://www.w3.org/1998/Math/MathML">
3487 <mfenced open="[" close="]">
3488 <mtable rowspacing="4pt" columnspacing="1em">
3578 <p class="p">Based on zero-based indexing, the block-wise view of
3580 <math xmlns="http://www.w3.org/1998/Math/MathML">
3586 can be represented as follows.
3588 <div class="tablenoborder">
3589 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
3590 <tbody class="tbody">
3592 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
3593 <math xmlns="http://www.w3.org/1998/Math/MathML">
3596 <mrow class="MJX-TeXAtom-ORD">
3601 <mfenced open="[" close="]">
3602 <mtable rowspacing="4pt" columnspacing="1em">
3607 <mrow class="MJX-TeXAtom-ORD">
3615 <mrow class="MJX-TeXAtom-ORD">
3623 <mrow class="MJX-TeXAtom-ORD">
3633 <mrow class="MJX-TeXAtom-ORD">
3641 <mrow class="MJX-TeXAtom-ORD">
3649 <mrow class="MJX-TeXAtom-ORD">
3663 <p class="p">The basic element of BSR is a nonzero
3665 <math xmlns="http://www.w3.org/1998/Math/MathML">
3668 <mrow class="MJX-TeXAtom-ORD">
3674 block, one that contains at least one nonzero element of <samp class="ph codeph">A</samp>.
3675 Five of six blocks are nonzero in
3677 <math xmlns="http://www.w3.org/1998/Math/MathML">
3685 <div class="tablenoborder">
3686 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
3687 <tbody class="tbody">
3689 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
3690 <math xmlns="http://www.w3.org/1998/Math/MathML">
3693 <mrow class="MJX-TeXAtom-ORD">
3698 <mfenced open="[" close="]">
3699 <mtable rowspacing="4pt" columnspacing="1em">
3721 <mrow class="MJX-TeXAtom-ORD">
3726 <mfenced open="[" close="]">
3727 <mtable rowspacing="4pt" columnspacing="1em">
3749 <mrow class="MJX-TeXAtom-ORD">
3754 <mfenced open="[" close="]">
3755 <mtable rowspacing="4pt" columnspacing="1em">
3777 <mrow class="MJX-TeXAtom-ORD">
3782 <mfenced open="[" close="]">
3783 <mtable rowspacing="4pt" columnspacing="1em">
3805 <mrow class="MJX-TeXAtom-ORD">
3810 <mfenced open="[" close="]">
3811 <mtable rowspacing="4pt" columnspacing="1em">
3836 <p class="p">BSR format only stores the information of nonzero blocks, including block indices
3838 <math xmlns="http://www.w3.org/1998/Math/MathML">
3839 <mo stretchy="false">(</mo>
3843 <mo stretchy="false">)</mo>
3847 <math xmlns="http://www.w3.org/1998/Math/MathML">
3850 <mrow class="MJX-TeXAtom-ORD">
3856 . Also row indices are compressed in CSR format.
3858 <div class="tablenoborder">
3859 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
3860 <tbody class="tbody">
3862 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
3863 <p class="p d4p_eqn_block">
3864 <math xmlns="http://www.w3.org/1998/Math/MathML">
3865 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
3868 <mtext>bsrValA</mtext>
3874 <mfenced open="[" close="]">
3875 <mtable rowspacing="4pt" columnspacing="0.8em">
3880 <mrow class="MJX-TeXAtom-ORD">
3888 <mrow class="MJX-TeXAtom-ORD">
3896 <mrow class="MJX-TeXAtom-ORD">
3904 <mrow class="MJX-TeXAtom-ORD">
3912 <mrow class="MJX-TeXAtom-ORD">
3924 <mtext>bsrRowPtrA</mtext>
3930 <mfenced open="[" close="]">
3931 <mtable rowspacing="4pt" columnspacing="1em">
3935 <mrow class="MJX-TeXAtom-ORD">
3943 <mrow class="MJX-TeXAtom-ORD">
3959 <mtext>bsrColIndA</mtext>
3965 <mfenced open="[" close="]">
3966 <mtable rowspacing="4pt" columnspacing="1em">
3970 <mrow class="MJX-TeXAtom-ORD">
3978 <mrow class="MJX-TeXAtom-ORD">
3986 <mrow class="MJX-TeXAtom-ORD">
3994 <mrow class="MJX-TeXAtom-ORD">
4016 <p class="p">There are two ways to arrange the data element of block
4019 <math xmlns="http://www.w3.org/1998/Math/MathML">
4022 <mrow class="MJX-TeXAtom-ORD">
4028 : row-major order and column-major order. Under column-major order, the physical
4029 storage of <samp class="ph codeph">bsrValA</samp> is this.
4031 <div class="tablenoborder">
4032 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
4033 <tbody class="tbody">
4035 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
4036 <math xmlns="http://www.w3.org/1998/Math/MathML">
4037 <mtable columnalign="center center center center center" rowspacing="4pt" columnspacing="1em" columnlines="solid solid solid solid">
4048 <mo stretchy="false">[</mo>
4050 <mrow class="MJX-TeXAtom-ORD">
4056 <mrow class="MJX-TeXAtom-ORD">
4062 <mrow class="MJX-TeXAtom-ORD">
4068 <mrow class="MJX-TeXAtom-ORD">
4076 <mrow class="MJX-TeXAtom-ORD">
4082 <mrow class="MJX-TeXAtom-ORD">
4088 <mrow class="MJX-TeXAtom-ORD">
4094 <mrow class="MJX-TeXAtom-ORD">
4102 <mrow class="MJX-TeXAtom-ORD">
4108 <mrow class="MJX-TeXAtom-ORD">
4114 <mrow class="MJX-TeXAtom-ORD">
4120 <mrow class="MJX-TeXAtom-ORD">
4128 <mrow class="MJX-TeXAtom-ORD">
4134 <mrow class="MJX-TeXAtom-ORD">
4140 <mrow class="MJX-TeXAtom-ORD">
4146 <mrow class="MJX-TeXAtom-ORD">
4154 <mrow class="MJX-TeXAtom-ORD">
4160 <mrow class="MJX-TeXAtom-ORD">
4166 <mrow class="MJX-TeXAtom-ORD">
4172 <mrow class="MJX-TeXAtom-ORD">
4177 <mo stretchy="false">]</mo>
4187 <p class="p">Under row-major order, the physical storage of <samp class="ph codeph">bsrValA</samp> is this.
4189 <div class="tablenoborder">
4190 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
4191 <tbody class="tbody">
4193 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
4194 <math xmlns="http://www.w3.org/1998/Math/MathML">
4195 <mtable columnalign="center center center center center" rowspacing="4pt" columnspacing="1em" columnlines="solid solid solid solid">
4206 <mo stretchy="false">[</mo>
4208 <mrow class="MJX-TeXAtom-ORD">
4214 <mrow class="MJX-TeXAtom-ORD">
4220 <mrow class="MJX-TeXAtom-ORD">
4226 <mrow class="MJX-TeXAtom-ORD">
4234 <mrow class="MJX-TeXAtom-ORD">
4240 <mrow class="MJX-TeXAtom-ORD">
4246 <mrow class="MJX-TeXAtom-ORD">
4252 <mrow class="MJX-TeXAtom-ORD">
4260 <mrow class="MJX-TeXAtom-ORD">
4266 <mrow class="MJX-TeXAtom-ORD">
4272 <mrow class="MJX-TeXAtom-ORD">
4278 <mrow class="MJX-TeXAtom-ORD">
4286 <mrow class="MJX-TeXAtom-ORD">
4292 <mrow class="MJX-TeXAtom-ORD">
4298 <mrow class="MJX-TeXAtom-ORD">
4304 <mrow class="MJX-TeXAtom-ORD">
4312 <mrow class="MJX-TeXAtom-ORD">
4318 <mrow class="MJX-TeXAtom-ORD">
4324 <mrow class="MJX-TeXAtom-ORD">
4330 <mrow class="MJX-TeXAtom-ORD">
4335 <mo stretchy="false">]</mo>
4345 <p class="p">Similarly, in BSR format with one-based indexing and column-major order, <samp class="ph codeph">A</samp> can
4346 be represented by the following.
4348 <div class="tablenoborder">
4349 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
4350 <tbody class="tbody">
4352 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
4353 <math xmlns="http://www.w3.org/1998/Math/MathML">
4356 <mrow class="MJX-TeXAtom-ORD">
4361 <mfenced open="[" close="]">
4362 <mtable rowspacing="4pt" columnspacing="1em">
4367 <mrow class="MJX-TeXAtom-ORD">
4375 <mrow class="MJX-TeXAtom-ORD">
4383 <mrow class="MJX-TeXAtom-ORD">
4393 <mrow class="MJX-TeXAtom-ORD">
4401 <mrow class="MJX-TeXAtom-ORD">
4409 <mrow class="MJX-TeXAtom-ORD">
4423 <div class="tablenoborder">
4424 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
4425 <tbody class="tbody">
4427 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
4428 <math xmlns="http://www.w3.org/1998/Math/MathML">
4429 <mtable columnalign="center center center center center" rowspacing="4pt" columnspacing="1em" columnlines="solid solid solid solid">
4440 <mo stretchy="false">[</mo>
4442 <mrow class="MJX-TeXAtom-ORD">
4448 <mrow class="MJX-TeXAtom-ORD">
4454 <mrow class="MJX-TeXAtom-ORD">
4460 <mrow class="MJX-TeXAtom-ORD">
4468 <mrow class="MJX-TeXAtom-ORD">
4474 <mrow class="MJX-TeXAtom-ORD">
4480 <mrow class="MJX-TeXAtom-ORD">
4486 <mrow class="MJX-TeXAtom-ORD">
4494 <mrow class="MJX-TeXAtom-ORD">
4500 <mrow class="MJX-TeXAtom-ORD">
4506 <mrow class="MJX-TeXAtom-ORD">
4512 <mrow class="MJX-TeXAtom-ORD">
4520 <mrow class="MJX-TeXAtom-ORD">
4526 <mrow class="MJX-TeXAtom-ORD">
4532 <mrow class="MJX-TeXAtom-ORD">
4538 <mrow class="MJX-TeXAtom-ORD">
4546 <mrow class="MJX-TeXAtom-ORD">
4552 <mrow class="MJX-TeXAtom-ORD">
4558 <mrow class="MJX-TeXAtom-ORD">
4564 <mrow class="MJX-TeXAtom-ORD">
4569 <mo stretchy="false">]</mo>
4579 <div class="tablenoborder">
4580 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
4581 <tbody class="tbody">
4583 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
4584 <p class="p d4p_eqn_block">
4585 <math xmlns="http://www.w3.org/1998/Math/MathML">
4586 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
4589 <mtext>bsrRowPtrA</mtext>
4595 <mfenced open="[" close="]">
4596 <mtable rowspacing="4pt" columnspacing="1em">
4600 <mrow class="MJX-TeXAtom-ORD">
4608 <mrow class="MJX-TeXAtom-ORD">
4624 <mtext>bsrColIndA</mtext>
4630 <mfenced open="[" close="]">
4631 <mtable rowspacing="4pt" columnspacing="1em">
4635 <mrow class="MJX-TeXAtom-ORD">
4643 <mrow class="MJX-TeXAtom-ORD">
4651 <mrow class="MJX-TeXAtom-ORD">
4659 <mrow class="MJX-TeXAtom-ORD">
4681 <div class="note note"><span class="notetitle">Note:</span> The storage format of blocks in BSR format can be column-major or row-major, independently
4682 of the base index. However, if the developer has BSR format from the Math Kernel Library (MKL)
4683 and wants to directly copy it to BSR in CUSPARSE, then <samp class="ph codeph">cusparseDirection_t</samp> is
4684 <samp class="ph codeph">CUSPARSE_DIRECTION_COLUMN</samp> if the base index is one; otherwise,
4685 <samp class="ph codeph">cusparseDirection_t</samp> is <samp class="ph codeph">CUSPARSE_DIRECTION_ROW</samp>.
4689 <div class="topic concept nested2" id="extended-bsr-format-bsrx"><a name="extended-bsr-format-bsrx" shape="rect">
4690 <!-- --></a><h3 class="title topictitle2"><a href="#extended-bsr-format-bsrx" name="extended-bsr-format-bsrx" shape="rect">3.3.8. Extended BSR Format (BSRX)</a></h3>
4691 <div class="body conbody">
4692 <p class="p">BSRX is the same as the BSR format, but the array <samp class="ph codeph">bsrRowPtrA</samp> is separated into
4693 two parts. The first nonzero block of each row is still specified by the array
4694 <samp class="ph codeph">bsrRowPtrA</samp>, which is the same as in BSR, but the position next to the last
4695 nonzero block of each row is specified by the array <samp class="ph codeph">bsrEndPtrA</samp>. Briefly, BSRX
4696 format is simply like a 4-vector variant of BSR format.
4698 <p class="p">Matrix <samp class="ph codeph">A</samp> is represented in BSRX format by the following parameters.
4700 <div class="tablenoborder">
4701 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
4702 <tbody class="tbody">
4704 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">blockDim</samp></td>
4705 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
4706 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Block dimension of matrix <samp class="ph codeph">A</samp>.
4710 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">mb</samp></td>
4711 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
4712 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of block rows of <samp class="ph codeph">A</samp>.
4716 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">nb</samp></td>
4717 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
4718 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The number of block columns of <samp class="ph codeph">A</samp>.
4722 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">nnzb</samp></td>
4723 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(integer)</td>
4724 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">The size of <samp class="ph codeph">bsrColIndA</samp> and <samp class="ph codeph">bsrValA</samp>; <samp class="ph codeph">nnzb</samp>
4725 is greater than or equal to the number of nonzero blocks in the matrix
4726 <samp class="ph codeph">A</samp>.
4730 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">bsrValA</samp></td>
4731 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
4732 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the data array of length
4734 <math xmlns="http://www.w3.org/1998/Math/MathML">
4752 that holds all the elements of the nonzero blocks of <samp class="ph codeph">A</samp>.
4753 The block elements are stored in either column-major order or row-major order.
4757 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">bsrRowPtrA</samp></td>
4758 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
4759 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">mb</samp> that holds indices into the arrays
4760 <samp class="ph codeph">bsrColIndA</samp> and <samp class="ph codeph">bsrValA</samp>;
4761 <samp class="ph codeph">bsrRowPtr(i)</samp> is the position of the first nonzero block of the
4762 <samp class="ph codeph">i</samp>th block row in <samp class="ph codeph">bsrColIndA</samp> and
4763 <samp class="ph codeph">bsrValA</samp>.
4767 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">bsrEndPtrA</samp></td>
4768 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
4769 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">mb</samp> that holds indices into the arrays
4770 <samp class="ph codeph">bsrColIndA</samp> and <samp class="ph codeph">bsrValA</samp>;
4771 <samp class="ph codeph">bsrRowPtr(i)</samp> is the position next to the last nonzero block of the
4772 <samp class="ph codeph">i</samp>th block row in <samp class="ph codeph">bsrColIndA</samp> and
4773 <samp class="ph codeph">bsrValA</samp>.
4777 <td class="entry" valign="top" width="20%" rowspan="1" colspan="1"><samp class="ph codeph">bsrColIndA</samp></td>
4778 <td class="entry" valign="top" width="10%" rowspan="1" colspan="1">(pointer)</td>
4779 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">Points to the integer array of length <samp class="ph codeph">nnzb</samp> that contains the column indices of the corresponding blocks in array <samp class="ph codeph">bsrValA</samp>.
4785 <p class="p">A simple conversion between BSR and BSRX can be done as follows. Suppose the developer has a
4786 <samp class="ph codeph">2×3</samp> block sparse matrix
4788 <math xmlns="http://www.w3.org/1998/Math/MathML">
4794 represented as shown.
4796 <div class="tablenoborder">
4797 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
4798 <tbody class="tbody">
4800 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
4801 <math xmlns="http://www.w3.org/1998/Math/MathML">
4804 <mrow class="MJX-TeXAtom-ORD">
4809 <mfenced open="[" close="]">
4810 <mtable rowspacing="4pt" columnspacing="1em">
4815 <mrow class="MJX-TeXAtom-ORD">
4823 <mrow class="MJX-TeXAtom-ORD">
4831 <mrow class="MJX-TeXAtom-ORD">
4841 <mrow class="MJX-TeXAtom-ORD">
4849 <mrow class="MJX-TeXAtom-ORD">
4857 <mrow class="MJX-TeXAtom-ORD">
4871 <p class="p">Assume it has this BSR format.</p>
4872 <div class="tablenoborder">
4873 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
4874 <tbody class="tbody">
4876 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
4877 <p class="p d4p_eqn_block">
4878 <math xmlns="http://www.w3.org/1998/Math/MathML">
4879 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
4882 <mtext>bsrValA of BSR</mtext>
4888 <mfenced open="[" close="]">
4889 <mtable rowspacing="4pt" columnspacing="0.8em">
4894 <mrow class="MJX-TeXAtom-ORD">
4902 <mrow class="MJX-TeXAtom-ORD">
4910 <mrow class="MJX-TeXAtom-ORD">
4918 <mrow class="MJX-TeXAtom-ORD">
4926 <mrow class="MJX-TeXAtom-ORD">
4938 <mtext>bsrRowPtrA of BSR</mtext>
4944 <mfenced open="[" close="]">
4945 <mtable rowspacing="4pt" columnspacing="1em">
4949 <mrow class="MJX-TeXAtom-ORD">
4957 <mrow class="MJX-TeXAtom-ORD">
4973 <mtext>bsrColIndA of BSR</mtext>
4979 <mfenced open="[" close="]">
4980 <mtable rowspacing="4pt" columnspacing="1em">
4984 <mrow class="MJX-TeXAtom-ORD">
4992 <mrow class="MJX-TeXAtom-ORD">
5000 <mrow class="MJX-TeXAtom-ORD">
5008 <mrow class="MJX-TeXAtom-ORD">
5030 <p class="p">The <samp class="ph codeph">bsrRowPtrA</samp> of the BSRX format is simply the first two elements of the
5031 <samp class="ph codeph">bsrRowPtrA</samp> BSR format. The <samp class="ph codeph">bsrEndPtrA</samp> of BSRX format is
5032 the last two elements of the <samp class="ph codeph">bsrRowPtrA</samp> of BSR format.
5034 <div class="tablenoborder">
5035 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
5036 <tbody class="tbody">
5038 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
5039 <p class="p d4p_eqn_block">
5040 <math xmlns="http://www.w3.org/1998/Math/MathML">
5041 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
5044 <mtext>bsrRowPtrA of BSRX</mtext>
5050 <mfenced open="[" close="]">
5051 <mtable rowspacing="4pt" columnspacing="1em">
5055 <mrow class="MJX-TeXAtom-ORD">
5071 <mtext>bsrEndPtrA of BSRX</mtext>
5077 <mfenced open="[" close="]">
5078 <mtable rowspacing="4pt" columnspacing="1em">
5082 <mrow class="MJX-TeXAtom-ORD">
5104 <p class="p">The power of the BSRX format is that the developer can specify a submatrix in the original BSR
5105 format by modifying <samp class="ph codeph">bsrRowPtrA</samp> and <samp class="ph codeph">bsrEndPtrA</samp> while keeping
5106 <samp class="ph codeph">bsrColIndA</samp> and <samp class="ph codeph">bsrValA</samp> unchanged.
5108 <p class="p">For example, to create another block matrix
5110 <math xmlns="http://www.w3.org/1998/Math/MathML">
5111 <mrow class="MJX-TeXAtom-ORD">
5114 <mo stretchy="false">˜</mo>
5118 <mfenced open="[" close="]">
5119 <mtable rowspacing="4pt" columnspacing="1em">
5138 <mrow class="MJX-TeXAtom-ORD">
5150 that is slightly different from
5152 <math xmlns="http://www.w3.org/1998/Math/MathML">
5155 , the developer can keep <samp class="ph codeph">bsrColIndA</samp> and
5156 <samp class="ph codeph">bsrValA</samp>, but reconstruct
5158 <math xmlns="http://www.w3.org/1998/Math/MathML">
5159 <mrow class="MJX-TeXAtom-ORD">
5162 <mo stretchy="false">˜</mo>
5166 by properly setting of <samp class="ph codeph">bsrRowPtrA</samp> and
5167 <samp class="ph codeph">bsrEndPtrA</samp>. The following 4-vector characterizes
5169 <math xmlns="http://www.w3.org/1998/Math/MathML">
5170 <mrow class="MJX-TeXAtom-ORD">
5173 <mo stretchy="false">˜</mo>
5179 <div class="tablenoborder">
5180 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
5181 <tbody class="tbody">
5183 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
5184 <p class="p d4p_eqn_block">
5185 <math xmlns="http://www.w3.org/1998/Math/MathML">
5186 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
5189 <mtext>bsrValA of </mtext>
5190 <mrow class="MJX-TeXAtom-ORD">
5193 <mo stretchy="false">˜</mo>
5201 <mfenced open="[" close="]">
5202 <mtable rowspacing="4pt" columnspacing="0.8em">
5207 <mrow class="MJX-TeXAtom-ORD">
5215 <mrow class="MJX-TeXAtom-ORD">
5223 <mrow class="MJX-TeXAtom-ORD">
5231 <mrow class="MJX-TeXAtom-ORD">
5239 <mrow class="MJX-TeXAtom-ORD">
5251 <mtext>bsrColIndA of </mtext>
5252 <mrow class="MJX-TeXAtom-ORD">
5255 <mo stretchy="false">˜</mo>
5263 <mfenced open="[" close="]">
5264 <mtable rowspacing="4pt" columnspacing="1em">
5268 <mrow class="MJX-TeXAtom-ORD">
5276 <mrow class="MJX-TeXAtom-ORD">
5284 <mrow class="MJX-TeXAtom-ORD">
5292 <mrow class="MJX-TeXAtom-ORD">
5308 <mtext>bsrRowPtrA of </mtext>
5309 <mrow class="MJX-TeXAtom-ORD">
5312 <mo stretchy="false">˜</mo>
5320 <mfenced open="[" close="]">
5321 <mtable rowspacing="4pt" columnspacing="1em">
5325 <mrow class="MJX-TeXAtom-ORD">
5341 <mtext>bsrEndPtrA of </mtext>
5342 <mrow class="MJX-TeXAtom-ORD">
5345 <mo stretchy="false">˜</mo>
5353 <mfenced open="[" close="]">
5354 <mtable rowspacing="4pt" columnspacing="1em">
5358 <mrow class="MJX-TeXAtom-ORD">
5384 <div class="topic concept nested0" id="cusparse-types-reference"><a name="cusparse-types-reference" shape="rect">
5385 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-types-reference" name="cusparse-types-reference" shape="rect">4. CUSPARSE Types Reference</a></h2>
5386 <div class="topic concept nested1" id="data-types"><a name="data-types" shape="rect">
5387 <!-- --></a><h3 class="title topictitle2"><a href="#data-types" name="data-types" shape="rect">4.1. Data types</a></h3>
5388 <div class="body conbody">
5389 <p class="p">The <samp class="ph codeph">float</samp>, <samp class="ph codeph">double</samp>, <samp class="ph codeph">cuComplex</samp>, and <samp class="ph codeph">cuDoubleComplex</samp> data types are supported. The first two are standard C data types, while the last two are exported from <samp class="ph codeph">cuComplex.h</samp>.
5393 <div class="topic concept nested1" id="cusparseactiont"><a name="cusparseactiont" shape="rect">
5394 <!-- --></a><h3 class="title topictitle2"><a href="#cusparseactiont" name="cusparseactiont" shape="rect">4.2. cusparseAction_t</a></h3>
5395 <div class="body conbody">
5396 <p class="p">This type indicates whether the operation is performed only on indices or on data and indices.</p>
5397 <div class="tablenoborder">
5398 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5399 <thead class="thead" align="left">
5401 <th class="entry" valign="top" width="50%" id="d54e10236" rowspan="1" colspan="1">
5405 <th class="entry" valign="top" width="50%" id="d54e10239" rowspan="1" colspan="1">
5411 <tbody class="tbody">
5413 <td class="entry" valign="top" width="50%" headers="d54e10236" rowspan="1" colspan="1">
5414 <p class="p"><samp class="ph codeph">CUSPARSE_ACTION_SYMBOLIC</samp></p>
5416 <td class="entry" valign="top" width="50%" headers="d54e10239" rowspan="1" colspan="1">
5417 <p class="p">the operation is performed only on indices.</p>
5421 <td class="entry" valign="top" width="50%" headers="d54e10236" rowspan="1" colspan="1">
5422 <p class="p"><samp class="ph codeph">CUSPARSE_ACTION_NUMERIC</samp></p>
5424 <td class="entry" valign="top" width="50%" headers="d54e10239" rowspan="1" colspan="1">
5425 <p class="p">the operation is performed on data and indices.</p>
5433 <div class="topic concept nested1" id="cusparsedirectiont"><a name="cusparsedirectiont" shape="rect">
5434 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsedirectiont" name="cusparsedirectiont" shape="rect">4.3. cusparseDirection_t</a></h3>
5435 <div class="body conbody">
5436 <p class="p">This type indicates whether the elements of a dense matrix should be parsed by rows or by columns (assuming column-major storage
5437 in memory of the dense matrix) in function cusparse[S|D|C|Z]nnz. Besides storage format of blocks in BSR format is also controlled
5440 <div class="tablenoborder">
5441 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5442 <thead class="thead" align="left">
5444 <th class="entry" valign="top" width="50%" id="d54e10305" rowspan="1" colspan="1">
5448 <th class="entry" valign="top" width="50%" id="d54e10308" rowspan="1" colspan="1">
5454 <tbody class="tbody">
5456 <td class="entry" valign="top" width="50%" headers="d54e10305" rowspan="1" colspan="1">
5457 <p class="p"><samp class="ph codeph">CUSPARSE_DIRECTION_ROW</samp></p>
5459 <td class="entry" valign="top" width="50%" headers="d54e10308" rowspan="1" colspan="1">
5460 <p class="p">the matrix should be parsed by rows.</p>
5464 <td class="entry" valign="top" width="50%" headers="d54e10305" rowspan="1" colspan="1">
5465 <p class="p"><samp class="ph codeph">CUSPARSE_DIRECTION_COLUMN</samp></p>
5467 <td class="entry" valign="top" width="50%" headers="d54e10308" rowspan="1" colspan="1">
5468 <p class="p">the matrix should be parsed by columns.</p>
5476 <div class="topic concept nested1" id="cusparsehandlet"><a name="cusparsehandlet" shape="rect">
5477 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsehandlet" name="cusparsehandlet" shape="rect">4.4. cusparseHandle_t</a></h3>
5478 <div class="body conbody">
5479 <p class="p">This is a pointer type to an opaque CUSPARSE context, which the user must initialize by calling prior to calling <samp class="ph codeph">cusparseCreate()</samp> any other library function. The handle created and returned by <samp class="ph codeph">cusparseCreate()</samp> must be passed to every CUSPARSE function.
5483 <div class="topic concept nested1" id="cusparsehybmatt"><a name="cusparsehybmatt" shape="rect">
5484 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsehybmatt" name="cusparsehybmatt" shape="rect">4.5. cusparseHybMat_t</a></h3>
5485 <div class="body conbody">
5486 <p class="p">This is a pointer type to an opaque structure holding the matrix in HYB format, which is created by <samp class="ph codeph">cusparseCreateHybMat</samp> and destroyed by <samp class="ph codeph">cusparseDestroyHybMat</samp>.
5489 <div class="topic concept nested2" id="cusparsehybpartitiont"><a name="cusparsehybpartitiont" shape="rect">
5490 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsehybpartitiont" name="cusparsehybpartitiont" shape="rect">4.5.1. cusparseHybPartition_t</a></h3>
5491 <div class="body conbody">
5492 <p class="p">This type indicates how to perform the partitioning of the matrix into regular (ELL) and irregular (COO) parts of the HYB
5495 <p class="p">The partitioning is performed during the conversion of the matrix from a dense or sparse format into the HYB format and is
5496 governed by the following rules. When <samp class="ph codeph">CUSPARSE_HYB_PARTITION_AUTO</samp> is selected, the CUSPARSE library automatically decides how much data to put into the regular and irregular parts of the
5497 HYB format. When <samp class="ph codeph">CUSPARSE_HYB_PARTITION_USER</samp> is selected, the width of the regular part of the HYB format should be specified by the caller. When <samp class="ph codeph">CUSPARSE_HYB_PARTITION_MAX</samp> is selected, the width of the regular part of the HYB format equals to the maximum number of non-zero elements per row, in
5498 other words, the entire matrix is stored in the regular part of the HYB format.
5500 <p class="p">The <em class="ph i">default</em> is to let the library automatically decide how to split the data.
5502 <div class="tablenoborder">
5503 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5504 <thead class="thead" align="left">
5506 <th class="entry" valign="top" width="50%" id="d54e10429" rowspan="1" colspan="1">
5510 <th class="entry" valign="top" width="50%" id="d54e10432" rowspan="1" colspan="1">
5516 <tbody class="tbody">
5518 <td class="entry" valign="top" width="50%" headers="d54e10429" rowspan="1" colspan="1">
5519 <p class="p"><samp class="ph codeph">CUSPARSE_HYB_PARTITION_AUTO</samp></p>
5521 <td class="entry" valign="top" width="50%" headers="d54e10432" rowspan="1" colspan="1">
5522 <p class="p">the automatic partitioning is selected (<em class="ph i">default</em>).
5527 <td class="entry" valign="top" width="50%" headers="d54e10429" rowspan="1" colspan="1">
5528 <p class="p"><samp class="ph codeph">CUSPARSE_HYB_PARTITION_USER</samp></p>
5530 <td class="entry" valign="top" width="50%" headers="d54e10432" rowspan="1" colspan="1">
5531 <p class="p">the user specified treshold is used.</p>
5535 <td class="entry" valign="top" width="50%" headers="d54e10429" rowspan="1" colspan="1">
5536 <p class="p"><samp class="ph codeph">CUSPARSE_HYB_PARTITION_MAX</samp></p>
5538 <td class="entry" valign="top" width="50%" headers="d54e10432" rowspan="1" colspan="1">
5539 <p class="p">the data is stored in ELL format.</p>
5548 <div class="topic concept nested1" id="cusparsematdescrt"><a name="cusparsematdescrt" shape="rect">
5549 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsematdescrt" name="cusparsematdescrt" shape="rect">4.6. cusparseMatDescr_t</a></h3>
5550 <div class="body conbody">
5551 <p class="p">This structure is used to describe the shape and properties of a matrix.</p><pre xml:space="preserve">typedef struct {
5552 cusparseMatrixType_t MatrixType;
5553 cusparseFillMode_t FillMode;
5554 cusparseDiagType_t DiagType;
5555 cusparseIndexBase_t IndexBase;
5556 } cusparseMatDescr_t;</pre></div>
5557 <div class="topic concept nested2" id="cusparsediagtypet"><a name="cusparsediagtypet" shape="rect">
5558 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsediagtypet" name="cusparsediagtypet" shape="rect">4.6.1. cusparseDiagType_t</a></h3>
5559 <div class="body conbody">
5560 <p class="p">This type indicates if the matrix diagonal entries are unity. The diagonal elements are always assumed to be present, but
5561 if <samp class="ph codeph">CUSPARSE_DIAG_TYPE_UNIT</samp> is passed to an API routine, then the routine will assume that all diagonal entries are unity and will not read or modify
5562 those entries. Note that in this case the routine assumes the diagonal entries are equal to one, regardless of what those
5563 entries are actuall set to in memory.
5565 <div class="tablenoborder">
5566 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5567 <thead class="thead" align="left">
5569 <th class="entry" valign="top" width="50%" id="d54e10535" rowspan="1" colspan="1">
5573 <th class="entry" valign="top" width="50%" id="d54e10538" rowspan="1" colspan="1">
5579 <tbody class="tbody">
5581 <td class="entry" valign="top" width="50%" headers="d54e10535" rowspan="1" colspan="1">
5582 <p class="p"><samp class="ph codeph">CUSPARSE_DIAG_TYPE_NON_UNIT</samp></p>
5584 <td class="entry" valign="top" width="50%" headers="d54e10538" rowspan="1" colspan="1">
5585 <p class="p">the matrix diagonal has non-unit elements.</p>
5589 <td class="entry" valign="top" width="50%" headers="d54e10535" rowspan="1" colspan="1">
5590 <p class="p"><samp class="ph codeph">CUSPARSE_DIAG_TYPE_UNIT</samp></p>
5592 <td class="entry" valign="top" width="50%" headers="d54e10538" rowspan="1" colspan="1">
5593 <p class="p">the matrix diagonal has unit elements.</p>
5601 <div class="topic concept nested2" id="cusparsefillmodet"><a name="cusparsefillmodet" shape="rect">
5602 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsefillmodet" name="cusparsefillmodet" shape="rect">4.6.2. cusparseFillMode_t</a></h3>
5603 <div class="body conbody">
5604 <p class="p">This type indicates if the lower or upper part of a matrix is stored in sparse storage.</p>
5605 <div class="tablenoborder">
5606 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5607 <thead class="thead" align="left">
5609 <th class="entry" valign="top" width="50%" id="d54e10604" rowspan="1" colspan="1">
5613 <th class="entry" valign="top" width="50%" id="d54e10607" rowspan="1" colspan="1">
5619 <tbody class="tbody">
5621 <td class="entry" valign="top" width="50%" headers="d54e10604" rowspan="1" colspan="1">
5622 <p class="p"><samp class="ph codeph">CUSPARSE_FILL_MODE_LOWER</samp></p>
5624 <td class="entry" valign="top" width="50%" headers="d54e10607" rowspan="1" colspan="1">
5625 <p class="p">the lower triangular part is stored.</p>
5629 <td class="entry" valign="top" width="50%" headers="d54e10604" rowspan="1" colspan="1">
5630 <p class="p"><samp class="ph codeph">CUSPARSE_FILL_MODE_UPPER</samp></p>
5632 <td class="entry" valign="top" width="50%" headers="d54e10607" rowspan="1" colspan="1">
5633 <p class="p">the upper triangular part is stored.</p>
5641 <div class="topic concept nested2" id="cusparseindexbaset"><a name="cusparseindexbaset" shape="rect">
5642 <!-- --></a><h3 class="title topictitle2"><a href="#cusparseindexbaset" name="cusparseindexbaset" shape="rect">4.6.3. cusparseIndexBase_t</a></h3>
5643 <div class="body conbody">
5644 <p class="p">This type indicates if the base of the matrix indices is zero or one.</p>
5645 <div class="tablenoborder">
5646 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5647 <thead class="thead" align="left">
5649 <th class="entry" valign="top" width="50%" id="d54e10673" rowspan="1" colspan="1">
5653 <th class="entry" valign="top" width="50%" id="d54e10676" rowspan="1" colspan="1">
5659 <tbody class="tbody">
5661 <td class="entry" valign="top" width="50%" headers="d54e10673" rowspan="1" colspan="1">
5662 <p class="p"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp></p>
5664 <td class="entry" valign="top" width="50%" headers="d54e10676" rowspan="1" colspan="1">
5665 <p class="p">the base index is zero.</p>
5669 <td class="entry" valign="top" width="50%" headers="d54e10673" rowspan="1" colspan="1">
5670 <p class="p"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></p>
5672 <td class="entry" valign="top" width="50%" headers="d54e10676" rowspan="1" colspan="1">
5673 <p class="p">the base index is one.</p>
5681 <div class="topic concept nested2" id="cusparsematrixtypet"><a name="cusparsematrixtypet" shape="rect">
5682 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsematrixtypet" name="cusparsematrixtypet" shape="rect">4.6.4. cusparseMatrixType_t</a></h3>
5683 <div class="body conbody">
5684 <p class="p">This type indicates the type of matrix stored in sparse storage. Notice that for symmetric, Hermitian and triangular matrices
5685 only their lower or upper part is assumed to be stored.
5687 <div class="tablenoborder">
5688 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5689 <thead class="thead" align="left">
5691 <th class="entry" valign="top" width="50%" id="d54e10743" rowspan="1" colspan="1">
5695 <th class="entry" valign="top" width="50%" id="d54e10746" rowspan="1" colspan="1">
5701 <tbody class="tbody">
5703 <td class="entry" valign="top" width="50%" headers="d54e10743" rowspan="1" colspan="1">
5704 <p class="p"><samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp></p>
5706 <td class="entry" valign="top" width="50%" headers="d54e10746" rowspan="1" colspan="1">
5707 <p class="p">the matrix is general.</p>
5711 <td class="entry" valign="top" width="50%" headers="d54e10743" rowspan="1" colspan="1">
5712 <p class="p"><samp class="ph codeph">CUSPARSE_MATRIX_TYPE_SYMMETRIC</samp></p>
5714 <td class="entry" valign="top" width="50%" headers="d54e10746" rowspan="1" colspan="1">
5715 <p class="p">the matrix is symmetric.</p>
5719 <td class="entry" valign="top" width="50%" headers="d54e10743" rowspan="1" colspan="1">
5720 <p class="p"><samp class="ph codeph">CUSPARSE_MATRIX_TYPE_HERMITIAN</samp></p>
5722 <td class="entry" valign="top" width="50%" headers="d54e10746" rowspan="1" colspan="1">
5723 <p class="p">the matrix is Hermitian.</p>
5727 <td class="entry" valign="top" width="50%" headers="d54e10743" rowspan="1" colspan="1">
5728 <p class="p"><samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp></p>
5730 <td class="entry" valign="top" width="50%" headers="d54e10746" rowspan="1" colspan="1">
5731 <p class="p">the matrix is triangular.</p>
5740 <div class="topic concept nested1" id="cusparseoperationt"><a name="cusparseoperationt" shape="rect">
5741 <!-- --></a><h3 class="title topictitle2"><a href="#cusparseoperationt" name="cusparseoperationt" shape="rect">4.7. cusparseOperation_t</a></h3>
5742 <div class="body conbody">
5743 <p class="p">This type indicates which operations need to be performed with the sparse matrix.</p>
5744 <div class="tablenoborder">
5745 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5746 <thead class="thead" align="left">
5748 <th class="entry" valign="top" width="50%" id="d54e10844" rowspan="1" colspan="1">
5752 <th class="entry" valign="top" width="50%" id="d54e10847" rowspan="1" colspan="1">
5758 <tbody class="tbody">
5760 <td class="entry" valign="top" width="50%" headers="d54e10844" rowspan="1" colspan="1">
5761 <p class="p"><samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp></p>
5763 <td class="entry" valign="top" width="50%" headers="d54e10847" rowspan="1" colspan="1">
5764 <p class="p">the non-transpose operation is selected.</p>
5768 <td class="entry" valign="top" width="50%" headers="d54e10844" rowspan="1" colspan="1">
5769 <p class="p"><samp class="ph codeph">CUSPARSE_OPERATION_TRANSPOSE</samp></p>
5771 <td class="entry" valign="top" width="50%" headers="d54e10847" rowspan="1" colspan="1">
5772 <p class="p">the transpose operation is selected.</p>
5776 <td class="entry" valign="top" width="50%" headers="d54e10844" rowspan="1" colspan="1">
5777 <p class="p"><samp class="ph codeph">CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</samp></p>
5779 <td class="entry" valign="top" width="50%" headers="d54e10847" rowspan="1" colspan="1">
5782 <p class="p">the conjugate transpose operation is selected.</p>
5790 <div class="topic concept nested1" id="cusparsepointermode_t"><a name="cusparsepointermode_t" shape="rect">
5791 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsepointermode_t" name="cusparsepointermode_t" shape="rect">4.8. cusparsePointerMode_t</a></h3>
5792 <div class="body conbody">
5793 <p class="p">This type indicates whether the scalar values are passed by reference on the host or device. It is important to point out
5794 that if several scalar values are passed by reference in the function call, all of them will conform to the same single pointer
5795 mode. The pointer mode can be set and retrieved using <samp class="ph codeph">cusparseSetPointerMode()</samp> and <samp class="ph codeph">cusparseGetPointerMode()</samp> routines, respectively.
5797 <div class="tablenoborder">
5798 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5799 <thead class="thead" align="left">
5801 <th class="entry" valign="top" width="50%" id="d54e10938" rowspan="1" colspan="1">
5805 <th class="entry" valign="top" width="50%" id="d54e10941" rowspan="1" colspan="1">
5811 <tbody class="tbody">
5813 <td class="entry" valign="top" width="50%" headers="d54e10938" rowspan="1" colspan="1">
5814 <p class="p"><samp class="ph codeph">CUSPARSE_POINTER_MODE_HOST</samp></p>
5816 <td class="entry" valign="top" width="50%" headers="d54e10941" rowspan="1" colspan="1">
5817 <p class="p">the scalars are passed by reference on the host.</p>
5821 <td class="entry" valign="top" width="50%" headers="d54e10938" rowspan="1" colspan="1">
5822 <p class="p"><samp class="ph codeph">CUSPARSE_POINTER_MODE_DEVICE</samp></p>
5824 <td class="entry" valign="top" width="50%" headers="d54e10941" rowspan="1" colspan="1">
5825 <p class="p">the scalars are passed by reference on the device.</p>
5833 <div class="topic concept nested1" id="cusparsesolveanalysisinfot"><a name="cusparsesolveanalysisinfot" shape="rect">
5834 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsesolveanalysisinfot" name="cusparsesolveanalysisinfot" shape="rect">4.9. cusparseSolveAnalysisInfo_t</a></h3>
5835 <div class="body conbody">
5836 <p class="p">This is a pointer type to an opaque structure holding the information collected in the analysis phase of the solution of the
5837 sparse triangular linear system. It is expected to be passed unchanged to the solution phase of the sparse triangular linear
5842 <div class="topic concept nested1" id="cusparsestatust"><a name="cusparsestatust" shape="rect">
5843 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsestatust" name="cusparsestatust" shape="rect">4.10. cusparseStatus_t</a></h3>
5844 <div class="body conbody">
5845 <p class="p">This is a status type returned by the library functions and it can have the following values.</p>
5846 <div class="tablenoborder">
5847 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
5848 <tbody class="tbody">
5850 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
5851 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5854 <p class="p">The operation completed successfully.</p>
5858 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
5859 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5862 <p class="p">The CUSPARSE library was not initialized. This is usually caused by the lack of a prior call, an error in the CUDA Runtime
5863 API called by the CUSPARSE routine, or an error in the hardware setup.
5865 <p class="p"><strong class="ph b">To correct:</strong> call <samp class="ph codeph">cusparseCreate()</samp> prior to the function call; and check that the hardware, an appropriate version of the driver, and the CUSPARSE library are
5866 correctly installed.
5871 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
5872 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5875 <p class="p">Resource allocation failed inside the CUSPARSE library. This is usually caused by a <samp class="ph codeph">cudaMalloc()</samp> failure.
5877 <p class="p"><strong class="ph b">To correct:</strong> prior to the function call, deallocate previously allocated memory as much as possible.
5882 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
5883 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5886 <p class="p">An unsupported value or parameter was passed to the function (a negative vector size, for example).</p>
5887 <p class="p"><strong class="ph b">To correct:</strong> ensure that all the parameters being passed have valid values.
5892 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
5893 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5896 <p class="p">The function requires a feature absent from the device architecture; usually caused by the lack of support for atomic operations
5897 or double precision.
5899 <p class="p"><strong class="ph b">To correct:</strong> compile and run the application on a device with appropriate compute capability, which is 1.1 for 32-bit atomic operations
5900 and 1.3 for double precision.
5905 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MAPPING_ERROR</samp></td>
5906 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5909 <p class="p">An access to GPU memory space failed, which is usually caused by a failure to bind a texture.</p>
5910 <p class="p"><strong class="ph b">To correct:</strong> prior to the function call, unbind any previously bound textures.
5915 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
5916 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5919 <p class="p">The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused
5920 by multiple reasons.
5922 <p class="p"><strong class="ph b">To correct:</strong> check that the hardware, an appropriate version of the driver, and the CUSPARSE library are correctly installed.
5927 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
5928 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5931 <p class="p">An internal CUSPARSE operation failed. This error is usually caused by a <samp class="ph codeph">cudaMemcpyAsync()
5934 <p class="p"><strong class="ph b">To correct:</strong> check that the hardware, an appropriate version of the driver, and the CUSPARSE library are correctly installed. Also, check
5935 that the memory passed as a parameter to the routine is not being deallocated prior to the routine’s completion.
5940 <td class="entry" valign="top" width="30%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
5941 <td class="entry" valign="top" width="70%" rowspan="1" colspan="1">
5944 <p class="p">The matrix type is not supported by this function. This is usually caused by passing an invalid matrix descriptor to the function.</p>
5945 <p class="p"><strong class="ph b">To correct:</strong> check that the fields in <samp class="ph codeph">cusparseMatDescr_t descrA</samp> were set correctly.
5955 <div class="topic concept nested0" id="cusparse-helper-function-reference"><a name="cusparse-helper-function-reference" shape="rect">
5956 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-helper-function-reference" name="cusparse-helper-function-reference" shape="rect">5. CUSPARSE Helper Function Reference</a></h2>
5957 <div class="body conbody">
5958 <p class="p">The CUSPARSE helper functions are described in this section.</p>
5960 <div class="topic concept nested1" id="cusparsecreate"><a name="cusparsecreate" shape="rect">
5961 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsecreate" name="cusparsecreate" shape="rect">5.1. cusparseCreate()</a></h3>
5962 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
5963 cusparseCreate(cusparseHandle_t *handle)</pre><p class="p">This function initializes the CUSPARSE library and creates a handle on the CUSPARSE context. It must be called before any
5964 other CUSPARSE API function is invoked. It allocates hardware resources necessary for accessing the GPU.
5966 <div class="tablenoborder">
5967 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
5969 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
5970 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the pointer to the handle to the CUSPARSE context.</td>
5975 <div class="tablenoborder">
5976 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
5978 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
5979 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the initialization succeeded.</td>
5982 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
5983 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the CUDA Runtime initialization failed.</td>
5986 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
5987 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
5990 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
5991 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device compute capability (CC) is less than 1.1. The CC of at least 1.1 is required.</td>
5998 <div class="topic concept nested1" id="cusparsecreatehybmat"><a name="cusparsecreatehybmat" shape="rect">
5999 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsecreatehybmat" name="cusparsecreatehybmat" shape="rect">5.2. cusparseCreateHybMat()</a></h3>
6000 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6001 cusparseCreateHybMat(cusparseHybMat_t *hybA)</pre><p class="p">This function creates and initializes the <samp class="ph codeph">hybA</samp> opaque data structure.
6003 <div class="tablenoborder">
6004 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6006 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
6007 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the pointer to the hybrid format storage structure.</td>
6012 <div class="tablenoborder">
6013 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6015 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6016 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the structure was initialized successfully.</td>
6019 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
6020 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
6027 <div class="topic concept nested1" id="cusparsecreatematdescr"><a name="cusparsecreatematdescr" shape="rect">
6028 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsecreatematdescr" name="cusparsecreatematdescr" shape="rect">5.3. cusparseCreateMatDescr()</a></h3>
6029 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6030 cusparseCreateMatDescr(cusparseMatDescr_t *descrA)</pre><p class="p">This function initializes the matrix descriptor. It sets the fields <samp class="ph codeph">MatrixType</samp> and <samp class="ph codeph">IndexBase</samp> to the <em class="ph i">default</em> values <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> , respectively, while leaving other fields uninitialized.
6032 <div class="tablenoborder">
6033 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6035 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6036 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the pointer to the matrix descriptor.</td>
6041 <div class="tablenoborder">
6042 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6044 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6045 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor was initialized successfully.</td>
6048 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
6049 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
6056 <div class="topic concept nested1" id="cusparsecreatesolveanalysisinfo"><a name="cusparsecreatesolveanalysisinfo" shape="rect">
6057 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsecreatesolveanalysisinfo" name="cusparsecreatesolveanalysisinfo" shape="rect">5.4. cusparseCreateSolveAnalysisInfo()</a></h3>
6058 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6059 cusparseCreateSolveAnalysisInfo(cusparseSolveAnalysisInfo_t *info)</pre><p class="p">This function creates and initializes the solve and analysis structure to <em class="ph i">default</em> values.
6061 <div class="tablenoborder">
6062 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6064 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
6065 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the pointer to the solve and analysis structure.</td>
6070 <div class="tablenoborder">
6071 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6073 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6074 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the structure was initialized successfully.</td>
6077 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
6078 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
6085 <div class="topic concept nested1" id="cusparsedestroy"><a name="cusparsedestroy" shape="rect">
6086 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsedestroy" name="cusparsedestroy" shape="rect">5.5. cusparseDestroy()</a></h3>
6087 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6088 cusparseDestroy(cusparseHandle_t handle)</pre><p class="p">This function releases CPU-side resources used by the CUSPARSE library. The release of GPU-side resources may be deferred
6089 until the application shuts down.
6091 <div class="tablenoborder">
6092 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6094 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6095 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the handle to the CUSPARSE context.</td>
6100 <div class="tablenoborder">
6101 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6103 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6104 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the shutdown succeeded.</td>
6107 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
6108 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
6115 <div class="topic concept nested1" id="cusparsedestroyhybmat"><a name="cusparsedestroyhybmat" shape="rect">
6116 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsedestroyhybmat" name="cusparsedestroyhybmat" shape="rect">5.6. cusparseDestroyHybMat()</a></h3>
6117 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6118 cusparseDestroyHybMat(cusparseHybMat_t hybA)
6119 </pre><p class="p">This function destroys and releases any memory required by the <samp class="ph codeph">hybA</samp> structure.
6121 <div class="tablenoborder">
6122 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6124 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
6125 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the hybrid format storage structure.</td>
6130 <div class="tablenoborder">
6131 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6133 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6134 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources were released successfully.</td>
6141 <div class="topic concept nested1" id="cusparsedestroymatdescr"><a name="cusparsedestroymatdescr" shape="rect">
6142 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsedestroymatdescr" name="cusparsedestroymatdescr" shape="rect">5.7. cusparseDestroyMatDescr()</a></h3>
6143 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6144 cusparseDestroyMatDescr(cusparseMatDescr_t descrA)
6145 </pre><p class="p">This function releases the memory allocated for the matrix descriptor.</p>
6146 <div class="tablenoborder">
6147 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6149 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6150 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6155 <div class="tablenoborder">
6156 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6158 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6159 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources were released successfully.</td>
6166 <div class="topic concept nested1" id="cusparsedestroysolveanalysisinfo"><a name="cusparsedestroysolveanalysisinfo" shape="rect">
6167 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsedestroysolveanalysisinfo" name="cusparsedestroysolveanalysisinfo" shape="rect">5.8. cusparseDestroySolveAnalysisInfo()</a></h3>
6168 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6169 cusparseDestroySolveAnalysisInfo(cusparseSolveAnalysisInfo_t info)</pre><p class="p">This function destroys and releases any memory required by the structure.</p>
6170 <p class="p"><strong class="ph b">Input</strong></p>
6171 <div class="tablenoborder">
6172 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
6173 <tbody class="tbody">
6175 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
6176 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the solve and analysis structure.</td>
6181 <p class="p"><strong class="ph b">Status Returened</strong></p>
6182 <div class="tablenoborder">
6183 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all">
6184 <tbody class="tbody">
6186 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6187 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources were released successfully.</td>
6194 <div class="topic concept nested1" id="cusparsegetlevelinfo"><a name="cusparsegetlevelinfo" shape="rect">
6195 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsegetlevelinfo" name="cusparsegetlevelinfo" shape="rect">5.9. cusparseGetLevelInfo()</a></h3>
6196 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6197 cusparseGetLevelInfo(cusparseHandle_t handle,
6198 cusparseSolveAnalysisInfo_t info,
6201 int **levelInd)</pre><p class="p">This function returns the number of levels and the assignment of rows into the levels computed by either the csrsv_analysis,
6202 csrsm_analysis or hybsv_analysis routines.
6204 <div class="tablenoborder">
6205 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6207 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6208 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
6211 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
6212 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the pointer to the solve and analysis structure.</td>
6217 <div class="tablenoborder">
6218 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6220 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nlevels</samp></td>
6221 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of levels.</td>
6224 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">levelPtr</samp></td>
6225 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nlevels+1</samp> elements that contains the start of every level and the end of the last level plus one.
6229 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">levelInd</samp></td>
6230 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp> (number of rows in the matrix) elements that contains the row indices belonging to every level.
6236 <div class="tablenoborder">
6237 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6239 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6240 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the structure was initialized successfully.</td>
6243 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
6244 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library or the solve analysis structure was not initialized.</td>
6251 <div class="topic concept nested1" id="cusparsegetmatdiagtype"><a name="cusparsegetmatdiagtype" shape="rect">
6252 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsegetmatdiagtype" name="cusparsegetmatdiagtype" shape="rect">5.10. cusparseGetMatDiagType()</a></h3>
6253 <div class="body conbody"><pre xml:space="preserve">cusparseDiagType_t
6254 cusparseGetMatDiagType(const cusparseMatDescr_t descrA)
6255 </pre><p class="p">This function returns the <samp class="ph codeph">DiagType</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6257 <div class="tablenoborder">
6258 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6260 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6261 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6266 <div class="tablenoborder">
6267 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Returned</strong></span><tbody class="tbody">
6269 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph"></samp></td>
6270 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated diagType types.</td>
6277 <div class="topic concept nested1" id="cusparsegetmatfillmode"><a name="cusparsegetmatfillmode" shape="rect">
6278 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsegetmatfillmode" name="cusparsegetmatfillmode" shape="rect">5.11. cusparseGetMatFillMode()</a></h3>
6279 <div class="body conbody"><pre xml:space="preserve">cusparseFillMode_t
6280 cusparseGetMatFillMode(const cusparseMatDescr_t descrA)
6281 </pre><p class="p">This function returns the <samp class="ph codeph">FillMode</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6283 <div class="tablenoborder">
6284 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6286 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6287 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6292 <div class="tablenoborder">
6293 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Returned</strong></span><tbody class="tbody">
6295 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph"></samp></td>
6296 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated fillMode types.</td>
6303 <div class="topic concept nested1" id="cusparsegetmatindexbase"><a name="cusparsegetmatindexbase" shape="rect">
6304 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsegetmatindexbase" name="cusparsegetmatindexbase" shape="rect">5.12. cusparseGetMatIndexBase()</a></h3>
6305 <div class="body conbody"><pre xml:space="preserve">cusparseIndexBase_t
6306 cusparseGetMatIndexBase(const cusparseMatDescr_t descrA)
6307 </pre><p class="p">This function returns the <samp class="ph codeph">IndexBase</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6309 <div class="tablenoborder">
6310 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6312 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6313 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6318 <div class="tablenoborder">
6319 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Returned</strong></span><tbody class="tbody">
6321 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph"></samp></td>
6322 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated indexBase types.</td>
6329 <div class="topic concept nested1" id="cusparsegetmattype"><a name="cusparsegetmattype" shape="rect">
6330 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsegetmattype" name="cusparsegetmattype" shape="rect">5.13. cusparseGetMatType()</a></h3>
6331 <div class="body conbody"><pre xml:space="preserve">cusparseMatrixType_t
6332 cusparseGetMatType(const cusparseMatDescr_t descrA)
6333 </pre><p class="p">This function returns the <samp class="ph codeph">MatrixType</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6335 <div class="tablenoborder">
6336 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6338 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6339 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6344 <div class="tablenoborder">
6345 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Returned</strong></span><tbody class="tbody">
6347 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph"></samp></td>
6348 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated matrix types.</td>
6355 <div class="topic concept nested1" id="cusparsegetpointermode"><a name="cusparsegetpointermode" shape="rect">
6356 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsegetpointermode" name="cusparsegetpointermode" shape="rect">5.14. cusparseGetPointerMode()</a></h3>
6357 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6358 cusparseGetPointerMode(cusparseHandlet handle,
6359 cusparsePointerMode_t *mode)
6360 </pre><p class="p">This function obtains the pointer mode used by the CUSPARSE library. Please see the section on the <samp class="ph codeph">cusparsePointerMode_t</samp> type for more details.
6362 <div class="tablenoborder">
6363 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6365 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6366 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the handle to the CUSPARSE context.</td>
6371 <div class="tablenoborder">
6372 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6374 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">mode</samp></td>
6375 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated pointer mode types.</td>
6380 <div class="tablenoborder">
6381 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6383 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6384 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the pointer mode was returned successfully.</td>
6387 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
6388 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
6395 <div class="topic concept nested1" id="cusparsegetversion"><a name="cusparsegetversion" shape="rect">
6396 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsegetversion" name="cusparsegetversion" shape="rect">5.15. cusparseGetVersion()</a></h3>
6397 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6398 cusparseGetVersion(cusparseHandle_t handle, int *version)
6399 </pre><p class="p">This function returns the version number of the CUSPARSE library.</p>
6400 <div class="tablenoborder">
6401 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6403 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6404 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the handle to the CUSPARSE context.</td>
6409 <div class="tablenoborder">
6410 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6412 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">version</samp></td>
6413 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the version number of the library.</td>
6418 <div class="tablenoborder">
6419 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6421 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6422 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the version was returned successfully.</td>
6425 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
6426 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
6433 <div class="topic concept nested1" id="cusparsesetmatdiagtype"><a name="cusparsesetmatdiagtype" shape="rect">
6434 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsesetmatdiagtype" name="cusparsesetmatdiagtype" shape="rect">5.16. cusparseSetMatDiagType()</a></h3>
6435 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6436 cusparseSetMatDiagType(cusparseMatDescr_t descrA,
6437 cusparseDiagType_t diagType)
6438 </pre><p class="p">This function sets the <samp class="ph codeph">DiagType</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6440 <div class="tablenoborder">
6441 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6443 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">diagType</samp></td>
6444 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated diagType types.</td>
6449 <div class="tablenoborder">
6450 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6452 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6453 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6458 <div class="tablenoborder">
6459 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6461 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6462 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the field <samp class="ph codeph">DiagType</samp> was set successfully.
6466 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6467 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">An invalid <samp class="ph codeph">diagType</samp> parameter was passed.
6475 <div class="topic concept nested1" id="cusparsesetmatfillmode"><a name="cusparsesetmatfillmode" shape="rect">
6476 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsesetmatfillmode" name="cusparsesetmatfillmode" shape="rect">5.17. cusparseSetMatFillMode()</a></h3>
6477 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6478 cusparseSetMatFillMode(cusparseMatDescr_t descrA,
6479 cusparseFillMode_t fillMode)
6480 </pre><p class="p">This function sets the <samp class="ph codeph">FillMode</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6482 <div class="tablenoborder">
6483 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6485 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">fillMode</samp></td>
6486 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated fillMode types.</td>
6491 <div class="tablenoborder">
6492 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6494 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6495 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6500 <div class="tablenoborder">
6501 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6503 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6504 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">FillMode</samp> field was set successfully.
6508 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6509 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">An invalid <samp class="ph codeph">fillMode</samp> parameter was passed.
6517 <div class="topic concept nested1" id="cusparsesetmatindexbase"><a name="cusparsesetmatindexbase" shape="rect">
6518 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsesetmatindexbase" name="cusparsesetmatindexbase" shape="rect">5.18. cusparseSetMatIndexBase()</a></h3>
6519 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6520 cusparseSetMatIndexBase(cusparseMatDescr_t descrA,
6521 cusparseIndexBase_t base)
6522 </pre><p class="p">This function sets the <samp class="ph codeph">IndexBase</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6524 <div class="tablenoborder">
6525 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6527 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">base</samp></td>
6528 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated indexBase types.</td>
6533 <div class="tablenoborder">
6534 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6536 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6537 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6542 <div class="tablenoborder">
6543 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6545 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6546 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">IndexBase</samp> field was set successfully.
6550 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6551 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">An invalid <samp class="ph codeph">base</samp> parameter was passed.
6559 <div class="topic concept nested1" id="cusparsesetmattype"><a name="cusparsesetmattype" shape="rect">
6560 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsesetmattype" name="cusparsesetmattype" shape="rect">5.19. cusparseSetMatType()</a></h3>
6561 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6562 cusparseSetMatType(cusparseMatDescr_t descrA, cusparseMatrixType_t type)
6563 </pre><p class="p">This function sets the <samp class="ph codeph">MatrixType</samp> field of the matrix descriptor <samp class="ph codeph">descrA</samp>.
6565 <div class="tablenoborder">
6566 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6568 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">type</samp></td>
6569 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated matrix types.</td>
6574 <div class="tablenoborder">
6575 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6577 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
6578 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix descriptor.</td>
6583 <div class="tablenoborder">
6584 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6586 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6587 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">MatrixType</samp> field was set successfully.
6591 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6592 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">An invalid <samp class="ph codeph">type</samp> parameter was passed.
6600 <div class="topic concept nested1" id="cusparsesetpointermode"><a name="cusparsesetpointermode" shape="rect">
6601 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsesetpointermode" name="cusparsesetpointermode" shape="rect">5.20. cusparseSetPointerMode()</a></h3>
6602 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6603 cusparseSetPointerMode(cusparseHandle_t handle,
6604 cusparsePointerMode_t mode)
6605 </pre><p class="p">This function sets the pointer mode used by the CUSPARSE library. The <em class="ph i">default</em> is for the values to be passed by reference on the host. Please see the section on the <samp class="ph codeph">cublasPointerMode_t</samp> type for more details.
6607 <div class="tablenoborder">
6608 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6610 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6611 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the handle to the CUSPARSE context.</td>
6614 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">mode</samp></td>
6615 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">One of the enumerated pointer mode types.</td>
6620 <div class="tablenoborder">
6621 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6623 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6624 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the pointer mode was set successfully.</td>
6627 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6628 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
6635 <div class="topic concept nested1" id="cusparsesetstream"><a name="cusparsesetstream" shape="rect">
6636 <!-- --></a><h3 class="title topictitle2"><a href="#cusparsesetstream" name="cusparsesetstream" shape="rect">5.21. cusparseSetStream()</a></h3>
6637 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6638 cusparseSetStream(cusparseHandle_t handle, cudaStream_t streamId)
6639 </pre><p class="p">This function sets the stream to be used by the CUSPARSE library to execute its routines.</p>
6640 <div class="tablenoborder">
6641 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6643 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6644 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the handle to the CUSPARSE context.</td>
6647 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">streamId</samp></td>
6648 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the stream to be used by the library.</td>
6653 <div class="tablenoborder">
6654 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6656 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6657 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the stream was set successfully.</td>
6660 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6661 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
6669 <div class="topic concept nested0" id="cusparse-level-1-function-reference"><a name="cusparse-level-1-function-reference" shape="rect">
6670 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-level-1-function-reference" name="cusparse-level-1-function-reference" shape="rect">6. CUSPARSE Level 1 Function Reference</a></h2>
6671 <div class="body conbody">
6672 <p class="p">This chapter describes sparse linear algebra functions that perform operations between dense and sparse vectors.</p>
6674 <div class="topic concept nested1" id="cusparse-lt-t-gt-axpyi"><a name="cusparse-lt-t-gt-axpyi" shape="rect">
6675 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-axpyi" name="cusparse-lt-t-gt-axpyi" shape="rect">6.1. cusparse<t>axpyi</a></h3>
6676 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6677 cusparseSaxpyi(cusparseHandle_t handle, int nnz,
6679 const float *xVal, const int *xInd,
6680 float *y, cusparseIndexBase_t idxBase)
6682 cusparseDaxpyi(cusparseHandle_t handle, int nnz,
6683 const double *alpha,
6684 const double *xVal, const int *xInd,
6685 double *y, cusparseIndexBase_t idxBase)
6687 cusparseCaxpyi(cusparseHandle_t handle, int nnz,
6688 const cuComplex *alpha,
6689 const cuComplex *xVal, const int *xInd,
6690 cuComplex *y, cusparseIndexBase_t idxBase)
6692 cusparseZaxpyi(cusparseHandle_t handle, int nnz,
6693 const cuDoubleComplex *alpha,
6694 const cuDoubleComplex *xVal, const int *xInd,
6695 cuDoubleComplex *y, cusparseIndexBase_t idxBase)</pre><p class="p">This function multiplies the vector <samp class="ph codeph">x</samp> in sparse format by the constant
6698 <math xmlns="http://www.w3.org/1998/Math/MathML">
6702 and adds the result to the vector <samp class="ph codeph">y</samp> in dense format. This operation can be written as
6704 <div class="tablenoborder">
6705 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
6706 <tbody class="tbody">
6708 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
6709 <math xmlns="http://www.w3.org/1998/Math/MathML">
6723 <p class="p">in other words,</p><pre xml:space="preserve">for i=0 to nnz-1
6724 y[xInd[i]-idxBase] = y[xInd[i]-idxBase] + alpha*xVal[i]</pre><p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
6725 to the application on the host before the result is ready.
6727 <div class="tablenoborder">
6728 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6730 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6731 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
6734 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
6735 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of elements in vector <samp class="ph codeph">x</samp>.
6739 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
6740 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
6743 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
6744 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values of vector <samp class="ph codeph">x</samp>.
6748 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xInd</samp></td>
6749 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer vector with <samp class="ph codeph">nnz</samp> indices of the non-zero values of vector <samp class="ph codeph">x</samp>.
6753 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
6754 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector in dense format.</td>
6757 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
6758 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></td>
6763 <div class="tablenoborder">
6764 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6766 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
6767 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated vector in dense format (that is unchanged if <samp class="ph codeph">nnz == 0</samp>).
6773 <div class="tablenoborder">
6774 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6776 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6777 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
6780 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
6781 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
6784 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6785 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">idxBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
6789 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
6790 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
6793 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
6794 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU.</td>
6801 <div class="topic concept nested1" id="cusparse-lt-t-gt-doti"><a name="cusparse-lt-t-gt-doti" shape="rect">
6802 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-doti" name="cusparse-lt-t-gt-doti" shape="rect">6.2. cusparse<t>doti</a></h3>
6803 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6804 cusparseSdoti(cusparseHandle_t handle, int nnz,
6806 const int *xInd, const float *y,
6807 float *resultDevHostPtr,
6808 cusparseIndexBase_t idxBase)
6810 cusparseDdoti(cusparseHandle_t handle, int nnz,
6812 const int *xInd, const double *y,
6813 double *resultDevHostPtr,
6814 cusparseIndexBase_t idxBase)
6816 cusparseCdoti(cusparseHandle_t handle, int nnz,
6817 const cuComplex *xVal,
6818 const int *xInd, const cuComplex *y,
6819 cuComplex *resultDevHostPtr,
6820 cusparseIndexBase_t idxBase)
6822 cusparseZdoti(cusparseHandle_t handle, int nnz, const
6823 cuDoubleComplex *xVal,
6824 const int *xInd, const cuDoubleComplex *y,
6825 cuDoubleComplex *resultDevHostPtr,
6826 cusparseIndexBase_t idxBase)</pre><p class="p">This function returns the dot product of a vector <samp class="ph codeph">x</samp> in sparse format and vector <samp class="ph codeph">y</samp> in dense format. This operation can be written as
6828 <div class="tablenoborder">
6829 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
6830 <tbody class="tbody">
6832 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
6833 <math xmlns="http://www.w3.org/1998/Math/MathML">
6852 <p class="p">in other words,</p><pre xml:space="preserve">for i=0 to nnz-1
6853 resultDevHostPtr += xVal[i]*y[xInd[i-idxBase]]
6854 </pre><p class="p">This function requires some temporary extra storage that is allocated internally. It is executed asynchronously with respect
6855 to the host and it may return control to the application on the host before the result is ready.
6857 <div class="tablenoborder">
6858 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6860 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6861 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
6864 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
6865 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of elements in vector <samp class="ph codeph">x</samp>.
6869 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
6870 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values of vector <samp class="ph codeph">x</samp>.
6874 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xInd</samp></td>
6875 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer vector with <samp class="ph codeph">nnz</samp> indices of the non-zero values of vector <samp class="ph codeph">x</samp>.
6879 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
6880 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector in dense format.</td>
6883 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">resultDevHostPtr</samp></td>
6884 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">pointer to the location of the result in the device or host memory.</td>
6887 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
6888 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></td>
6893 <div class="tablenoborder">
6894 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
6896 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">resultDevHostPtr</samp></td>
6897 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">scalar result in the device or host memory (that is zero if <samp class="ph codeph">nnz == 0</samp>).
6903 <div class="tablenoborder">
6904 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
6906 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
6907 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
6910 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
6911 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
6914 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
6915 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the reduction buffer could not be allocated.</td>
6918 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
6919 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">idxBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
6923 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
6924 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
6927 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
6928 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU.</td>
6931 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
6932 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
6939 <div class="topic concept nested1" id="cusparse-lt-t-gt-dotci"><a name="cusparse-lt-t-gt-dotci" shape="rect">
6940 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-dotci" name="cusparse-lt-t-gt-dotci" shape="rect">6.3. cusparse<t>dotci</a></h3>
6941 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
6942 cusparseCdotci(cusparseHandle_t handle, int nnz,
6943 const cuComplex *xVal,
6944 const int *xInd, const cuComplex *y,
6945 cuComplex *resultDevHostPtr, cusparseIndexBase_t idxBase)
6947 cusparseZdotci(cusparseHandle_t handle, int nnz,
6948 const cuDoubleComplex *xVal,
6949 const int *xInd, const cuDoubleComplex *y,
6950 cuDoubleComplex *resultDevHostPtr, cusparseIndexBase_t idxBase)</pre><p class="p">This function returns the dot product of a complex conjugate of vector <samp class="ph codeph">x</samp> in sparse format and vector <samp class="ph codeph">y</samp> in dense format. This operation can be written as
6952 <div class="tablenoborder">
6953 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
6954 <tbody class="tbody">
6956 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
6957 <math xmlns="http://www.w3.org/1998/Math/MathML">
6976 <p class="p">in other words,</p><pre xml:space="preserve">for i=0 to nnz-1
6977 resultDevHostPtr += <math xmlns="http://www.w3.org/1998/Math/MathML"><mover><mrow class="MJX-TeXAtom-ORD"><mrow class="MJX-TeXAtom-ORD"><mtext>xVal[i]</mtext></mrow></mrow><mo accent="false">¯</mo></mover></math>*y[xInd[i-idxBase]]
6978 </pre><p class="p">This function requires some temporary extra storage that is allocated internally. It is executed asynchronously with respect
6979 to the host and it may return control to the application on the host before the result is ready.
6981 <div class="tablenoborder">
6982 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
6984 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
6985 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
6988 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
6989 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of elements in vector <samp class="ph codeph">x</samp>.
6993 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
6994 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values of vector <samp class="ph codeph">x</samp>.
6998 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xInd</samp></td>
6999 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer vector with <samp class="ph codeph">nnz</samp> indices of the non-zero values of vector <samp class="ph codeph">x</samp>.
7003 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7004 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector in dense format.</td>
7007 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">resultDevHostPtr</samp></td>
7008 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">pointer to the location of the result in the device or host memory.</td>
7011 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
7012 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></td>
7017 <div class="tablenoborder">
7018 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
7020 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">resultDevHostPtr</samp></td>
7021 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">scalar result in the device or host memory (that is zero if <samp class="ph codeph">nnz == 0</samp>).
7027 <div class="tablenoborder">
7028 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
7030 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
7031 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
7034 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
7035 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
7038 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
7039 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the reduction buffer could not be allocated.</td>
7042 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
7043 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">idxBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
7047 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
7048 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
7051 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
7052 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU.</td>
7055 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
7056 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
7063 <div class="topic concept nested1" id="cusparse-lt-t-gt-gthr"><a name="cusparse-lt-t-gt-gthr" shape="rect">
7064 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-gthr" name="cusparse-lt-t-gt-gthr" shape="rect">6.4. cusparse<t>gthr</a></h3>
7065 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
7066 cusparseSgthr(cusparseHandle_t handle, int nnz,
7068 float *xVal, const int *xInd,
7069 cusparseIndexBase_t idxBase)
7071 cusparseDgthr(cusparseHandle_t handle, int nnz,
7073 double *xVal, const int *xInd,
7074 cusparseIndexBase_t idxBase)
7076 cusparseCgthr(cusparseHandle_t handle, int nnz,
7078 cuComplex *xVal, const int *xInd,
7079 cusparseIndexBase_t idxBase)
7081 cusparseZgthr(cusparseHandle_t handle, int nnz,
7082 const cuDoubleComplex *y,
7083 cuDoubleComplex *xVal, const int *xInd,
7084 cusparseIndexBase_t idxBase)</pre><p class="p">This function gathers the elements of the vector <samp class="ph codeph">y</samp> listed in the index array <samp class="ph codeph">xInd</samp> into the data array <samp class="ph codeph">xVal</samp>.
7086 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
7087 to the application on the host before the result is ready.
7089 <div class="tablenoborder">
7090 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
7092 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
7093 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
7096 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
7097 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of elements in vector <samp class="ph codeph">x</samp>.
7101 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7102 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector in dense format (of <samp class="ph codeph">size≥max(xInd)-idxBase+1</samp>).
7106 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xInd</samp></td>
7107 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer vector with <samp class="ph codeph">nnz</samp> indices of the non-zero values of vector <samp class="ph codeph">x</samp>.
7111 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
7112 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></td>
7117 <div class="tablenoborder">
7118 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
7120 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
7121 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values that were gathered from vector <samp class="ph codeph">y</samp> (that is unchanged if <samp class="ph codeph">nnz == 0</samp>).
7127 <div class="tablenoborder">
7128 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
7130 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
7131 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
7134 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
7135 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
7138 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
7139 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">idxBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
7143 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
7144 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
7147 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
7148 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU.</td>
7155 <div class="topic concept nested1" id="cusparse-lt-t-gt-gthrz"><a name="cusparse-lt-t-gt-gthrz" shape="rect">
7156 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-gthrz" name="cusparse-lt-t-gt-gthrz" shape="rect">6.5. cusparse<t>gthrz</a></h3>
7157 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
7158 cusparseSgthrz(cusparseHandle_t handle, int nnz, float *y,
7159 float *xVal, const int *xInd,
7160 cusparseIndexBase_t idxBase)
7162 cusparseDgthrz(cusparseHandle_t handle, int nnz, double *y,
7163 double *xVal, const int *xInd,
7164 cusparseIndexBase_t idxBase)
7166 cusparseCgthrz(cusparseHandle_t handle, int nnz, cuComplex *y,
7167 cuComplex *xVal, const int *xInd,
7168 cusparseIndexBase_t idxBase)
7170 cusparseZgthrz(cusparseHandle_t handle, int nnz, cuDoubleComplex *y,
7171 cuDoubleComplex *xVal, const int *xInd,
7172 cusparseIndexBase_t idxBase)</pre><p class="p">This function gathers the elements of the vector <samp class="ph codeph">y</samp> listed in the index array <samp class="ph codeph">xInd</samp> into the data array <samp class="ph codeph">xVal</samp>. Also, it zeroes out the gathered elements in the vector <samp class="ph codeph">y</samp>.
7174 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
7175 to the application on the host before the result is ready.
7177 <div class="tablenoborder">
7178 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
7180 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
7181 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
7184 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
7185 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of elements in vector <samp class="ph codeph">x</samp>.
7189 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7190 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector in dense format (of <samp class="ph codeph">size≥max(xInd)-idxBase+1</samp>).
7194 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xInd</samp></td>
7195 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer vector with <samp class="ph codeph">nnz</samp> indices of the non-zero values of vector <samp class="ph codeph">x</samp>.
7199 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
7200 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></td>
7205 <div class="tablenoborder">
7206 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
7208 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
7209 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values that were gathered from vector <samp class="ph codeph">y</samp> (that is unchanged if <samp class="ph codeph">nnz == 0</samp>).
7213 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7214 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector in dense format with elements indexed by <samp class="ph codeph">xInd</samp> set to zero (it is unchanged if <samp class="ph codeph">nnz == 0</samp>).
7220 <div class="tablenoborder">
7221 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
7223 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
7224 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
7227 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
7228 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
7231 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
7232 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">idxBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
7236 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
7237 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
7240 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
7241 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU.</td>
7248 <div class="topic concept nested1" id="cusparse-lt-t-gt-roti"><a name="cusparse-lt-t-gt-roti" shape="rect">
7249 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-roti" name="cusparse-lt-t-gt-roti" shape="rect">6.6. cusparse<t>roti</a></h3>
7250 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
7251 cusparseSroti(cusparseHandle_t handle, int nnz, float *xVal,
7253 float *y, const float *c, const float *s,
7254 cusparseIndexBase_t idxBase)
7256 cusparseDroti(cusparseHandle_t handle, int nnz, double *xVal,
7258 double *y, const double *c, const double *s,
7259 cusparseIndexBase_t idxBase)</pre><p class="p">This function applies Givens rotation matrix</p>
7260 <div class="tablenoborder">
7261 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
7262 <tbody class="tbody">
7264 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
7265 <p class="p d4p_eqn_block">
7266 <math xmlns="http://www.w3.org/1998/Math/MathML">
7269 <mfenced open="(" close=")">
7270 <mtable columnalign="right right" rowspacing="4pt" columnspacing="1em">
7297 <p class="p">to sparse <samp class="ph codeph">x</samp> and dense <samp class="ph codeph">y</samp> vectors. In other words,
7298 </p><pre xml:space="preserve">for i=0 to nnz-1
7299 y[xInd[i]-idxBase] = c * y[xInd[i]-idxBase] - s*xVal[i]
7300 x[i] = c * xVal[i] + s * y[xInd[i]-idxBase]
7301 </pre><div class="tablenoborder">
7302 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
7304 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
7305 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
7308 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
7309 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of elements in vector <samp class="ph codeph">x</samp>.
7313 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
7314 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values of vector <samp class="ph codeph">x</samp>.
7318 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xInd</samp></td>
7319 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer vector with <samp class="ph codeph">nnz</samp> indices of the non-zero values of vector <samp class="ph codeph">x</samp>.
7323 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7324 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector in dense format.</td>
7327 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">c</samp></td>
7328 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">cosine element of the rotation matrix.</td>
7331 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">s</samp></td>
7332 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">sine element of the rotation matrix.</td>
7335 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
7336 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></td>
7341 <div class="tablenoborder">
7342 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
7344 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
7345 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated vector in sparse fomat (that is unchanged if <samp class="ph codeph">nnz == 0</samp>).
7349 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7350 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated vector in dense fomat (that is unchanged if <samp class="ph codeph">nnz == 0</samp>).
7356 <div class="tablenoborder">
7357 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
7359 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
7360 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
7363 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
7364 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
7367 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
7368 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">idxBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
7372 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
7373 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
7376 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
7377 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU.</td>
7384 <div class="topic concept nested1" id="cusparse-lt-t-gt-sctr"><a name="cusparse-lt-t-gt-sctr" shape="rect">
7385 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-sctr" name="cusparse-lt-t-gt-sctr" shape="rect">6.7. cusparse<t>sctr</a></h3>
7386 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
7387 cusparseSsctr(cusparseHandle_t handle, int nnz,
7389 const int *xInd, float *y,
7390 cusparseIndexBase_t idxBase)
7392 cusparseDsctr(cusparseHandle_t handle, int nnz,
7394 const int *xInd, double *y,
7395 cusparseIndexBase_t idxBase)
7397 cusparseCsctr(cusparseHandle_t handle, int nnz,
7398 const cuComplex *xVal,
7399 const int *xInd, cuComplex *y,
7400 cusparseIndexBase_t idxBase)
7402 cusparseZsctr(cusparseHandle_t handle, int nnz,
7403 const cuDoubleComplex *xVal,
7404 const int *xInd, cuDoubleComplex *y,
7405 cusparseIndexBase_t idxBase)</pre><p class="p">This function scatters the elements of the vector <samp class="ph codeph">x</samp> in sparse format into the vector <samp class="ph codeph">y</samp> in dense format. It modifies only the elements of <samp class="ph codeph">y</samp> whose indices are listed in the array <samp class="ph codeph">xInd</samp>.
7407 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
7408 to the application on the host before the result is ready.
7410 <div class="tablenoborder">
7411 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
7413 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
7414 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
7417 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
7418 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of elements in vector <samp class="ph codeph">x</samp>.
7422 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xVal</samp></td>
7423 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values of vector <samp class="ph codeph">x</samp>.
7427 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">xInd</samp></td>
7428 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer vector with <samp class="ph codeph">nnz</samp> indices of the non-zero values of vector <samp class="ph codeph">x</samp>.
7432 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7433 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense vector (of <samp class="ph codeph">size≥max(xInd)-idxBase+1</samp>).
7437 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
7438 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp></td>
7443 <div class="tablenoborder">
7444 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
7446 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7447 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector with <samp class="ph codeph">nnz</samp> non-zero values that were scattered from vector <samp class="ph codeph">x</samp> (that is unchanged if <samp class="ph codeph">nnz == 0</samp>).
7453 <div class="tablenoborder">
7454 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
7456 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
7457 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
7460 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
7461 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
7464 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
7465 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the <samp class="ph codeph">idxBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE.</samp>.
7469 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
7470 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
7473 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
7474 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU.</td>
7482 <div class="topic concept nested0" id="cusparse-level-2-function-reference"><a name="cusparse-level-2-function-reference" shape="rect">
7483 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-level-2-function-reference" name="cusparse-level-2-function-reference" shape="rect">7. CUSPARSE Level 2 Function Reference</a></h2>
7484 <div class="body conbody">
7485 <p class="p">This chapter describes the sparse linear algebra functions that perform operations between sparse matrices and dense vectors.</p>
7486 <p class="p">In particular, the solution of sparse triangular linear systems is implemented in two phases. First, during the analysis phase,
7487 the sparse triangular matrix is analyzed to determine the dependencies between its elements by calling the appropriate <samp class="ph codeph">csrsv_analysis()</samp> function. The analysis is specific to the sparsity pattern of the given matrix and to the selected <samp class="ph codeph">cusparseOperation_t</samp> type. The information from the analysis phase is stored in the parameter of type <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> that has been initialized previously with a call to <samp class="ph codeph">cusparseCreateSolveAnalysisInfo()</samp>.
7489 <p class="p">Second, during the solve phase, the given sparse triangular linear system is solved using the information stored in the <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> parameter by calling the appropriate <samp class="ph codeph">csrsv_solve()</samp> function. The solve phase may be performed multiple times with different right-hand-sides, while the analysis phase needs
7490 to be performed only once. This is especially useful when a sparse triangular linear system must be solved for a set of different
7491 right-hand-sides one at a time, while its coefficient matrix remains the same.
7493 <p class="p">Finally, once all the solves have completed, the opaque data structure pointed to by the <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> parameter can be released by calling <samp class="ph codeph">cusparseDestroySolveAnalysisInfo()</samp>. For more information please refer to [3].
7496 <div class="topic concept nested1" id="cusparse-lt-t-gt-bsrmv"><a name="cusparse-lt-t-gt-bsrmv" shape="rect">
7497 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-bsrmv" name="cusparse-lt-t-gt-bsrmv" shape="rect">7.1. cusparse<t>bsrmv</a></h3>
7498 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
7499 cusparseSbsrmv(cusparseHandle_t handle, cusparseDirection_t dir,
7500 cusparseOperation_t trans, int mb, int nb, int nnzb,
7501 const float *alpha, const cusparseMatDescr_t descr,
7502 const float *bsrVal, const int *bsrRowPtr, const int *bsrColInd,
7503 int blockDim, const float *x,
7504 const float *beta, float *y)
7506 cusparseDbsrmv(cusparseHandle_t handle, cusparseDirection_t dir,
7507 cusparseOperation_t trans, int mb, int nb, int nnzb,
7508 const double *alpha, const cusparseMatDescr_t descr,
7509 const double *bsrVal, const int *bsrRowPtr, const int *bsrColInd,
7510 int blockDim, const double *x,
7511 const double *beta, double *y)
7513 cusparseCbsrmv(cusparseHandle_t handle, cusparseDirection_t dir,
7514 cusparseOperation_t trans, int mb, int nb, int nnzb,
7515 const cuComplex *alpha, const cusparseMatDescr_t descr,
7516 const cuComplex *bsrVal, const int *bsrRowPtr, const int *bsrColInd,
7517 int blockDim, const cuComplex *x,
7518 const cuComplex *beta, cuComplex *y)
7520 cusparseZbsrmv(cusparseHandle_t handle, cusparseDirection_t dir,
7521 cusparseOperation_t trans, int mb, int nb, int nnzb,
7522 const cuDoubleComplex *alpha, const cusparseMatDescr_t descr,
7523 const cuDoubleComplex *bsrVal, const int *bsrRowPtr, const int *bsrColInd,
7524 int blockDim, const cuDoubleComplex *x,
7525 const cuDoubleComplex *beta, cuDoubleComplex *y)</pre><p class="p">This function performs the matrix-vector operation</p>
7526 <div class="tablenoborder">
7527 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
7528 <tbody class="tbody">
7530 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
7531 <math xmlns="http://www.w3.org/1998/Math/MathML">
7532 <mrow class="MJX-TeXAtom-ORD">
7533 <mrow class="MJX-TeXAtom-ORD">
7540 <mrow class="MJX-TeXAtom-ORD">
7541 <mrow class="MJX-TeXAtom-ORD">
7545 <mo stretchy="false">(</mo>
7547 <mo stretchy="false">)</mo>
7549 <mrow class="MJX-TeXAtom-ORD">
7550 <mrow class="MJX-TeXAtom-ORD">
7557 <mrow class="MJX-TeXAtom-ORD">
7558 <mrow class="MJX-TeXAtom-ORD">
7571 <math xmlns="http://www.w3.org/1998/Math/MathML">
7573 <mtext> is </mtext>
7574 <mo stretchy="false">(</mo>
7586 <mo stretchy="false">)</mo>
7588 <mo stretchy="false">(</mo>
7600 <mo stretchy="false">)</mo>
7603 sparse matrix (that is defined in BSR storage format by the three arrays <samp class="ph codeph">bsrVal</samp>, <samp class="ph codeph">bsrRowPtr</samp>, and <samp class="ph codeph">bsrColInd</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are vectors,
7606 <math xmlns="http://www.w3.org/1998/Math/MathML">
7608 <mtext> and </mtext>
7614 <math xmlns="http://www.w3.org/1998/Math/MathML">
7615 <mrow class="MJX-TeXAtom-ORD">
7616 <mrow class="MJX-TeXAtom-ORD">
7620 <mo stretchy="false">(</mo>
7622 <mo stretchy="false">)</mo>
7624 <mfenced open="{" close="">
7625 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
7631 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
7642 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
7653 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
7659 <p class="p">Several comments on bsrmv:</p>
7660 <p class="p">1. Only <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> is supported, i.e.
7662 <div class="tablenoborder">
7663 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
7664 <tbody class="tbody">
7666 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
7667 <math xmlns="http://www.w3.org/1998/Math/MathML">
7668 <mrow class="MJX-TeXAtom-ORD">
7669 <mrow class="MJX-TeXAtom-ORD">
7678 <mrow class="MJX-TeXAtom-ORD">
7679 <mrow class="MJX-TeXAtom-ORD">
7685 <mrow class="MJX-TeXAtom-ORD">
7686 <mrow class="MJX-TeXAtom-ORD">
7690 <mo stretchy="false">(</mo>
7692 <mo stretchy="false">)</mo>
7694 <mrow class="MJX-TeXAtom-ORD">
7695 <mrow class="MJX-TeXAtom-ORD">
7705 <p class="p">2. Only <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> is supported.
7707 <p class="p">3. The size of vector <samp class="ph codeph">x</samp> should be
7710 <math xmlns="http://www.w3.org/1998/Math/MathML">
7711 <mo stretchy="false">(</mo>
7723 <mo stretchy="false">)</mo>
7726 at least and the size of vector <samp class="ph codeph">y</samp> should be
7729 <math xmlns="http://www.w3.org/1998/Math/MathML">
7730 <mo stretchy="false">(</mo>
7742 <mo stretchy="false">)</mo>
7745 at least. Otherwise the kernel may return <samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp> because of out-of-array-bound.
7747 <p class="p">Example: suppose the user has a CSR format and wants to try bsrmv, the following code demonstrates csr2csc and csrmv on single
7749 </p><pre xml:space="preserve"><span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// Suppose that A is m x n sparse matrix represented by CSR format, </span>
7750 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// hx is a host vector of size n, and hy is also a host vector of size m. </span>
7751 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// m and n are not multiple of blockDim.</span>
7752 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// step 1: transform CSR to BSR with column-major order </span>
7753 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> base, nnz;
7754 cusparseDirection_t dirA = CUSPARSE_DIRECTION_COLUMN;
7755 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> mb = (m + <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>-1)/<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>;
7756 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> nb = (n + <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>-1)/<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>;
7757 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&bsrRowPtrC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>) *(mb+1));
7758 cusparseXcsr2bsrNnz(handle, dirA, m, n,
7759 descrA, csrRowPtrA, csrColIndA, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>,
7760 descrC, bsrRowPtrC);
7761 cudaMemcpy(&nnzb, bsrRowPtrC+mb, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
7762 cudaMemcpy(&base, bsrRowPtrC , <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
7764 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&bsrColIndC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*nnzb);
7765 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&bsrValC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>)*nnzb);
7766 cusparseScsr2bsr(handle, dirA, m, n,
7767 descrA, csrValA, csrRowPtrA, csrColIndA, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>,
7768 descrC, bsrValC, bsrRowPtrC, bsrColIndC);
7769 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// step 2: allocate vector x and vector y large enough for bsrmv </span>
7770 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&x, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*(nb*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>));
7771 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&y, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*(mb*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>));
7772 cudaMemcpy(x, hx, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*n, cudaMemcpyHostToDevice);
7773 cudaMemcpy(y, hy, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*m, cudaMemcpyHostToDevice);
7774 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// step 3: perform bsrmv</span>
7775 cusparseSbsrmv(handle, dirA, transA, mb, nb, alpha, descrC, bsrValC, bsrRowPtrC, bsrColIndC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>, x, beta, y);</pre><div class="tablenoborder">
7776 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
7778 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
7779 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
7782 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dir</samp></td>
7783 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">storage format of blocks, either <samp class="ph codeph">CUSPARSE_DIRECTION_ROW</samp> or <samp class="ph codeph">CUSPARSE_DIRECTION_COLUMN</samp> .
7787 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">trans</samp></td>
7788 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
7791 <math xmlns="http://www.w3.org/1998/Math/MathML">
7792 <mrow class="MJX-TeXAtom-ORD">
7793 <mrow class="MJX-TeXAtom-ORD">
7797 <mo stretchy="false">(</mo>
7799 <mo stretchy="false">)</mo>
7802 . Only <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> is supported.
7806 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">mb</samp></td>
7807 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of block rows of matrix
7808 <math xmlns="http://www.w3.org/1998/Math/MathML">
7814 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nb</samp></td>
7815 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of block columns of matrix
7816 <math xmlns="http://www.w3.org/1998/Math/MathML">
7822 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzb</samp></td>
7823 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero blocks of matrix
7824 <math xmlns="http://www.w3.org/1998/Math/MathML">
7830 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
7831 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
7834 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descr</samp></td>
7835 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
7836 <math xmlns="http://www.w3.org/1998/Math/MathML">
7838 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
7842 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrVal</samp></td>
7843 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
7844 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
7845 <mo stretchy="false">(</mo>
7847 </math><samp class="ph codeph">csrRowPtrA(mb)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
7849 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
7850 <mo stretchy="false">)</mo>
7852 non-zero blocks of matrix
7853 <math xmlns="http://www.w3.org/1998/Math/MathML">
7859 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrRowPtr</samp></td>
7860 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">mb</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
7865 elements that contains the start of every block row and the end of the last block row plus one.
7869 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrColInd</samp></td>
7870 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
7871 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
7872 <mo stretchy="false">(</mo>
7874 </math><samp class="ph codeph">csrRowPtrA(mb)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
7876 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
7877 <mo stretchy="false">)</mo>
7879 column indices of the non-zero blocks of matrix
7880 <math xmlns="http://www.w3.org/1998/Math/MathML">
7886 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">blockDim</samp></td>
7887 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">block dimension of sparse matrix
7888 <math xmlns="http://www.w3.org/1998/Math/MathML">
7890 </math>, larger than zero.
7894 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
7895 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of
7898 <math xmlns="http://www.w3.org/1998/Math/MathML">
7916 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">beta</samp></td>
7917 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication. If <samp class="ph codeph">beta</samp> is zero, <samp class="ph codeph">y</samp> does not have to be a valid input.
7921 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7922 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of
7925 <math xmlns="http://www.w3.org/1998/Math/MathML">
7945 <div class="tablenoborder">
7946 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
7948 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
7949 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated vector.</td>
7954 <div class="tablenoborder">
7955 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
7957 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
7958 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
7961 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
7962 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
7965 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
7966 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,nnz<0</samp>, <samp class="ph codeph">trans != CUSPARSE_OPERATION_NON_TRANSPOSE</samp>,
7969 <math xmlns="http://www.w3.org/1998/Math/MathML">
7982 , <samp class="ph codeph">dir</samp> is not row-major or column-major, or <samp class="ph codeph">IndexBase of descr</samp> is not base-0 or base-1 ).
7986 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
7987 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
7990 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
7991 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
7994 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
7995 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
7998 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
7999 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
8002 the matrix type is not supported.
8010 <div class="topic concept nested1" id="cusparse-lt-t-gt-bsrxmv"><a name="cusparse-lt-t-gt-bsrxmv" shape="rect">
8011 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-bsrxmv" name="cusparse-lt-t-gt-bsrxmv" shape="rect">7.2. cusparse<t>bsrxmv</a></h3>
8012 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
8013 cusparseSbsrxmv(cusparseHandle_t handle, cusparseDirection_t dir,
8014 cusparseOperation_t trans, int sizeOfMask,
8015 int mb, int nb, int nnzb,
8016 const float *alpha, const cusparseMatDescr_t descr,
8017 const float *bsrVal, const int *bsrMaskPtr,
8018 const int *bsrRowPtr, const int *bsrEndPtr, const int *bsrColInd,
8019 int blockDim, const float *x,
8020 const float *beta, float *y)
8022 cusparseDbsrxmv(cusparseHandle_t handle, cusparseDirection_t dir,
8023 cusparseOperation_t trans, int sizeOfMask,
8024 int mb, int nb, int nnzb,
8025 const double *alpha, const cusparseMatDescr_t descr,
8026 const double *bsrVal, const int *bsrMaskPtr,
8027 const int *bsrRowPtr, const int *bsrEndPtr, const int *bsrColInd,
8028 int blockDim, const double *x,
8029 const double *beta, double *y)
8031 cusparseCbsrxmv(cusparseHandle_t handle, cusparseDirection_t dir,
8032 cusparseOperation_t trans, int sizeOfMask,
8033 int mb, int nb, int nnzb,
8034 const cuComplex *alpha, const cusparseMatDescr_t descr,
8035 const cuComplex *bsrVal, const int *bsrMaskPtr,
8036 const int *bsrRowPtr, const int *bsrEndPtr, const int *bsrColInd,
8037 int blockDim, const cuComplex *x,
8038 const cuComplex *beta, cuComplex *y)
8040 cusparseZbsrxmv(cusparseHandle_t handle, cusparseDirection_t dir,
8041 cusparseOperation_t trans, int sizeOfMask,
8042 int mb, int nb, int nnzb,
8043 const cuDoubleComplex *alpha, const cusparseMatDescr_t descr,
8044 const cuDoubleComplex *bsrVal, const int *bsrMaskPtr,
8045 const int *bsrRowPtr, const int *bsrEndPtr, const int *bsrColInd,
8046 int blockDim, const cuDoubleComplex *x,
8047 const cuDoubleComplex *beta, cuDoubleComplex *y)</pre><p class="p">This function performs a bsrmv and a mask operation</p>
8048 <div class="tablenoborder">
8049 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
8050 <tbody class="tbody">
8052 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
8053 <math xmlns="http://www.w3.org/1998/Math/MathML">
8054 <mrow class="MJX-TeXAtom-ORD">
8055 <mrow class="MJX-TeXAtom-ORD">
8056 <mtext>y(mask)</mtext>
8060 <mrow class="MJX-TeXAtom-ORD">
8061 <mo maxsize="1.2em" minsize="1.2em">(</mo>
8065 <mrow class="MJX-TeXAtom-ORD">
8066 <mrow class="MJX-TeXAtom-ORD">
8070 <mo stretchy="false">(</mo>
8072 <mo stretchy="false">)</mo>
8074 <mrow class="MJX-TeXAtom-ORD">
8075 <mrow class="MJX-TeXAtom-ORD">
8082 <mrow class="MJX-TeXAtom-ORD">
8083 <mrow class="MJX-TeXAtom-ORD">
8087 <mrow class="MJX-TeXAtom-ORD">
8088 <mo maxsize="1.2em" minsize="1.2em">)</mo>
8090 <mrow class="MJX-TeXAtom-ORD">
8091 <mrow class="MJX-TeXAtom-ORD">
8092 <mtext>(mask)</mtext>
8104 <math xmlns="http://www.w3.org/1998/Math/MathML">
8106 <mtext> is </mtext>
8107 <mo stretchy="false">(</mo>
8119 <mo stretchy="false">)</mo>
8121 <mo stretchy="false">(</mo>
8133 <mo stretchy="false">)</mo>
8136 sparse matrix (that is defined in BSRX storage format by the four arrays <samp class="ph codeph">bsrVal</samp>, <samp class="ph codeph">bsrRowPtr</samp>, <samp class="ph codeph">bsrEndPtr</samp>, and <samp class="ph codeph">bsrColInd</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are vectors,
8139 <math xmlns="http://www.w3.org/1998/Math/MathML">
8141 <mtext> and </mtext>
8147 <math xmlns="http://www.w3.org/1998/Math/MathML">
8148 <mrow class="MJX-TeXAtom-ORD">
8149 <mrow class="MJX-TeXAtom-ORD">
8153 <mo stretchy="false">(</mo>
8155 <mo stretchy="false">)</mo>
8157 <mfenced open="{" close="">
8158 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
8164 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
8175 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
8186 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
8192 <p class="p">The mask operation is defined by array <samp class="ph codeph">bsrMaskPtr</samp> which contains updated row indices of
8193 <math xmlns="http://www.w3.org/1998/Math/MathML">
8196 <math xmlns="http://www.w3.org/1998/Math/MathML">
8198 </math> is not specified in <samp class="ph codeph">bsrMaskPtr</samp>, then bsrxmv does not touch row block
8199 <math xmlns="http://www.w3.org/1998/Math/MathML">
8202 <math xmlns="http://www.w3.org/1998/Math/MathML">
8206 <math xmlns="http://www.w3.org/1998/Math/MathML">
8208 <mo stretchy="false">[</mo>
8210 <mo stretchy="false">]</mo>
8214 <p class="p">For example, consider the
8216 <math xmlns="http://www.w3.org/1998/Math/MathML">
8222 <math xmlns="http://www.w3.org/1998/Math/MathML">
8226 <div class="tablenoborder">
8227 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
8228 <tbody class="tbody">
8230 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
8231 <math xmlns="http://www.w3.org/1998/Math/MathML">
8232 <mtable columnalign="right left right left right left right left right left right left" rowspacing=".5em" columnspacing="0.2777777777777778em 2em 0.2777777777777778em 2em 0.2777777777777778em 2em 0.2777777777777778em 2em 0.2777777777777778em 2em 0.2777777777777778em">
8237 <mfenced open="[" close="]">
8238 <mtable rowspacing="4pt" columnspacing="1em">
8243 <mrow class="MJX-TeXAtom-ORD">
8251 <mrow class="MJX-TeXAtom-ORD">
8264 <mrow class="MJX-TeXAtom-ORD">
8272 <mrow class="MJX-TeXAtom-ORD">
8280 <mrow class="MJX-TeXAtom-ORD">
8297 <p class="p">and its one-based BSR format (three vector form) is</p>
8298 <div class="tablenoborder">
8299 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
8300 <tbody class="tbody">
8302 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
8303 <math xmlns="http://www.w3.org/1998/Math/MathML">
8304 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
8307 <mtext>bsrVal</mtext>
8313 <mfenced open="[" close="]">
8314 <mtable rowspacing="4pt" columnspacing="1em">
8319 <mrow class="MJX-TeXAtom-ORD">
8327 <mrow class="MJX-TeXAtom-ORD">
8335 <mrow class="MJX-TeXAtom-ORD">
8343 <mrow class="MJX-TeXAtom-ORD">
8351 <mrow class="MJX-TeXAtom-ORD">
8363 <mtext>bsrRowPtr</mtext>
8369 <mfenced open="[" close="]">
8370 <mtable rowspacing="4pt" columnspacing="1em">
8374 <mrow class="MJX-TeXAtom-ORD">
8382 <mrow class="MJX-TeXAtom-ORD">
8398 <mtext>bsrColInd</mtext>
8404 <mfenced open="[" close="]">
8405 <mtable rowspacing="4pt" columnspacing="1em">
8409 <mrow class="MJX-TeXAtom-ORD">
8417 <mrow class="MJX-TeXAtom-ORD">
8425 <mrow class="MJX-TeXAtom-ORD">
8433 <mrow class="MJX-TeXAtom-ORD">
8454 <p class="p">Suppose we want to do the following bsrmv operation on a matrix
8456 <math xmlns="http://www.w3.org/1998/Math/MathML">
8459 <mo accent="false">¯</mo>
8462 which is slightly different from
8463 <math xmlns="http://www.w3.org/1998/Math/MathML">
8467 <div class="tablenoborder">
8468 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
8469 <tbody class="tbody">
8471 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
8472 <math xmlns="http://www.w3.org/1998/Math/MathML">
8473 <mfenced open="[" close="]">
8474 <mtable rowspacing="4pt" columnspacing="1em">
8479 <mrow class="MJX-TeXAtom-ORD">
8489 <mrow class="MJX-TeXAtom-ORD">
8504 <mrow class="MJX-TeXAtom-ORD">
8505 <mo maxsize="2.470em" minsize="2.470em">(</mo>
8507 <mrow class="MJX-TeXAtom-ORD">
8510 <mo stretchy="false">˜</mo>
8514 <mfenced open="[" close="]">
8515 <mtable rowspacing="4pt" columnspacing="1em">
8534 <mrow class="MJX-TeXAtom-ORD">
8545 <mrow class="MJX-TeXAtom-ORD">
8546 <mo maxsize="2.470em" minsize="2.470em">)</mo>
8549 <mfenced open="[" close="]">
8550 <mtable rowspacing="4pt" columnspacing="1em">
8555 <mrow class="MJX-TeXAtom-ORD">
8565 <mrow class="MJX-TeXAtom-ORD">
8575 <mrow class="MJX-TeXAtom-ORD">
8584 <mfenced open="[" close="]">
8585 <mtable rowspacing="4pt" columnspacing="1em">
8590 <mrow class="MJX-TeXAtom-ORD">
8605 <mrow class="MJX-TeXAtom-ORD">
8619 <p class="p">We don’t need to create another BSR format for the new matrix
8621 <math xmlns="http://www.w3.org/1998/Math/MathML">
8624 <mo accent="false">¯</mo>
8627 , all that we should do is to keep <samp class="ph codeph">bsrVal</samp> and <samp class="ph codeph">bsrColInd</samp> unchanged, but modify <samp class="ph codeph">bsrRowPtr</samp> and add additional array <samp class="ph codeph">bsrEndPtr</samp> which points to last nonzero elements per row of
8629 <math xmlns="http://www.w3.org/1998/Math/MathML">
8632 <mo accent="false">¯</mo>
8637 <p class="p">For example, the following <samp class="ph codeph">bsrRowPtr</samp> and <samp class="ph codeph">bsrEndPtr</samp> can represent matrix
8639 <math xmlns="http://www.w3.org/1998/Math/MathML">
8642 <mo accent="false">¯</mo>
8647 <div class="tablenoborder">
8648 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
8649 <tbody class="tbody">
8651 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
8652 <math xmlns="http://www.w3.org/1998/Math/MathML">
8653 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
8656 <mtext>bsrRowPtr</mtext>
8662 <mfenced open="[" close="]">
8663 <mtable rowspacing="4pt" columnspacing="1em">
8667 <mrow class="MJX-TeXAtom-ORD">
8683 <mtext>bsrEndPtr</mtext>
8689 <mfenced open="[" close="]">
8690 <mtable rowspacing="4pt" columnspacing="1em">
8694 <mrow class="MJX-TeXAtom-ORD">
8715 <p class="p">Further we can use mask operator (specified by array <samp class="ph codeph">bsrMaskPtr</samp>) to update particular row indices of
8716 <math xmlns="http://www.w3.org/1998/Math/MathML">
8718 </math> only because
8719 <math xmlns="http://www.w3.org/1998/Math/MathML">
8724 </math> is never changed. In this case, <samp class="ph codeph">bsrMaskPtr</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
8728 <p class="p">The mask operator is equivalent to the following operation (? stands for don’t care) </p>
8729 <div class="tablenoborder">
8730 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
8731 <tbody class="tbody">
8733 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
8734 <math xmlns="http://www.w3.org/1998/Math/MathML">
8735 <mfenced open="[" close="]">
8736 <mtable rowspacing="4pt" columnspacing="1em">
8746 <mrow class="MJX-TeXAtom-ORD">
8761 <mfenced open="[" close="]">
8762 <mtable rowspacing="4pt" columnspacing="1em">
8781 <mrow class="MJX-TeXAtom-ORD">
8793 <mfenced open="[" close="]">
8794 <mtable rowspacing="4pt" columnspacing="1em">
8799 <mrow class="MJX-TeXAtom-ORD">
8809 <mrow class="MJX-TeXAtom-ORD">
8819 <mrow class="MJX-TeXAtom-ORD">
8833 <mfenced open="[" close="]">
8834 <mtable rowspacing="4pt" columnspacing="1em">
8844 <mrow class="MJX-TeXAtom-ORD">
8858 <p class="p">In other words, <samp class="ph codeph">bsrRowPtr[0]</samp> and <samp class="ph codeph">bsrEndPtr[0]</samp> are don’t care.
8860 <div class="tablenoborder">
8861 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
8862 <tbody class="tbody">
8864 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
8865 <math xmlns="http://www.w3.org/1998/Math/MathML">
8866 <mtable columnalign="right center left" rowspacing=".5em" columnspacing="thickmathspace">
8869 <mtext>bsrRowPtr</mtext>
8875 <mfenced open="[" close="]">
8876 <mtable rowspacing="4pt" columnspacing="1em">
8880 <mrow class="MJX-TeXAtom-ORD">
8896 <mtext>bsrEndPtr</mtext>
8902 <mfenced open="[" close="]">
8903 <mtable rowspacing="4pt" columnspacing="1em">
8907 <mrow class="MJX-TeXAtom-ORD">
8928 <p class="p">Several comments on bsrxmv:</p>
8929 <p class="p">Only <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> are supported.
8931 <p class="p"><samp class="ph codeph">bsrMaskPtr</samp>, <samp class="ph codeph">bsrRowPtr</samp>, <samp class="ph codeph">bsrEndPtr</samp> and <samp class="ph codeph">bsrColInd</samp> are consistent with base index, either one-based or zero-based. Above example is one-based.
8933 <div class="tablenoborder">
8934 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
8936 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
8937 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
8940 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dir</samp></td>
8941 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">storage format of blocks, either <samp class="ph codeph">CUSPARSE_DIRECTION_ROW</samp> or <samp class="ph codeph">CUSPARSE_DIRECTION_COLUMN</samp> .
8945 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">trans</samp></td>
8946 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
8949 <math xmlns="http://www.w3.org/1998/Math/MathML">
8950 <mrow class="MJX-TeXAtom-ORD">
8951 <mrow class="MJX-TeXAtom-ORD">
8955 <mo stretchy="false">(</mo>
8957 <mo stretchy="false">)</mo>
8960 . Only <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> is supported.
8964 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">sizeOfMask</samp></td>
8965 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of updated rows of
8966 <math xmlns="http://www.w3.org/1998/Math/MathML">
8972 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">mb</samp></td>
8973 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of block rows of matrix
8974 <math xmlns="http://www.w3.org/1998/Math/MathML">
8980 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nb</samp></td>
8981 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of block columns of matrix
8982 <math xmlns="http://www.w3.org/1998/Math/MathML">
8988 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzb</samp></td>
8989 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero blocks of matrix
8990 <math xmlns="http://www.w3.org/1998/Math/MathML">
8996 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
8997 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
9000 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descr</samp></td>
9001 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
9002 <math xmlns="http://www.w3.org/1998/Math/MathML">
9004 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
9008 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrVal</samp></td>
9009 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of <samp class="ph codeph">nnz</samp> non-zero blocks of matrix
9010 <math xmlns="http://www.w3.org/1998/Math/MathML">
9016 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrRowPtr</samp></td>
9017 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">mb</samp>
9018 elements that contains the start of every block row and the end of the last block row plus one.
9022 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrEndPtr</samp></td>
9023 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">mb</samp> elements that contains the end of the every block row plus one.
9027 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrColInd</samp></td>
9028 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnzb</samp> column indices of the non-zero blocks of matrix
9029 <math xmlns="http://www.w3.org/1998/Math/MathML">
9035 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">blockDim</samp></td>
9036 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">block dimension of sparse matrix
9037 <math xmlns="http://www.w3.org/1998/Math/MathML">
9039 </math>, larger than zero.
9043 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
9044 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of
9047 <math xmlns="http://www.w3.org/1998/Math/MathML">
9065 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">beta</samp></td>
9066 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication. If <samp class="ph codeph">beta</samp> is zero, <samp class="ph codeph">y</samp> does not have to be a valid input.
9070 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
9071 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of
9074 <math xmlns="http://www.w3.org/1998/Math/MathML">
9094 <div class="tablenoborder">
9095 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
9097 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
9098 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
9101 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
9102 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
9105 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
9106 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,nnz<0</samp>, <samp class="ph codeph">trans != CUSPARSE_OPERATION_NON_TRANSPOSE</samp>,
9109 <math xmlns="http://www.w3.org/1998/Math/MathML">
9122 , <samp class="ph codeph">dir</samp> is not row-major or column-major, or <samp class="ph codeph">IndexBase of descr</samp> is not base-0 or base-1 ).
9126 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
9127 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
9130 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
9131 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
9134 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
9135 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
9138 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
9139 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
9142 the matrix type is not supported.
9150 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrmv"><a name="cusparse-lt-t-gt-csrmv" shape="rect">
9151 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrmv" name="cusparse-lt-t-gt-csrmv" shape="rect">7.3. cusparse<t>csrmv</a></h3>
9152 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
9153 cusparseScsrmv(cusparseHandle_t handle, cusparseOperation_t transA,
9154 int m, int n, int nnz, const float *alpha,
9155 const cusparseMatDescr_t descrA,
9156 const float *csrValA,
9157 const int *csrRowPtrA, const int *csrColIndA,
9158 const float *x, const float *beta,
9161 cusparseDcsrmv(cusparseHandle_t handle, cusparseOperation_t transA,
9162 int m, int n, int nnz, const double *alpha,
9163 const cusparseMatDescr_t descrA,
9164 const double *csrValA,
9165 const int *csrRowPtrA, const int *csrColIndA,
9166 const double *x, const double *beta,
9169 cusparseCcsrmv(cusparseHandle_t handle, cusparseOperation_t transA,
9170 int m, int n, int nnz, const cuComplex *alpha,
9171 const cusparseMatDescr_t descrA,
9172 const cuComplex *csrValA,
9173 const int *csrRowPtrA, const int *csrColIndA,
9174 const cuComplex *x, const cuComplex *beta,
9177 cusparseZcsrmv(cusparseHandle_t handle, cusparseOperation_t transA,
9178 int m, int n, int nnz, const cuDoubleComplex *alpha,
9179 const cusparseMatDescr_t descrA,
9180 const cuDoubleComplex *csrValA,
9181 const int *csrRowPtrA, const int *csrColIndA,
9182 const cuDoubleComplex *x, const cuDoubleComplex *beta,
9183 cuDoubleComplex *y)</pre><p class="p">This function performs the matrix-vector operation</p>
9184 <div class="tablenoborder">
9185 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
9186 <tbody class="tbody">
9188 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
9189 <math xmlns="http://www.w3.org/1998/Math/MathML">
9190 <mrow class="MJX-TeXAtom-ORD">
9191 <mrow class="MJX-TeXAtom-ORD">
9198 <mrow class="MJX-TeXAtom-ORD">
9199 <mrow class="MJX-TeXAtom-ORD">
9203 <mo stretchy="false">(</mo>
9205 <mo stretchy="false">)</mo>
9207 <mrow class="MJX-TeXAtom-ORD">
9208 <mrow class="MJX-TeXAtom-ORD">
9215 <mrow class="MJX-TeXAtom-ORD">
9216 <mrow class="MJX-TeXAtom-ORD">
9227 <math xmlns="http://www.w3.org/1998/Math/MathML">
9230 <samp class="ph codeph">m×n</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp>, and <samp class="ph codeph">csrColIndA</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are vectors,
9233 <math xmlns="http://www.w3.org/1998/Math/MathML">
9235 <mtext> and </mtext>
9241 <math xmlns="http://www.w3.org/1998/Math/MathML">
9242 <mrow class="MJX-TeXAtom-ORD">
9243 <mrow class="MJX-TeXAtom-ORD">
9247 <mo stretchy="false">(</mo>
9249 <mo stretchy="false">)</mo>
9251 <mfenced open="{" close="">
9252 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
9258 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
9269 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
9280 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
9286 <p class="p">When using the (conjugate) transpose of a general matrix or a Hermitian/symmetric matrix, this routine may produce slightly
9287 different results during different runs of this function with the same input parameters. For these matrix types it uses atomic
9288 operations to compute the final result, consequently many threads may be adding floating point numbers to the same memory
9289 location without any specific ordering, which may produce slightly different results for each run.
9291 <p class="p">If exactly the same output is required for any input when multiplying by the transpose of a general matrix, the following
9292 procedure can be used:
9294 <p class="p">1. Convert the matrix from CSR to CSC format using one of the <samp class="ph codeph">csr2csc()</samp> functions. Notice that by interchanging the rows and columns of the result you are implicitly transposing the matrix.
9296 <p class="p">2. Call the <samp class="ph codeph">csrmv()</samp> function with the <samp class="ph codeph">cusparseOperation_t</samp> parameter set to <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> and with the interchanged rows and columns of the matrix stored in CSC format. This (implicitly) multiplies the vector by
9297 the transpose of the matrix in the original CSR format.
9299 <p class="p">This function requires no extra storage for the general matrices when operation <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> is selected. It requires some extra storage for Hermitian/symmetric matrices and for the general matrices when operation
9300 different than <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> is selected. It is executed asynchronously with respect to the host and it may return control to the application on the host
9301 before the result is ready.
9303 <div class="tablenoborder">
9304 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
9306 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
9307 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
9310 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">trans</samp></td>
9311 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
9314 <math xmlns="http://www.w3.org/1998/Math/MathML">
9315 <mrow class="MJX-TeXAtom-ORD">
9316 <mrow class="MJX-TeXAtom-ORD">
9320 <mo stretchy="false">(</mo>
9322 <mo stretchy="false">)</mo>
9327 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
9328 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
9329 <math xmlns="http://www.w3.org/1998/Math/MathML">
9335 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
9336 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
9337 <math xmlns="http://www.w3.org/1998/Math/MathML">
9343 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
9344 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of matrix
9345 <math xmlns="http://www.w3.org/1998/Math/MathML">
9351 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
9352 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
9355 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
9356 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
9357 <math xmlns="http://www.w3.org/1998/Math/MathML">
9359 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>, <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_SYMMETRIC</samp>, and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_HERMITIAN</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
9363 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
9364 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
9365 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9366 <mo stretchy="false">(</mo>
9368 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9370 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9371 <mo stretchy="false">)</mo>
9373 non-zero elements of matrix
9374 <math xmlns="http://www.w3.org/1998/Math/MathML">
9380 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
9381 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
9385 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
9386 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
9387 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9388 <mo stretchy="false">(</mo>
9390 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9392 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9393 <mo stretchy="false">)</mo>
9395 column indices of the non-zero elements of matrix
9396 <math xmlns="http://www.w3.org/1998/Math/MathML">
9402 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
9403 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of <samp class="ph codeph">n</samp> elements if
9406 <math xmlns="http://www.w3.org/1998/Math/MathML">
9407 <mrow class="MJX-TeXAtom-ORD">
9408 <mrow class="MJX-TeXAtom-ORD">
9412 <mo stretchy="false">(</mo>
9414 <mo stretchy="false">)</mo>
9419 , and <samp class="ph codeph">m</samp> elements if
9422 <math xmlns="http://www.w3.org/1998/Math/MathML">
9423 <mrow class="MJX-TeXAtom-ORD">
9424 <mrow class="MJX-TeXAtom-ORD">
9428 <mo stretchy="false">(</mo>
9430 <mo stretchy="false">)</mo>
9441 <math xmlns="http://www.w3.org/1998/Math/MathML">
9442 <mrow class="MJX-TeXAtom-ORD">
9443 <mrow class="MJX-TeXAtom-ORD">
9447 <mo stretchy="false">(</mo>
9449 <mo stretchy="false">)</mo>
9459 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">beta</samp></td>
9460 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication. If <samp class="ph codeph">beta</samp> is zero, <samp class="ph codeph">y</samp> does not have to be a valid input.
9464 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
9465 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of <samp class="ph codeph">m</samp> elements if
9468 <math xmlns="http://www.w3.org/1998/Math/MathML">
9469 <mrow class="MJX-TeXAtom-ORD">
9470 <mrow class="MJX-TeXAtom-ORD">
9474 <mo stretchy="false">(</mo>
9476 <mo stretchy="false">)</mo>
9481 , and <samp class="ph codeph">n</samp> elements if
9484 <math xmlns="http://www.w3.org/1998/Math/MathML">
9485 <mrow class="MJX-TeXAtom-ORD">
9486 <mrow class="MJX-TeXAtom-ORD">
9490 <mo stretchy="false">(</mo>
9492 <mo stretchy="false">)</mo>
9503 <math xmlns="http://www.w3.org/1998/Math/MathML">
9504 <mrow class="MJX-TeXAtom-ORD">
9505 <mrow class="MJX-TeXAtom-ORD">
9509 <mo stretchy="false">(</mo>
9511 <mo stretchy="false">)</mo>
9523 <div class="tablenoborder">
9524 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
9526 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
9527 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated vector.</td>
9532 <div class="tablenoborder">
9533 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
9535 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
9536 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
9539 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
9540 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
9543 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
9544 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
9547 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
9548 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,nnz<0</samp>).
9552 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
9553 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision. (compute capability (c.c.) >= 1.3), symmetric/Hermitian matrix (c.c. >= 1.2)
9554 or transpose operation (c.c. >= 1.1).
9558 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
9559 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
9562 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
9563 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
9566 the matrix type is not supported.
9574 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrsvanalysis"><a name="cusparse-lt-t-gt-csrsvanalysis" shape="rect">
9575 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrsvanalysis" name="cusparse-lt-t-gt-csrsvanalysis" shape="rect">7.4. cusparse<t>csrsv_analysis</a></h3>
9576 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
9577 cusparseScsrsv_analysis(cusparseHandle_t handle,
9578 cusparseOperation_t transA,
9579 int m, int nnz, const cusparseMatDescr_t descrA,
9580 const float *csrValA,
9581 const int *csrRowPtrA, const int *csrColIndA,
9582 cusparseSolveAnalysisInfo_t info)
9584 cusparseDcsrsv_analysis(cusparseHandle_t handle,
9585 cusparseOperation_t transA,
9586 int m, int nnz, const cusparseMatDescr_t descrA,
9587 const double *csrValA,
9588 const int *csrRowPtrA, const int *csrColIndA,
9589 cusparseSolveAnalysisInfo_t info)
9591 cusparseCcsrsv_analysis(cusparseHandle_t handle,
9592 cusparseOperation_t transA,
9593 int m, int nnz, const cusparseMatDescr_t descrA,
9594 const cuComplex *csrValA,
9595 const int *csrRowPtrA, const int *csrColIndA,
9596 cusparseSolveAnalysisInfo_t info)
9598 cusparseZcsrsv_analysis(cusparseHandle_t handle,
9599 cusparseOperation_t transA,
9600 int m, int nnz, const cusparseMatDescr_t descrA,
9601 const cuDoubleComplex *csrValA,
9602 const int *csrRowPtrA, const int *csrColIndA,
9603 cusparseSolveAnalysisInfo_t info)</pre><p class="p">This function performs the analysis phase of the solution of a sparse triangular linear system</p>
9604 <div class="tablenoborder">
9605 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
9606 <tbody class="tbody">
9608 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
9609 <math xmlns="http://www.w3.org/1998/Math/MathML">
9610 <mrow class="MJX-TeXAtom-ORD">
9611 <mrow class="MJX-TeXAtom-ORD">
9615 <mo stretchy="false">(</mo>
9617 <mo stretchy="false">)</mo>
9619 <mrow class="MJX-TeXAtom-ORD">
9620 <mrow class="MJX-TeXAtom-ORD">
9627 <mrow class="MJX-TeXAtom-ORD">
9628 <mrow class="MJX-TeXAtom-ORD">
9639 <math xmlns="http://www.w3.org/1998/Math/MathML">
9641 </math> is <samp class="ph codeph">m×m</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp>, and <samp class="ph codeph">csrColIndA</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are the right-hand-side and the solution vectors,
9642 <math xmlns="http://www.w3.org/1998/Math/MathML">
9644 </math> is a scalar, and
9646 <math xmlns="http://www.w3.org/1998/Math/MathML">
9647 <mrow class="MJX-TeXAtom-ORD">
9648 <mrow class="MJX-TeXAtom-ORD">
9652 <mo stretchy="false">(</mo>
9654 <mo stretchy="false">)</mo>
9656 <mfenced open="{" close="">
9657 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
9663 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
9674 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
9685 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
9691 <p class="p">It is expected that this function will be executed only once for a given matrix and a particular operation type.</p>
9692 <p class="p">This function requires significant amount of extra storage that is proportional to the matrix size. It is executed asynchronously
9693 with respect to the host and it may return control to the application on the host before the result is ready.
9695 <div class="tablenoborder">
9696 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
9698 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
9699 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
9702 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">trans</samp></td>
9703 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
9706 <math xmlns="http://www.w3.org/1998/Math/MathML">
9707 <mrow class="MJX-TeXAtom-ORD">
9708 <mrow class="MJX-TeXAtom-ORD">
9712 <mo stretchy="false">(</mo>
9714 <mo stretchy="false">)</mo>
9719 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
9720 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
9721 <math xmlns="http://www.w3.org/1998/Math/MathML">
9727 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
9728 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of matrix
9729 <math xmlns="http://www.w3.org/1998/Math/MathML">
9735 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
9736 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
9737 <math xmlns="http://www.w3.org/1998/Math/MathML">
9739 </math>. The supported matrix types are <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp> and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>, while the supported diagonal types are <samp class="ph codeph">CUSPARSE_DIAG_TYPE_UNIT</samp> and <samp class="ph codeph">CUSPARSE_DIAG_TYPE_NON_UNIT</samp>.
9743 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
9744 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
9745 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9746 <mo stretchy="false">(</mo>
9748 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9750 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9751 <mo stretchy="false">)</mo>
9753 non-zero elements of matrix
9754 <math xmlns="http://www.w3.org/1998/Math/MathML">
9760 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
9761 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9766 elements that contains the start of every row and the end of the last row plus one.
9770 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
9771 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
9772 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9773 <mo stretchy="false">(</mo>
9775 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9777 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
9778 <mo stretchy="false">)</mo>
9780 column indices of the non-zero elements of matrix
9781 <math xmlns="http://www.w3.org/1998/Math/MathML">
9787 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
9788 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure initialized using <samp class="ph codeph">cusparseCreateSolveAnalysisInfo</samp>.
9794 <div class="tablenoborder">
9795 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
9797 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
9798 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure filled with information collected during the analysis phase (that should be passed to the solve phase unchanged).</td>
9803 <div class="tablenoborder">
9804 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
9806 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
9807 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
9810 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
9811 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
9814 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
9815 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
9818 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
9819 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,nnz<0</samp>).
9823 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
9824 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
9827 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
9828 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
9831 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
9832 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
9835 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
9836 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
9839 the matrix type is not supported.
9847 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrsvsolve"><a name="cusparse-lt-t-gt-csrsvsolve" shape="rect">
9848 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrsvsolve" name="cusparse-lt-t-gt-csrsvsolve" shape="rect">7.5. cusparse<t>csrsv_solve</a></h3>
9849 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
9850 cusparseScsrsv_solve(cusparseHandle_t handle,
9851 cusparseOperation_t transA,
9852 int m, const float *alpha,
9853 const cusparseMatDescr_t descrA,
9854 const float *csrValA,
9855 const int *csrRowPtrA, const int *csrColIndA,
9856 cusparseSolveAnalysisInfo_t info,
9857 const float *x, float *y)
9859 cusparseDcsrsv_solve(cusparseHandle_t handle,
9860 cusparseOperation_t transA,
9861 int m, const double *alpha,
9862 const cusparseMatDescr_t descrA,
9863 const double *csrValA,
9864 const int *csrRowPtrA, const int *csrColIndA,
9865 cusparseSolveAnalysisInfo_t info,
9866 const double *x, double *y)
9868 cusparseCcsrsv_solve(cusparseHandle_t handle,
9869 cusparseOperation_t transA,
9870 int m, const cuComplex *alpha,
9871 const cusparseMatDescr_t descrA,
9872 const cuComplex *csrValA,
9873 const int *csrRowPtrA, const int *csrColIndA,
9874 cusparseSolveAnalysisInfo_t info,
9875 const cuComplex *x, cuComplex *y)
9877 cusparseZcsrsv_solve(cusparseHandle_t handle,
9878 cusparseOperation_t transA,
9879 int m, const cuDoubleComplex *alpha,
9880 const cusparseMatDescr_t descrA,
9881 const cuDoubleComplex *csrValA,
9882 const int *csrRowPtrA, const int *csrColIndA,
9883 cusparseSolveAnalysisInfo_t info,
9884 const cuDoubleComplex *x, cuDoubleComplex *y)</pre><p class="p">This function performs the solve phase of the solution of a sparse triangular linear system</p>
9885 <div class="tablenoborder">
9886 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
9887 <tbody class="tbody">
9889 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
9890 <math xmlns="http://www.w3.org/1998/Math/MathML">
9891 <mrow class="MJX-TeXAtom-ORD">
9892 <mrow class="MJX-TeXAtom-ORD">
9896 <mo stretchy="false">(</mo>
9898 <mo stretchy="false">)</mo>
9900 <mrow class="MJX-TeXAtom-ORD">
9901 <mrow class="MJX-TeXAtom-ORD">
9908 <mrow class="MJX-TeXAtom-ORD">
9909 <mrow class="MJX-TeXAtom-ORD">
9920 <math xmlns="http://www.w3.org/1998/Math/MathML">
9922 </math> is <samp class="ph codeph">m×m</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp>, and <samp class="ph codeph">csrColIndA</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are the right-hand-side and the solution vectors,
9923 <math xmlns="http://www.w3.org/1998/Math/MathML">
9925 </math> is a scalar, and
9927 <math xmlns="http://www.w3.org/1998/Math/MathML">
9928 <mrow class="MJX-TeXAtom-ORD">
9929 <mrow class="MJX-TeXAtom-ORD">
9933 <mo stretchy="false">(</mo>
9935 <mo stretchy="false">)</mo>
9937 <mfenced open="{" close="">
9938 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
9944 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
9955 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
9966 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
9972 <p class="p">This function may be executed multiple times for a given matrix and a particular operation type.</p>
9973 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
9974 to the application on the host before the result is ready.
9976 <div class="tablenoborder">
9977 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
9979 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
9980 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
9983 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">trans</samp></td>
9984 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
9987 <math xmlns="http://www.w3.org/1998/Math/MathML">
9988 <mrow class="MJX-TeXAtom-ORD">
9989 <mrow class="MJX-TeXAtom-ORD">
9993 <mo stretchy="false">(</mo>
9995 <mo stretchy="false">)</mo>
10000 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
10001 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows and columns of matrix
10002 <math xmlns="http://www.w3.org/1998/Math/MathML">
10008 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
10009 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
10012 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
10013 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
10014 <math xmlns="http://www.w3.org/1998/Math/MathML">
10016 </math>. The supported matrix types are <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp> and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>, while the supported diagonal types are <samp class="ph codeph">CUSPARSE_DIAG_TYPE_UNIT</samp> and <samp class="ph codeph">CUSPARSE_DIAG_TYPE_NON_UNIT</samp>.
10020 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
10021 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
10022 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
10023 <mo stretchy="false">(</mo>
10025 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
10027 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
10028 <mo stretchy="false">)</mo>
10030 non-zero elements of matrix
10031 <math xmlns="http://www.w3.org/1998/Math/MathML">
10037 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
10038 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
10043 elements that contains the start of every row and the end of the last row plus one.
10047 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
10048 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
10049 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
10050 <mo stretchy="false">(</mo>
10052 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
10054 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
10055 <mo stretchy="false">)</mo>
10057 column indices of the non-zero elements of matrix
10058 <math xmlns="http://www.w3.org/1998/Math/MathML">
10064 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
10065 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure with information collected during the analysis phase (that should have been passed to the solve phase unchanged).</td>
10068 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
10069 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> right-hand-side vector of size <samp class="ph codeph">m</samp>.
10075 <div class="tablenoborder">
10076 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
10078 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
10079 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> solution vector of size <samp class="ph codeph">m</samp>.
10085 <div class="tablenoborder">
10086 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
10088 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
10089 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
10092 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
10093 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
10096 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
10097 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m<0</samp>).
10101 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
10102 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
10105 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MAPPING_ERROR</samp></td>
10106 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the texture binding failed.</td>
10109 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
10110 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
10113 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
10114 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
10117 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
10118 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
10121 the matrix type is not supported.
10129 <div class="topic concept nested1" id="cusparse-lt-t-gt-hybmv"><a name="cusparse-lt-t-gt-hybmv" shape="rect">
10130 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-hybmv" name="cusparse-lt-t-gt-hybmv" shape="rect">7.6. cusparse<t>hybmv</a></h3>
10131 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
10132 cusparseShybmv(cusparseHandle_t handle, cusparseOperation_t transA,
10133 const float *alpha,
10134 const cusparseMatDescr_t descrA,
10135 const cusparseHybMat_t hybA, const float *x,
10136 const float *beta, float *y)
10138 cusparseDhybmv(cusparseHandle_t handle, cusparseOperation_t transA,
10139 const double *alpha,
10140 const cusparseMatDescr_t descrA,
10141 const cusparseHybMat_t hybA, const double *x,
10142 const double *beta, double *y)
10144 cusparseChybmv(cusparseHandle_t handle, cusparseOperation_t transA,
10145 const cuComplex *alpha,
10146 const cusparseMatDescr_t descrA,
10147 const cusparseHybMat_t hybA, const cuComplex *x,
10148 const cuComplex *beta, cuComplex *y)
10150 cusparseZhybmv(cusparseHandle_t handle, cusparseOperation_t transA,
10151 const cuDoubleComplex *alpha,
10152 const cusparseMatDescr_t descrA,
10153 const cusparseHybMat_t hybA, const cuDoubleComplex *x,
10154 const cuDoubleComplex *beta, cuDoubleComplex *y)</pre><p class="p">This function performs the matrix-vector operation</p>
10155 <div class="tablenoborder">
10156 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
10157 <tbody class="tbody">
10159 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
10160 <math xmlns="http://www.w3.org/1998/Math/MathML">
10161 <mrow class="MJX-TeXAtom-ORD">
10162 <mrow class="MJX-TeXAtom-ORD">
10169 <mrow class="MJX-TeXAtom-ORD">
10170 <mrow class="MJX-TeXAtom-ORD">
10174 <mo stretchy="false">(</mo>
10176 <mo stretchy="false">)</mo>
10178 <mrow class="MJX-TeXAtom-ORD">
10179 <mrow class="MJX-TeXAtom-ORD">
10186 <mrow class="MJX-TeXAtom-ORD">
10187 <mrow class="MJX-TeXAtom-ORD">
10198 <math xmlns="http://www.w3.org/1998/Math/MathML">
10200 </math> is an <samp class="ph codeph">m×n</samp> sparse matrix (that is defined in the HYB storage format by an opaque data structure <samp class="ph codeph">hybA</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are vectors,
10203 <math xmlns="http://www.w3.org/1998/Math/MathML">
10205 <mtext> and </mtext>
10211 <math xmlns="http://www.w3.org/1998/Math/MathML">
10212 <mrow class="MJX-TeXAtom-ORD">
10213 <mrow class="MJX-TeXAtom-ORD">
10217 <mo stretchy="false">(</mo>
10219 <mo stretchy="false">)</mo>
10221 <mfenced open="{" close="">
10222 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
10228 <mtext> if transA == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
10234 <p class="p">Notice that currently only
10237 <math xmlns="http://www.w3.org/1998/Math/MathML">
10238 <mrow class="MJX-TeXAtom-ORD">
10239 <mrow class="MJX-TeXAtom-ORD">
10243 <mo stretchy="false">(</mo>
10245 <mo stretchy="false">)</mo>
10252 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
10253 to the application on the host before the result is ready.
10255 <div class="tablenoborder">
10256 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
10258 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
10259 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
10262 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
10263 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
10266 <math xmlns="http://www.w3.org/1998/Math/MathML">
10267 <mrow class="MJX-TeXAtom-ORD">
10268 <mrow class="MJX-TeXAtom-ORD">
10272 <mo stretchy="false">(</mo>
10274 <mo stretchy="false">)</mo>
10280 <math xmlns="http://www.w3.org/1998/Math/MathML">
10281 <mrow class="MJX-TeXAtom-ORD">
10282 <mrow class="MJX-TeXAtom-ORD">
10286 <mo stretchy="false">(</mo>
10288 <mo stretchy="false">)</mo>
10298 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
10299 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
10300 <math xmlns="http://www.w3.org/1998/Math/MathML">
10306 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
10307 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
10308 <math xmlns="http://www.w3.org/1998/Math/MathML">
10314 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
10315 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
10318 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
10319 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
10320 <math xmlns="http://www.w3.org/1998/Math/MathML">
10322 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
10326 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
10327 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix
10328 <math xmlns="http://www.w3.org/1998/Math/MathML">
10330 </math> in HYB storage format.
10334 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
10335 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of <samp class="ph codeph">n</samp> elements.
10339 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">beta</samp></td>
10340 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication. If <samp class="ph codeph">beta</samp> is zero, <samp class="ph codeph">y</samp> does not have to be a valid input.
10344 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
10345 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> vector of <samp class="ph codeph">m</samp> elements.
10351 <div class="tablenoborder">
10352 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
10354 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
10355 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated vector.</td>
10360 <div class="tablenoborder">
10361 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
10363 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
10364 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
10367 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
10368 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
10371 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
10372 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
10375 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
10376 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the internally stored hyb format parameters are invalid.</td>
10379 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
10380 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
10383 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
10384 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
10387 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
10388 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
10391 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
10392 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
10395 the matrix type is not supported.
10403 <div class="topic concept nested1" id="cusparse-lt-t-gt-hybsvanalysis"><a name="cusparse-lt-t-gt-hybsvanalysis" shape="rect">
10404 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-hybsvanalysis" name="cusparse-lt-t-gt-hybsvanalysis" shape="rect">7.7. cusparse<t>hybsv_analysis</a></h3>
10405 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
10406 cusparseShybsv_analysis(cusparseHandle_t handle,
10407 cusparseOperation_t transA,
10408 const cusparseMatDescr_t descrA,
10409 cusparseHybMat_t hybA,
10410 cusparseSolveAnalysisInfo_t info)
10412 cusparseDhybsv_analysis(cusparseHandle_t handle,
10413 cusparseOperation_t transA,
10414 const cusparseMatDescr_t descrA,
10415 cusparseHybMat_t hybA,
10416 cusparseSolveAnalysisInfo_t info)
10418 cusparseChybsv_analysis(cusparseHandle_t handle,
10419 cusparseOperation_t transA,
10420 const cusparseMatDescr_t descrA,
10421 cusparseHybMat_t hybA,
10422 cusparseSolveAnalysisInfo_t info)
10424 cusparseZhybsv_analysis(cusparseHandle_t handle,
10425 cusparseOperation_t transA,
10426 const cusparseMatDescr_t descrA,
10427 cusparseHybMat_t hybA,
10428 cusparseSolveAnalysisInfo_t info) </pre><p class="p">This function performs the analysis phase of the solution of a sparse triangular linear system</p>
10429 <div class="tablenoborder">
10430 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
10431 <tbody class="tbody">
10433 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
10434 <math xmlns="http://www.w3.org/1998/Math/MathML">
10435 <mrow class="MJX-TeXAtom-ORD">
10436 <mrow class="MJX-TeXAtom-ORD">
10440 <mo stretchy="false">(</mo>
10442 <mo stretchy="false">)</mo>
10444 <mrow class="MJX-TeXAtom-ORD">
10445 <mrow class="MJX-TeXAtom-ORD">
10452 <mrow class="MJX-TeXAtom-ORD">
10453 <mrow class="MJX-TeXAtom-ORD">
10464 <math xmlns="http://www.w3.org/1998/Math/MathML">
10466 </math> is <samp class="ph codeph">m×m</samp> sparse matrix (that is defined in HYB storage format by an opaque data structure <samp class="ph codeph">hybA</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are the right-hand-side and the solution vectors,
10467 <math xmlns="http://www.w3.org/1998/Math/MathML">
10469 </math> is a scalar, and
10471 <math xmlns="http://www.w3.org/1998/Math/MathML">
10472 <mrow class="MJX-TeXAtom-ORD">
10473 <mrow class="MJX-TeXAtom-ORD">
10477 <mo stretchy="false">(</mo>
10479 <mo stretchy="false">)</mo>
10481 <mfenced open="{" close="">
10482 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
10488 <mtext> if transA == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
10494 <p class="p">Notice that currently only
10497 <math xmlns="http://www.w3.org/1998/Math/MathML">
10498 <mrow class="MJX-TeXAtom-ORD">
10499 <mrow class="MJX-TeXAtom-ORD">
10503 <mo stretchy="false">(</mo>
10505 <mo stretchy="false">)</mo>
10512 <p class="p">It is expected that this function will be executed only once for a given matrix and a particular operation type.</p>
10513 <p class="p">This function requires significant amount of extra storage that is proportional to the matrix size. It is executed asynchronously
10514 with respect to the host and it may return control to the application on the host before the result is ready.
10516 <div class="tablenoborder">
10517 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
10519 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
10520 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
10523 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
10524 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
10527 <math xmlns="http://www.w3.org/1998/Math/MathML">
10528 <mrow class="MJX-TeXAtom-ORD">
10529 <mrow class="MJX-TeXAtom-ORD">
10533 <mo stretchy="false">(</mo>
10535 <mo stretchy="false">)</mo>
10541 <math xmlns="http://www.w3.org/1998/Math/MathML">
10542 <mrow class="MJX-TeXAtom-ORD">
10543 <mrow class="MJX-TeXAtom-ORD">
10547 <mo stretchy="false">(</mo>
10549 <mo stretchy="false">)</mo>
10559 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
10560 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
10561 <math xmlns="http://www.w3.org/1998/Math/MathML">
10563 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp> and diagonal type <samp class="ph codeph">USPARSE_DIAG_TYPE_NON_UNIT</samp>.
10567 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
10568 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix
10569 <math xmlns="http://www.w3.org/1998/Math/MathML">
10571 </math> in HYB storage format.
10575 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
10576 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure initialized using <samp class="ph codeph">cusparseCreateSolveAnalysisInfo</samp>.
10582 <div class="tablenoborder">
10583 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
10585 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
10586 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure filled with information collected during the analysis phase (that should be passed to the solve phase unchanged).</td>
10591 <div class="tablenoborder">
10592 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
10594 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
10595 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
10598 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
10599 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
10602 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
10603 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
10606 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
10607 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the internally stored hyb format parameters are invalid.</td>
10610 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
10611 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
10614 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
10615 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
10618 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
10619 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
10622 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
10623 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
10626 the matrix type is not supported.
10634 <div class="topic concept nested1" id="cusparse-lt-t-gt-hybsvsolve"><a name="cusparse-lt-t-gt-hybsvsolve" shape="rect">
10635 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-hybsvsolve" name="cusparse-lt-t-gt-hybsvsolve" shape="rect">7.8. cusparse<t>hybsv_solve</a></h3>
10636 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
10637 cusparseShybsv_solve(cusparseHandle_t handle,
10638 cusparseOperation_t transA,
10639 const float *alpha,
10640 const cusparseMatDescr_t descrA,
10641 cusparseHybMat_t hybA,
10642 cusparseSolveAnalysisInfo_t info,
10643 const float *x, float *y)
10645 cusparseDhybsv_solve(cusparseHandle_t handle,
10646 cusparseOperation_t transA,
10647 const double *alpha,
10648 const cusparseMatDescr_t descrA,
10649 cusparseHybMat_t hybA,
10650 cusparseSolveAnalysisInfo_t info,
10651 const double *x, double *y)
10653 cusparseChybsv_solve(cusparseHandle_t handle,
10654 cusparseOperation_t transA,
10655 const cuComplex *alpha,
10656 const cusparseMatDescr_t descrA,
10657 cusparseHybMat_t hybA,
10658 cusparseSolveAnalysisInfo_t info,
10659 const cuComplex *x, cuComplex *y)
10661 cusparseZhybsv_solve(cusparseHandle_t handle,
10662 cusparseOperation_t transA,
10663 const cuDoubleComplex *alpha,
10664 const cusparseMatDescr_t descrA,
10665 cusparseHybMat_t hybA,
10666 cusparseSolveAnalysisInfo_t info,
10667 const cuDoubleComplex *x, cuDoubleComplex *y)</pre><p class="p">This function performs the solve phase of the solution of a sparse triangular linear system</p>
10668 <div class="tablenoborder">
10669 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
10670 <tbody class="tbody">
10672 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
10673 <math xmlns="http://www.w3.org/1998/Math/MathML">
10674 <mrow class="MJX-TeXAtom-ORD">
10675 <mrow class="MJX-TeXAtom-ORD">
10679 <mo stretchy="false">(</mo>
10681 <mo stretchy="false">)</mo>
10683 <mrow class="MJX-TeXAtom-ORD">
10684 <mrow class="MJX-TeXAtom-ORD">
10691 <mrow class="MJX-TeXAtom-ORD">
10692 <mrow class="MJX-TeXAtom-ORD">
10703 <math xmlns="http://www.w3.org/1998/Math/MathML">
10705 </math> is <samp class="ph codeph">m×m</samp> sparse matrix (that is defined in HYB storage format by an opaque data structure <samp class="ph codeph">hybA</samp>), <samp class="ph codeph">x</samp> and <samp class="ph codeph">y</samp> are the right-hand-side and the solution vectors,
10706 <math xmlns="http://www.w3.org/1998/Math/MathML">
10708 </math> is a scalar, and
10710 <math xmlns="http://www.w3.org/1998/Math/MathML">
10711 <mrow class="MJX-TeXAtom-ORD">
10712 <mrow class="MJX-TeXAtom-ORD">
10716 <mo stretchy="false">(</mo>
10718 <mo stretchy="false">)</mo>
10720 <mfenced open="{" close="">
10721 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
10727 <mtext> if transA == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
10733 <p class="p">Notice that currently only
10736 <math xmlns="http://www.w3.org/1998/Math/MathML">
10737 <mrow class="MJX-TeXAtom-ORD">
10738 <mrow class="MJX-TeXAtom-ORD">
10742 <mo stretchy="false">(</mo>
10744 <mo stretchy="false">)</mo>
10751 <p class="p">This function may be executed multiple times for a given matrix and a particular operation type.</p>
10752 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
10753 to the application on the host before the result is ready.
10755 <div class="tablenoborder">
10756 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
10758 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
10759 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
10762 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
10763 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
10766 <math xmlns="http://www.w3.org/1998/Math/MathML">
10767 <mrow class="MJX-TeXAtom-ORD">
10768 <mrow class="MJX-TeXAtom-ORD">
10772 <mo stretchy="false">(</mo>
10774 <mo stretchy="false">)</mo>
10780 <math xmlns="http://www.w3.org/1998/Math/MathML">
10781 <mrow class="MJX-TeXAtom-ORD">
10782 <mrow class="MJX-TeXAtom-ORD">
10786 <mo stretchy="false">(</mo>
10788 <mo stretchy="false">)</mo>
10798 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
10799 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
10802 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
10803 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
10804 <math xmlns="http://www.w3.org/1998/Math/MathML">
10806 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp> and diagonal type <samp class="ph codeph">CUSPARSE_DIAG_TYPE_NON_UNIT</samp>.
10810 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
10811 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix
10812 <math xmlns="http://www.w3.org/1998/Math/MathML">
10814 </math> in HYB storage format.
10818 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
10819 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure with information collected during the analysis phase (that should be passed to the solve phase unchanged).</td>
10822 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
10823 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> right-hand-side vector of size <samp class="ph codeph">m</samp>.
10829 <div class="tablenoborder">
10830 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
10832 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">y</samp></td>
10833 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> solution vector of size <samp class="ph codeph">m</samp>.
10839 <div class="tablenoborder">
10840 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
10842 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
10843 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
10846 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
10847 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
10850 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
10851 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the internally stored hyb format parameters are invalid.</td>
10854 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
10855 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
10858 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MAPPING_ERROR</samp></td>
10859 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the texture binding failed.</td>
10862 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
10863 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
10866 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
10867 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
10870 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
10871 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
10874 the matrix type is not supported.
10883 <div class="topic concept nested0" id="cusparse-level-3-function-reference"><a name="cusparse-level-3-function-reference" shape="rect">
10884 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-level-3-function-reference" name="cusparse-level-3-function-reference" shape="rect">8. CUSPARSE Level 3 Function Reference</a></h2>
10885 <div class="body conbody">
10886 <p class="p">This chapter describes sparse linear algebra functions that perform operations between sparse and (usually tall) dense matrices.</p>
10887 <p class="p">In particular, the solution of sparse triangular linear systems with multiple right-hand-sides is implemented in two phases.
10888 First, during the analysis phase, the sparse triangular matrix is analyzed to determine the dependencies between its elements
10889 by calling the appropriate <samp class="ph codeph">csrsm_analysis()</samp> function. The analysis is specific to the sparsity pattern of the given matrix and to the selected <samp class="ph codeph">cusparseOperation_t</samp> type. The information from the analysis phase is stored in the parameter of type <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> that has been initialized previously with a call to <samp class="ph codeph">cusparseCreateSolveAnalysisInfo()</samp>.
10891 <p class="p">Second, during the solve phase, the given sparse triangular linear system is solved using the information stored in the <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> parameter by calling the appropriate <samp class="ph codeph">csrsm_solve()</samp> function. The solve phase may be performed multiple times with different multiple right-hand-sides, while the analysis phase
10892 needs to be performed only once. This is especially useful when a sparse triangular linear system must be solved for different
10893 sets of multiple right-hand-sides one at a time, while its coefficient matrix remains the same.
10895 <p class="p">Finally, once all the solves have completed, the opaque data structure pointed to by the <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> parameter can be released by calling <samp class="ph codeph">cusparseDestroySolveAnalysisInfo()</samp>. For more information please refer to [3].
10898 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrmm"><a name="cusparse-lt-t-gt-csrmm" shape="rect">
10899 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrmm" name="cusparse-lt-t-gt-csrmm" shape="rect">8.1. cusparse<t>csrmm</a></h3>
10900 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
10901 cusparseScsrmm(cusparseHandle_t handle, cusparseOperation_t transA,
10902 int m, int n, int k, int nnz,
10903 const float *alpha,
10904 const cusparseMatDescr_t descrA,
10905 const float *csrValA,
10906 const int *csrRowPtrA, const int *csrColIndA,
10907 const float *B, int ldb,
10908 const float *beta, float *C, int ldc)
10910 cusparseDcsrmm(cusparseHandle_t handle, cusparseOperation_t transA,
10911 int m, int n, int k, int nnz,
10912 const double *alpha,
10913 const cusparseMatDescr_t descrA,
10914 const double *csrValA,
10915 const int *csrRowPtrA, const int *csrColIndA,
10916 const double *B, int ldb,
10917 const double *beta, double *C, int ldc)
10919 cusparseCcsrmm(cusparseHandle_t handle, cusparseOperation_t transA,
10920 int m, int n, int k, int nnz,
10921 const cuComplex *alpha,
10922 const cusparseMatDescr_t descrA,
10923 const cuComplex *csrValA,
10924 const int *csrRowPtrA, const int *csrColIndA,
10925 const cuComplex *B, int ldb,
10926 const cuComplex *beta, cuComplex *C, int ldc)
10928 cusparseZcsrmm(cusparseHandle_t handle, cusparseOperation_t transA,
10929 int m, int n, int k, int nnz,
10930 const cuDoubleComplex *alpha,
10931 const cusparseMatDescr_t descrA,
10932 const cuDoubleComplex *csrValA,
10933 const int *csrRowPtrA, const int *csrColIndA,
10934 const cuDoubleComplex *B, int ldb,
10935 const cuDoubleComplex *beta, cuDoubleComplex *C, int ldc)</pre><p class="p">This function performs one of the following matrix-matrix operation</p>
10936 <div class="tablenoborder">
10937 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
10938 <tbody class="tbody">
10940 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
10941 <math xmlns="http://www.w3.org/1998/Math/MathML">
10946 <mrow class="MJX-TeXAtom-ORD">
10947 <mrow class="MJX-TeXAtom-ORD">
10951 <mo stretchy="false">(</mo>
10953 <mo stretchy="false">)</mo>
10967 <math xmlns="http://www.w3.org/1998/Math/MathML">
10969 </math> is <samp class="ph codeph">m×k</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp>, and <samp class="ph codeph">csrColIndA</samp>),
10972 <math xmlns="http://www.w3.org/1998/Math/MathML">
10974 <mtext> and </mtext>
10978 are dense matrices,
10981 <math xmlns="http://www.w3.org/1998/Math/MathML">
10983 <mtext> and </mtext>
10989 <math xmlns="http://www.w3.org/1998/Math/MathML">
10990 <mrow class="MJX-TeXAtom-ORD">
10991 <mrow class="MJX-TeXAtom-ORD">
10995 <mo stretchy="false">(</mo>
10997 <mo stretchy="false">)</mo>
10999 <mfenced open="{" close="">
11000 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
11006 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
11017 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
11028 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
11034 <p class="p">When using the (conjugate) transpose of a general matrix or a Hermitian/symmetric matrix, this routine may produce slightly
11035 different results during different runs of this function with the same input parameters. For these matrix types it uses atomic
11036 operations to compute the final result, consequently many threads may be adding floating point numbers to the same memory
11037 location without any specific ordering, which may produce slightly different results for each run.
11039 <p class="p">If exactly the same output is required for any input when multiplying by the transpose of a general matrix, the following
11040 procedure can be used:
11042 <p class="p">1. Convert the matrix from CSR to CSC format using one of the <samp class="ph codeph">csr2csc()</samp> functions. Notice that by interchanging the rows and columns of the result you are implicitly transposing the matrix.
11044 <p class="p">2. Call the <samp class="ph codeph">csrmm()</samp> function with the <samp class="ph codeph">cusparseOperation_t</samp> parameter set to <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> and with the interchanged rows and columns of the matrix stored in CSC format. This (implicitly) multiplies the vector by
11045 the transpose of the matrix in the original CSR format.
11047 <p class="p">This function requires no extra storage for the general matrices when operation <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> is selected. It requires some extra storage for Hermitian/symmetric matrices and for the general matrices when operation
11048 different than <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp> is selected. It is executed asynchronously with respect to the host and it may return control to the application on the host
11049 before the result is ready.
11051 <div class="tablenoborder">
11052 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
11054 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
11055 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
11058 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
11059 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
11062 <math xmlns="http://www.w3.org/1998/Math/MathML">
11063 <mrow class="MJX-TeXAtom-ORD">
11064 <mrow class="MJX-TeXAtom-ORD">
11068 <mo stretchy="false">(</mo>
11070 <mo stretchy="false">)</mo>
11075 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
11076 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of sparse matrix
11077 <math xmlns="http://www.w3.org/1998/Math/MathML">
11083 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
11084 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of dense matrix
11085 <math xmlns="http://www.w3.org/1998/Math/MathML">
11088 <math xmlns="http://www.w3.org/1998/Math/MathML">
11094 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">k</samp></td>
11095 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of sparse matrix
11096 <math xmlns="http://www.w3.org/1998/Math/MathML">
11102 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
11103 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of sparse matrix
11104 <math xmlns="http://www.w3.org/1998/Math/MathML">
11110 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
11111 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
11114 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
11115 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
11116 <math xmlns="http://www.w3.org/1998/Math/MathML">
11118 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>, <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_SYMMETRIC</samp>, and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_HERMITIAN</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
11122 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
11123 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
11124 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11125 <mo stretchy="false">(</mo>
11127 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11129 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11130 <mo stretchy="false">)</mo>
11132 non-zero elements of matrix
11133 <math xmlns="http://www.w3.org/1998/Math/MathML">
11139 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
11140 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11145 elements that contains the start of every row and the end of the last row plus one.
11149 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
11150 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
11151 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11152 <mo stretchy="false">(</mo>
11154 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11156 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11157 <mo stretchy="false">)</mo>
11159 column indices of the non-zero elements of matrix
11160 <math xmlns="http://www.w3.org/1998/Math/MathML">
11166 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">B</samp></td>
11167 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(ldb, n)</samp>.
11171 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldb</samp></td>
11172 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">B</samp>. It must be at least
11175 <math xmlns="http://www.w3.org/1998/Math/MathML">
11176 <mo movablelimits="true">max</mo>
11177 <mrow class="MJX-TeXAtom-ORD">
11178 <mrow class="MJX-TeXAtom-ORD">
11179 <mtext>(1, k)</mtext>
11187 <math xmlns="http://www.w3.org/1998/Math/MathML">
11188 <mrow class="MJX-TeXAtom-ORD">
11189 <mrow class="MJX-TeXAtom-ORD">
11193 <mo stretchy="false">(</mo>
11195 <mo stretchy="false">)</mo>
11203 <math xmlns="http://www.w3.org/1998/Math/MathML">
11204 <mo movablelimits="true">max</mo>
11205 <mrow class="MJX-TeXAtom-ORD">
11206 <mrow class="MJX-TeXAtom-ORD">
11207 <mtext>(1, m)</mtext>
11216 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">beta</samp></td>
11217 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication. If <samp class="ph codeph">beta</samp> is zero, <samp class="ph codeph">C</samp> does not have to be a valid input.
11221 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">C</samp></td>
11222 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(ldc, n)</samp>.
11226 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldc</samp></td>
11227 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">C</samp>. It must be at least
11230 <math xmlns="http://www.w3.org/1998/Math/MathML">
11231 <mo movablelimits="true">max</mo>
11232 <mrow class="MJX-TeXAtom-ORD">
11233 <mrow class="MJX-TeXAtom-ORD">
11234 <mtext>(1, k)</mtext>
11242 <math xmlns="http://www.w3.org/1998/Math/MathML">
11243 <mrow class="MJX-TeXAtom-ORD">
11244 <mrow class="MJX-TeXAtom-ORD">
11248 <mo stretchy="false">(</mo>
11250 <mo stretchy="false">)</mo>
11258 <math xmlns="http://www.w3.org/1998/Math/MathML">
11259 <mo movablelimits="true">max</mo>
11260 <mrow class="MJX-TeXAtom-ORD">
11261 <mrow class="MJX-TeXAtom-ORD">
11262 <mtext>(1, m)</mtext>
11273 <div class="tablenoborder">
11274 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
11276 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">C</samp></td>
11277 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated array of dimensions <samp class="ph codeph">(ldc, n)</samp>.
11283 <div class="tablenoborder">
11284 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
11286 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
11287 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
11290 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
11291 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
11294 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
11295 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
11298 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
11299 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,k,nnz<0</samp> or <samp class="ph codeph">ldb</samp> and <samp class="ph codeph">ldc</samp> are incorrect).
11303 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
11304 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
11307 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
11308 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
11311 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
11312 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
11315 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
11316 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
11319 the matrix type is not supported.
11327 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrmm2"><a name="cusparse-lt-t-gt-csrmm2" shape="rect">
11328 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrmm2" name="cusparse-lt-t-gt-csrmm2" shape="rect">8.2. cusparse<t>csrmm2</a></h3>
11329 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
11330 cusparseScsrmm2(cusparseHandle_t handle,
11331 cusparseOperation_t transA,
11332 cusparseOperation_t transB,
11337 const float *alpha,
11338 const cusparseMatDescr_t descrA,
11339 const float *csrValA,
11340 const int *csrRowPtrA,
11341 const int *csrColIndA,
11348 cusparseDcsrmm2(cusparseHandle_t handle,
11349 cusparseOperation_t transA,
11350 cusparseOperation_t transB,
11355 const double *alpha,
11356 const cusparseMatDescr_t descrA,
11357 const double *csrValA,
11358 const int *csrRowPtrA,
11359 const int *csrColIndA,
11362 const double *beta,
11366 cusparseCcsrmm2(cusparseHandle_t handle,
11367 cusparseOperation_t transA,
11368 cusparseOperation_t transB,
11373 const cuComplex *alpha,
11374 const cusparseMatDescr_t descrA,
11375 const cuComplex *csrValA,
11376 const int *csrRowPtrA,
11377 const int *csrColIndA,
11378 const cuComplex *B,
11380 const cuComplex *beta,
11384 cusparseZcsrmm2(cusparseHandle_t handle,
11385 cusparseOperation_t transA,
11386 cusparseOperation_t transB,
11391 const cuDoubleComplex *alpha,
11392 const cusparseMatDescr_t descrA,
11393 const cuDoubleComplex *csrValA,
11394 const int *csrRowPtrA,
11395 const int *csrColIndA,
11396 const cuDoubleComplex *B,
11398 const cuDoubleComplex *beta,
11399 cuDoubleComplex *C,
11400 int ldc)</pre><p class="p">This function performs one of the following matrix-matrix operation</p>
11401 <div class="tablenoborder">
11402 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
11403 <tbody class="tbody">
11405 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
11406 <math xmlns="http://www.w3.org/1998/Math/MathML">
11411 <mrow class="MJX-TeXAtom-ORD">
11412 <mrow class="MJX-TeXAtom-ORD">
11416 <mo stretchy="false">(</mo>
11418 <mo stretchy="false">)</mo>
11420 <mrow class="MJX-TeXAtom-ORD">
11421 <mrow class="MJX-TeXAtom-ORD">
11425 <mo stretchy="false">(</mo>
11427 <mo stretchy="false">)</mo>
11439 <math xmlns="http://www.w3.org/1998/Math/MathML">
11441 </math> is <samp class="ph codeph">m×k</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp>, and <samp class="ph codeph">csrColIndA</samp>),
11444 <math xmlns="http://www.w3.org/1998/Math/MathML">
11446 <mtext> and </mtext>
11450 are dense matrices,
11453 <math xmlns="http://www.w3.org/1998/Math/MathML">
11455 <mtext> and </mtext>
11461 <math xmlns="http://www.w3.org/1998/Math/MathML">
11462 <mrow class="MJX-TeXAtom-ORD">
11463 <mrow class="MJX-TeXAtom-ORD">
11467 <mo stretchy="false">(</mo>
11469 <mo stretchy="false">)</mo>
11471 <mfenced open="{" close="">
11472 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
11478 <mtext>if transA == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
11489 <mtext>if transA == CUSPARSE_OPERATION_TRANSPOSE</mtext>
11500 <mtext>if transA == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
11506 <p class="p">, </p>
11507 <math xmlns="http://www.w3.org/1998/Math/MathML">
11508 <mrow class="MJX-TeXAtom-ORD">
11509 <mrow class="MJX-TeXAtom-ORD">
11513 <mo stretchy="false">(</mo>
11515 <mo stretchy="false">)</mo>
11517 <mfenced open="{" close="">
11518 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
11524 <mtext>if transB == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
11535 <mtext>if transB == CUSPARSE_OPERATION_TRANSPOSE</mtext>
11546 <mtext>not supported</mtext>
11552 <p class="p">If <samp class="ph codeph">op(B)=B</samp>, cusparse<t>csrmm2 is the same as cusparse<t>csrmm.
11553 Otherwise only <samp class="ph codeph">op(A)=A</samp> is supported and matrix type must be <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
11556 <p class="p"> The motivation of transpose(B) is to improve memory access of matrix B .
11557 The computational pattern of <samp class="ph codeph">A*transpose(B)</samp> with matrix B in column-major order is equivalent to
11558 <samp class="ph codeph">A*B</samp> with matrix B in row-mjor order.
11561 <p class="p">In practice, no operation in iterative solver or eigenvalue solver uses <samp class="ph codeph">A*transpose(B)</samp>.
11562 However we can perform <samp class="ph codeph">A*transpose(transpose(B))</samp> which is the same as <samp class="ph codeph">A*B</samp>.
11563 For example, suppose A is <samp class="ph codeph">m*k</samp>, B is <samp class="ph codeph">k*n</samp> and C is <samp class="ph codeph">m*n</samp>, the following code shows usage of cusparseDcsrmm.
11564 </p><pre xml:space="preserve">
11565 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// A is m*k, B is k*n and C is m*n </span>
11566 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">const</span> <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> ldb_B = k ; <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// leading dimension of B</span>
11567 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">const</span> <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> ldc = m ; <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// leading dimension of C</span>
11568 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// perform C:=alpha*A*B + beta*C</span>
11569 cusparseSetMatType(descrA, CUSPARSE_MATRIX_TYPE_GENERAL ) ;
11570 cusparseDcsrmm2(cusparse_handle,
11571 CUSPARSE_OPERATION_NON_TRANSPOSE,
11572 m, n, k, nnz, alpha,
11573 descrA, csrValA, csrRowPtrA, csrColIndA,
11576 </pre><p class="p">Instead of using <samp class="ph codeph">A*B</samp>, our proposal is to transpose B to Bt first by calling cublas<t>geam, then to perform <samp class="ph codeph">A*transpose(Bt)</samp>.
11577 </p><pre xml:space="preserve">
11578 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// step 1: Bt := transpose(B)</span>
11579 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> *Bt;
11580 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">const</span> <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> ldb_Bt = n ; <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// leading dimension of Bt</span>
11581 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&Bt, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span>)*ldb_Bt*k);
11582 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> one = 1.0;
11583 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> zero = 0.0;
11584 cublasSetPointerMode(cublas_handle, CUBLAS_POINTER_MODE_HOST);
11585 cublasDgeam(cublas_handle, CUBLAS_OP_T, CUBLAS_OP_T,
11586 n, k, &one, B, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> ldb_B, &zero, B, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> ldb_B, Bt, ldb_Bt);
11588 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// step 2: perform C:=alpha*A*transpose(Bt) + beta*C</span>
11589 cusparseDcsrmm2(cusparse_handle,
11590 CUSPARSE_OPERATION_NON_TRANSPOSE,
11591 CUSPARSE_OPERATION_TRANSPOSE
11592 m, n, k, nnz, alpha,
11593 descrA, csrValA, csrRowPtrA, csrColIndA,
11596 </pre><p class="p">Remark 1: cublas<t>geam and cusparse<t>csrmm2 are memory-bound.
11597 The complexity of cublas<t>geam is <samp class="ph codeph">2*n*k</samp> and the minimum complexity of cusparse<t>csrmm2 is about <samp class="ph codeph">(nnz + nnz*n + 2*m*n)</samp>.
11598 If nnz per column <samp class="ph codeph">(=nnz/k)</samp> is large, it is worth paying extra cost on transposition because '<samp class="ph codeph">A*transpose(B)</samp>' may be 2x faster than '<samp class="ph codeph">A*B</samp>' if sparsity pattern of A is not good.
11601 <p class="p">Remark 2: <samp class="ph codeph">A*transpose(B)</samp> is only supported on compute capability 2.0 and above.
11604 <div class="tablenoborder">
11605 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
11607 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
11608 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
11611 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
11612 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
11615 <math xmlns="http://www.w3.org/1998/Math/MathML">
11616 <mrow class="MJX-TeXAtom-ORD">
11617 <mrow class="MJX-TeXAtom-ORD">
11621 <mo stretchy="false">(</mo>
11623 <mo stretchy="false">)</mo>
11628 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transB</samp></td>
11629 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
11632 <math xmlns="http://www.w3.org/1998/Math/MathML">
11633 <mrow class="MJX-TeXAtom-ORD">
11634 <mrow class="MJX-TeXAtom-ORD">
11638 <mo stretchy="false">(</mo>
11640 <mo stretchy="false">)</mo>
11645 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
11646 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of sparse matrix
11647 <math xmlns="http://www.w3.org/1998/Math/MathML">
11653 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
11654 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of dense matrix
11655 <math xmlns="http://www.w3.org/1998/Math/MathML">
11658 <math xmlns="http://www.w3.org/1998/Math/MathML">
11664 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">k</samp></td>
11665 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of sparse matrix
11666 <math xmlns="http://www.w3.org/1998/Math/MathML">
11672 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
11673 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of sparse matrix
11674 <math xmlns="http://www.w3.org/1998/Math/MathML">
11680 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
11681 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
11684 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
11685 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
11686 <math xmlns="http://www.w3.org/1998/Math/MathML">
11688 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>, <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_SYMMETRIC</samp>, and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_HERMITIAN</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
11692 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
11693 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
11694 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11695 <mo stretchy="false">(</mo>
11697 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11699 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11700 <mo stretchy="false">)</mo>
11702 non-zero elements of matrix
11703 <math xmlns="http://www.w3.org/1998/Math/MathML">
11709 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
11710 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11715 elements that contains the start of every row and the end of the last row plus one.
11719 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
11720 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
11721 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11722 <mo stretchy="false">(</mo>
11724 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11726 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
11727 <mo stretchy="false">)</mo>
11729 column indices of the non-zero elements of matrix
11730 <math xmlns="http://www.w3.org/1998/Math/MathML">
11736 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">B</samp></td>
11737 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(ldb, n)</samp> if op(B)=B and <samp class="ph codeph">(ldb, k)</samp> otherwise.
11741 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldb</samp></td>
11742 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">B</samp>. If op(B)=B, it must be at least
11745 <math xmlns="http://www.w3.org/1998/Math/MathML">
11746 <mo movablelimits="true">max</mo>
11747 <mrow class="MJX-TeXAtom-ORD">
11748 <mrow class="MJX-TeXAtom-ORD">
11749 <mtext>(1, k)</mtext>
11757 <math xmlns="http://www.w3.org/1998/Math/MathML">
11758 <mrow class="MJX-TeXAtom-ORD">
11759 <mrow class="MJX-TeXAtom-ORD">
11763 <mo stretchy="false">(</mo>
11765 <mo stretchy="false">)</mo>
11773 <math xmlns="http://www.w3.org/1998/Math/MathML">
11774 <mo movablelimits="true">max</mo>
11775 <mrow class="MJX-TeXAtom-ORD">
11776 <mrow class="MJX-TeXAtom-ORD">
11777 <mtext>(1, m)</mtext>
11783 If op(B) != B, it must be at least max(1, n).
11788 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">beta</samp></td>
11789 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication. If <samp class="ph codeph">beta</samp> is zero, <samp class="ph codeph">C</samp> does not have to be a valid input.
11793 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">C</samp></td>
11794 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(ldc, n)</samp>.
11798 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldc</samp></td>
11799 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">C</samp>. It must be at least
11802 <math xmlns="http://www.w3.org/1998/Math/MathML">
11803 <mo movablelimits="true">max</mo>
11804 <mrow class="MJX-TeXAtom-ORD">
11805 <mrow class="MJX-TeXAtom-ORD">
11806 <mtext>(1, k)</mtext>
11814 <math xmlns="http://www.w3.org/1998/Math/MathML">
11815 <mrow class="MJX-TeXAtom-ORD">
11816 <mrow class="MJX-TeXAtom-ORD">
11820 <mo stretchy="false">(</mo>
11822 <mo stretchy="false">)</mo>
11830 <math xmlns="http://www.w3.org/1998/Math/MathML">
11831 <mo movablelimits="true">max</mo>
11832 <mrow class="MJX-TeXAtom-ORD">
11833 <mrow class="MJX-TeXAtom-ORD">
11834 <mtext>(1, m)</mtext>
11845 <div class="tablenoborder">
11846 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
11848 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">C</samp></td>
11849 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> updated array of dimensions <samp class="ph codeph">(ldc, n)</samp>.
11855 <div class="tablenoborder">
11856 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
11858 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
11859 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
11862 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
11863 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
11866 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
11867 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
11870 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
11871 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,k,nnz<0</samp> or <samp class="ph codeph">ldb</samp> and <samp class="ph codeph">ldc</samp> are incorrect).
11875 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
11876 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">if op(B)=B the device does not support double precision
11877 or if op(B)=transpose(B) the device is below compute capability 2.0.
11881 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
11882 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
11885 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
11886 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
11889 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
11890 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
11892 <p class="p"></p><samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp> is not supported if op(B)=B and only <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> is supported otherwise.
11900 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrsmanalysis"><a name="cusparse-lt-t-gt-csrsmanalysis" shape="rect">
11901 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrsmanalysis" name="cusparse-lt-t-gt-csrsmanalysis" shape="rect">8.3. cusparse<t>csrsm_analysis</a></h3>
11902 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
11903 cusparseScsrsm_analysis(cusparseHandle_t handle,
11904 cusparseOperation_t transA,
11906 const cusparseMatDescr_t descrA,
11907 const float *csrValA,
11908 const int *csrRowPtrA, const int *csrColIndA,
11909 cusparseSolveAnalysisInfo_t info)
11911 cusparseDcsrsm_analysis(cusparseHandle_t handle,
11912 cusparseOperation_t transA,
11914 const cusparseMatDescr_t descrA,
11915 const double *csrValA,
11916 const int *csrRowPtrA, const int *csrColIndA,
11917 cusparseSolveAnalysisInfo_t info)
11919 cusparseCcsrsm_analysis(cusparseHandle_t handle,
11920 cusparseOperation_t transA,
11922 const cusparseMatDescr_t descrA,
11923 const cuComplex *csrValA,
11924 const int *csrRowPtrA, const int *csrColIndA,
11925 cusparseSolveAnalysisInfo_t info)
11927 cusparseZcsrsm_analysis(cusparseHandle_t handle,
11928 cusparseOperation_t transA,
11930 const cusparseMatDescr_t descrA,
11931 const cuDoubleComplex *csrValA,
11932 const int *csrRowPtrA, const int *csrColIndA,
11933 cusparseSolveAnalysisInfo_t info) </pre><p class="p">This function performs the analysis phase of the solution of a sparse triangular linear system</p>
11934 <div class="tablenoborder">
11935 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
11936 <tbody class="tbody">
11938 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
11939 <math xmlns="http://www.w3.org/1998/Math/MathML">
11940 <mrow class="MJX-TeXAtom-ORD">
11941 <mrow class="MJX-TeXAtom-ORD">
11945 <mo stretchy="false">(</mo>
11947 <mo stretchy="false">)</mo>
11949 <mrow class="MJX-TeXAtom-ORD">
11950 <mrow class="MJX-TeXAtom-ORD">
11957 <mrow class="MJX-TeXAtom-ORD">
11958 <mrow class="MJX-TeXAtom-ORD">
11968 <p class="p">with multiple right-hand-sides, where
11969 <math xmlns="http://www.w3.org/1998/Math/MathML">
11971 </math> is <samp class="ph codeph">m×m</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp>, and <samp class="ph codeph">csrColIndA</samp>),
11974 <math xmlns="http://www.w3.org/1998/Math/MathML">
11976 <mtext> and </mtext>
11980 are the right-hand-side and the solution dense matrices,
11981 <math xmlns="http://www.w3.org/1998/Math/MathML">
11983 </math> is a scalar, and
11985 <math xmlns="http://www.w3.org/1998/Math/MathML">
11986 <mrow class="MJX-TeXAtom-ORD">
11987 <mrow class="MJX-TeXAtom-ORD">
11991 <mo stretchy="false">(</mo>
11993 <mo stretchy="false">)</mo>
11995 <mfenced open="{" close="">
11996 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
12002 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
12013 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
12024 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
12030 <p class="p">It is expected that this function will be executed only once for a given matrix and a particular operation type.</p>
12031 <p class="p">This function requires significant amount of extra storage that is proportional to the matrix size. It is executed asynchronously
12032 with respect to the host and it may return control to the application on the host before the result is ready.
12034 <div class="tablenoborder">
12035 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
12037 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
12038 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
12041 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
12042 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
12045 <math xmlns="http://www.w3.org/1998/Math/MathML">
12046 <mrow class="MJX-TeXAtom-ORD">
12047 <mrow class="MJX-TeXAtom-ORD">
12051 <mo stretchy="false">(</mo>
12053 <mo stretchy="false">)</mo>
12058 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
12059 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
12060 <math xmlns="http://www.w3.org/1998/Math/MathML">
12066 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
12067 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of matrix
12068 <math xmlns="http://www.w3.org/1998/Math/MathML">
12074 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
12075 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
12076 <math xmlns="http://www.w3.org/1998/Math/MathML">
12078 </math>. The supported matrix types are <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp> and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>, while the supported diagonal types are <samp class="ph codeph">CUSPARSE_DIAG_TYPE_UNIT</samp> and <samp class="ph codeph">CUSPARSE_DIAG_TYPE_NON_UNIT</samp>.
12082 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
12083 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
12084 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12085 <mo stretchy="false">(</mo>
12087 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12089 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12090 <mo stretchy="false">)</mo>
12092 non-zero elements of matrix
12093 <math xmlns="http://www.w3.org/1998/Math/MathML">
12099 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
12100 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12105 elements that contains the start of every row and the end of the last row plus one.
12109 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
12110 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
12111 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12112 <mo stretchy="false">(</mo>
12114 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12116 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12117 <mo stretchy="false">)</mo>
12119 column indices of the non-zero elements of matrix
12120 <math xmlns="http://www.w3.org/1998/Math/MathML">
12126 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info </samp></td>
12127 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure initialized using <samp class="ph codeph">cusparseCreateSolveAnalysisInfo</samp>.
12133 <div class="tablenoborder">
12134 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
12136 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
12137 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure filled with information collected during the analysis phase (that should be passed to the solve phase unchanged).</td>
12142 <div class="tablenoborder">
12143 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
12145 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
12146 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
12149 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
12150 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
12153 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
12154 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
12157 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
12158 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,nnz<0</samp>).
12162 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
12163 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
12166 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
12167 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
12170 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
12171 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
12174 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
12175 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
12178 the matrix type is not supported.
12186 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrsmsolve"><a name="cusparse-lt-t-gt-csrsmsolve" shape="rect">
12187 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrsmsolve" name="cusparse-lt-t-gt-csrsmsolve" shape="rect">8.4. cusparse<t>csrsm_solve</a></h3>
12188 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
12189 cusparseScsrsm_solve(cusparseHandle_t handle,
12190 cusparseOperation_t transA,
12191 int m, int n, const float *alpha,
12192 const cusparseMatDescr_t descrA,
12193 const float *csrValA,
12194 const int *csrRowPtrA, const int *csrColIndA,
12195 cusparseSolveAnalysisInfo_t info,
12196 const float *X, int ldx,
12199 cusparseDcsrsm_solve(cusparseHandle_t handle,
12200 cusparseOperation_t transA,
12201 int m, int n, const double *alpha,
12202 const cusparseMatDescr_t descrA,
12203 const double *csrValA,
12204 const int *csrRowPtrA, const int *csrColIndA,
12205 cusparseSolveAnalysisInfo_t info,
12206 const double *X, int ldx,
12207 double *Y, int ldy)
12209 cusparseCcsrsm_solve(cusparseHandle_t handle,
12210 cusparseOperation_t transA,
12211 int m, int n, const cuComplex *alpha,
12212 const cusparseMatDescr_t descrA,
12213 const cuComplex *csrValA,
12214 const int *csrRowPtrA, const int *csrColIndA,
12215 cusparseSolveAnalysisInfo_t info,
12216 const cuComplex *X, int ldx,
12217 cuComplex *Y, int ldy)
12219 cusparseZcsrsm_solve(cusparseHandle_t handle,
12220 cusparseOperation_t transA,
12221 int m, int n, const cuDoubleComplex *alpha,
12222 const cusparseMatDescr_t descrA,
12223 const cuDoubleComplex *csrValA,
12224 const int *csrRowPtrA, const int *csrColIndA,
12225 cusparseSolveAnalysisInfo_t info,
12226 const cuDoubleComplex *X, int ldx,
12227 cuDoubleComplex *Y, int ldy)</pre><p class="p">This function performs the solve phase of the solution of a sparse triangular linear system</p>
12228 <div class="tablenoborder">
12229 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
12230 <tbody class="tbody">
12232 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
12233 <math xmlns="http://www.w3.org/1998/Math/MathML">
12234 <mrow class="MJX-TeXAtom-ORD">
12235 <mrow class="MJX-TeXAtom-ORD">
12239 <mo stretchy="false">(</mo>
12241 <mo stretchy="false">)</mo>
12243 <mrow class="MJX-TeXAtom-ORD">
12244 <mrow class="MJX-TeXAtom-ORD">
12251 <mrow class="MJX-TeXAtom-ORD">
12252 <mrow class="MJX-TeXAtom-ORD">
12262 <p class="p">with multiple right-hand-sides, where
12263 <math xmlns="http://www.w3.org/1998/Math/MathML">
12265 </math> is <samp class="ph codeph">m×n</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp>, and <samp class="ph codeph">csrColIndA</samp>),
12268 <math xmlns="http://www.w3.org/1998/Math/MathML">
12270 <mtext> and </mtext>
12274 are the right-hand-side and the solution dense matrices,
12275 <math xmlns="http://www.w3.org/1998/Math/MathML">
12277 </math> is a scalar, and
12279 <math xmlns="http://www.w3.org/1998/Math/MathML">
12280 <mrow class="MJX-TeXAtom-ORD">
12281 <mrow class="MJX-TeXAtom-ORD">
12285 <mo stretchy="false">(</mo>
12287 <mo stretchy="false">)</mo>
12289 <mfenced open="{" close="">
12290 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
12296 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
12307 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
12318 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
12324 <p class="p">This function may be executed multiple times for a given matrix and a particular operation type.</p>
12325 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
12326 to the application on the host before the result is ready.
12328 <div class="tablenoborder">
12329 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
12331 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
12332 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
12335 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
12336 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
12339 <math xmlns="http://www.w3.org/1998/Math/MathML">
12340 <mrow class="MJX-TeXAtom-ORD">
12341 <mrow class="MJX-TeXAtom-ORD">
12345 <mo stretchy="false">(</mo>
12347 <mo stretchy="false">)</mo>
12352 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
12353 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows and columns of matrix
12354 <math xmlns="http://www.w3.org/1998/Math/MathML">
12360 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
12361 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
12362 <math xmlns="http://www.w3.org/1998/Math/MathML">
12365 <math xmlns="http://www.w3.org/1998/Math/MathML">
12371 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
12372 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
12375 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
12376 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
12377 <math xmlns="http://www.w3.org/1998/Math/MathML">
12379 </math>. The supported matrix types are <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_TRIANGULAR</samp> and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>, while the supported diagonal types are <samp class="ph codeph">CUSPARSE_DIAG_TYPE_UNIT</samp> and <samp class="ph codeph">CUSPARSE_DIAG_TYPE_NON_UNIT</samp>.
12383 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
12384 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
12385 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12386 <mo stretchy="false">(</mo>
12388 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12390 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12391 <mo stretchy="false">)</mo>
12393 non-zero elements of matrix
12394 <math xmlns="http://www.w3.org/1998/Math/MathML">
12400 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
12401 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12406 elements that contains the start of every row and the end of the last row plus one.
12410 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
12411 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
12412 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12413 <mo stretchy="false">(</mo>
12415 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12417 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12418 <mo stretchy="false">)</mo>
12420 column indices of the non-zero elements of matrix
12421 <math xmlns="http://www.w3.org/1998/Math/MathML">
12427 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info </samp></td>
12428 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure with information collected during the analysis phase (that should be passed to the solve phase unchanged).</td>
12431 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">X</samp></td>
12432 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> right-hand-side array of dimensions <samp class="ph codeph">(ldx, n)</samp>.
12436 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldx</samp></td>
12437 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">X</samp>. (that is ≥
12440 <math xmlns="http://www.w3.org/1998/Math/MathML">
12441 <mo movablelimits="true">max</mo>
12442 <mrow class="MJX-TeXAtom-ORD">
12443 <mrow class="MJX-TeXAtom-ORD">
12444 <mtext>(1, m)</mtext>
12455 <div class="tablenoborder">
12456 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
12458 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">Y</samp></td>
12459 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> solution array of dimensions <samp class="ph codeph">(ldy, n)</samp>.
12463 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldy</samp></td>
12464 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">Y</samp>. (that is ≥
12467 <math xmlns="http://www.w3.org/1998/Math/MathML">
12468 <mo movablelimits="true">max</mo>
12469 <mrow class="MJX-TeXAtom-ORD">
12470 <mrow class="MJX-TeXAtom-ORD">
12471 <mtext>(1, m)</mtext>
12482 <div class="tablenoborder">
12483 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
12485 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
12486 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
12489 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
12490 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
12493 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
12494 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m<0</samp>).
12498 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
12499 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
12502 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MAPPING_ERROR</samp></td>
12503 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the texture binding failed.</td>
12506 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
12507 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
12510 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
12511 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
12514 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
12515 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
12518 the matrix type is not supported.
12527 <div class="topic concept nested0" id="cusparse-extra-function-reference"><a name="cusparse-extra-function-reference" shape="rect">
12528 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-extra-function-reference" name="cusparse-extra-function-reference" shape="rect">9. CUSPARSE Extra Function Reference</a></h2>
12529 <div class="body conbody">
12530 <p class="p">This chapter describes the extra routines used to manipulate sparse matrices.</p>
12532 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrgeam"><a name="cusparse-lt-t-gt-csrgeam" shape="rect">
12533 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrgeam" name="cusparse-lt-t-gt-csrgeam" shape="rect">9.1. cusparse<t>csrgeam</a></h3>
12534 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
12535 cusparseXcsrgeamNnz(cusparseHandle_t handle, int m, int n,
12536 const cusparseMatDescr_t descrA, int nnzA,
12537 const int *csrRowPtrA, const int *csrColIndA,
12538 const cusparseMatDescr_t descrB, int nnzB,
12539 const int *csrRowPtrB, const int *csrColIndB,
12540 const cusparseMatDescr_t descrC, int *csrRowPtrC,
12541 int *nnzTotalDevHostPtr)
12543 cusparseScsrgeam(cusparseHandle_t handle, int m, int n,
12544 const float *alpha,
12545 const cusparseMatDescr_t descrA, int nnzA,
12546 const float *csrValA, const int *csrRowPtrA, const int *csrColIndA,
12548 const cusparseMatDescr_t descrB, int nnzB,
12549 const float *csrValB, const int *csrRowPtrB, const int *csrColIndB,
12550 const cusparseMatDescr_t descrC,
12551 float *csrValC, int *csrRowPtrC, int *csrColIndC)
12553 cusparseDcsrgeam(cusparseHandle_t handle, int m, int n,
12554 const double *alpha,
12555 const cusparseMatDescr_t descrA, int nnzA,
12556 const double *csrValA, const int *csrRowPtrA, const int *csrColIndA,
12557 const double *beta,
12558 const cusparseMatDescr_t descrB, int nnzB,
12559 const double *csrValB, const int *csrRowPtrB, const int *csrColIndB,
12560 const cusparseMatDescr_t descrC,
12561 double *csrValC, int *csrRowPtrC, int *csrColIndC)
12563 cusparseCcsrgeam(cusparseHandle_t handle, int m, int n,
12564 const cuComplex *alpha,
12565 const cusparseMatDescr_t descrA, int nnzA,
12566 const cuComplex *csrValA, const int *csrRowPtrA, const int *csrColIndA,
12567 const cuComplex *beta,
12568 const cusparseMatDescr_t descrB, int nnzB,
12569 const cuComplex *csrValB, const int *csrRowPtrB, const int *csrColIndB,
12570 const cusparseMatDescr_t descrC,
12571 cuComplex *csrValC, int *csrRowPtrC, int *csrColIndC)
12573 cusparseZcsrgeam(cusparseHandle_t handle, int m, int n,
12574 const cuDoubleComplex *alpha,
12575 const cusparseMatDescr_t descrA, int nnzA,
12576 const cuDoubleComplex *csrValA, const int *csrRowPtrA,
12577 const int *csrColIndA,
12578 const cuDoubleComplex *beta,
12579 const cusparseMatDescr_t descrB, int nnzB,
12580 const cuDoubleComplex *csrValB, const int *csrRowPtrB,
12581 const int *csrColIndB,
12582 const cusparseMatDescr_t descrC,
12583 cuDoubleComplex *csrValC, int *csrRowPtrC, int *csrColIndC)</pre><p class="p">This function performs following matrix-matrix operation</p>
12584 <div class="tablenoborder">
12585 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
12586 <tbody class="tbody">
12588 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
12589 <math xmlns="http://www.w3.org/1998/Math/MathML">
12606 <math xmlns="http://www.w3.org/1998/Math/MathML">
12609 <math xmlns="http://www.w3.org/1998/Math/MathML">
12612 <math xmlns="http://www.w3.org/1998/Math/MathML">
12614 </math> are <samp class="ph codeph">m×n</samp> sparse matrices (defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA|csrValB|csrValC</samp>, <samp class="ph codeph">csrRowPtrA|csrRowPtrB|csrRowPtrC</samp>, and <samp class="ph codeph">csrColIndA|csrColIndB|csrcolIndC</samp> respectively), and
12617 <math xmlns="http://www.w3.org/1998/Math/MathML">
12619 <mtext> and </mtext>
12624 <math xmlns="http://www.w3.org/1998/Math/MathML">
12627 <math xmlns="http://www.w3.org/1998/Math/MathML">
12629 </math> have different sparsity patterns, CUSPARSE adopts two-step approach to complete sparse matrix C. In the first step, the user
12630 allocates <samp class="ph codeph">csrRowPtrC</samp> of <samp class="ph codeph">m+1</samp>elements and uses function cusparseXcsrgeamNnz to determine <samp class="ph codeph">csrRowPtrC</samp> and total number of nonzero elements. In the second step, the user gathers nnzC (number of non-zero elements of matrix C)
12631 from either <samp class="ph codeph">(nnzC=*nnzTotalDevHostPtr)</samp> or <samp class="ph codeph">(nnzC=csrRowPtrC(m)-csrRowPtrC(0))</samp>
12632 and allocates <samp class="ph codeph">csrValC, csrColIndC</samp> of nnzC elements respectively, then finally calls function cusparse[S|D|C|Z]csrgeam to complete matrix C.
12634 <p class="p">The general procedure is as follows:</p><pre xml:space="preserve"><span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> baseC, nnzC;
12635 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// nnzTotalDevHostPtr points to host memory</span>
12636 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> *nnzTotalDevHostPtr = &nnzC;
12637 cusparseSetPointerNode(handle, CUSPARSE_POINTER_MODE_HOST);
12638 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrRowPtrC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*(m+1));
12639 cusparseXcsrgeamNnz(handle, m, n,
12640 descrA, nnzA, csrRowPtrA, csrColIndA,
12641 descrB, nnzB, csrRowPtrB, csrColIndB,
12642 descrC, csrRowPtrC, nnzTotalDevHostPtr);
12643 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (NULL != nnzTotalDevHostPtr){
12644 nnzC = *nnzTotalDevHostPtr;
12645 }<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">else</span>{
12646 cudaMemcpy(&nnzC , csrRowPtrC+m, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
12647 cudaMemcpy(&baseC, csrRowPtrC , <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
12650 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrColIndC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*nnzC);
12651 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrValC , <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*nnzC);
12652 cusparseScsrgeam(handle, m, n,
12655 csrValA, csrRowPtrA, csrColIndA,
12658 csrValB, csrRowPtrB, csrColIndB,
12660 csrValC, csrRowPtrC, csrColIndC);</pre><p class="p">Several comments on csrgeam:</p>
12661 <p class="p">1. CUSPARSE does not support other three combinations, NT, TN and TT. In order to do any one of above three, the user should
12662 use the routine csr2csc to convert
12663 <math xmlns="http://www.w3.org/1998/Math/MathML">
12666 <math xmlns="http://www.w3.org/1998/Math/MathML">
12669 <math xmlns="http://www.w3.org/1998/Math/MathML">
12675 <math xmlns="http://www.w3.org/1998/Math/MathML">
12682 <p class="p">2. Only <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> is supported, if either
12683 <math xmlns="http://www.w3.org/1998/Math/MathML">
12686 <math xmlns="http://www.w3.org/1998/Math/MathML">
12688 </math> is symmetric or hermitian, then the user must extend the matrix to a full one and reconfigure MatrixType field of descriptor
12689 to <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
12691 <p class="p">3. If the sparsity pattern of matrix C is known, then the user can skip the call to function cusparseXcsrgeamNnz. For example,
12692 suppose that the user has an iterative algorithm which would update
12693 <math xmlns="http://www.w3.org/1998/Math/MathML">
12696 <math xmlns="http://www.w3.org/1998/Math/MathML">
12698 </math> iteratively but keep sparsity patterns. The user can call function cusparseXcsrgeamNnz once to setup sparsity pattern of
12699 C, then call function cusparse[S|D|C|Z]geam only for each iteration.
12701 <p class="p">4. The pointers, alpha and beta, must be valid.</p>
12702 <p class="p">5. CUSPARSE would not consider special case when alpha or beta is zero. The sparsity pattern of C is independent of value
12703 of alpha and beta. If the user want
12706 <math xmlns="http://www.w3.org/1998/Math/MathML">
12717 <mrow class="MJX-TeXAtom-ORD">
12723 , then csr2csc is better than csrgeam.
12725 <div class="tablenoborder">
12726 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
12728 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
12729 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
12732 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
12733 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of sparse matrix <samp class="ph codeph">A,B,C</samp>.
12737 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
12738 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of sparse matrix <samp class="ph codeph">A,B,C</samp>.
12742 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">alpha</samp></td>
12743 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication.</td>
12746 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
12747 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
12748 <math xmlns="http://www.w3.org/1998/Math/MathML">
12750 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> only.
12754 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzA</samp></td>
12755 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of sparse matrix <samp class="ph codeph">A</samp>.
12759 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
12760 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
12761 <samp class="ph codeph">nnzA</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12762 <mo stretchy="false">(</mo>
12764 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12766 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12767 <mo stretchy="false">)</mo>
12769 non-zero elements of matrix
12770 <math xmlns="http://www.w3.org/1998/Math/MathML">
12776 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
12777 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12782 elements that contains the start of every row and the end of the last row plus one.
12786 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
12787 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
12788 <samp class="ph codeph">nnzA</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12789 <mo stretchy="false">(</mo>
12791 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12793 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12794 <mo stretchy="false">)</mo>
12796 column indices of the non-zero elements of matrix
12797 <math xmlns="http://www.w3.org/1998/Math/MathML">
12803 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">beta</samp></td>
12804 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> scalar used for multiplication. If <samp class="ph codeph">beta</samp> is zero, <samp class="ph codeph">y</samp> does not have to be a valid input.
12808 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrB</samp></td>
12809 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
12810 <math xmlns="http://www.w3.org/1998/Math/MathML">
12812 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> only.
12816 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzB</samp></td>
12817 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of sparse matrix <samp class="ph codeph">B</samp>.
12821 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValB</samp></td>
12822 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
12823 <samp class="ph codeph">nnzB</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12824 <mo stretchy="false">(</mo>
12826 </math><samp class="ph codeph">csrRowPtrB(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12828 </math><samp class="ph codeph">csrRowPtrB(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12829 <mo stretchy="false">)</mo>
12831 non-zero elements of matrix
12832 <math xmlns="http://www.w3.org/1998/Math/MathML">
12838 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrB</samp></td>
12839 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12844 elements that contains the start of every row and the end of the last row plus one.
12848 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndB</samp></td>
12849 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
12850 <samp class="ph codeph">nnzB</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12851 <mo stretchy="false">(</mo>
12853 </math><samp class="ph codeph">csrRowPtrB(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12855 </math><samp class="ph codeph">csrRowPtrB(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12856 <mo stretchy="false">)</mo>
12858 column indices of the non-zero elements of matrix
12859 <math xmlns="http://www.w3.org/1998/Math/MathML">
12865 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrC</samp></td>
12866 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
12867 <math xmlns="http://www.w3.org/1998/Math/MathML">
12869 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> only.
12875 <div class="tablenoborder">
12876 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
12878 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValC</samp></td>
12879 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
12880 <samp class="ph codeph">nnzC</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12881 <mo stretchy="false">(</mo>
12883 </math><samp class="ph codeph">csrRowPtrC(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12885 </math><samp class="ph codeph">csrRowPtrC(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12886 <mo stretchy="false">)</mo>
12888 non-zero elements of matrix
12889 <math xmlns="http://www.w3.org/1998/Math/MathML">
12895 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrC</samp></td>
12896 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12901 elements that contains the start of every row and the end of the last row plus one.
12905 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndC</samp></td>
12906 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
12907 <samp class="ph codeph">nnzC</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12908 <mo stretchy="false">(</mo>
12910 </math><samp class="ph codeph">csrRowPtrC(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12912 </math><samp class="ph codeph">csrRowPtrC(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
12913 <mo stretchy="false">)</mo>
12915 column indices of the non-zero elements of matrix
12916 <math xmlns="http://www.w3.org/1998/Math/MathML">
12922 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzTotalDevHostPtr</samp></td>
12923 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">total number of nonzero elements in device or host memory.It is equal to <samp class="ph codeph">(csrRowPtrC(m)-csrRowPtrC(0))</samp>.
12929 <div class="tablenoborder">
12930 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
12932 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
12933 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
12936 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
12937 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
12940 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
12941 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
12944 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
12945 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,nnz<0</samp>, <samp class="ph codeph">IndexBase</samp> of <samp class="ph codeph">descrA,descrB,descrC</samp> is not base-0 or base-1, or alpha or beta is nil )).
12949 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
12950 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
12953 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
12954 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
12957 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
12958 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
12961 the matrix type is not supported.
12965 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
12966 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
12973 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrgemm"><a name="cusparse-lt-t-gt-csrgemm" shape="rect">
12974 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrgemm" name="cusparse-lt-t-gt-csrgemm" shape="rect">9.2. cusparse<t>csrgemm</a></h3>
12975 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
12976 cusparseXcsrgemmNnz(cusparseHandle_t handle,
12977 cusparseOperation_t transA, cusparseOperation_t transB,
12978 int m, int n, int k,
12979 const cusparseMatDescr_t descrA, const int nnzA,
12980 const int *csrRowPtrA, const int *csrColIndA,
12981 const cusparseMatDescr_t descrB, const int nnzB,
12982 const int *csrRowPtrB, const int *csrColIndB,
12983 const cusparseMatDescr_t descrC, int *csrRowPtrC,
12984 int *nnzTotalDevHostPtr )
12986 cusparseScsrgemm(cusparseHandle_t handle,
12987 cusparseOperation_t transA, cusparseOperation_t transB,
12988 int m, int n, int k,
12989 const cusparseMatDescr_t descrA, const int nnzA,
12990 const float *csrValA,
12991 const int *csrRowPtrA, const int *csrColIndA,
12992 const cusparseMatDescr_t descrB, const int nnzB,
12993 const float *csrValB,
12994 const int *csrRowPtrB, const int *csrColIndB,
12995 const cusparseMatDescr_t descrC,
12997 const int *csrRowPtrC, int *csrColIndC )
12999 cusparseDcsrgemm(cusparseHandle_t handle,
13000 cusparseOperation_t transA, cusparseOperation_t transB,
13001 int m, int n, int k,
13002 const cusparseMatDescr_t descrA, const int nnzA,
13003 const double *csrValA,
13004 const int *csrRowPtrA, const int *csrColIndA,
13005 const cusparseMatDescr_t descrB, const int nnzB,
13006 const double *csrValB,
13007 const int *csrRowPtrB, const int *csrColIndB,
13008 const cusparseMatDescr_t descrC,
13010 const int *csrRowPtrC, int *csrColIndC )
13012 cusparseCcsrgemm(cusparseHandle_t handle,
13013 cusparseOperation_t transA, cusparseOperation_t transB,
13014 int m, int n, int k,
13015 const cusparseMatDescr_t descrA, const int nnzA,
13016 const cuComplex *csrValA,
13017 const int *csrRowPtrA, const int *csrColIndA,
13018 const cusparseMatDescr_t descrB, const int nnzB,
13019 const cuComplex *csrValB,
13020 const int *csrRowPtrB, const int *csrColIndB,
13021 const cusparseMatDescr_t descrC,
13022 cuComplex *csrValC,
13023 const int *csrRowPtrC, int *csrColIndC )
13025 cusparseZcsrgemm(cusparseHandle_t handle,
13026 cusparseOperation_t transA, cusparseOperation_t transB,
13027 int m, int n, int k,
13028 const cusparseMatDescr_t descrA, const int nnzA,
13029 const cuDoubleComplex *csrValA,
13030 const int *csrRowPtrA, const int *csrColIndA,
13031 const cusparseMatDescr_t descrB, const int nnzB,
13032 const cuDoubleComplex *csrValB,
13033 const int *csrRowPtrB, const int *csrColIndB,
13034 const cusparseMatDescr_t descrC,
13035 cuDoubleComplex *csrValC,
13036 const int *csrRowPtrC, int *csrColIndC )</pre><p class="p">This function performs following matrix-matrix operation</p>
13037 <div class="tablenoborder">
13038 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
13039 <tbody class="tbody">
13041 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
13042 <math xmlns="http://www.w3.org/1998/Math/MathML">
13043 <mrow class="MJX-TeXAtom-ORD">
13044 <mrow class="MJX-TeXAtom-ORD">
13049 <mrow class="MJX-TeXAtom-ORD">
13050 <mrow class="MJX-TeXAtom-ORD">
13054 <mo stretchy="false">(</mo>
13056 <mo stretchy="false">)</mo>
13058 <mrow class="MJX-TeXAtom-ORD">
13059 <mrow class="MJX-TeXAtom-ORD">
13063 <mo stretchy="false">(</mo>
13065 <mo stretchy="false">)</mo>
13075 <math xmlns="http://www.w3.org/1998/Math/MathML">
13076 <mrow class="MJX-TeXAtom-ORD">
13077 <mrow class="MJX-TeXAtom-ORD">
13081 <mo stretchy="false">(</mo>
13083 <mo stretchy="false">)</mo>
13089 <math xmlns="http://www.w3.org/1998/Math/MathML">
13090 <mrow class="MJX-TeXAtom-ORD">
13091 <mrow class="MJX-TeXAtom-ORD">
13095 <mo stretchy="false">(</mo>
13097 <mo stretchy="false">)</mo>
13101 <math xmlns="http://www.w3.org/1998/Math/MathML">
13103 </math> are <samp class="ph codeph">m×k</samp>, <samp class="ph codeph">k×n</samp>, and <samp class="ph codeph">m×n</samp> sparse matrices (defined in CSR storage format by the three arrays <samp class="ph codeph">csrValA|csrValB|csrValC</samp>, <samp class="ph codeph">csrRowPtrA|csrRowPtrB|csrRowPtrC</samp>, and <samp class="ph codeph">csrColIndA|csrColIndB|csrcolIndC</samp> respectively. The operation is defined by
13105 <math xmlns="http://www.w3.org/1998/Math/MathML">
13106 <mrow class="MJX-TeXAtom-ORD">
13107 <mrow class="MJX-TeXAtom-ORD">
13111 <mo stretchy="false">(</mo>
13113 <mo stretchy="false">)</mo>
13115 <mfenced open="{" close="">
13116 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
13122 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
13129 <mrow class="MJX-TeXAtom-ORD">
13135 <mtext>if trans != CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
13141 <p class="p">There are four versions, NN, NT, TN and TT. NN stands for
13144 <math xmlns="http://www.w3.org/1998/Math/MathML">
13155 <math xmlns="http://www.w3.org/1998/Math/MathML">
13169 <math xmlns="http://www.w3.org/1998/Math/MathML">
13183 <math xmlns="http://www.w3.org/1998/Math/MathML">
13199 <p class="p">CUSPARSE adopts two-step approach to complete sparse matrix . In the first step, the user allocates <samp class="ph codeph">csrRowPtrC</samp> of <samp class="ph codeph">m+1</samp> elements and uses function cusparseXcsrgemmNnz to determine <samp class="ph codeph">csrRowPtrC</samp> and total number of nonzero elements. In the second step, the user gathers nnzC (number of nonzero elements of matrix C)
13200 from either <samp class="ph codeph">(nnzC=*nnzTotalDevHostPtr)</samp> or <samp class="ph codeph">(nnzC=csrRowPtrC(m)-csrRowPtrC(0))</samp>
13201 and allocates <samp class="ph codeph">csrValC, csrColIndC</samp> of nnzC elements respectively, then finally calls function cusparse[S|D|C|Z]csrgemm to complete matrix C.
13203 <p class="p">The general procedure is as follows:</p><pre xml:space="preserve"><span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> baseC, nnzC;
13204 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// nnzTotalDevHostPtr points to host memory</span>
13205 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> *nnzTotalDevHostPtr = &nnzC;
13206 cusparseSetPointerMode(handle, CUSPARSE_POINTER_MODE_HOST);
13207 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrRowPtrC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*(m+1));
13208 cusparseXcsrgemmNnz(handle, m, n, k,
13209 descrA, nnzA, csrRowPtrA, csrColIndA,
13210 descrB, nnzB, csrRowPtrB, csrColIndB,
13211 descrC, csrRowPtrC, nnzTotalDevHostPtr );
13212 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (NULL != nnzTotalDevHostPtr){
13213 nnzC = *nnzTotalDevHostPtr;
13214 }<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">else</span>{
13215 cudaMemcpy(&nnzC , csrRowPtrC+m, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
13216 cudaMemcpy(&baseC, csrRowPtrC , <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
13219 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrColIndC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*nnzC);
13220 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrValC , <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*nnzC);
13221 cusparseScsrgemm(handle, transA, transB, m, n, k,
13223 csrValA, csrRowPtrA, csrColIndA,
13225 csrValB, csrRowPtrB, csrColIndB,
13227 csrValC, csrRowPtrC, csrColIndC);</pre><p class="p">Several comments on csrgemm:</p>
13228 <p class="p">1. Only NN version is implemented. For NT version, matrix
13229 <math xmlns="http://www.w3.org/1998/Math/MathML">
13231 </math> is converted to
13232 <math xmlns="http://www.w3.org/1998/Math/MathML">
13237 </math> by csr2csc and call NN version. The same technique applies to TN and TT. The csr2csc routine would allocate working space
13238 implicitly, if the user needs memory management, then NN version is better.
13240 <p class="p">2. NN version needs working space of size <samp class="ph codeph">nnzA</samp> integers at least.
13242 <p class="p">3. Only <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> is supported, if either
13243 <math xmlns="http://www.w3.org/1998/Math/MathML">
13246 <math xmlns="http://www.w3.org/1998/Math/MathML">
13248 </math> is symmetric or hermitian, then the user must extend the matrix to a full one and reconfigure MatrixType field of descriptor
13249 to <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
13251 <p class="p">4. Only support devices of compute capability 2.0 or above.</p>
13252 <div class="tablenoborder">
13253 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
13255 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
13256 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
13259 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transA</samp></td>
13260 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
13263 <math xmlns="http://www.w3.org/1998/Math/MathML">
13264 <mrow class="MJX-TeXAtom-ORD">
13265 <mrow class="MJX-TeXAtom-ORD">
13269 <mo stretchy="false">(</mo>
13271 <mo stretchy="false">)</mo>
13276 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">transB</samp></td>
13277 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation
13280 <math xmlns="http://www.w3.org/1998/Math/MathML">
13281 <mrow class="MJX-TeXAtom-ORD">
13282 <mrow class="MJX-TeXAtom-ORD">
13286 <mo stretchy="false">(</mo>
13288 <mo stretchy="false">)</mo>
13293 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
13294 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of sparse matrix
13297 <math xmlns="http://www.w3.org/1998/Math/MathML">
13298 <mrow class="MJX-TeXAtom-ORD">
13299 <mrow class="MJX-TeXAtom-ORD">
13303 <mo stretchy="false">(</mo>
13305 <mo stretchy="false">)</mo>
13308 and <samp class="ph codeph">C</samp>.
13312 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
13313 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of sparse matrix
13316 <math xmlns="http://www.w3.org/1998/Math/MathML">
13317 <mrow class="MJX-TeXAtom-ORD">
13318 <mrow class="MJX-TeXAtom-ORD">
13322 <mo stretchy="false">(</mo>
13324 <mo stretchy="false">)</mo>
13327 and <samp class="ph codeph">C</samp>.
13331 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">k</samp></td>
13332 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns/rows of sparse matrix
13335 <math xmlns="http://www.w3.org/1998/Math/MathML">
13336 <mrow class="MJX-TeXAtom-ORD">
13337 <mrow class="MJX-TeXAtom-ORD">
13341 <mo stretchy="false">(</mo>
13343 <mo stretchy="false">)</mo>
13349 <math xmlns="http://www.w3.org/1998/Math/MathML">
13350 <mrow class="MJX-TeXAtom-ORD">
13351 <mrow class="MJX-TeXAtom-ORD">
13355 <mo stretchy="false">(</mo>
13357 <mo stretchy="false">)</mo>
13364 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
13365 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
13366 <math xmlns="http://www.w3.org/1998/Math/MathML">
13368 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> only.
13372 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzA</samp></td>
13373 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of sparse matrix <samp class="ph codeph">A</samp>.
13377 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
13378 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
13379 <samp class="ph codeph">nnzA</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13380 <mo stretchy="false">(</mo>
13382 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13384 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13385 <mo stretchy="false">)</mo>
13387 non-zero elements of matrix
13388 <math xmlns="http://www.w3.org/1998/Math/MathML">
13394 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
13395 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
13398 <math xmlns="http://www.w3.org/1998/Math/MathML">
13399 <mrow class="MJX-TeXAtom-ORD">
13402 <mo stretchy="false">˜</mo>
13409 elements that contains the start of every row and the end of the last row plus one.
13412 <math xmlns="http://www.w3.org/1998/Math/MathML">
13413 <mrow class="MJX-TeXAtom-ORD">
13416 <mo stretchy="false">˜</mo>
13423 if <samp class="ph codeph">transA</samp> == <samp class="ph codeph">CUSPARSE_OPERATION_NON_TRANSPOSE</samp>, otherwise
13426 <math xmlns="http://www.w3.org/1998/Math/MathML">
13427 <mrow class="MJX-TeXAtom-ORD">
13430 <mo stretchy="false">˜</mo>
13441 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
13442 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
13443 <samp class="ph codeph">nnzA</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13444 <mo stretchy="false">(</mo>
13446 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13448 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13449 <mo stretchy="false">)</mo>
13451 column indices of the non-zero elements of matrix
13452 <math xmlns="http://www.w3.org/1998/Math/MathML">
13458 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrB</samp></td>
13459 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
13460 <math xmlns="http://www.w3.org/1998/Math/MathML">
13462 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> only.
13466 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzB</samp></td>
13467 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of sparse matrix <samp class="ph codeph">B</samp>.
13471 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValB</samp></td>
13472 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
13473 <samp class="ph codeph">nnzB</samp>
13474 non-zero elements of matrix
13475 <math xmlns="http://www.w3.org/1998/Math/MathML">
13481 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrB</samp></td>
13482 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
13485 <math xmlns="http://www.w3.org/1998/Math/MathML">
13486 <mrow class="MJX-TeXAtom-ORD">
13489 <mo stretchy="false">˜</mo>
13496 elements that contains the start of every row and the end of the last row plus one.
13499 <math xmlns="http://www.w3.org/1998/Math/MathML">
13500 <mrow class="MJX-TeXAtom-ORD">
13503 <mo stretchy="false">˜</mo>
13510 if <samp class="ph codeph">transB == CUSPARSE_OPERATION_NON_TRANSPOSE</samp>, otherwise
13513 <math xmlns="http://www.w3.org/1998/Math/MathML">
13514 <mrow class="MJX-TeXAtom-ORD">
13517 <mo stretchy="false">˜</mo>
13526 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndB</samp></td>
13527 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnzB</samp> column indices of the non-zero elements of matrix
13528 <math xmlns="http://www.w3.org/1998/Math/MathML">
13534 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrC</samp></td>
13535 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
13536 <math xmlns="http://www.w3.org/1998/Math/MathML">
13538 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp> only.
13544 <div class="tablenoborder">
13545 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
13547 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValC</samp></td>
13548 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
13549 <samp class="ph codeph">nnzC</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13550 <mo stretchy="false">(</mo>
13552 </math><samp class="ph codeph">csrRowPtrC(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13554 </math><samp class="ph codeph">csrRowPtrC(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13555 <mo stretchy="false">)</mo>
13557 non-zero elements of matrix
13558 <math xmlns="http://www.w3.org/1998/Math/MathML">
13564 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrC</samp></td>
13565 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
13569 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndC</samp></td>
13570 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
13571 <samp class="ph codeph">nnzC</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13572 <mo stretchy="false">(</mo>
13574 </math><samp class="ph codeph">csrRowPtrC(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13576 </math><samp class="ph codeph">csrRowPtrC(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13577 <mo stretchy="false">)</mo>
13579 column indices of the non-zero elements of matrix
13580 <math xmlns="http://www.w3.org/1998/Math/MathML">
13586 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzTotalDevHostPtr</samp></td>
13587 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">total number of nonzero elements in device or host memory. It is equal to <samp class="ph codeph">(csrRowPtrC(m)-csrRowPtrC(0))</samp>.
13593 <div class="tablenoborder">
13594 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
13596 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
13597 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
13600 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
13601 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
13604 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
13605 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
13608 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
13609 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,k<0</samp>, <samp class="ph codeph">IndexBase</samp> of <samp class="ph codeph">descrA,descrB,descrC</samp> is not base-0 or base-1, or alpha or beta is nil )).
13613 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
13614 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
13617 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
13618 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
13621 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
13622 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
13625 the matrix type is not supported.
13629 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
13630 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
13638 <div class="topic concept nested0" id="cusparse-preconditioners-reference"><a name="cusparse-preconditioners-reference" shape="rect">
13639 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-preconditioners-reference" name="cusparse-preconditioners-reference" shape="rect">10. CUSPARSE Preconditioners Reference</a></h2>
13640 <div class="body conbody">
13641 <p class="p">This chapter describes the routines that implement different preconditioners.</p>
13642 <p class="p">In particular, the incomplete factorizations are implemented in two phases. First, during the analysis phase, the sparse triangular
13643 matrix is analyzed to determine the dependencies between its elements by calling the appropriate <samp class="ph codeph">csrsv_analysis()</samp> function. The analysis is specific to the sparsity pattern of the given matrix and selected <samp class="ph codeph">cusparseOperation_t</samp> type. The information from the analysis phase is stored in the parameter of type <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> that has been initialized previously with a call to <samp class="ph codeph">cusparseCreateSolveAnalysisInfo()</samp>.
13645 <p class="p">Second, during the numerical factorization phase, the given coefficient matrix is factorized using the information stored
13646 in the <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> parameter by calling the appropriate <samp class="ph codeph">csrilu0</samp> or <samp class="ph codeph">csric0</samp> function.
13648 <p class="p">The analysis phase is shared across the sparse triangular solve and the incomplete factorization and must be performed only
13649 once. While the resulting information can be passed to the numerical factorization and the sparse triangular solve multiple
13652 <p class="p">Finally, once the incomplete factorization and all the sparse triangular solves have completed, the opaque data structure
13653 pointed to by the <samp class="ph codeph">cusparseSolveAnalysisInfo_t</samp> parameter can be released by calling <samp class="ph codeph">cusparseDestroySolveAnalysisInfo()</samp>.
13656 <div class="topic concept nested1" id="cusparse-lt-t-gt-csric0"><a name="cusparse-lt-t-gt-csric0" shape="rect">
13657 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csric0" name="cusparse-lt-t-gt-csric0" shape="rect">10.1. cusparse<t>csric0</a></h3>
13658 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
13659 cusparseScsric0(cusparseHandle_t handle, cusparseOperation_t trans,
13660 int m, const cusparseMatDescr_t descrA,
13662 const int *csrRowPtrA, const int *csrColIndA,
13663 cusparseSolveAnalysisInfo_t info)
13665 cusparseDcsric0(cusparseHandle_t handle, cusparseOperation_t trans,
13666 int m, const cusparseMatDescr_t descrA,
13668 const int *csrRowPtrA, const int *csrColIndA,
13669 cusparseSolveAnalysisInfo_t info)
13671 cusparseCcsric0(cusparseHandle_t handle, cusparseOperation_t trans,
13672 int m, const cusparseMatDescr_t descrA,
13673 cuComplex *csrValM,
13674 const int *csrRowPtrA, const int *csrColIndA,
13675 cusparseSolveAnalysisInfo_t info)
13677 cusparseZcsric0(cusparseHandle_t handle, cusparseOperation_t trans,
13678 int m, const cusparseMatDescr_t descrA,
13679 cuDoubleComplex *csrValM,
13680 const int *csrRowPtrA, const int *csrColIndA,
13681 cusparseSolveAnalysisInfo_t info)</pre><p class="p">This function computes the incomplete-Cholesky factorization with
13682 <math xmlns="http://www.w3.org/1998/Math/MathML">
13684 </math> fill-in and no pivoting
13686 <div class="tablenoborder">
13687 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
13688 <tbody class="tbody">
13690 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
13691 <math xmlns="http://www.w3.org/1998/Math/MathML">
13694 <mo stretchy="false">(</mo>
13696 <mo stretchy="false">)</mo>
13700 <mrow class="MJX-TeXAtom-ORD">
13712 <math xmlns="http://www.w3.org/1998/Math/MathML">
13714 </math> is <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13716 </math><samp class="ph codeph">m</samp> Hermitian/symmetric positive definite sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValM</samp>, <samp class="ph codeph">csrRowPtrA</samp> and <samp class="ph codeph">csrColIndA</samp>) and
13718 <math xmlns="http://www.w3.org/1998/Math/MathML">
13719 <mrow class="MJX-TeXAtom-ORD">
13720 <mrow class="MJX-TeXAtom-ORD">
13724 <mo stretchy="false">(</mo>
13726 <mo stretchy="false">)</mo>
13728 <mfenced open="{" close="">
13729 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
13735 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
13746 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
13757 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
13763 <p class="p">Notice that only a lower or upper Hermitian/symmetric part of the matrix
13764 <math xmlns="http://www.w3.org/1998/Math/MathML">
13766 </math> is actually stored. It is overwritten by the lower or upper triangular factor
13767 <math xmlns="http://www.w3.org/1998/Math/MathML">
13773 <math xmlns="http://www.w3.org/1998/Math/MathML">
13778 <p class="p">A call to this routine must be preceeded by a call to the <samp class="ph codeph">csrsv_analysis</samp> routine.
13780 <p class="p">This function requires some extra storage. It is executed asynchronously with respect to the host and it may return control
13781 to the application on the host before the result is ready.
13783 <div class="tablenoborder">
13784 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
13786 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
13787 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
13790 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">trans</samp></td>
13791 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation <samp class="ph codeph">op</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13792 <mo stretchy="false">(</mo>
13794 <mo stretchy="false">)</mo>
13799 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
13800 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows and columns of matrix
13801 <math xmlns="http://www.w3.org/1998/Math/MathML">
13807 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
13808 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
13809 <math xmlns="http://www.w3.org/1998/Math/MathML">
13811 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_SYMMETRIC</samp> and <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_HERMITIAN</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
13815 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValM</samp></td>
13816 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
13817 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13818 <mo stretchy="false">(</mo>
13820 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13822 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13823 <mo stretchy="false">)</mo>
13825 non-zero elements of matrix
13826 <math xmlns="http://www.w3.org/1998/Math/MathML">
13832 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
13833 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13838 elements that contains the start of every row and the end of the last row plus one.
13842 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
13843 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
13844 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13845 <mo stretchy="false">(</mo>
13847 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13849 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13850 <mo stretchy="false">)</mo>
13852 column indices of the non-zero elements of matrix
13853 <math xmlns="http://www.w3.org/1998/Math/MathML">
13859 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
13860 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure with information collected during the analysis phase (that should have been passed to the solve phase unchanged).</td>
13865 <div class="tablenoborder">
13866 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
13868 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValM</samp></td>
13869 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> matrix containg the incomplete-Cholesky lower or upper triangular factor.</td>
13874 <div class="tablenoborder">
13875 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
13877 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
13878 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
13881 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
13882 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
13885 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
13886 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
13889 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
13890 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m<0</samp>).
13894 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
13895 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
13898 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
13899 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
13902 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
13903 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
13906 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
13907 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
13910 the matrix type is not supported.
13918 <div class="topic concept nested1" id="cusparse-lt-t-gt-csrilu0"><a name="cusparse-lt-t-gt-csrilu0" shape="rect">
13919 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csrilu0" name="cusparse-lt-t-gt-csrilu0" shape="rect">10.2. cusparse<t>csrilu0</a></h3>
13920 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
13921 cusparseScsrilu0(cusparseHandle_t handle, cusparseOperation_t trans,
13922 int m, const cusparseMatDescr_t descrA,
13924 const int *csrRowPtrA, const int *csrColIndA,
13925 cusparseSolveAnalysisInfo_t info)
13927 cusparseDcsrilu0(cusparseHandle_t handle, cusparseOperation_t trans,
13928 int m, const cusparseMatDescr_t descrA,
13930 const int *csrRowPtrA, const int *csrColIndA,
13931 cusparseSolveAnalysisInfo_t info)
13933 cusparseCcsrilu0(cusparseHandle_t handle, cusparseOperation_t trans,
13934 int m, const cusparseMatDescr_t descrA,
13935 cuComplex *csrValM,
13936 const int *csrRowPtrA, const int *csrColIndA,
13937 cusparseSolveAnalysisInfo_t info)
13939 cusparseZcsrilu0(cusparseHandle_t handle, cusparseOperation_t trans,
13940 int m, const cusparseMatDescr_t descrA,
13941 cuDoubleComplex *csrValM,
13942 const int *csrRowPtrA, const int *csrColIndA,
13943 cusparseSolveAnalysisInfo_t info)</pre><p class="p">This function computes the incomplete-LU factorization with
13944 <math xmlns="http://www.w3.org/1998/Math/MathML">
13946 </math> fill-in and no pivoting
13948 <div class="tablenoborder">
13949 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
13950 <tbody class="tbody">
13952 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
13953 <math xmlns="http://www.w3.org/1998/Math/MathML">
13956 <mo stretchy="false">(</mo>
13958 <mo stretchy="false">)</mo>
13969 <math xmlns="http://www.w3.org/1998/Math/MathML">
13971 </math> is <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
13973 </math><samp class="ph codeph">m</samp> sparse matrix (that is defined in CSR storage format by the three arrays <samp class="ph codeph">csrValM</samp>, <samp class="ph codeph">csrRowPtrA</samp> and <samp class="ph codeph">csrColIndA</samp>) and
13975 <math xmlns="http://www.w3.org/1998/Math/MathML">
13976 <mrow class="MJX-TeXAtom-ORD">
13977 <mrow class="MJX-TeXAtom-ORD">
13981 <mo stretchy="false">(</mo>
13983 <mo stretchy="false">)</mo>
13985 <mfenced open="{" close="">
13986 <mtable columnalign="left left" rowspacing=".1em" columnspacing="1em">
13992 <mtext>if trans == CUSPARSE_OPERATION_NON_TRANSPOSE</mtext>
14003 <mtext>if trans == CUSPARSE_OPERATION_TRANSPOSE</mtext>
14014 <mtext>if trans == CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE</mtext>
14020 <p class="p">Notice that the diagonal of lower triangular factor
14021 <math xmlns="http://www.w3.org/1998/Math/MathML">
14023 </math> is unitary and need not be stored. Therefore the input matrix is ovewritten with the resulting lower and upper triangular
14025 <math xmlns="http://www.w3.org/1998/Math/MathML">
14028 <math xmlns="http://www.w3.org/1998/Math/MathML">
14030 </math>, respectively.
14032 <p class="p">A call to this routine must be preceeded by a call to the <samp class="ph codeph">csrsv_analysis</samp> routine.
14034 <p class="p">This function requires some extra storage. It is executed asynchronously with respect to the host and it may return control
14035 to the application on the host before the result is ready.
14037 <div class="tablenoborder">
14038 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
14040 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
14041 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
14044 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">trans</samp></td>
14045 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation <samp class="ph codeph">op</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14046 <mo stretchy="false">(</mo>
14048 <mo stretchy="false">)</mo>
14053 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
14054 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows and columns of matrix
14055 <math xmlns="http://www.w3.org/1998/Math/MathML">
14061 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
14062 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
14063 <math xmlns="http://www.w3.org/1998/Math/MathML">
14065 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
14069 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValM</samp></td>
14070 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
14071 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14072 <mo stretchy="false">(</mo>
14074 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14076 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14077 <mo stretchy="false">)</mo>
14079 non-zero elements of matrix
14080 <math xmlns="http://www.w3.org/1998/Math/MathML">
14086 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
14087 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14092 elements that contains the start of every row and the end of the last row plus one.
14096 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
14097 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
14098 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14099 <mo stretchy="false">(</mo>
14101 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14103 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14104 <mo stretchy="false">)</mo>
14106 column indices of the non-zero elements of matrix
14107 <math xmlns="http://www.w3.org/1998/Math/MathML">
14113 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">info</samp></td>
14114 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">structure with information collected during the analysis phase (that should have been passed to the solve phase unchanged).</td>
14119 <div class="tablenoborder">
14120 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
14122 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValM</samp></td>
14123 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> matrix containg the incomplete-LU lower and upper triangular factors.</td>
14128 <div class="tablenoborder">
14129 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
14131 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
14132 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
14135 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
14136 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
14139 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
14140 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
14143 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
14144 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m<0</samp>).
14148 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
14149 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
14152 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
14153 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
14156 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
14157 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
14160 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
14161 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
14164 the matrix type is not supported.
14172 <div class="topic concept nested1" id="cusparse-lt-t-gt-gtsv"><a name="cusparse-lt-t-gt-gtsv" shape="rect">
14173 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-gtsv" name="cusparse-lt-t-gt-gtsv" shape="rect">10.3. cusparse<t>gtsv</a></h3>
14174 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
14175 cusparseSgtsv(cusparseHandle_t handle, int m, int n,
14176 const float *dl, const float *d,
14177 const float *du, float *B, int ldb)
14179 cusparseDgtsv(cusparseHandle_t handle, int m, int n,
14180 const double *dl, const double *d,
14181 const double *du, double *B, int ldb)
14183 cusparseCgtsv(cusparseHandle_t handle, int m, int n,
14184 const cuComplex *dl, const cuComplex *d,
14185 const cuComplex *du, cuComplex *B, int ldb)
14187 cusparseZgtsv(cusparseHandle_t handle, int m, int n,
14188 const cuDoubleComplex *dl, const cuDoubleComplex *d,
14189 const cuDoubleComplex *du, cuDoubleComplex *B, int ldb)</pre><p class="p">This function computes the solution of a tridiagonal linear system</p>
14190 <div class="tablenoborder">
14191 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
14192 <tbody class="tbody">
14194 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
14195 <math xmlns="http://www.w3.org/1998/Math/MathML">
14209 <p class="p">with multiple right-hand-sides.</p>
14210 <p class="p">The coefficient matrix
14211 <math xmlns="http://www.w3.org/1998/Math/MathML">
14213 </math> of each of these tri-diagonal linear system is defined with three vectors corresponding to its lower (<strong class="ph b">ld</strong>), main (<strong class="ph b">d</strong>) and upper (<strong class="ph b">ud</strong>) matrix diagonals, while the right-hand-sides are stored in the dense matrix
14214 <math xmlns="http://www.w3.org/1998/Math/MathML">
14216 </math>. Notice that the solutions
14217 <math xmlns="http://www.w3.org/1998/Math/MathML">
14219 </math> overwrite the right-hand-sides
14220 <math xmlns="http://www.w3.org/1998/Math/MathML">
14224 <p class="p">The routine does perform pivoting, which usually results in more accurate and more stable results than cusparse<t>gtsv_nopivot
14225 at the expense of some execution time
14227 <p class="p">This routine requires significant amount of temporary extra storage (<samp class="ph codeph">min(m,8) ×(3+n)×sizeof(<type>)</samp>). It is executed asynchronously with respect to the host and it may return control to the application on the host before
14228 the result is ready.
14230 <div class="tablenoborder">
14231 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
14233 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
14234 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
14237 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
14238 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the size of the linear system (must be ≥ 3).</td>
14241 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
14242 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of right-hand-sides, columns of matrix <samp class="ph codeph">B</samp>.
14246 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dl</samp></td>
14247 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the lower diagonal of the tri-diagonal linear system. The first element of each lower diagonal
14252 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">d</samp></td>
14253 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the main diagonal of the tri-diagonal linear system.</td>
14256 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">du</samp></td>
14257 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the upper diagonal of the tri-diagonal linear system. The last element of each upper diagonal
14262 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">B</samp></td>
14263 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense right-hand-side array of dimensions <samp class="ph codeph">(ldb, n)</samp>.
14267 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldb</samp></td>
14268 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">B</samp>. (that is ≥
14271 <math xmlns="http://www.w3.org/1998/Math/MathML">
14272 <mo movablelimits="true">max</mo>
14273 <mrow class="MJX-TeXAtom-ORD">
14274 <mrow class="MJX-TeXAtom-ORD">
14275 <mtext>(1, m))</mtext>
14284 <div class="tablenoborder">
14285 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
14287 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">B</samp></td>
14288 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense solution array of dimensions <samp class="ph codeph">(ldb, n)</samp>.
14294 <div class="tablenoborder">
14295 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
14297 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
14298 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
14301 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
14302 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
14305 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
14306 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
14309 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
14310 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m<3, n<0</samp>).
14314 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
14315 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
14318 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
14319 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
14322 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
14323 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
14326 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
14327 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
14330 the matrix type is not supported.
14338 <div class="topic concept nested1" id="cusparse-lt-t-gt-gtsv_nopivot"><a name="cusparse-lt-t-gt-gtsv_nopivot" shape="rect">
14339 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-gtsv_nopivot" name="cusparse-lt-t-gt-gtsv_nopivot" shape="rect">10.4. cusparse<t>gtsv_nopivot</a></h3>
14340 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
14341 cusparseSgtsv_nopivot(cusparseHandle_t handle, int m, int n,
14342 const float *dl, const float *d,
14343 const float *du, float *B, int ldb)
14345 cusparseDgtsv_nopivot(cusparseHandle_t handle, int m, int n,
14346 const double *dl, const double *d,
14347 const double *du, double *B, int ldb)
14349 cusparseCgtsv_nopivot(cusparseHandle_t handle, int m, int n,
14350 const cuComplex *dl, const cuComplex *d,
14351 const cuComplex *du, cuComplex *B, int ldb)
14353 cusparseZgtsv_nopivot(cusparseHandle_t handle, int m, int n,
14354 const cuDoubleComplex *dl, const cuDoubleComplex *d,
14355 const cuDoubleComplex *du, cuDoubleComplex *B, int ldb)</pre><p class="p">This function computes the solution of a tridiagonal linear system</p>
14356 <div class="tablenoborder">
14357 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
14358 <tbody class="tbody">
14360 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
14361 <math xmlns="http://www.w3.org/1998/Math/MathML">
14375 <p class="p">with multiple right-hand-sides.</p>
14376 <p class="p">The coefficient matrix
14377 <math xmlns="http://www.w3.org/1998/Math/MathML">
14379 </math> of each of these tri-diagonal linear system is defined with three vectors corresponding to its lower (<strong class="ph b">ld</strong>), main (<strong class="ph b">d</strong>) and upper (<strong class="ph b">ud</strong>) matrix diagonals, while the right-hand-sides are stored in the dense matrix
14380 <math xmlns="http://www.w3.org/1998/Math/MathML">
14382 </math>. Notice that the solutions
14383 <math xmlns="http://www.w3.org/1998/Math/MathML">
14385 </math> overwrite the right-hand-sides
14386 <math xmlns="http://www.w3.org/1998/Math/MathML">
14390 <p class="p">The routine does not perform any pivoting and uses a combination of the Cyclic Reduction (CR) and Parallel Cyclic Reduction
14391 (PCR) algorithms to find the solution. It achieves better performance when <samp class="ph codeph">m</samp> is a power of 2.
14393 <p class="p">This routine requires significant amount of temporary extra storage (<samp class="ph codeph">m×(3+n)×sizeof(<type>)</samp>). It is executed asynchronously with respect to the host and it may return control to the application on the host before
14394 the result is ready.
14396 <div class="tablenoborder">
14397 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
14399 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
14400 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
14403 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
14404 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the size of the linear system (must be ≥ 3).</td>
14407 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
14408 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of right-hand-sides, columns of matrix <samp class="ph codeph">B</samp>.
14412 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dl</samp></td>
14413 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the lower diagonal of the tri-diagonal linear system. The first element of each lower diagonal
14418 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">d</samp></td>
14419 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the main diagonal of the tri-diagonal linear system.</td>
14422 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">du</samp></td>
14423 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the upper diagonal of the tri-diagonal linear system. The last element of each upper diagonal
14428 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">B</samp></td>
14429 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense right-hand-side array of dimensions <samp class="ph codeph">(ldb, n)</samp>.
14433 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">ldb</samp></td>
14434 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of <samp class="ph codeph">B</samp>. (that is ≥
14437 <math xmlns="http://www.w3.org/1998/Math/MathML">
14438 <mo movablelimits="true">max</mo>
14439 <mrow class="MJX-TeXAtom-ORD">
14440 <mrow class="MJX-TeXAtom-ORD">
14441 <mtext>(1, m))</mtext>
14450 <div class="tablenoborder">
14451 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
14453 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">B</samp></td>
14454 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense solution array of dimensions <samp class="ph codeph">(ldb, n)</samp>.
14460 <div class="tablenoborder">
14461 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
14463 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
14464 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
14467 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
14468 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
14471 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
14472 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
14475 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
14476 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m<3, n<0</samp>).
14480 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
14481 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
14484 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
14485 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
14488 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
14489 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
14492 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
14493 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
14496 the matrix type is not supported.
14504 <div class="topic concept nested1" id="cusparse-lt-t-gt-gtsvstridedbatch"><a name="cusparse-lt-t-gt-gtsvstridedbatch" shape="rect">
14505 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-gtsvstridedbatch" name="cusparse-lt-t-gt-gtsvstridedbatch" shape="rect">10.5. cusparse<t>gtsvStridedBatch</a></h3>
14506 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
14507 cusparseSgtsvStridedBatch(cusparseHandle_t handle, int m,
14510 const float *du, float *x,
14511 int batchCount, int batchStride)
14513 cusparseDgtsvStridedBatch(cusparseHandle_t handle, int m,
14516 const double *du, double *x,
14517 int batchCount, int batchStride)
14519 cusparseCgtsvStridedBatch(cusparseHandle_t handle, int m,
14520 const cuComplex *dl,
14521 const cuComplex *d,
14522 const cuComplex *du, cuComplex *x,
14523 int batchCount, int batchStride)
14525 cusparseZgtsvStridedBatch(cusparseHandle_t handle, int m,
14526 const cuDoubleComplex *dl,
14527 const cuDoubleComplex *d,
14528 const cuDoubleComplex *du, cuDoubleComplex *x,
14529 int batchCount, int batchStride)</pre><p class="p">This function computes the solution of multiple tridiagonal linear systems</p>
14530 <div class="tablenoborder">
14531 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="void" border="1" rules="all">
14532 <tbody class="tbody">
14534 <td class="entry" align="center" valign="top" rowspan="1" colspan="1">
14535 <math xmlns="http://www.w3.org/1998/Math/MathML">
14538 <mrow class="MJX-TeXAtom-ORD">
14539 <mo stretchy="false">(</mo>
14541 <mo stretchy="false">)</mo>
14546 <mrow class="MJX-TeXAtom-ORD">
14547 <mrow class="MJX-TeXAtom-ORD">
14551 <mrow class="MJX-TeXAtom-ORD">
14552 <mo stretchy="false">(</mo>
14554 <mo stretchy="false">)</mo>
14561 <mrow class="MJX-TeXAtom-ORD">
14562 <mrow class="MJX-TeXAtom-ORD">
14566 <mrow class="MJX-TeXAtom-ORD">
14567 <mo stretchy="false">(</mo>
14569 <mo stretchy="false">)</mo>
14578 <p class="p">for <em class="ph i">i</em>=0,\ldots,<samp class="ph codeph">batchCount</samp>.
14580 <p class="p">The coefficient matrix
14581 <math xmlns="http://www.w3.org/1998/Math/MathML">
14583 </math> of each of these tri-diagonal linear system is defined with three vectors corresponding to its lower (<strong class="ph b">ld</strong>), main (<strong class="ph b">d</strong>) and upper (<strong class="ph b">ud</strong>) matrix diagonals, while the right-hand-side is stored in the vector <samp class="ph codeph">x</samp>. Notice that the solution <samp class="ph codeph">y</samp> overwrites the right-hand-side <samp class="ph codeph">x</samp> on exit. The different matrices are assumed to be of the same size and are stored with a fixed <samp class="ph codeph">batchStride</samp> in memory.
14585 <p class="p">The routine does not perform any pivoting and uses a combination of the Cyclic Reduction (CR) and Parallel Cyclic Reduction
14586 (PCR) algorithms to find the solution. It achieves better performance when <samp class="ph codeph">m</samp> is a power of 2.
14588 <p class="p">This routine requires significant amount of temporary extra storage ((<samp class="ph codeph">batchCount×(4×m+2048)×sizeof(<type>)</samp>)). It is executed asynchronously with respect to the host and it may return control to the application on the host before
14589 the result is ready.
14591 <div class="tablenoborder">
14592 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
14594 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
14595 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
14598 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
14599 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the size of the linear system (must be ≥ 3).</td>
14602 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dl</samp></td>
14603 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the lower diagonal of the tri-diagonal linear system. The lower diagonal
14606 <math xmlns="http://www.w3.org/1998/Math/MathML">
14610 <mrow class="MJX-TeXAtom-ORD">
14611 <mo stretchy="false">(</mo>
14613 <mo stretchy="false">)</mo>
14618 that corresponds to the <em class="ph i">i</em>th linear system starts at location <samp class="ph codeph">dl+batchStride×i</samp> in memory. Also, the first element of each lower diagonal must be zero.
14622 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">d</samp></td>
14623 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the main diagonal of the tri-diagonal linear system. The main diagonal
14626 <math xmlns="http://www.w3.org/1998/Math/MathML">
14629 <mrow class="MJX-TeXAtom-ORD">
14630 <mo stretchy="false">(</mo>
14632 <mo stretchy="false">)</mo>
14637 that corresponds to the <em class="ph i">i</em>th linear system starts at location <samp class="ph codeph">d+batchStride×i</samp> in memory.
14641 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">du</samp></td>
14642 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array containing the upper diagonal of the tri-diagonal linear system. The upper diagonal
14645 <math xmlns="http://www.w3.org/1998/Math/MathML">
14649 <mrow class="MJX-TeXAtom-ORD">
14650 <mo stretchy="false">(</mo>
14652 <mo stretchy="false">)</mo>
14657 that corresponds to the <em class="ph i">i</em>th linear system starts at location <samp class="ph codeph">du+batchStride×i</samp> in memory. Also, the last element of each upper diagonal must be zero.
14661 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
14662 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array that contains the right-hand-side of the tri-diagonal linear system. The right-hand-side
14665 <math xmlns="http://www.w3.org/1998/Math/MathML">
14668 <mrow class="MJX-TeXAtom-ORD">
14669 <mo stretchy="false">(</mo>
14671 <mo stretchy="false">)</mo>
14676 that corresponds to the <em class="ph i">i</em>th linear system starts at location <samp class="ph codeph">x+batchStride×i</samp>in memory.
14680 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">batchCount</samp></td>
14681 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">Number of systems to solve.</td>
14684 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">batchStride</samp></td>
14685 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">stride (number of elements) that separates the vectors of every system (must be at least <samp class="ph codeph">m</samp>).
14691 <div class="tablenoborder">
14692 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
14694 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">x</samp></td>
14695 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> dense array that contains the solution of the tri-diagonal linear system. The solution
14698 <math xmlns="http://www.w3.org/1998/Math/MathML">
14701 <mrow class="MJX-TeXAtom-ORD">
14702 <mo stretchy="false">(</mo>
14704 <mo stretchy="false">)</mo>
14709 that corresponds to the <em class="ph i">i</em>th linear system starts at location <samp class="ph codeph">x+batchStride×i</samp>in memory.
14715 <div class="tablenoborder">
14716 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
14718 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
14719 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
14722 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
14723 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
14726 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
14727 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
14730 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
14731 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m<3, batchCount≤0, batchStride<m</samp>).
14735 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
14736 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
14739 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
14740 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
14743 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
14744 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
14752 <div class="topic concept nested0" id="cusparse-format-conversion-reference"><a name="cusparse-format-conversion-reference" shape="rect">
14753 <!-- --></a><h2 class="title topictitle1"><a href="#cusparse-format-conversion-reference" name="cusparse-format-conversion-reference" shape="rect">11. CUSPARSE Format Conversion Reference</a></h2>
14754 <div class="body conbody">
14755 <p class="p">This chapter describes the conversion routines between different sparse and dense storage formats.</p>
14757 <div class="topic concept nested1" id="cusparse-lt-t-gt-bsr2csr"><a name="cusparse-lt-t-gt-bsr2csr" shape="rect">
14758 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-bsr2csr" name="cusparse-lt-t-gt-bsr2csr" shape="rect">11.1. cusparse<t>bsr2csr</a></h3>
14759 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
14760 cusparseSbsr2csr(cusparseHandle_t handle, cusparseDirection_t dirA,
14762 const cusparseMatDescr_t descrA, const float *bsrValA,
14763 const int *bsrRowPtrA, const int *bsrColIndA,
14765 const cusparseMatDescr_t descrC,
14766 float *csrValC, int *csrRowPtrC, int *csrColIndC)
14768 cusparseDbsr2csr(cusparseHandle_t handle, cusparseDirection_t dirA,
14770 const cusparseMatDescr_t descrA, const double *bsrValA,
14771 const int *bsrRowPtrA, const int *bsrColIndA,
14773 const cusparseMatDescr_t descrC,
14774 double *csrValC, int *csrRowPtrC, int *csrColIndC)
14776 cusparseCbsr2csr(cusparseHandle_t handle, cusparseDirection_t dirA,
14778 const cusparseMatDescr_t descrA, const cuComplex *bsrValA,
14779 const int *bsrRowPtrA, const int *bsrColIndA,
14781 const cusparseMatDescr_t descrC,
14782 cuComplex *csrValC, int *csrRowPtrC, int *csrColIndC)
14784 cusparseZbsr2csr(cusparseHandle_t handle, cusparseDirection_t dirA,
14786 const cusparseMatDescr_t descrA, const cuDoubleComplex *bsrValA,
14787 const int *bsrRowPtrA, const int *bsrColIndA,
14789 const cusparseMatDescr_t descrC,
14790 cuDoubleComplex *csrValC, int *csrRowPtrC, int *csrColIndC)</pre><p class="p">This function converts a sparse matrix in BSR format (that is defined by the three arrays <samp class="ph codeph">bsrValA</samp>, <samp class="ph codeph">bsrRowPtrA</samp>, and <samp class="ph codeph">bsrColIndA</samp>) into a sparse matrix in CSR format (that is defined by arrays <samp class="ph codeph">csrValC</samp>, <samp class="ph codeph">csrRowPtrC</samp>, and <samp class="ph codeph">csrColIndC</samp>).
14795 <math xmlns="http://www.w3.org/1998/Math/MathML">
14797 <mo stretchy="false">(</mo>
14810 <mo stretchy="false">)</mo>
14813 be number of rows of
14814 <math xmlns="http://www.w3.org/1998/Math/MathML">
14819 <math xmlns="http://www.w3.org/1998/Math/MathML">
14821 <mo stretchy="false">(</mo>
14834 <mo stretchy="false">)</mo>
14837 be number of columns of
14838 <math xmlns="http://www.w3.org/1998/Math/MathML">
14841 <math xmlns="http://www.w3.org/1998/Math/MathML">
14844 <math xmlns="http://www.w3.org/1998/Math/MathML">
14846 </math> are <samp class="ph codeph">m×n</samp> sparse matricies. BSR format of
14847 <math xmlns="http://www.w3.org/1998/Math/MathML">
14852 <math xmlns="http://www.w3.org/1998/Math/MathML">
14857 <mo stretchy="false">(</mo>
14859 </math><samp class="ph codeph">csrRowPtrC(mb) − csrRowPtrC(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14860 <mo stretchy="false">)</mo>
14863 non-zero blocks whereas sparse matrix
14864 <math xmlns="http://www.w3.org/1998/Math/MathML">
14869 <math xmlns="http://www.w3.org/1998/Math/MathML">
14873 <mo stretchy="false">(</mo>
14891 <mo stretchy="false">)</mo>
14894 elements. The user must allocate enough space for arrays <samp class="ph codeph">csrRowPtrC</samp>, <samp class="ph codeph">csrColIndC</samp> and <samp class="ph codeph">csrValC</samp>. The requirements are
14896 <p class="p"><samp class="ph codeph">csrRowPtrC</samp> of <samp class="ph codeph">m+1</samp> elements,
14898 <p class="p"><samp class="ph codeph">csrValC</samp> of nnz elements, and
14900 <p class="p"><samp class="ph codeph">csrColIndC</samp> of nnz elements.
14902 <p class="p">The general procedure is as follows:</p><pre xml:space="preserve"><span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// Given BSR format (bsrRowPtrA, bsrcolIndA, bsrValA) and </span>
14903 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// blocks of BSR format are stored in column-major order.</span>
14904 cusparseDirection_t dirA = CUSPARSE_DIRECTION_COLUMN;
14905 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> m = mb*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>;
14906 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> nnzb = bsrRowPtrA[mb] - bsrRowPtrA[0]; <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// number of blocks</span>
14907 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> nnz = nnzb * <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span> * <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>; <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// number of elements</span>
14908 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrRowPtrC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*(m+1));
14909 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrColIndC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*nnz);
14910 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrValC , <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*nnz);
14911 cusparseSbsr2csr(handle, dirA, mb, nb,
14913 bsrValA, bsrRowPtrA, bsrColIndA,
14914 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>,
14916 csrValC, csrRowPtrC, csrColIndC);</pre><div class="tablenoborder">
14917 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
14919 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
14920 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
14923 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dirA</samp></td>
14924 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">storage format of blocks, either <samp class="ph codeph">CUSPARSE_DIRECTION_ROW</samp> or <samp class="ph codeph">CUSPARSE_DIRECTION_COLUMN</samp>.
14928 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">mb</samp></td>
14929 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of block rows of sparse matrix <samp class="ph codeph">A</samp>. The number of rows of sparse matrix C is <dfn class="term">m</dfn> (= <dfn class="term">mb</dfn> * <dfn class="term">blockDim</dfn>)
14933 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nb</samp></td>
14934 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of block columns of sparse matrix <samp class="ph codeph">A</samp>. The number of columns of sparse matrix C is <dfn class="term">n</dfn> (= <dfn class="term">nb</dfn> * <dfn class="term">blockDim</dfn>)
14938 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
14939 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
14940 <math xmlns="http://www.w3.org/1998/Math/MathML">
14946 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrValA</samp></td>
14947 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of <samp class="ph codeph">nnzb</samp>*
14950 <math xmlns="http://www.w3.org/1998/Math/MathML">
14964 non-zero elements of matrix <samp class="ph codeph">A</samp>.
14968 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrRowPtrA</samp></td>
14969 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of<samp class="ph codeph">mb+1</samp> elements that contains the start of every block row and the end of the last block row plus one.
14973 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrColIndA</samp></td>
14974 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnzb</samp> column indices of the non-zero blocks of matrix <samp class="ph codeph">A</samp>.
14978 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">blockDim</samp></td>
14979 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">block dimension of sparse matrix <samp class="ph codeph">A</samp>, larger than zero.
14983 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrC</samp></td>
14984 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
14985 <math xmlns="http://www.w3.org/1998/Math/MathML">
14993 <div class="tablenoborder">
14994 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
14996 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValC</samp></td>
14997 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
14998 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
14999 <mo stretchy="false">(</mo>
15001 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15003 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15004 <mo stretchy="false">)</mo>
15006 non-zero elements of matrix
15007 <math xmlns="http://www.w3.org/1998/Math/MathML">
15013 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrC</samp></td>
15014 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
15018 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndC</samp></td>
15019 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp> column indices of the non-zero elements of matrix
15020 <math xmlns="http://www.w3.org/1998/Math/MathML">
15028 <div class="tablenoborder">
15029 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
15031 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
15032 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
15035 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
15036 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
15039 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
15040 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
15043 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
15044 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">mb,nb<0</samp>, <samp class="ph codeph">IndexBase</samp> of <samp class="ph codeph">descrA, descrC</samp> is not base-0 or base-1, <samp class="ph codeph">dirA</samp> is not row-major or column-major, or <dfn class="term">blockDim</dfn><1).
15048 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
15049 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
15052 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
15053 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
15056 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
15057 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
15060 the matrix type is not supported.
15064 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
15065 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
15072 <div class="topic concept nested1" id="cusparse-lt-t-gt-coo2csr"><a name="cusparse-lt-t-gt-coo2csr" shape="rect">
15073 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-coo2csr" name="cusparse-lt-t-gt-coo2csr" shape="rect">11.2. cusparse<t>coo2csr</a></h3>
15074 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
15075 cusparseXcoo2csr(cusparseHandle_t handle, const int *cooRowInd,
15076 int nnz, int m, int *csrRowPtr, cusparseIndexBase_t idxBase)</pre><p class="p">This function converts the array containing the uncompressed row indices (corresponding to COO format) into an array of compressed
15077 row pointers (corresponding to CSR format).
15079 <p class="p">It can also be used to convert the array containing the uncompressed column indices (corresponding to COO format) into an
15080 array of column pointers (corresponding to CSC format).
15082 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
15083 to the application on the host before the result is ready.
15085 <div class="tablenoborder">
15086 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
15088 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
15089 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
15092 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cooRowInd</samp></td>
15093 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp> uncompressed row indices.
15097 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
15098 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of non-zeros of the sparse matrix (that is also the length of array <samp class="ph codeph">cooRowInd</samp>).
15102 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
15103 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
15104 <math xmlns="http://www.w3.org/1998/Math/MathML">
15110 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
15111 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
15117 <div class="tablenoborder">
15118 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
15120 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtr</samp></td>
15121 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
15127 <div class="tablenoborder">
15128 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
15130 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
15131 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
15134 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
15135 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
15138 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
15139 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">IndexBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
15143 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
15144 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
15151 <div class="topic concept nested1" id="cusparse-lt-t-gt-csc2dense"><a name="cusparse-lt-t-gt-csc2dense" shape="rect">
15152 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csc2dense" name="cusparse-lt-t-gt-csc2dense" shape="rect">11.3. cusparse<t>csc2dense</a></h3>
15153 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
15154 cusparseScsc2dense(cusparseHandle_t handle, int m, int n,
15155 const cusparseMatDescr_t descrA,
15156 const float *cscValA,
15157 const int *cscRowIndA, const int *cscColPtrA,
15160 cusparseDcsc2dense(cusparseHandle_t handle, int m, int n,
15161 const cusparseMatDescr_t descrA,
15162 const double *cscValA,
15163 const int *cscRowIndA, const int *cscColPtrA,
15164 double *A, int lda)
15166 cusparseCcsc2dense(cusparseHandle_t handle, int m, int n,
15167 const cusparseMatDescr_t descrA,
15168 const cuComplex *cscValA,
15169 const int *cscRowIndA, const int *cscColPtrA,
15170 cuComplex *A, int lda)
15172 cusparseZcsc2dense(cusparseHandle_t handle, int m, int n,
15173 const cusparseMatDescr_t descrA,
15174 const cuDoubleComplex *cscValA,
15175 const int *cscRowIndA, const int *cscColPtrA,
15176 cuDoubleComplex *A, int lda)</pre><p class="p">This function converts the sparse matrix in CSC format (that is defined by the three arrays <samp class="ph codeph">cscValA</samp>, <samp class="ph codeph">cscColPtrA</samp> and <samp class="ph codeph">cscRowIndA</samp>) into the matrix <samp class="ph codeph">A</samp> in dense format. The dense matrix <samp class="ph codeph">A</samp> is filled in with the values of the sparse matrix and with zeros elsewhere.
15178 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
15179 to the application on the host before the result is ready.
15181 <div class="tablenoborder">
15182 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
15184 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
15185 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
15188 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
15189 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
15190 <math xmlns="http://www.w3.org/1998/Math/MathML">
15196 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
15197 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
15198 <math xmlns="http://www.w3.org/1998/Math/MathML">
15204 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
15205 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
15206 <math xmlns="http://www.w3.org/1998/Math/MathML">
15208 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
15212 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscValA</samp></td>
15213 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
15214 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15215 <mo stretchy="false">(</mo>
15217 </math><samp class="ph codeph">cscColPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15219 </math><samp class="ph codeph">cscColPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15220 <mo stretchy="false">)</mo>
15222 non-zero elements of matrix
15223 <math xmlns="http://www.w3.org/1998/Math/MathML">
15229 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscRowIndA</samp></td>
15230 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
15231 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15232 <mo stretchy="false">(</mo>
15234 </math><samp class="ph codeph">cscColPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15236 </math><samp class="ph codeph">cscColPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15237 <mo stretchy="false">)</mo>
15239 row indices of the non-zero elements of matrix
15240 <math xmlns="http://www.w3.org/1998/Math/MathML">
15246 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscColPtrA</samp></td>
15247 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">n+1</samp> elements that contains the start of every row and the end of the last column plus one.
15251 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">lda</samp></td>
15252 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of dense array <samp class="ph codeph">A</samp>.
15258 <div class="tablenoborder">
15259 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
15261 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">A</samp></td>
15262 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(lda, n)</samp> that is filled in with the values of the sparse matrix.
15268 <div class="tablenoborder">
15269 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
15271 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
15272 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
15275 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
15276 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
15279 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
15280 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
15284 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
15285 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
15288 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
15289 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
15292 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
15293 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
15296 the matrix type is not supported.
15304 <div class="topic concept nested1" id="cusparse-lt-t-gt-csc2hyb"><a name="cusparse-lt-t-gt-csc2hyb" shape="rect">
15305 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csc2hyb" name="cusparse-lt-t-gt-csc2hyb" shape="rect">11.4. cusparse<t>csc2hyb</a></h3>
15306 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
15307 cusparseScsc2hyb(cusparseHandle_t handle, int m, int n,
15308 const cusparseMatDescr_t descrA,
15309 const float *cscValA,
15310 const int *cscRowIndA, const int *cscColPtrA,
15311 cusparseHybMat_t hybA, int userEllWidth,
15312 cusparseHybPartition_t partitionType)
15314 cusparseDcsc2hyb(cusparseHandle_t handle, int m, int n,
15315 const cusparseMatDescr_t descrA,
15316 const double *cscValA,
15317 const int *cscRowIndA, const int *cscColPtrA,
15318 cusparseHybMat_t hybA, int userEllWidth,
15319 cusparseHybPartition_t partitionType)
15321 cusparseCcsc2hyb(cusparseHandle_t handle, int m, int n,
15322 const cusparseMatDescr_t descrA,
15323 const cuComplex *cscValA,
15324 const int *cscRowIndA, const int *cscColPtrA,
15325 cusparseHybMat_t hybA, int userEllWidth,
15326 cusparseHybPartition_t partitionType)
15328 cusparseZcsc2hyb(cusparseHandle_t handle, int m, int n,
15329 const cusparseMatDescr_t descrA,
15330 const cuDoubleComplex *cscValA,
15331 const int *cscRowIndA, const int *cscColPtrA,
15332 cusparseHybMat_t hybA, int userEllWidth,
15333 cusparseHybPartition_t partitionType)</pre><p class="p">This function converts a sparse matrix in CSC format into a sparse matrix in HYB format. It assumes that the <samp class="ph codeph">hybA</samp> parameter has been initialized with <samp class="ph codeph">cusparseCreateHybMat</samp> routine before calling this function.
15335 <p class="p">This function requires some amount of temporary storage and a significant amount of storage for the matrix in HYB format.
15336 It is executed asynchronously with respect to the host and it may return control to the application on the host before the
15339 <div class="tablenoborder">
15340 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
15342 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
15343 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
15346 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
15347 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
15348 <math xmlns="http://www.w3.org/1998/Math/MathML">
15354 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
15355 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
15356 <math xmlns="http://www.w3.org/1998/Math/MathML">
15362 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
15363 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
15364 <math xmlns="http://www.w3.org/1998/Math/MathML">
15366 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
15370 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscValA</samp></td>
15371 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
15372 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15373 <mo stretchy="false">(</mo>
15375 </math><samp class="ph codeph">cscColPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15377 </math><samp class="ph codeph">cscColPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15378 <mo stretchy="false">)</mo>
15380 non-zero elements of matrix
15381 <math xmlns="http://www.w3.org/1998/Math/MathML">
15387 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscRowIndA</samp></td>
15388 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
15389 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15390 <mo stretchy="false">(</mo>
15392 </math><samp class="ph codeph">cscColPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15394 </math><samp class="ph codeph">cscColPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15395 <mo stretchy="false">)</mo>
15397 column indices of the non-zero elements of matrix
15398 <math xmlns="http://www.w3.org/1998/Math/MathML">
15404 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscColPtrA</samp></td>
15405 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
15409 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">userEllWidth</samp></td>
15410 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">width of the regular (ELL) part of the matrix in HYB format, which should be less than maximum number of non-zeros per row
15411 and is only required if <samp class="ph codeph">partitionType</samp> == <samp class="ph codeph">CUSPARSE_HYB_PARTITION_USER</samp>.
15415 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">partitionType</samp></td>
15416 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">partitioning method to be used in the conversion (please refer to <samp class="ph codeph">cusparseHybPartition_t</samp> on page ?? for details).
15422 <div class="tablenoborder">
15423 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
15425 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
15426 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix <samp class="ph codeph">A</samp> in HYB storage format.
15432 <div class="tablenoborder">
15433 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
15435 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
15436 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
15439 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
15440 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
15443 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
15444 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
15447 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
15448 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
15452 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
15453 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
15456 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
15457 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
15460 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
15461 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
15464 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
15465 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
15468 the matrix type is not supported.
15476 <div class="topic concept nested1" id="cusparse-lt-t-gt-csr2bsr"><a name="cusparse-lt-t-gt-csr2bsr" shape="rect">
15477 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csr2bsr" name="cusparse-lt-t-gt-csr2bsr" shape="rect">11.5. cusparse<t>csr2bsr</a></h3>
15478 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
15479 cusparseXcsr2bsrNnz(cusparseHandle_t handle, cusparseDirection_t dirA,
15481 const cusparseMatDescr_t descrA,
15482 const int *csrRowPtrA, const int *csrColIndA,
15484 const cusparseMatDescr_t descrC,
15486 int *nnzTotalDevHostPtr)
15488 cusparseScsr2bsr(cusparseHandle_t handle, cusparseDirection_t dirA,
15490 const cusparseMatDescr_t descrA, const float *csrValA,
15491 const int *csrRowPtrA, const int *csrColIndA,
15493 const cusparseMatDescr_t descrC,
15494 float *bsrValC, int *bsrRowPtrC, int *bsrColIndC)
15496 cusparseDcsr2bsr(cusparseHandle_t handle, cusparseDirection_t dirA,
15498 const cusparseMatDescr_t descrA, const double *csrValA,
15499 const int *csrRowPtrA, const int *csrColIndA,
15501 const cusparseMatDescr_t descrC,
15502 double *bsrValC, int *bsrRowPtrC, int *bsrColIndC)
15504 cusparseCcsr2bsr(cusparseHandle_t handle, cusparseDirection_t dirA,
15506 const cusparseMatDescr_t descrA, const cuComplex *csrValA,
15507 const int *csrRowPtrA, const int *csrColIndA,
15509 const cusparseMatDescr_t descrC,
15510 cuComplex *bsrValC, int *bsrRowPtrC, int *bsrColIndC)
15512 cusparseZcsr2bsr(cusparseHandle_t handle, cusparseDirection_t dirA,
15514 const cusparseMatDescr_t descrA, const cuDoubleComplex *csrValA,
15515 const int *csrRowPtrA, const int *csrColIndA,
15517 const cusparseMatDescr_t descrC,
15518 cuDoubleComplex *bsrValC, int *bsrRowPtrC, int *bsrColIndC)</pre><p class="p">This function converts a sparse matrix in CSR format (that is defined by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp> and <samp class="ph codeph">csrColIndA</samp>) into a sparse matrix in BSR format (that is defined by arrays <samp class="ph codeph">bsrValC</samp>, <samp class="ph codeph">bsrRowPtrC</samp> and <samp class="ph codeph">bsrColIndC</samp>).
15521 <math xmlns="http://www.w3.org/1998/Math/MathML">
15523 </math> is <samp class="ph codeph">m×n</samp> sparse matrix and is (mb*blockDim) (nb*blockDim) sparse matrix.
15528 <math xmlns="http://www.w3.org/1998/Math/MathML">
15531 <mo stretchy="false">(</mo>
15559 <mo stretchy="false">)</mo>
15562 is number of block rows of A and
15565 <math xmlns="http://www.w3.org/1998/Math/MathML">
15568 <mo stretchy="false">(</mo>
15596 <mo stretchy="false">)</mo>
15599 is number of block columns of A.
15600 <math xmlns="http://www.w3.org/1998/Math/MathML">
15603 <math xmlns="http://www.w3.org/1998/Math/MathML">
15605 </math> need not be multiple of
15606 <math xmlns="http://www.w3.org/1998/Math/MathML">
15615 </math>. If so, then zeros are filled in.
15617 <p class="p">CUSPARSE adopts two-step approach to do the conversion. First, the user allocates <samp class="ph codeph">bsrRowPtrC</samp> of <samp class="ph codeph">mb+1</samp> elements and uses function cusparseXcsr2bsrNnz to determine number of non-zero block columns per block row. Second, the user
15618 gathers nnzb (number of non-zero block columns of matrix A) from either <samp class="ph codeph">(nnzb=*nnzTotalDevHostPtr)</samp> or <samp class="ph codeph">(nnzb=bsrRowPtrC(mb)-bsrRowPtrC(0))</samp>
15619 and allocates <samp class="ph codeph">bsrValC</samp> of
15621 <math xmlns="http://www.w3.org/1998/Math/MathML">
15636 <mrow class="MJX-TeXAtom-ORD">
15642 elements and <samp class="ph codeph">bsrColIndC</samp> of
15643 <math xmlns="http://www.w3.org/1998/Math/MathML">
15648 </math> elements. Finally function cusparse[S|D|C|Z]csr2bsr is called to complete the conversion.
15650 <p class="p">The general procedure is as follows:</p><pre xml:space="preserve"><span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// Given CSR format (csrRowPtrA, csrcolIndA, csrValA) and </span>
15651 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// blocks of BSR format are stored in column-major order.</span>
15652 cusparseDirection_t dirA = CUSPARSE_DIRECTION_COLUMN;
15653 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> base, nnzb;
15654 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> mb = (m + <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>-1)/<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>;
15655 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">// nnzTotalDevHostPtr points to host memory</span>
15656 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> *nnzTotalDevHostPtr = &nnzb;
15657 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&bsrRowPtrC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>) *(mb+1));
15658 cusparseXcsr2bsrNnz(handle, dirA, m, n,
15659 descrA, csrRowPtrA, csrColIndA,
15660 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>,
15661 descrC, bsrRowPtrC,
15662 nnzTotalDevHostPtr);
15663 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (NULL != nnzTotalDevHostPtr){
15664 nnzb = *nnzTotalDevHostPtr;
15665 }<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">else</span>{
15666 cudaMemcpy(&nnzb, bsrRowPtrC+mb, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
15667 cudaMemcpy(&base, bsrRowPtrC , <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>), cudaMemcpyDeviceToHost);
15670 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&bsrColIndC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span>)*nnzb);
15671 cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&bsrValC, <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">float</span>)*(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>)*nnzb);
15672 cusparseScsr2bsr(handle, dirA, m, n,
15674 csrValA, csrRowPtrA, csrColIndA,
15675 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-attribute">blockDim</span>,
15677 bsrValC, bsrRowPtrC, bsrColIndC);</pre><p class="p">If
15678 <math xmlns="http://www.w3.org/1998/Math/MathML">
15687 </math> is large (typically a block cannot fit into shared memory), then csr2bsr routines will allocate temporary integer array of
15690 <math xmlns="http://www.w3.org/1998/Math/MathML">
15703 . If device memory is not available, then CUSPARSE_STATUS_ALLOC_FAILED is returned.
15705 <div class="tablenoborder">
15706 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
15708 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
15709 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
15712 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dirA</samp></td>
15713 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">storage format of blocks, either <samp class="ph codeph">CUSPARSE_DIRECTION_ROW</samp> or <samp class="ph codeph">CUSPARSE_DIRECTION_COLUMN</samp>.
15717 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
15718 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of sparse matrix
15719 <math xmlns="http://www.w3.org/1998/Math/MathML">
15725 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
15726 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of sparse matrix
15727 <math xmlns="http://www.w3.org/1998/Math/MathML">
15733 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
15734 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
15735 <math xmlns="http://www.w3.org/1998/Math/MathML">
15741 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzA</samp></td>
15742 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of sparse matrix
15743 <math xmlns="http://www.w3.org/1998/Math/MathML">
15749 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
15750 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
15751 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15752 <mo stretchy="false">(</mo>
15754 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15756 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15757 <mo stretchy="false">)</mo>
15759 non-zero elements of matrix
15760 <math xmlns="http://www.w3.org/1998/Math/MathML">
15766 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
15767 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
15771 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
15772 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
15773 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15774 <mo stretchy="false">(</mo>
15776 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15778 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
15779 <mo stretchy="false">)</mo>
15781 column indices of the non-zero elements of matrix
15782 <math xmlns="http://www.w3.org/1998/Math/MathML">
15788 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">blockDim</samp></td>
15789 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">block dimension of sparse matrix <samp class="ph codeph">A</samp>. The range of
15790 <math xmlns="http://www.w3.org/1998/Math/MathML">
15799 </math> is between 1 and
15801 <math xmlns="http://www.w3.org/1998/Math/MathML">
15805 <mo stretchy="false">(</mo>
15809 <mo stretchy="false">)</mo>
15816 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrC</samp></td>
15817 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
15818 <math xmlns="http://www.w3.org/1998/Math/MathML">
15826 <div class="tablenoborder">
15827 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
15829 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrValC</samp></td>
15830 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of <samp class="ph codeph">nnzb</samp> *
15833 <math xmlns="http://www.w3.org/1998/Math/MathML">
15843 <mrow class="MJX-TeXAtom-ORD">
15849 non-zero elements of matrix .
15853 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrRowPtrC</samp></td>
15854 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">mb+1</samp> elements that contains the start of every block row and the end of the last block row plus one.
15858 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">bsrColIndC</samp></td>
15859 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnzb</samp> column indices of the non-zero blocks of matrix
15860 <math xmlns="http://www.w3.org/1998/Math/MathML">
15866 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzTotalDevHostPtr</samp></td>
15867 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">total number of nonzero elements in device or host memory. It is equal to
15868 <math xmlns="http://www.w3.org/1998/Math/MathML">
15869 <mi>(bsrRowPtrC(mb)-bsrRowPtrC(0))</mi>
15876 <div class="tablenoborder">
15877 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
15879 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
15880 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
15883 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
15884 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
15887 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
15888 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
15891 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
15892 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>). IndexBase field of <samp class="ph codeph">descrA, descrC</samp> is not base-0 or base-1, <samp class="ph codeph">dirA</samp> is not row-major or column-major, or
15893 <math xmlns="http://www.w3.org/1998/Math/MathML">
15902 </math> is not between 1 and min(
15903 <math xmlns="http://www.w3.org/1998/Math/MathML">
15906 <math xmlns="http://www.w3.org/1998/Math/MathML">
15912 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
15913 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
15916 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
15917 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
15920 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
15921 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
15924 the matrix type is not supported.
15928 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
15929 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
15936 <div class="topic concept nested1" id="cusparse-lt-t-gt-csr2coo"><a name="cusparse-lt-t-gt-csr2coo" shape="rect">
15937 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csr2coo" name="cusparse-lt-t-gt-csr2coo" shape="rect">11.6. cusparse<t>csr2coo</a></h3>
15938 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
15939 cusparseXcsr2coo(cusparseHandle_t handle, const int *csrRowPtr,
15940 int nnz, int m, int *cooRowInd,
15941 cusparseIndexBase_t idxBase)</pre><p class="p">This function converts the array containing the compressed row pointers (corresponding to CSR format) into an array of uncompressed
15942 row indices (corresponding to COO format).
15944 <p class="p">It can also be used to convert the array containing the compressed column indices (corresponding to CSC format) into an array
15945 of uncompressed column indices (corresponding to COO format).
15947 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
15948 to the application on the host before the result is ready.
15950 <div class="tablenoborder">
15951 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
15953 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
15954 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
15957 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtr</samp></td>
15958 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
15962 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
15963 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of non-zeros of the sparse matrix (that is also the length of array <samp class="ph codeph">cooRowInd</samp>).
15967 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
15968 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
15969 <math xmlns="http://www.w3.org/1998/Math/MathML">
15975 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
15976 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
15982 <div class="tablenoborder">
15983 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
15985 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cooRowInd</samp></td>
15986 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp> uncompressed row indices.
15992 <div class="tablenoborder">
15993 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
15995 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
15996 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
15999 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
16000 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
16003 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
16004 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">IndexBase</samp> is neither <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> nor <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
16008 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
16009 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
16016 <div class="topic concept nested1" id="cusparse-lt-t-gt-csr2csc"><a name="cusparse-lt-t-gt-csr2csc" shape="rect">
16017 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csr2csc" name="cusparse-lt-t-gt-csr2csc" shape="rect">11.7. cusparse<t>csr2csc</a></h3>
16018 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
16019 cusparseScsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16020 const float *csrVal, const int *csrRowPtr,
16021 const int *csrColInd, float *cscVal,
16022 int *cscRowInd, int *cscColPtr,
16023 cusparseAction_t copyValues,
16024 cusparseIndexBase_t idxBase)
16026 cusparseDcsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16027 const double *csrVal, const int *csrRowPtr,
16028 const int *csrColInd, double *cscVal,
16029 int *cscRowInd, int *cscColPtr,
16030 cusparseAction_t copyValues,
16031 cusparseIndexBase_t idxBase)
16033 cusparseCcsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16034 const cuComplex *csrVal, const int *csrRowPtr,
16035 const int *csrColInd, cuComplex *cscVal,
16036 int *cscRowInd, int *cscColPtr,
16037 cusparseAction_t copyValues,
16038 cusparseIndexBase_t idxBase)
16040 cusparseZcsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16041 const cuDoubleComplex *csrVal, const int *csrRowPtr,
16042 const int *csrColInd, cuDoubleComplex *cscVal,
16043 int *cscRowInd, int *cscColPtr,
16044 cusparseAction_t copyValues,
16045 cusparseIndexBase_t idxBase)</pre><p class="p">This function converts a sparse matrix in CSR format (that is defined by the three arrays <samp class="ph codeph">csrVal</samp>, <samp class="ph codeph">csrRowPtr</samp> and <samp class="ph codeph">csrColInd</samp>) into a sparse matrix in CSC format (that is defined by arrays <samp class="ph codeph">cscVal</samp>, <samp class="ph codeph">cscRowInd</samp>, and <samp class="ph codeph">cscColPtr</samp>). The resulting matrix can also be seen as the transpose of the original sparse matrix. Notice that this routine can also
16046 be used to convert a matrix in CSC format into a matrix in CSR format.
16048 <p class="p">This function requires significant amount of extra storage that is proportional to the matrix size. It is executed asynchronously
16049 with respect to the host and it may return control to the application on the host before the result is ready.
16051 <div class="tablenoborder">
16052 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
16054 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
16055 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
16058 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
16059 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
16060 <math xmlns="http://www.w3.org/1998/Math/MathML">
16066 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
16067 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
16068 <math xmlns="http://www.w3.org/1998/Math/MathML">
16074 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnz</samp></td>
16075 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of nonz-zero elements of matrix
16076 <math xmlns="http://www.w3.org/1998/Math/MathML">
16082 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrVal</samp></td>
16083 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
16084 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16085 <mo stretchy="false">(</mo>
16087 </math><samp class="ph codeph">csrRowPtr(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16089 </math><samp class="ph codeph">csrRowPtr(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16090 <mo stretchy="false">)</mo>
16092 non-zero elements of matrix
16093 <math xmlns="http://www.w3.org/1998/Math/MathML">
16099 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtr</samp></td>
16100 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
16104 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColInd</samp></td>
16105 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
16106 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16107 <mo stretchy="false">(</mo>
16109 </math><samp class="ph codeph">csrRowPtr(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16111 </math><samp class="ph codeph">csrRowPtr(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16112 <mo stretchy="false">)</mo>
16114 column indices of the non-zero elements of matrix
16115 <math xmlns="http://www.w3.org/1998/Math/MathML">
16121 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">copyValues</samp></td>
16122 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_ACTION_SYMBOLIC</samp> or <samp class="ph codeph">CUSPARSE_ACTION_NUMERIC</samp>.
16126 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">idxBase</samp></td>
16127 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> or <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
16133 <div class="tablenoborder">
16134 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
16136 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscVal</samp></td>
16137 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
16138 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16139 <mo stretchy="false">(</mo>
16141 </math><samp class="ph codeph">cscColPtr(n)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16143 </math><samp class="ph codeph">cscColPtr(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16144 <mo stretchy="false">)</mo>
16146 non-zero elements of matrix
16147 <math xmlns="http://www.w3.org/1998/Math/MathML">
16149 </math>. It is only filled-in if copyValues is set to <samp class="ph codeph">CUSPARSE_ACTION_NUMERIC</samp>.
16153 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscRowInd</samp></td>
16154 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16155 <mo stretchy="false">(</mo>
16157 </math><samp class="ph codeph">cscColPtr(n)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16159 </math><samp class="ph codeph">cscColPtr(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16160 <mo stretchy="false">)</mo>
16161 </math> column indices of the non-zero elements of matrix
16162 <math xmlns="http://www.w3.org/1998/Math/MathML">
16168 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscColPtr</samp></td>
16169 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">n+1</samp> elements that contains the start of every column and the end of the last column plus one.
16175 <div class="tablenoborder">
16176 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
16178 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
16179 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
16182 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
16183 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
16186 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
16187 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
16190 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
16191 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n,nnz<0</samp>).
16195 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
16196 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
16199 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
16200 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
16203 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
16204 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
16211 <div class="topic concept nested1" id="cusparse-lt-t-gt-csr2dense"><a name="cusparse-lt-t-gt-csr2dense" shape="rect">
16212 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csr2dense" name="cusparse-lt-t-gt-csr2dense" shape="rect">11.8. cusparse<t>csr2dense</a></h3>
16213 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
16214 cusparseScsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16215 const float *csrVal, const int *csrRowPtr,
16216 const int *csrColInd, float *cscVal,
16217 int *cscRowInd, int *cscColPtr,
16218 cusparseAction_t copyValues,
16219 cusparseIndexBase_t idxBase)
16221 cusparseDcsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16222 const double *csrVal, const int *csrRowPtr,
16223 const int *csrColInd, double *cscVal,
16224 int *cscRowInd, int *cscColPtr,
16225 cusparseAction_t copyValues,
16226 cusparseIndexBase_t idxBase)
16228 cusparseCcsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16229 const cuComplex *csrVal, const int *csrRowPtr,
16230 const int *csrColInd, cuComplex *cscVal,
16231 int *cscRowInd, int *cscColPtr,
16232 cusparseAction_t copyValues,
16233 cusparseIndexBase_t idxBase)
16235 cusparseZcsr2csc(cusparseHandle_t handle, int m, int n, int nnz,
16236 const cuDoubleComplex *csrVal, const int *csrRowPtr,
16237 const int *csrColInd, cuDoubleComplex *cscVal,
16238 int *cscRowInd, int *cscColPtr,
16239 cusparseAction_t copyValues,
16240 cusparseIndexBase_t idxBase)</pre><p class="p">This function converts the sparse matrix in CSR format (that is defined by the three arrays <samp class="ph codeph">csrValA</samp>, <samp class="ph codeph">csrRowPtrA</samp> and <samp class="ph codeph">csrColIndA</samp>) into the matrix <samp class="ph codeph">A</samp> in dense format. The dense matrix <samp class="ph codeph">A</samp> is filled in with the values of the sparse matrix and with zeros elsewhere.
16242 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
16243 to the application on the host before the result is ready.
16245 <div class="tablenoborder">
16246 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
16248 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
16249 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
16252 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
16253 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
16254 <math xmlns="http://www.w3.org/1998/Math/MathML">
16260 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
16261 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
16262 <math xmlns="http://www.w3.org/1998/Math/MathML">
16268 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
16269 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
16270 <math xmlns="http://www.w3.org/1998/Math/MathML">
16272 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
16276 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
16277 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
16278 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16279 <mo stretchy="false">(</mo>
16281 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16283 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16284 <mo stretchy="false">)</mo>
16286 non-zero elements of matrix
16287 <math xmlns="http://www.w3.org/1998/Math/MathML">
16293 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
16294 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
16298 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
16299 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
16300 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16301 <mo stretchy="false">(</mo>
16303 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16305 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16306 <mo stretchy="false">)</mo>
16308 column indices of the non-zero elements of matrix
16309 <math xmlns="http://www.w3.org/1998/Math/MathML">
16315 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">lda</samp></td>
16316 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of array matrix<samp class="ph codeph">A</samp>.
16322 <div class="tablenoborder">
16323 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
16325 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">A</samp></td>
16326 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(lda,n)</samp> that is filled in with the values of the sparse matrix.
16332 <div class="tablenoborder">
16333 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
16335 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
16336 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
16339 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
16340 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
16343 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
16344 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
16348 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
16349 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
16352 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
16353 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
16356 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
16357 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
16360 the matrix type is not supported.
16368 <div class="topic concept nested1" id="cusparse-lt-t-gt-csr2hyb"><a name="cusparse-lt-t-gt-csr2hyb" shape="rect">
16369 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-csr2hyb" name="cusparse-lt-t-gt-csr2hyb" shape="rect">11.9. cusparse<t>csr2hyb</a></h3>
16370 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
16371 cusparseScsr2hyb(cusparseHandle_t handle, int m, int n,
16372 const cusparseMatDescr_t descrA,
16373 const float *csrValA,
16374 const int *csrRowPtrA, const int *csrColIndA,
16375 cusparseHybMat_t hybA, int userEllWidth,
16376 cusparseHybPartition_t partitionType)
16378 cusparseDcsr2hyb(cusparseHandle_t handle, int m, int n,
16379 const cusparseMatDescr_t descrA,
16380 const double *csrValA,
16381 const int *csrRowPtrA, const int *csrColIndA,
16382 cusparseHybMat_t hybA, int userEllWidth,
16383 cusparseHybPartition_t partitionType)
16385 cusparseCcsr2hyb(cusparseHandle_t handle, int m, int n,
16386 const cusparseMatDescr_t descrA,
16387 const cuComplex *csrValA,
16388 const int *csrRowPtrA, const int *csrColIndA,
16389 cusparseHybMat_t hybA, int userEllWidth,
16390 cusparseHybPartition_t partitionType)
16392 cusparseZcsr2hyb(cusparseHandle_t handle, int m, int n,
16393 const cusparseMatDescr_t descrA,
16394 const cuDoubleComplex *csrValA,
16395 const int *csrRowPtrA, const int *csrColIndA,
16396 cusparseHybMat_t hybA, int userEllWidth,
16397 cusparseHybPartition_t partitionType)</pre><p class="p">This function converts a sparse matrix in CSR format into a sparse matrix in HYB format. It assumes that the <samp class="ph codeph">hybA</samp> parameter has been initialized with <samp class="ph codeph">cusparseCreateHybMat</samp> routine before calling this function.
16399 <p class="p">This function requires some amount of temporary storage and a significant amount of storage for the matrix in HYB format.
16400 It is executed asynchronously with respect to the host and it may return control to the application on the host before the
16403 <div class="tablenoborder">
16404 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
16406 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
16407 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
16410 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
16411 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
16412 <math xmlns="http://www.w3.org/1998/Math/MathML">
16418 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
16419 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
16420 <math xmlns="http://www.w3.org/1998/Math/MathML">
16426 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
16427 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
16428 <math xmlns="http://www.w3.org/1998/Math/MathML">
16430 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
16434 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
16435 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
16436 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16437 <mo stretchy="false">(</mo>
16439 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16441 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16442 <mo stretchy="false">)</mo>
16444 non-zero elements of matrix
16445 <math xmlns="http://www.w3.org/1998/Math/MathML">
16451 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
16452 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every row and the end of the last row plus one.
16456 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
16457 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of
16458 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16459 <mo stretchy="false">(</mo>
16461 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16463 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16464 <mo stretchy="false">)</mo>
16466 column indices of the non-zero elements of matrix
16467 <math xmlns="http://www.w3.org/1998/Math/MathML">
16473 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">userEllWidth</samp></td>
16474 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">width of the regular (ELL) part of the matrix in HYB format, which should be less than maximum number of non-zeros per row
16475 and is only required if <samp class="ph codeph">partitionType</samp> == <samp class="ph codeph">CUSPARSE_HYB_PARTITION_USER</samp>.
16479 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">partitionType</samp></td>
16480 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">partitioning method to be used in the conversion (please refer to <samp class="ph codeph">cusparseHybPartition_t</samp> on page ?? for details).
16486 <div class="tablenoborder">
16487 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
16489 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
16490 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix <samp class="ph codeph">A</samp> in HYB storage format.
16496 <div class="tablenoborder">
16497 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
16499 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
16500 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
16503 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
16504 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
16507 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
16508 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
16511 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
16512 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
16516 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
16517 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
16520 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
16521 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
16524 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
16525 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
16528 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
16529 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
16532 the matrix type is not supported.
16540 <div class="topic concept nested1" id="cusparse-lt-t-gt-dense2csc"><a name="cusparse-lt-t-gt-dense2csc" shape="rect">
16541 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-dense2csc" name="cusparse-lt-t-gt-dense2csc" shape="rect">11.10. cusparse<t>dense2csc</a></h3>
16542 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
16543 cusparseSdense2csc(cusparseHandle_t handle, int m, int n,
16544 const cusparseMatDescr_t descrA,
16546 int lda, const int *nnzPerCol,
16548 int *cscRowIndA, int *cscColPtrA)
16550 cusparseDdense2csc(cusparseHandle_t handle, int m, int n,
16551 const cusparseMatDescr_t descrA,
16553 int lda, const int *nnzPerCol,
16555 int *cscRowIndA, int *cscColPtrA)
16557 cusparseCdense2csc(cusparseHandle_t handle, int m, int n,
16558 const cusparseMatDescr_t descrA,
16559 const cuComplex *A,
16560 int lda, const int *nnzPerCol,
16561 cuComplex *cscValA,
16562 int *cscRowIndA, int *cscColPtrA)
16564 cusparseZdense2csc(cusparseHandle_t handle, int m, int n,
16565 const cusparseMatDescr_t descrA,
16566 const cuDoubleComplex *A,
16567 int lda, const int *nnzPerCol,
16568 cuDoubleComplex *cscValA,
16569 int *cscRowIndA, int *cscColPtrA)</pre><p class="p">This function converts the matrix <samp class="ph codeph">A</samp> in dense format into a sparse matrix in CSC format. All the parameters are assumed to have been pre-allocated by the user
16570 and the arrays are filled in based on <samp class="ph codeph">nnzPerCol</samp>, which can be pre-computed with <samp class="ph codeph">cusparse<t>nnz()</samp>.
16572 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
16573 to the application on the host before the result is ready.
16575 <div class="tablenoborder">
16576 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
16578 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
16579 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
16582 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
16583 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
16584 <math xmlns="http://www.w3.org/1998/Math/MathML">
16590 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
16591 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
16592 <math xmlns="http://www.w3.org/1998/Math/MathML">
16598 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
16599 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
16600 <math xmlns="http://www.w3.org/1998/Math/MathML">
16602 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
16606 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">A</samp></td>
16607 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(lda, n)</samp>.
16611 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">lda</samp></td>
16612 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of dense array<samp class="ph codeph">A</samp>.
16616 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzPerCol</samp></td>
16617 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of size <samp class="ph codeph">n</samp> containing the number of non-zero elements per column.
16623 <div class="tablenoborder">
16624 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
16626 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscValA</samp></td>
16627 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
16628 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16629 <mo stretchy="false">(</mo>
16631 </math><samp class="ph codeph">cscRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16633 </math><samp class="ph codeph">cscRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16634 <mo stretchy="false">)</mo>
16636 non-zero elements of matrix
16637 <math xmlns="http://www.w3.org/1998/Math/MathML">
16639 </math>. It is only filled-in if copyValues is set to <samp class="ph codeph">CUSPARSE_ACTION_NUMERIC</samp>.
16643 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscRowIndA</samp></td>
16644 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16645 <mo stretchy="false">(</mo>
16647 </math><samp class="ph codeph">cscRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16649 </math><samp class="ph codeph">cscRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16650 <mo stretchy="false">)</mo>
16651 </math> row indices of the non-zero elements of matrix
16652 <math xmlns="http://www.w3.org/1998/Math/MathML">
16658 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscColPtrA</samp></td>
16659 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">n+1</samp> elements that contains the start of every column and the end of the last column plus one.
16665 <div class="tablenoborder">
16666 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
16668 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
16669 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
16672 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
16673 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
16676 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
16677 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
16681 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
16682 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
16685 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
16686 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
16689 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
16690 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
16693 the matrix type is not supported.
16701 <div class="topic concept nested1" id="cusparse-lt-t-gt-dense2csr"><a name="cusparse-lt-t-gt-dense2csr" shape="rect">
16702 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-dense2csr" name="cusparse-lt-t-gt-dense2csr" shape="rect">11.11. cusparse<t>dense2csr</a></h3>
16703 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
16704 cusparseSdense2csr(cusparseHandle_t handle, int m, int n,
16705 const cusparseMatDescr_t descrA,
16707 int lda, const int *nnzPerRow,
16709 int *csrRowPtrA, int *csrColIndA)
16711 cusparseDdense2csr(cusparseHandle_t handle, int m, int n,
16712 const cusparseMatDescr_t descrA,
16714 int lda, const int *nnzPerRow,
16716 int *csrRowPtrA, int *csrColIndA)
16718 cusparseCdense2csr(cusparseHandle_t handle, int m, int n,
16719 const cusparseMatDescr_t descrA,
16720 const cuComplex *A,
16721 int lda, const int *nnzPerRow,
16722 cuComplex *csrValA,
16723 int *csrRowPtrA, int *csrColIndA)
16725 cusparseZdense2csr(cusparseHandle_t handle, int m, int n,
16726 const cusparseMatDescr_t descrA,
16727 const cuDoubleComplex *A,
16728 int lda, const int *nnzPerRow,
16729 cuDoubleComplex *csrValA,
16730 int *csrRowPtrA, int *csrColIndA) </pre><p class="p">This function converts the matrix <samp class="ph codeph">A</samp> in dense format into a sparse matrix in CSR format. All the parameters are assumed to have been pre-allocated by the user
16731 and the arrays are filled in based on the <samp class="ph codeph">nnzPerRow</samp>, which can be pre-computed with <samp class="ph codeph">cusparse<t>nnz()</samp>.
16733 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
16734 to the application on the host before the result is ready.
16736 <div class="tablenoborder">
16737 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
16739 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
16740 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
16743 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
16744 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
16745 <math xmlns="http://www.w3.org/1998/Math/MathML">
16751 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
16752 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
16753 <math xmlns="http://www.w3.org/1998/Math/MathML">
16759 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
16760 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
16761 <math xmlns="http://www.w3.org/1998/Math/MathML">
16763 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
16767 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">A</samp></td>
16768 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(lda, n)</samp>.
16772 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">lda</samp></td>
16773 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of dense array<samp class="ph codeph">A</samp>.
16777 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzPerRow</samp></td>
16778 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of size <samp class="ph codeph">n</samp> containing the number of non-zero elements per row.
16784 <div class="tablenoborder">
16785 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
16787 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
16788 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
16789 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16790 <mo stretchy="false">(</mo>
16792 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16794 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16795 <mo stretchy="false">)</mo>
16797 non-zero elements of matrix
16798 <math xmlns="http://www.w3.org/1998/Math/MathML">
16804 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
16805 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every column and the end of the last column plus one.
16809 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
16810 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16811 <mo stretchy="false">(</mo>
16813 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16815 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
16816 <mo stretchy="false">)</mo>
16817 </math> column indices of the non-zero elements of matrix
16818 <math xmlns="http://www.w3.org/1998/Math/MathML">
16826 <div class="tablenoborder">
16827 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
16829 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
16830 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
16833 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
16834 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
16837 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
16838 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
16841 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
16842 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
16846 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
16847 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
16850 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
16851 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
16854 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
16855 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
16858 the matrix type is not supported.
16866 <div class="topic concept nested1" id="cusparse-lt-t-gt-dense2hyb"><a name="cusparse-lt-t-gt-dense2hyb" shape="rect">
16867 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-dense2hyb" name="cusparse-lt-t-gt-dense2hyb" shape="rect">11.12. cusparse<t>dense2hyb</a></h3>
16868 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
16869 cusparseSdense2hyb(cusparseHandle_t handle, int m, int n,
16870 const cusparseMatDescr_t descrA,
16872 int lda, const int *nnzPerRow, cusparseHybMat_t hybA,
16874 cusparseHybPartition_t partitionType)
16876 cusparseDdense2hyb(cusparseHandle_t handle, int m, int n,
16877 const cusparseMatDescr_t descrA,
16879 int lda, const int *nnzPerRow, cusparseHybMat_t hybA,
16881 cusparseHybPartition_t partitionType)
16883 cusparseCdense2hyb(cusparseHandle_t handle, int m, int n,
16884 const cusparseMatDescr_t descrA,
16885 const cuComplex *A,
16886 int lda, const int *nnzPerRow, cusparseHybMat_t hybA,
16888 cusparseHybPartition_t partitionType)
16890 cusparseZdense2hyb(cusparseHandle_t handle, int m, int n,
16891 const cusparseMatDescr_t descrA,
16892 const cuDoubleComplex *A,
16893 int lda, const int *nnzPerRow, cusparseHybMat_t hybA,
16895 cusparseHybPartition_t partitionType)</pre><p class="p">This function converts the matrix <samp class="ph codeph">A</samp> in dense format into a sparse matrix in HYB format. It assumes that the routine <samp class="ph codeph">cusparseCreateHybMat</samp> was used to initialize the opaque structure <samp class="ph codeph">hybA</samp> and that the array <samp class="ph codeph">nnzPerRow</samp> was pre-computed with <samp class="ph codeph">cusparse<t>nnz()</samp>.
16897 <p class="p">This function requires some amount of temporary storage and a significant amount of storage for the matrix in HYB format.
16898 It is executed asynchronously with respect to the host and it may return control to the application on the host before the
16901 <div class="tablenoborder">
16902 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
16904 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
16905 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
16908 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
16909 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
16910 <math xmlns="http://www.w3.org/1998/Math/MathML">
16916 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
16917 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
16918 <math xmlns="http://www.w3.org/1998/Math/MathML">
16924 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
16925 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
16926 <math xmlns="http://www.w3.org/1998/Math/MathML">
16928 </math>. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
16932 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">A</samp></td>
16933 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(lda, n)</samp>.
16937 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">lda</samp></td>
16938 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of dense array<samp class="ph codeph">A</samp>.
16942 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzPerRow</samp></td>
16943 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of size <samp class="ph codeph">m</samp> containing the number of non-zero elements per row.
16947 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">userEllWidth</samp></td>
16948 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">width of the regular (ELL) part of the matrix in HYB format, which should be less than maximum number of non-zeros per row
16949 and is only required if <samp class="ph codeph">partitionType</samp> == <samp class="ph codeph">CUSPARSE_HYB_PARTITION_USER</samp>.
16953 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">partitionType</samp></td>
16954 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">partitioning method to be used in the conversion (please refer to <samp class="ph codeph">cusparseHybPartition_t</samp> on page ?? for details).
16960 <div class="tablenoborder">
16961 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
16963 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
16964 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix <samp class="ph codeph">A</samp> in HYB storage format.
16970 <div class="tablenoborder">
16971 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
16973 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
16974 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
16977 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
16978 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
16981 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
16982 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
16985 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
16986 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
16990 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
16991 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
16994 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
16995 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
16998 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
16999 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
17002 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
17003 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
17006 the matrix type is not supported.
17014 <div class="topic concept nested1" id="cusparse-lt-t-gt-hyb2csc"><a name="cusparse-lt-t-gt-hyb2csc" shape="rect">
17015 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-hyb2csc" name="cusparse-lt-t-gt-hyb2csc" shape="rect">11.13. cusparse<t>hyb2csc</a></h3>
17016 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
17017 cusparseShyb2csc(cusparseHandle_t handle,
17018 const cusparseMatDescr_t descrA,
17019 const cusparseHybMat_t hybA,
17020 float *cscValA, int *cscRowIndA, int *cscColPtrA)
17022 cusparseDhyb2csc(cusparseHandle_t handle,
17023 const cusparseMatDescr_t descrA,
17024 const cusparseHybMat_t hybA,
17025 double *cscValA, int *cscRowIndA, int *cscColPtrA)
17027 cusparseChyb2csc(cusparseHandle_t handle,
17028 const cusparseMatDescr_t descrA,
17029 const cusparseHybMat_t hybA,
17030 cuComplex *cscValA, int *cscRowIndA, int *cscColPtrA)
17032 cusparseZhyb2csc(cusparseHandle_t handle,
17033 const cusparseMatDescr_t descrA,
17034 const cusparseHybMat_t hybA,
17035 cuDoubleComplex *cscValA, int *cscRowIndA, int *cscColPtrA)</pre><p class="p">This function converts a sparse matrix in HYB format into a sparse matrix in CSC format.</p>
17036 <p class="p">This function requires some amount of temporary storage. It is executed asynchronously with respect to the host and it may
17037 return control to the application on the host before the result is ready.
17039 <div class="tablenoborder">
17040 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
17042 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
17043 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
17046 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
17047 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
17048 <math xmlns="http://www.w3.org/1998/Math/MathML">
17050 </math> in Hyb format. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
17054 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
17055 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix <samp class="ph codeph">A</samp> in HYB storage format.
17061 <div class="tablenoborder">
17062 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
17064 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscValA</samp></td>
17065 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
17066 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17067 <mo stretchy="false">(</mo>
17069 </math><samp class="ph codeph">cscColPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17071 </math><samp class="ph codeph">cscColPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17072 <mo stretchy="false">)</mo>
17074 non-zero elements of matrix
17075 <math xmlns="http://www.w3.org/1998/Math/MathML">
17081 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscRowIndA</samp></td>
17082 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17083 <mo stretchy="false">(</mo>
17085 </math><samp class="ph codeph">cscColPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17087 </math><samp class="ph codeph">cscColPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17088 <mo stretchy="false">)</mo>
17089 </math> column indices of the non-zero elements of matrix
17090 <math xmlns="http://www.w3.org/1998/Math/MathML">
17096 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">cscColPtrA</samp></td>
17097 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every column and the end of the last row plus one.
17103 <div class="tablenoborder">
17104 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
17106 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
17107 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
17110 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
17111 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
17114 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
17115 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
17118 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
17119 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
17123 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
17124 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
17127 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
17128 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
17131 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
17132 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
17135 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
17136 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
17139 the matrix type is not supported.
17147 <div class="topic concept nested1" id="cusparse-lt-t-gt-hyb2csr"><a name="cusparse-lt-t-gt-hyb2csr" shape="rect">
17148 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-hyb2csr" name="cusparse-lt-t-gt-hyb2csr" shape="rect">11.14. cusparse<t>hyb2csr</a></h3>
17149 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
17150 cusparseShyb2csr(cusparseHandle_t handle,
17151 const cusparseMatDescr_t descrA,
17152 const cusparseHybMat_t hybA,
17153 float *csrValA, int *csrRowPtrA, int *csrColIndA)
17155 cusparseDhyb2csr(cusparseHandle_t handle,
17156 const cusparseMatDescr_t descrA,
17157 const cusparseHybMat_t hybA,
17158 double *csrValA, int *csrRowPtrA, int *csrColIndA)
17160 cusparseChyb2csr(cusparseHandle_t handle,
17161 const cusparseMatDescr_t descrA,
17162 const cusparseHybMat_t hybA,
17163 cuComplex *csrValA, int *csrRowPtrA, int *csrColIndA)
17165 cusparseZhyb2csr(cusparseHandle_t handle,
17166 const cusparseMatDescr_t descrA,
17167 const cusparseHybMat_t hybA,
17168 cuDoubleComplex *csrValA, int *csrRowPtrA, int *csrColIndA)</pre><p class="p">This function converts a sparse matrix in HYB format into a sparse matrix in CSR format.</p>
17169 <p class="p">This function requires some amount of temporary storage. It is executed asynchronously with respect to the host and it may
17170 return control to the application on the host before the result is ready.
17172 <div class="tablenoborder">
17173 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
17175 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
17176 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
17179 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
17180 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
17181 <math xmlns="http://www.w3.org/1998/Math/MathML">
17183 </math> in Hyb format. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
17187 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
17188 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix <samp class="ph codeph">A</samp> in HYB storage format.
17194 <div class="tablenoborder">
17195 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
17197 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrValA</samp></td>
17198 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><type> array of
17199 <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17200 <mo stretchy="false">(</mo>
17202 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17204 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17205 <mo stretchy="false">)</mo>
17207 non-zero elements of matrix
17208 <math xmlns="http://www.w3.org/1998/Math/MathML">
17214 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrRowPtrA</samp></td>
17215 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">m+1</samp> elements that contains the start of every column and the end of the last row plus one.
17219 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">csrColIndA</samp></td>
17220 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">integer array of <samp class="ph codeph">nnz</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17221 <mo stretchy="false">(</mo>
17223 </math><samp class="ph codeph">csrRowPtrA(m)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17225 </math><samp class="ph codeph">csrRowPtrA(0)</samp><math xmlns="http://www.w3.org/1998/Math/MathML">
17226 <mo stretchy="false">)</mo>
17227 </math> column indices of the non-zero elements of matrix
17228 <math xmlns="http://www.w3.org/1998/Math/MathML">
17236 <div class="tablenoborder">
17237 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
17239 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
17240 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
17243 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
17244 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
17247 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
17248 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
17251 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
17252 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
17256 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
17257 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
17260 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
17261 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
17264 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
17265 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
17268 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
17269 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
17272 the matrix type is not supported.
17280 <div class="topic concept nested1" id="cusparse-lt-t-gt-hyb2dense"><a name="cusparse-lt-t-gt-hyb2dense" shape="rect">
17281 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-hyb2dense" name="cusparse-lt-t-gt-hyb2dense" shape="rect">11.15. cusparse<t>hyb2dense</a></h3>
17282 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
17283 cusparseShyb2csr(cusparseHandle_t handle,
17284 const cusparseMatDescr_t descrA,
17285 const cusparseHybMat_t hybA,
17286 float *csrValA, int *csrRowPtrA, int *csrColIndA)
17288 cusparseDhyb2csr(cusparseHandle_t handle,
17289 const cusparseMatDescr_t descrA,
17290 const cusparseHybMat_t hybA,
17291 double *csrValA, int *csrRowPtrA, int *csrColIndA)
17293 cusparseChyb2csr(cusparseHandle_t handle,
17294 const cusparseMatDescr_t descrA,
17295 const cusparseHybMat_t hybA,
17296 cuComplex *csrValA, int *csrRowPtrA, int *csrColIndA)
17298 cusparseZhyb2csr(cusparseHandle_t handle,
17299 const cusparseMatDescr_t descrA,
17300 const cusparseHybMat_t hybA,
17301 cuDoubleComplex *csrValA, int *csrRowPtrA, int *csrColIndA)</pre><p class="p">This function converts a sparse matrix in HYB format (contained in the opaque structure ) into a matrix <samp class="ph codeph">A</samp> in dense format. The dense matrix <samp class="ph codeph">A</samp> is filled in with the values of the sparse matrix and with zeros elsewhere.
17303 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
17304 to the application on the host before the result is ready.
17306 <div class="tablenoborder">
17307 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
17309 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
17310 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
17313 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
17314 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
17315 <math xmlns="http://www.w3.org/1998/Math/MathML">
17317 </math> in Hyb format. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>.
17321 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">hybA</samp></td>
17322 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the matrix <samp class="ph codeph">A</samp> in HYB storage format.
17326 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">lda</samp></td>
17327 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of dense array <samp class="ph codeph">A</samp>.
17333 <div class="tablenoborder">
17334 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
17336 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">A</samp></td>
17337 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(lda, n)</samp> that is filled in with the values of the sparse matrix.
17343 <div class="tablenoborder">
17344 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
17346 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
17347 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
17350 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
17351 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
17354 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
17355 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the internally stored hyb format parameters are invalid.</td>
17358 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
17359 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
17362 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
17363 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
17366 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
17367 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
17370 the matrix type is not supported.
17378 <div class="topic concept nested1" id="cusparse-lt-t-gt-nnz"><a name="cusparse-lt-t-gt-nnz" shape="rect">
17379 <!-- --></a><h3 class="title topictitle2"><a href="#cusparse-lt-t-gt-nnz" name="cusparse-lt-t-gt-nnz" shape="rect">11.16. cusparse<t>nnz</a></h3>
17380 <div class="body conbody"><pre xml:space="preserve">cusparseStatus_t
17381 cusparseSnnz(cusparseHandle_t handle, cusparseDirection_t dirA, int m,
17382 int n, const cusparseMatDescr_t descrA,
17384 int lda, int *nnzPerRowColumn, int *nnzTotalDevHostPtr)
17386 cusparseDnnz(cusparseHandle_t handle, cusparseDirection_t dirA, int m,
17387 int n, const cusparseMatDescr_t descrA,
17389 int lda, int *nnzPerRowColumn, int *nnzTotalDevHostPtr)
17391 cusparseCnnz(cusparseHandle_t handle, cusparseDirection_t dirA, int m,
17392 int n, const cusparseMatDescr_t descrA,
17393 const cuComplex *A,
17394 int lda, int *nnzPerRowColumn, int *nnzTotalDevHostPtr)
17396 cusparseZnnz(cusparseHandle_t handle, cusparseDirection_t dirA, int m,
17397 int n, const cusparseMatDescr_t descrA,
17398 const cuDoubleComplex *A,
17399 int lda, int *nnzPerRowColumn, int *nnzTotalDevHostPtr)</pre><p class="p">This function computes the number of non-zero elements per row or column and the total number of non-zero elements in a dense
17402 <p class="p">This function requires no extra storage. It is executed asynchronously with respect to the host and it may return control
17403 to the application on the host before the result is ready.
17405 <div class="tablenoborder">
17406 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Input</strong></span><tbody class="tbody">
17408 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">handle</samp></td>
17409 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">handle to the CUSPARSE library context.</td>
17412 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">dirA</samp></td>
17413 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">direction that specifies whether to count non-zero elements by <samp class="ph codeph">CUSPARSE_DIRECTION_ROW</samp> or <samp class="ph codeph">CUSPARSE_DIRECTION_COLUMN</samp>.
17417 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">m</samp></td>
17418 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of rows of matrix
17419 <math xmlns="http://www.w3.org/1998/Math/MathML">
17425 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">n</samp></td>
17426 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">number of columns of matrix
17427 <math xmlns="http://www.w3.org/1998/Math/MathML">
17433 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">descrA</samp></td>
17434 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the descriptor of matrix
17435 <math xmlns="http://www.w3.org/1998/Math/MathML">
17437 </math> in Hyb format. The supported matrix type is <samp class="ph codeph">CUSPARSE_MATRIX_TYPE_GENERAL</samp>. Also, the supported index bases are <samp class="ph codeph">CUSPARSE_INDEX_BASE_ZERO</samp> and <samp class="ph codeph">CUSPARSE_INDEX_BASE_ONE</samp>.
17441 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">A</samp></td>
17442 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of dimensions <samp class="ph codeph">(lda, n)</samp>.
17446 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">lda</samp></td>
17447 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">leading dimension of dense array <samp class="ph codeph">A</samp>.
17453 <div class="tablenoborder">
17454 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Output</strong></span><tbody class="tbody">
17456 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzPerRowColumn</samp></td>
17457 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">array of size <samp class="ph codeph">m</samp> or <samp class="ph codeph">n</samp> containing the number of non-zero elements per row or column, respectively.
17461 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">nnzTotalDevHostPtr</samp></td>
17462 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">total number of non-zero elements in device or host memory.</td>
17467 <div class="tablenoborder">
17468 <table cellpadding="4" cellspacing="0" summary="" class="table" width="100%" frame="border" border="1" rules="all"><span class="desc tabledesc"><strong class="ph b">Status Returned</strong></span><tbody class="tbody">
17470 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_SUCCESS</samp></td>
17471 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the operation completed successfully.</td>
17474 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_NOT_INITIALIZED</samp></td>
17475 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the library was not initialized.</td>
17478 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ALLOC_FAILED</samp></td>
17479 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the resources could not be allocated.</td>
17482 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INVALID_VALUE</samp></td>
17483 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">invalid parameters were passed (<samp class="ph codeph">m,n<0</samp>).
17487 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_ARCH_MISMATCH</samp></td>
17488 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the device does not support double precision.</td>
17491 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_EXECUTION_FAILED</samp></td>
17492 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">the function failed to launch on the GPU</td>
17495 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_INTERNAL_ERROR</samp></td>
17496 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">an internal operation failed.</td>
17499 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1"><samp class="ph codeph">CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED</samp></td>
17500 <td class="entry" valign="top" width="50%" rowspan="1" colspan="1">
17503 the matrix type is not supported.
17512 <div class="topic concept nested0" id="appendix-b-cusparse-library-c---example"><a name="appendix-b-cusparse-library-c---example" shape="rect">
17513 <!-- --></a><h2 class="title topictitle1"><a href="#appendix-b-cusparse-library-c---example" name="appendix-b-cusparse-library-c---example" shape="rect">12. Appendix A: CUSPARSE Library C++ Example</a></h2>
17514 <div class="body conbody">
17515 <p class="p">For sample code reference please see the example code below. It shows an application written in C++ using the CUSPARSE library
17516 API. The code performs the following actions:
17518 <p class="p">1. Creates a sparse test matrix in COO format.</p>
17519 <p class="p">2. Creates a sparse and dense vector.</p>
17520 <p class="p">3. Allocates GPU memory and copies the matrix and vectors into it.</p>
17521 <p class="p">4. Initializes the CUSPARSE library.</p>
17522 <p class="p">5. Creates and sets up the matrix descriptor.</p>
17523 <p class="p">6. Converts the matrix from COO to CSR format.</p>
17524 <p class="p">7. Exercises Level 1 routines.</p>
17525 <p class="p">8. Exercises Level 2 routines.</p>
17526 <p class="p">9. Exercises Level 3 routines.</p>
17527 <p class="p">10. Destroys the matrix descriptor.</p>
17528 <p class="p">11. Releases resources allocated for the CUSPARSE library.</p><pre xml:space="preserve"><span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">//Example: Application using C++ and the CUSPARSE library </span>
17529 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">//-------------------------------------------------------</span>
17530 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#include <stdio.h></span>
17531 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#include <stdlib.h></span>
17532 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#include <cuda_runtime.h></span>
17533 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#include "cusparse_v2.h"</span>
17535 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#define CLEANUP(s) \
17537 printf ("%s\n", s); \
17538 if (yHostPtr) free(yHostPtr); \
17539 if (zHostPtr) free(zHostPtr); \
17540 if (xIndHostPtr) free(xIndHostPtr); \
17541 if (xValHostPtr) free(xValHostPtr); \
17542 if (cooRowIndexHostPtr) free(cooRowIndexHostPtr);\
17543 if (cooColIndexHostPtr) free(cooColIndexHostPtr);\
17544 if (cooValHostPtr) free(cooValHostPtr); \
17545 if (y) cudaFree(y); \
17546 if (z) cudaFree(z); \
17547 if (xInd) cudaFree(xInd); \
17548 if (xVal) cudaFree(xVal); \
17549 if (csrRowPtr) cudaFree(csrRowPtr); \
17550 if (cooRowIndex) cudaFree(cooRowIndex); \
17551 if (cooColIndex) cudaFree(cooColIndex); \
17552 if (cooVal) cudaFree(cooVal); \
17553 if (descr) cusparseDestroyMatDescr(descr);\
17554 if (handle) cusparseDestroy(handle); \
17555 cudaDeviceReset(); \
17559 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> main(){
17560 cudaError_t cudaStat1,cudaStat2,cudaStat3,cudaStat4,cudaStat5,cudaStat6;
17561 cusparseStatus_t status;
17562 cusparseHandle_t handle=0;
17563 cusparseMatDescr_t descr=0;
17564 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> * cooRowIndexHostPtr=0;
17565 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> * cooColIndexHostPtr=0;
17566 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * cooValHostPtr=0;
17567 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> * cooRowIndex=0;
17568 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> * cooColIndex=0;
17569 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * cooVal=0;
17570 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> * xIndHostPtr=0;
17571 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * xValHostPtr=0;
17572 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * yHostPtr=0;
17573 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> * xInd=0;
17574 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * xVal=0;
17575 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * y=0;
17576 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> * csrRowPtr=0;
17577 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * zHostPtr=0;
17578 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> * z=0;
17579 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> n, nnz, nnz_vector;
17580 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> dzero =0.0;
17581 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> dtwo =2.0;
17582 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> dthree=3.0;
17583 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> dfive =5.0;
17585 printf(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"testing example\n"</span>);
17586 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* create the following sparse test matrix in COO format */</span>
17587 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* |1.0 2.0 3.0|
17590 | 8.0 9.0| */</span>
17592 cooRowIndexHostPtr = (<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> *) malloc(nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooRowIndexHostPtr[0]));
17593 cooColIndexHostPtr = (<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> *) malloc(nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooColIndexHostPtr[0]));
17594 cooValHostPtr = (<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> *)malloc(nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooValHostPtr[0]));
17595 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> ((!cooRowIndexHostPtr) || (!cooColIndexHostPtr) || (!cooValHostPtr)){
17596 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Host malloc failed (matrix)"</span>);
17597 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17599 cooRowIndexHostPtr[0]=0; cooColIndexHostPtr[0]=0; cooValHostPtr[0]=1.0;
17600 cooRowIndexHostPtr[1]=0; cooColIndexHostPtr[1]=2; cooValHostPtr[1]=2.0;
17601 cooRowIndexHostPtr[2]=0; cooColIndexHostPtr[2]=3; cooValHostPtr[2]=3.0;
17602 cooRowIndexHostPtr[3]=1; cooColIndexHostPtr[3]=1; cooValHostPtr[3]=4.0;
17603 cooRowIndexHostPtr[4]=2; cooColIndexHostPtr[4]=0; cooValHostPtr[4]=5.0;
17604 cooRowIndexHostPtr[5]=2; cooColIndexHostPtr[5]=2; cooValHostPtr[5]=6.0;
17605 cooRowIndexHostPtr[6]=2; cooColIndexHostPtr[6]=3; cooValHostPtr[6]=7.0;
17606 cooRowIndexHostPtr[7]=3; cooColIndexHostPtr[7]=1; cooValHostPtr[7]=8.0;
17607 cooRowIndexHostPtr[8]=3; cooColIndexHostPtr[8]=3; cooValHostPtr[8]=9.0;
17608 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/*
17610 printf("Input data:\n");
17611 for (int i=0; i<nnz; i++){
17612 printf("cooRowIndexHostPtr[%d]=%d ",i,cooRowIndexHostPtr[i]);
17613 printf("cooColIndexHostPtr[%d]=%d ",i,cooColIndexHostPtr[i]);
17614 printf("cooValHostPtr[%d]=%f \n",i,cooValHostPtr[i]);
17618 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* create a sparse and dense vector */</span>
17619 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* xVal= [100.0 200.0 400.0] (sparse)
17621 y = [10.0 20.0 30.0 40.0 | 50.0 60.0 70.0 80.0] (dense) */</span>
17623 xIndHostPtr = (<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">int</span> *) malloc(nnz_vector*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(xIndHostPtr[0]));
17624 xValHostPtr = (<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> *)malloc(nnz_vector*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(xValHostPtr[0]));
17625 yHostPtr = (<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> *)malloc(2*n *<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(yHostPtr[0]));
17626 zHostPtr = (<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">double</span> *)malloc(2*(n+1) *<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(zHostPtr[0]));
17627 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span>((!xIndHostPtr) || (!xValHostPtr) || (!yHostPtr) || (!zHostPtr)){
17628 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Host malloc failed (vectors)"</span>);
17629 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17631 yHostPtr[0] = 10.0; xIndHostPtr[0]=0; xValHostPtr[0]=100.0;
17632 yHostPtr[1] = 20.0; xIndHostPtr[1]=1; xValHostPtr[1]=200.0;
17633 yHostPtr[2] = 30.0;
17634 yHostPtr[3] = 40.0; xIndHostPtr[2]=3; xValHostPtr[2]=400.0;
17635 yHostPtr[4] = 50.0;
17636 yHostPtr[5] = 60.0;
17637 yHostPtr[6] = 70.0;
17638 yHostPtr[7] = 80.0;
17639 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/*
17640 //print the vectors
17641 for (int j=0; j<2; j++){
17642 for (int i=0; i<n; i++){
17643 printf("yHostPtr[%d,%d]=%f\n",i,j,yHostPtr[i+n*j]);
17646 for (int i=0; i<nnz_vector; i++){
17647 printf("xIndHostPtr[%d]=%d ",i,xIndHostPtr[i]);
17648 printf("xValHostPtr[%d]=%f\n",i,xValHostPtr[i]);
17652 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* allocate GPU memory and copy the matrix and vectors into it */</span>
17653 cudaStat1 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&cooRowIndex,nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooRowIndex[0]));
17654 cudaStat2 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&cooColIndex,nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooColIndex[0]));
17655 cudaStat3 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&cooVal, nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooVal[0]));
17656 cudaStat4 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&y, 2*n*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(y[0]));
17657 cudaStat5 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&xInd,nnz_vector*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(xInd[0]));
17658 cudaStat6 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&xVal,nnz_vector*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(xVal[0]));
17659 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> ((cudaStat1 != cudaSuccess) ||
17660 (cudaStat2 != cudaSuccess) ||
17661 (cudaStat3 != cudaSuccess) ||
17662 (cudaStat4 != cudaSuccess) ||
17663 (cudaStat5 != cudaSuccess) ||
17664 (cudaStat6 != cudaSuccess)) {
17665 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Device malloc failed"</span>);
17666 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17668 cudaStat1 = cudaMemcpy(cooRowIndex, cooRowIndexHostPtr,
17669 (size_t)(nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooRowIndex[0])),
17670 cudaMemcpyHostToDevice);
17671 cudaStat2 = cudaMemcpy(cooColIndex, cooColIndexHostPtr,
17672 (size_t)(nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooColIndex[0])),
17673 cudaMemcpyHostToDevice);
17674 cudaStat3 = cudaMemcpy(cooVal, cooValHostPtr,
17675 (size_t)(nnz*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(cooVal[0])),
17676 cudaMemcpyHostToDevice);
17677 cudaStat4 = cudaMemcpy(y, yHostPtr,
17678 (size_t)(2*n*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(y[0])),
17679 cudaMemcpyHostToDevice);
17680 cudaStat5 = cudaMemcpy(xInd, xIndHostPtr,
17681 (size_t)(nnz_vector*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(xInd[0])),
17682 cudaMemcpyHostToDevice);
17683 cudaStat6 = cudaMemcpy(xVal, xValHostPtr,
17684 (size_t)(nnz_vector*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(xVal[0])),
17685 cudaMemcpyHostToDevice);
17686 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> ((cudaStat1 != cudaSuccess) ||
17687 (cudaStat2 != cudaSuccess) ||
17688 (cudaStat3 != cudaSuccess) ||
17689 (cudaStat4 != cudaSuccess) ||
17690 (cudaStat5 != cudaSuccess) ||
17691 (cudaStat6 != cudaSuccess)) {
17692 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Memcpy from Host to Device failed"</span>);
17693 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17696 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* initialize cusparse library */</span>
17697 status= cusparseCreate(&handle);
17698 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17699 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"CUSPARSE Library initialization failed"</span>);
17700 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17703 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* create and setup matrix descriptor */</span>
17704 status= cusparseCreateMatDescr(&descr);
17705 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17706 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Matrix descriptor initialization failed"</span>);
17707 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17709 cusparseSetMatType(descr,CUSPARSE_MATRIX_TYPE_GENERAL);
17710 cusparseSetMatIndexBase(descr,CUSPARSE_INDEX_BASE_ZERO);
17712 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* exercise conversion routines (convert matrix from COO 2 CSR format) */</span>
17713 cudaStat1 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&csrRowPtr,(n+1)*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(csrRowPtr[0]));
17714 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 != cudaSuccess) {
17715 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Device malloc failed (csrRowPtr)"</span>);
17716 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17718 status= cusparseXcoo2csr(handle,cooRowIndex,nnz,n,
17719 csrRowPtr,CUSPARSE_INDEX_BASE_ZERO);
17720 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17721 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Conversion from COO to CSR format failed"</span>);
17722 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17724 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">//csrRowPtr = [0 3 4 7 9] </span>
17726 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* exercise Level 1 routines (scatter vector elements) */</span>
17727 status= cusparseDsctr(handle, nnz_vector, xVal, xInd,
17728 &y[n], CUSPARSE_INDEX_BASE_ZERO);
17729 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17730 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Scatter from sparse to dense vector failed"</span>);
17731 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17733 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">//y = [10 20 30 40 | 100 200 70 400]</span>
17735 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* exercise Level 2 routines (csrmv) */</span>
17736 status= cusparseDcsrmv(handle,CUSPARSE_OPERATION_NON_TRANSPOSE, n, n, nnz,
17737 &dtwo, descr, cooVal, csrRowPtr, cooColIndex,
17738 &y[0], &dthree, &y[n]);
17739 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17740 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Matrix-vector multiplication failed"</span>);
17741 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17743 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">//y = [10 20 30 40 | 680 760 1230 2240]</span>
17744 cudaMemcpy(yHostPtr, y, (size_t)(2*n*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(y[0])), cudaMemcpyDeviceToHost);
17745 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/*
17746 printf("Intermediate results:\n");
17747 for (int j=0; j<2; j++){
17748 for (int i=0; i<n; i++){
17749 printf("yHostPtr[%d,%d]=%f\n",i,j,yHostPtr[i+n*j]);
17754 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* exercise Level 3 routines (csrmm) */</span>
17755 cudaStat1 = cudaMalloc((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span>**)&z, 2*(n+1)*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(z[0]));
17756 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 != cudaSuccess) {
17757 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Device malloc failed (z)"</span>);
17758 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17760 cudaStat1 = cudaMemset((<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">void</span> *)z,0, 2*(n+1)*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(z[0]));
17761 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 != cudaSuccess) {
17762 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Memset on Device failed"</span>);
17763 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17765 status= cusparseDcsrmm(handle, CUSPARSE_OPERATION_NON_TRANSPOSE, n, 2, n,
17766 nnz, &dfive, descr, cooVal, csrRowPtr, cooColIndex,
17767 y, n, &dzero, z, n+1);
17768 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17769 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Matrix-matrix multiplication failed"</span>);
17770 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17773 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* print final results (z) */</span>
17774 cudaStat1 = cudaMemcpy(zHostPtr, z,
17775 (size_t)(2*(n+1)*<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">sizeof</span>(z[0])),
17776 cudaMemcpyDeviceToHost);
17777 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 != cudaSuccess) {
17778 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Memcpy from Device to Host failed"</span>);
17779 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17781 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">//z = [950 400 2550 2600 0 | 49300 15200 132300 131200 0]</span>
17782 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/*
17783 printf("Final results:\n");
17784 for (int j=0; j<2; j++){
17785 for (int i=0; i<n+1; i++){
17786 printf("z[%d,%d]=%f\n",i,j,zHostPtr[i+(n+1)*j]);
17791 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* destroy matrix descriptor */</span>
17792 status = cusparseDestroyMatDescr(descr);
17794 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17795 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Matrix descriptor destruction failed"</span>);
17796 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17799 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* destroy handle */</span>
17800 status = cusparseDestroy(handle);
17802 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status != CUSPARSE_STATUS_SUCCESS) {
17803 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"CUSPARSE Library release of resources failed"</span>);
17804 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17807 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* check the results */</span>
17808 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-comment">/* Notice that CLEANUP() contains a call to cusparseDestroy(handle) */</span>
17809 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> ((zHostPtr[0] != 950.0) ||
17810 (zHostPtr[1] != 400.0) ||
17811 (zHostPtr[2] != 2550.0) ||
17812 (zHostPtr[3] != 2600.0) ||
17813 (zHostPtr[4] != 0.0) ||
17814 (zHostPtr[5] != 49300.0) ||
17815 (zHostPtr[6] != 15200.0) ||
17816 (zHostPtr[7] != 132300.0) ||
17817 (zHostPtr[8] != 131200.0) ||
17818 (zHostPtr[9] != 0.0) ||
17819 (yHostPtr[0] != 10.0) ||
17820 (yHostPtr[1] != 20.0) ||
17821 (yHostPtr[2] != 30.0) ||
17822 (yHostPtr[3] != 40.0) ||
17823 (yHostPtr[4] != 680.0) ||
17824 (yHostPtr[5] != 760.0) ||
17825 (yHostPtr[6] != 1230.0) ||
17826 (yHostPtr[7] != 2240.0)){
17827 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"example test FAILED"</span>);
17828 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 1;
17830 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">else</span>{
17831 CLEANUP(<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"example test PASSED"</span>);
17832 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">return</span> 0;
17837 <div class="topic concept nested0" id="appendix-c-cusparse-fortran-bindings"><a name="appendix-c-cusparse-fortran-bindings" shape="rect">
17838 <!-- --></a><h2 class="title topictitle1"><a href="#appendix-c-cusparse-fortran-bindings" name="appendix-c-cusparse-fortran-bindings" shape="rect">13. Appendix B: CUSPARSE Fortran Bindings</a></h2>
17839 <div class="body conbody">
17840 <p class="p">The CUSPARSE library is implemented using the C-based CUDA toolchain, and it thus provides a C-style API that makes interfacing
17841 to applications written in C or C++ trivial. There are also many applications implemented in Fortran that would benefit from
17842 using CUSPARSE, and therefore a CUSPARSE Fortran interface has been developed.
17844 <p class="p">Unfortunately, Fortran-to-C calling conventions are not standardized and differ by platform and toolchain. In particular,
17845 differences may exist in the following areas:
17847 <p class="p">Symbol names (capitalization, name decoration)</p>
17848 <p class="p">Argument passing (by value or reference)</p>
17849 <p class="p">Passing of pointer arguments (size of the pointer)</p>
17850 <p class="p">To provide maximum flexibility in addressing those differences, the CUSPARSE Fortran interface is provided in the form of
17851 wrapper functions, which are written in C and are located in the file <samp class="ph codeph">cusparse_fortran.c</samp>. This file also contains a few additional wrapper functions (for <samp class="ph codeph">cudaMalloc()</samp>, <samp class="ph codeph">cudaMemset</samp>, and so on) that can be used to allocate memory on the GPU.
17853 <p class="p">The CUSPARSE Fortran wrapper code is provided as an example only and needs to be compiled into an application for it to call
17854 the CUSPARSE API functions. Providing this source code allows users to make any changes necessary for a particular platform
17857 <p class="p">The CUSPARSE Fortran wrapper code has been used to demonstrate interoperability with the compilers g95 0.91 (on 32-bit and
17858 64-bit Linux) and g95 0.92 (on 32-bit and 64-bit Mac OS X). In order to use other compilers, users have to make any changes
17859 to the wrapper code that may be required.
17861 <p class="p">The direct wrappers, intended for production code, substitute device pointers for vector and matrix arguments in all CUSPARSE
17862 functions. To use these interfaces, existing applications need to be modified slightly to allocate and deallocate data structures
17863 in GPU memory space (using <samp class="ph codeph">CUDA_MALLOC()</samp> and <samp class="ph codeph">CUDA_FREE()</samp>) and to copy data between GPU and CPU memory spaces (using the <samp class="ph codeph">CUDA_MEMCPY()</samp> routines). The sample wrappers provided in <samp class="ph codeph">cusparse_fortran.c</samp> map device pointers to the OS-dependent type <samp class="ph codeph">size_t</samp>, which is 32 bits wide on 32-bit platforms and 64 bits wide on a 64-bit platforms.
17865 <p class="p">One approach to dealing with index arithmetic on device pointers in Fortran code is to use C-style macros and to use the C
17866 preprocessor to expand them. On Linux and Mac OS X, preprocessing can be done by using the option <samp class="ph codeph">'-cpp'</samp> with g95 or gfortran. The function <samp class="ph codeph">GET_SHIFTED_ADDRESS()</samp>, provided with the CUSPARSE Fortran wrappers, can also be used, as shown in example B.
17868 <p class="p">Example B shows the the C++ of example A implemented in Fortran 77 on the host. This example should be compiled with <samp class="ph codeph">ARCH_64</samp> defined as 1 on a 64-bit OS system and as undefined on a 32-bit OS system. For example, on g95 or gfortran, it can be done
17869 directly on the command line using the option <samp class="ph codeph">-cpp -DARCH_64=1</samp>.
17872 <div class="topic concept nested1" id="example-b"><a name="example-b" shape="rect">
17873 <!-- --></a><h3 class="title topictitle2"><a href="#example-b" name="example-b" shape="rect">13.1. Example B, Fortran Application</a></h3>
17874 <div class="body conbody"><pre xml:space="preserve">c #define ARCH_64 0
17875 c #define ARCH_64 1
17877 program cusparse_fortran_example
17879 integer cuda_malloc
17881 integer cuda_memcpy_c2fort_int
17882 integer cuda_memcpy_c2fort_real
17883 integer cuda_memcpy_fort2c_int
17884 integer cuda_memcpy_fort2c_real
17885 integer cuda_memset
17886 integer cusparse_create
17887 external cusparse_destroy
17888 integer cusparse_get_version
17889 integer cusparse_create_mat_descr
17890 external cusparse_destroy_mat_descr
17891 integer cusparse_set_mat_type
17892 integer cusparse_get_mat_type
17893 integer cusparse_get_mat_fill_mode
17894 integer cusparse_get_mat_diag_type
17895 integer cusparse_set_mat_index_base
17896 integer cusparse_get_mat_index_base
17897 integer cusparse_xcoo2csr
17898 integer cusparse_dsctr
17899 integer cusparse_dcsrmv
17900 integer cusparse_dcsrmm
17901 external get_shifted_address
17902 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#if ARCH_64 </span>
17905 integer*8 cooRowIndex
17906 integer*8 cooColIndex
17912 integer*8 csrRowPtr
17914 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#else</span>
17917 integer*4 cooRowIndex
17918 integer*4 cooColIndex
17924 integer*4 csrRowPtr
17926 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-directive">#endif </span>
17928 integer cudaStat1,cudaStat2,cudaStat3
17929 integer cudaStat4,cudaStat5,cudaStat6
17930 integer n, nnz, nnz_vector
17931 parameter (n=4, nnz=9, nnz_vector=3)
17932 integer cooRowIndexHostPtr(nnz)
17933 integer cooColIndexHostPtr(nnz)
17934 real*8 cooValHostPtr(nnz)
17935 integer xIndHostPtr(nnz_vector)
17936 real*8 xValHostPtr(nnz_vector)
17937 real*8 yHostPtr(2*n)
17938 real*8 zHostPtr(2*(n+1))
17940 integer version, mtype, fmode, dtype, ibase
17941 real*8 dzero,dtwo,dthree,dfive
17945 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"testing fortran example"</span>
17946 c predefined constants (need to be careful with them)
17951 c create the following sparse test matrix in COO format
17952 c (notice one-based indexing)
17957 cooRowIndexHostPtr(1)=1
17958 cooColIndexHostPtr(1)=1
17959 cooValHostPtr(1) =1.0
17960 cooRowIndexHostPtr(2)=1
17961 cooColIndexHostPtr(2)=3
17962 cooValHostPtr(2) =2.0
17963 cooRowIndexHostPtr(3)=1
17964 cooColIndexHostPtr(3)=4
17965 cooValHostPtr(3) =3.0
17966 cooRowIndexHostPtr(4)=2
17967 cooColIndexHostPtr(4)=2
17968 cooValHostPtr(4) =4.0
17969 cooRowIndexHostPtr(5)=3
17970 cooColIndexHostPtr(5)=1
17971 cooValHostPtr(5) =5.0
17972 cooRowIndexHostPtr(6)=3
17973 cooColIndexHostPtr(6)=3
17974 cooValHostPtr(6) =6.0
17975 cooRowIndexHostPtr(7)=3
17976 cooColIndexHostPtr(7)=4
17977 cooValHostPtr(7) =7.0
17978 cooRowIndexHostPtr(8)=4
17979 cooColIndexHostPtr(8)=2
17980 cooValHostPtr(8) =8.0
17981 cooRowIndexHostPtr(9)=4
17982 cooColIndexHostPtr(9)=4
17983 cooValHostPtr(9) =9.0
17985 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Input data:"</span>
17986 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> i=1,nnz
17987 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cooRowIndexHostPtr["</span>,i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,cooRowIndexHostPtr(i)
17988 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cooColIndexHostPtr["</span>,i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,cooColIndexHostPtr(i)
17989 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cooValHostPtr["</span>, i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,cooValHostPtr(i)
17992 c create a sparse and dense vector
17993 c xVal= [100.0 200.0 400.0] (sparse)
17995 c y = [10.0 20.0 30.0 40.0 | 50.0 60.0 70.0 80.0] (dense)
17996 c (notice one-based indexing)
18006 xValHostPtr(1)=100.0
18008 xValHostPtr(2)=200.0
18010 xValHostPtr(3)=400.0
18011 c print the vectors
18012 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> j=1,2
18013 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> i=1,n
18014 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"yHostPtr["</span>,i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">","</span>,j,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,yHostPtr(i+n*(j-1))
18017 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> i=1,nnz_vector
18018 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"xIndHostPtr["</span>,i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,xIndHostPtr(i)
18019 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"xValHostPtr["</span>,i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,xValHostPtr(i)
18022 c allocate GPU memory and copy the matrix and vectors into it
18024 c cudaMemcpyHostToDevice=1
18025 cudaStat1 = cuda_malloc(cooRowIndex,nnz*4)
18026 cudaStat2 = cuda_malloc(cooColIndex,nnz*4)
18027 cudaStat3 = cuda_malloc(cooVal, nnz*8)
18028 cudaStat4 = cuda_malloc(y, 2*n*8)
18029 cudaStat5 = cuda_malloc(xInd,nnz_vector*4)
18030 cudaStat6 = cuda_malloc(xVal,nnz_vector*8)
18031 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> ((cudaStat1 /= 0) .OR.
18032 $ (cudaStat2 /= 0) .OR.
18033 $ (cudaStat3 /= 0) .OR.
18034 $ (cudaStat4 /= 0) .OR.
18035 $ (cudaStat5 /= 0) .OR.
18036 $ (cudaStat6 /= 0)) then
18037 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Device malloc failed"</span>
18038 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat1="</span>,cudaStat1
18039 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat2="</span>,cudaStat2
18040 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat3="</span>,cudaStat3
18041 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat4="</span>,cudaStat4
18042 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat5="</span>,cudaStat5
18043 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat6="</span>,cudaStat6
18046 cudaStat1 = cuda_memcpy_fort2c_int(cooRowIndex,cooRowIndexHostPtr,
18048 cudaStat2 = cuda_memcpy_fort2c_int(cooColIndex,cooColIndexHostPtr,
18050 cudaStat3 = cuda_memcpy_fort2c_real(cooVal, cooValHostPtr,
18052 cudaStat4 = cuda_memcpy_fort2c_real(y, yHostPtr,
18054 cudaStat5 = cuda_memcpy_fort2c_int(xInd, xIndHostPtr,
18056 cudaStat6 = cuda_memcpy_fort2c_real(xVal, xValHostPtr,
18058 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> ((cudaStat1 /= 0) .OR.
18059 $ (cudaStat2 /= 0) .OR.
18060 $ (cudaStat3 /= 0) .OR.
18061 $ (cudaStat4 /= 0) .OR.
18062 $ (cudaStat5 /= 0) .OR.
18063 $ (cudaStat6 /= 0)) then
18064 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Memcpy from Host to Device failed"</span>
18065 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat1="</span>,cudaStat1
18066 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat2="</span>,cudaStat2
18067 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat3="</span>,cudaStat3
18068 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat4="</span>,cudaStat4
18069 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat5="</span>,cudaStat5
18070 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"cudaStat6="</span>,cudaStat6
18071 call cuda_free(cooRowIndex)
18072 call cuda_free(cooColIndex)
18073 call cuda_free(cooVal)
18074 call cuda_free(xInd)
18075 call cuda_free(xVal)
18080 c initialize cusparse library
18081 c CUSPARSE_STATUS_SUCCESS=0
18082 status = cusparse_create(handle)
18083 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status /= 0) then
18084 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"CUSPARSE Library initialization failed"</span>
18085 call cuda_free(cooRowIndex)
18086 call cuda_free(cooColIndex)
18087 call cuda_free(cooVal)
18088 call cuda_free(xInd)
18089 call cuda_free(xVal)
18094 c CUSPARSE_STATUS_SUCCESS=0
18095 status = cusparse_get_version(handle,version)
18096 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status /= 0) then
18097 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"CUSPARSE Library initialization failed"</span>
18098 call cuda_free(cooRowIndex)
18099 call cuda_free(cooColIndex)
18100 call cuda_free(cooVal)
18101 call cuda_free(xInd)
18102 call cuda_free(xVal)
18104 call cusparse_destroy(handle)
18107 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"CUSPARSE Library version"</span>,version
18109 c create and setup the matrix descriptor
18110 c CUSPARSE_STATUS_SUCCESS=0
18111 c CUSPARSE_MATRIX_TYPE_GENERAL=0
18112 c CUSPARSE_INDEX_BASE_ONE=1
18113 status= cusparse_create_mat_descr(descrA)
18114 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status /= 0) then
18115 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Creating matrix descriptor failed"</span>
18116 call cuda_free(cooRowIndex)
18117 call cuda_free(cooColIndex)
18118 call cuda_free(cooVal)
18119 call cuda_free(xInd)
18120 call cuda_free(xVal)
18122 call cusparse_destroy(handle)
18125 status = cusparse_set_mat_type(descrA,0)
18126 status = cusparse_set_mat_index_base(descrA,1)
18127 c print the matrix descriptor
18128 mtype = cusparse_get_mat_type(descrA)
18129 fmode = cusparse_get_mat_fill_mode(descrA)
18130 dtype = cusparse_get_mat_diag_type(descrA)
18131 ibase = cusparse_get_mat_index_base(descrA)
18132 write (*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"matrix descriptor:"</span>
18133 write (*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"t="</span>,mtype,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"m="</span>,fmode,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"d="</span>,dtype,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"b="</span>,ibase
18135 c exercise conversion routines (convert matrix from COO 2 CSR format)
18137 c CUSPARSE_STATUS_SUCCESS=0
18138 c CUSPARSE_INDEX_BASE_ONE=1
18139 cudaStat1 = cuda_malloc(csrRowPtr,(n+1)*4)
18140 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 /= 0) then
18141 call cuda_free(cooRowIndex)
18142 call cuda_free(cooColIndex)
18143 call cuda_free(cooVal)
18144 call cuda_free(xInd)
18145 call cuda_free(xVal)
18147 call cusparse_destroy_mat_descr(descrA)
18148 call cusparse_destroy(handle)
18149 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Device malloc failed (csrRowPtr)"</span>
18152 status= cusparse_xcoo2csr(handle,cooRowIndex,nnz,n,
18154 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status /= 0) then
18155 call cuda_free(cooRowIndex)
18156 call cuda_free(cooColIndex)
18157 call cuda_free(cooVal)
18158 call cuda_free(xInd)
18159 call cuda_free(xVal)
18161 call cuda_free(csrRowPtr)
18162 call cusparse_destroy_mat_descr(descrA)
18163 call cusparse_destroy(handle)
18164 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Conversion from COO to CSR format failed"</span>
18167 c csrRowPtr = [0 3 4 7 9]
18169 c exercise Level 1 routines (scatter vector elements)
18170 c CUSPARSE_STATUS_SUCCESS=0
18171 c CUSPARSE_INDEX_BASE_ONE=1
18172 call get_shifted_address(y,n*8,ynp1)
18173 status= cusparse_dsctr(handle, nnz_vector, xVal, xInd,
18175 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status /= 0) then
18176 call cuda_free(cooRowIndex)
18177 call cuda_free(cooColIndex)
18178 call cuda_free(cooVal)
18179 call cuda_free(xInd)
18180 call cuda_free(xVal)
18182 call cuda_free(csrRowPtr)
18183 call cusparse_destroy_mat_descr(descrA)
18184 call cusparse_destroy(handle)
18185 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Scatter from sparse to dense vector failed"</span>
18188 c y = [10 20 30 40 | 100 200 70 400]
18190 c exercise Level 2 routines (csrmv)
18191 c CUSPARSE_STATUS_SUCCESS=0
18192 c CUSPARSE_OPERATION_NON_TRANSPOSE=0
18193 status= cusparse_dcsrmv(handle, 0, n, n, nnz, dtwo,
18194 $ descrA, cooVal, csrRowPtr, cooColIndex,
18196 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status /= 0) then
18197 call cuda_free(cooRowIndex)
18198 call cuda_free(cooColIndex)
18199 call cuda_free(cooVal)
18200 call cuda_free(xInd)
18201 call cuda_free(xVal)
18203 call cuda_free(csrRowPtr)
18204 call cusparse_destroy_mat_descr(descrA)
18205 call cusparse_destroy(handle)
18206 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Matrix-vector multiplication failed"</span>
18210 c print intermediate results (y)
18211 c y = [10 20 30 40 | 680 760 1230 2240]
18213 c cudaMemcpyDeviceToHost=2
18214 cudaStat1 = cuda_memcpy_c2fort_real(yHostPtr, y, 2*n*8, 2)
18215 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 /= 0) then
18216 call cuda_free(cooRowIndex)
18217 call cuda_free(cooColIndex)
18218 call cuda_free(cooVal)
18219 call cuda_free(xInd)
18220 call cuda_free(xVal)
18222 call cuda_free(csrRowPtr)
18223 call cusparse_destroy_mat_descr(descrA)
18224 call cusparse_destroy(handle)
18225 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Memcpy from Device to Host failed"</span>
18228 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Intermediate results:"</span>
18229 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> j=1,2
18230 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> i=1,n
18231 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"yHostPtr["</span>,i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">","</span>,j,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,yHostPtr(i+n*(j-1))
18235 c exercise Level 3 routines (csrmm)
18237 c CUSPARSE_STATUS_SUCCESS=0
18238 c CUSPARSE_OPERATION_NON_TRANSPOSE=0
18239 cudaStat1 = cuda_malloc(z, 2*(n+1)*8)
18240 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 /= 0) then
18241 call cuda_free(cooRowIndex)
18242 call cuda_free(cooColIndex)
18243 call cuda_free(cooVal)
18244 call cuda_free(xInd)
18245 call cuda_free(xVal)
18247 call cuda_free(csrRowPtr)
18248 call cusparse_destroy_mat_descr(descrA)
18249 call cusparse_destroy(handle)
18250 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Device malloc failed (z)"</span>
18253 cudaStat1 = cuda_memset(z, 0, 2*(n+1)*8)
18254 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 /= 0) then
18255 call cuda_free(cooRowIndex)
18256 call cuda_free(cooColIndex)
18257 call cuda_free(cooVal)
18258 call cuda_free(xInd)
18259 call cuda_free(xVal)
18262 call cuda_free(csrRowPtr)
18263 call cusparse_destroy_mat_descr(descrA)
18264 call cusparse_destroy(handle)
18265 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Memset on Device failed"</span>
18268 status= cusparse_dcsrmm(handle, 0, n, 2, n, nnz, dfive,
18269 $ descrA, cooVal, csrRowPtr, cooColIndex,
18270 $ y, n, dzero, z, n+1)
18271 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (status /= 0) then
18272 call cuda_free(cooRowIndex)
18273 call cuda_free(cooColIndex)
18274 call cuda_free(cooVal)
18275 call cuda_free(xInd)
18276 call cuda_free(xVal)
18279 call cuda_free(csrRowPtr)
18280 call cusparse_destroy_mat_descr(descrA)
18281 call cusparse_destroy(handle)
18282 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Matrix-matrix multiplication failed"</span>
18286 c print final results (z)
18288 c cudaMemcpyDeviceToHost=2
18289 cudaStat1 = cuda_memcpy_c2fort_real(zHostPtr, z, 2*(n+1)*8, 2)
18290 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> (cudaStat1 /= 0) then
18291 call cuda_free(cooRowIndex)
18292 call cuda_free(cooColIndex)
18293 call cuda_free(cooVal)
18294 call cuda_free(xInd)
18295 call cuda_free(xVal)
18298 call cuda_free(csrRowPtr)
18299 call cusparse_destroy_mat_descr(descrA)
18300 call cusparse_destroy(handle)
18301 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Memcpy from Device to Host failed"</span>
18304 c z = [950 400 2550 2600 0 | 49300 15200 132300 131200 0]
18305 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"Final results:"</span>
18306 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> j=1,2
18307 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">do</span> i=1,n+1
18308 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"z["</span>,i,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">","</span>,j,<span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"]="</span>,zHostPtr(i+(n+1)*(j-1))
18312 c check the results
18313 epsilon = 0.00000000000001
18314 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">if</span> ((DABS(zHostPtr(1) - 950.0) .GT. epsilon) .OR.
18315 $ (DABS(zHostPtr(2) - 400.0) .GT. epsilon) .OR.
18316 $ (DABS(zHostPtr(3) - 2550.0) .GT. epsilon) .OR.
18317 $ (DABS(zHostPtr(4) - 2600.0) .GT. epsilon) .OR.
18318 $ (DABS(zHostPtr(5) - 0.0) .GT. epsilon) .OR.
18319 $ (DABS(zHostPtr(6) - 49300.0) .GT. epsilon) .OR.
18320 $ (DABS(zHostPtr(7) - 15200.0) .GT. epsilon) .OR.
18321 $ (DABS(zHostPtr(8) - 132300.0).GT. epsilon) .OR.
18322 $ (DABS(zHostPtr(9) - 131200.0).GT. epsilon) .OR.
18323 $ (DABS(zHostPtr(10) - 0.0) .GT. epsilon) .OR.
18324 $ (DABS(yHostPtr(1) - 10.0) .GT. epsilon) .OR.
18325 $ (DABS(yHostPtr(2) - 20.0) .GT. epsilon) .OR.
18326 $ (DABS(yHostPtr(3) - 30.0) .GT. epsilon) .OR.
18327 $ (DABS(yHostPtr(4) - 40.0) .GT. epsilon) .OR.
18328 $ (DABS(yHostPtr(5) - 680.0) .GT. epsilon) .OR.
18329 $ (DABS(yHostPtr(6) - 760.0) .GT. epsilon) .OR.
18330 $ (DABS(yHostPtr(7) - 1230.0) .GT. epsilon) .OR.
18331 $ (DABS(yHostPtr(8) - 2240.0) .GT. epsilon)) then
18332 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"fortran example test FAILED"</span>
18333 <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-keyword">else</span>
18334 write(*,*) <span xmlns:xslthl="http://xslthl.sf.net" class="xslthl-string">"fortran example test PASSED"</span>
18337 c deallocate GPU memory and exit
18338 call cuda_free(cooRowIndex)
18339 call cuda_free(cooColIndex)
18340 call cuda_free(cooVal)
18341 call cuda_free(xInd)
18342 call cuda_free(xVal)
18345 call cuda_free(csrRowPtr)
18346 call cusparse_destroy_mat_descr(descrA)
18347 call cusparse_destroy(handle)
18354 <div class="topic concept nested0" id="appendix-acknowledgements"><a name="appendix-acknowledgements" shape="rect">
18355 <!-- --></a><h2 class="title topictitle1"><a href="#appendix-acknowledgements" name="appendix-acknowledgements" shape="rect">Appendix C: Acknowledgements</a></h2>
18356 <div class="body conbody">
18358 NVIDIA would like to thank the following individuals and
18359 institutions for their contributions:
18364 The cusparse<t>gtsv implementation is derived from a version developed by Li-Wen Chang from the University of Illinois.
18370 <div class="topic concept nested0" id="bibliography"><a name="bibliography" shape="rect">
18371 <!-- --></a><h2 class="title topictitle1"><a href="#bibliography" name="bibliography" shape="rect">15. Bibliography</a></h2>
18372 <div class="body conbody">
18373 <p class="p">[1] N. Bell and M. Garland, <a class="xref" href="http://research.nvidia.com/content/implementing-sparse-matrix-vector-multiplication-throughput-oriented-processors" target="_blank" shape="rect">“Implementing Sparse Matrix-Vector Multiplication on Throughput-Oriented Processors”</a>, Supercomputing, 2009.
18375 <p class="p">[2] R. Grimes, D. Kincaid, and D. Young, “ITPACK 2.0 User’s Guide”, Technical Report CNA-150, Center for Numerical Analysis,
18376 University of Texas, 1979.
18378 <p class="p">[3] M. Naumov, <a class="xref" href="http://developer.nvidia.com/content/accelerated-solution-sparse-linear-systems" target="_blank" shape="rect">“Incomplete-LU and Cholesky Preconditioned Iterative Methods Using CUSPARSE and CUBLAS”</a>, Technical Report and White Paper, 2011.
18382 <div class="topic concept nested0" id="notices-header"><a name="notices-header" shape="rect">
18383 <!-- --></a><h2 class="title topictitle1"><a href="#notices-header" name="notices-header" shape="rect">Notices</a></h2>
18384 <div class="topic reference nested1" id="notice"><a name="notice" shape="rect">
18385 <!-- --></a><h3 class="title topictitle2"><a href="#notice" name="notice" shape="rect"></a></h3>
18386 <div class="body refbody">
18387 <div class="section">
18388 <h3 class="title sectiontitle">Notice</h3>
18389 <p class="p">ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND
18390 SEPARATELY, "MATERIALS") ARE BEING PROVIDED "AS IS." NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE
18391 WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS
18392 FOR A PARTICULAR PURPOSE.
18394 <p class="p">Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the
18395 consequences of use of such information or for any infringement of patents or other rights of third parties that may result
18396 from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications
18397 mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information
18398 previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems
18399 without express written approval of NVIDIA Corporation.
18404 <div class="topic reference nested1" id="trademarks"><a name="trademarks" shape="rect">
18405 <!-- --></a><h3 class="title topictitle2"><a href="#trademarks" name="trademarks" shape="rect"></a></h3>
18406 <div class="body refbody">
18407 <div class="section">
18408 <h3 class="title sectiontitle">Trademarks</h3>
18409 <p class="p">NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation
18410 in the U.S. and other countries. Other company and product names may be trademarks of
18411 the respective companies with which they are associated.
18416 <div class="topic reference nested1" id="copyright-past-to-present"><a name="copyright-past-to-present" shape="rect">
18417 <!-- --></a><h3 class="title topictitle2"><a href="#copyright-past-to-present" name="copyright-past-to-present" shape="rect"></a></h3>
18418 <div class="body refbody">
18419 <div class="section">
18420 <h3 class="title sectiontitle">Copyright</h3>
18421 <p class="p">© <span class="ph">2007</span>-<span class="ph">2013</span> NVIDIA
18422 Corporation. All rights reserved.
18429 <hr id="contents-end"></hr>
18430 <div id="release-info">CUSPARSE
18431 (<a href="../../pdf/CUSPARSE_Library.pdf">PDF</a>)
18434 (<a href="https://developer.nvidia.com/cuda-toolkit-archive">older</a>)
18439 <a href="mailto:cudatools@nvidia.com?subject=CUDA Tools Documentation Feedback: cusparse">Send Feedback</a></div>
18443 <header id="header"><span id="company">NVIDIA</span><span id="site-title">CUDA Toolkit Documentation</span><form id="search" method="get" action="search">
18444 <input type="text" name="search-text"></input><fieldset id="search-location">
18445 <legend>Search In:</legend>
18446 <label><input type="radio" name="search-type" value="site"></input>Entire Site</label>
18447 <label><input type="radio" name="search-type" value="document"></input>Just This Document</label></fieldset>
18448 <button type="reset">clear search</button>
18449 <button id="submit" type="submit">search</button></form>
18451 <nav id="site-nav">
18452 <div class="category closed"><span class="twiddle">▷</span><a href="../index.html" title="The root of the site.">CUDA Toolkit</a></div>
18453 <ul class="closed">
18454 <li><a href="../cuda-toolkit-release-notes/index.html" title="The Release Notes for the CUDA Toolkit from v4.0 to today.">Release Notes</a></li>
18455 <li><a href="../eula/index.html" title="The End User License Agreements for the NVIDIA CUDA Toolkit, the NVIDIA CUDA Samples, the NVIDIA Display Driver, and NVIDIA NSight (Visual Studio Edition).">EULA</a></li>
18456 <li><a href="../cuda-getting-started-guide-for-linux/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on GNU/Linux systems.">Getting Started Linux</a></li>
18457 <li><a href="../cuda-getting-started-guide-for-mac-os-x/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on Mac OS X systems.">Getting Started Mac OS X</a></li>
18458 <li><a href="../cuda-getting-started-guide-for-microsoft-windows/index.html" title="This guide discusses how to install and check for correct operation of the CUDA Development Tools on Microsoft Windows systems.">Getting Started Windows</a></li>
18459 <li><a href="../cuda-c-programming-guide/index.html" title="This guide provides a detailed discussion of the CUDA programming model and programming interface. It then describes the hardware implementation, and provides guidance on how to achieve maximum performance. The Appendixes include a list of all CUDA-enabled devices, detailed description of all extensions to the C language, listings of supported mathematical functions, C++ features supported in host and device code, details on texture fetching, technical specifications of various devices, and concludes by introducing the low-level driver API.">Programming Guide</a></li>
18460 <li><a href="../cuda-c-best-practices-guide/index.html" title="This guide presents established parallelization and optimization techniques and explains coding metaphors and idioms that can greatly simplify programming for CUDA-capable GPU architectures. The intent is to provide guidelines for obtaining the best performance from NVIDIA GPUs using the CUDA Toolkit.">Best Practices Guide</a></li>
18461 <li><a href="../kepler-compatibility-guide/index.html" title="This application note is intended to help developers ensure that their NVIDIA CUDA applications will run effectively on GPUs based on the NVIDIA Kepler Architecture. This document provides guidance to ensure that your software applications are compatible with Kepler.">Kepler Compatibility Guide</a></li>
18462 <li><a href="../kepler-tuning-guide/index.html" title="Kepler is NVIDIA's next-generation architecture for CUDA compute applications. Applications that follow the best practices for the Fermi architecture should typically see speedups on the Kepler architecture without any code changes. This guide summarizes the ways that an application can be fine-tuned to gain additional speedups by leveraging Kepler architectural features.">Kepler Tuning Guide</a></li>
18463 <li><a href="../parallel-thread-execution/index.html" title="This guide provides detailed instructions on the use of PTX, a low-level parallel thread execution virtual machine and instruction set architecture (ISA). PTX exposes the GPU as a data-parallel computing device.">PTX ISA</a></li>
18464 <li><a href="../optimus-developer-guide/index.html" title="This document explains how CUDA APIs can be used to query for GPU capabilities in NVIDIA Optimus systems.">Developer Guide for Optimus</a></li>
18465 <li><a href="../video-decoder/index.html" title="This document provides the video decoder API specification and the format conversion and display using DirectX or OpenGL following decode.">Video Decoder</a></li>
18466 <li><a href="../video-encoder/index.html" title="This document provides the CUDA video encoder specifications, including the C-library API functions and encoder query parameters.">Video Encoder</a></li>
18467 <li><a href="../inline-ptx-assembly/index.html" title="This document shows how to inline PTX (parallel thread execution) assembly language statements into CUDA code. It describes available assembler statement parameters and constraints, and the document also provides a list of some pitfalls that you may encounter.">Inline PTX Assembly</a></li>
18468 <li><a href="../cuda-runtime-api/index.html" title="The CUDA runtime API.">CUDA Runtime API</a></li>
18469 <li><a href="../cuda-driver-api/index.html" title="The CUDA driver API.">CUDA Driver API</a></li>
18470 <li><a href="../cuda-math-api/index.html" title="The CUDA math API.">CUDA Math API</a></li>
18471 <li><a href="../cublas/index.html" title="The CUBLAS library is an implementation of BLAS (Basic Linear Algebra Subprograms) on top of the NVIDIA CUDA runtime. It allows the user to access the computational resources of NVIDIA Graphical Processing Unit (GPU), but does not auto-parallelize across multiple GPUs.">CUBLAS</a></li>
18472 <li><a href="../cufft/index.html" title="The CUFFT library user guide.">CUFFT</a></li>
18473 <li><a href="../curand/index.html" title="The CURAND library user guide.">CURAND</a></li>
18474 <li><a href="../cusparse/index.html" title="The CUSPARSE library user guide.">CUSPARSE</a></li>
18475 <li><a href="../npp/index.html" title="NVIDIA NPP is a library of functions for performing CUDA accelerated processing. The initial set of functionality in the library focuses on imaging and video processing and is widely applicable for developers in these areas. NPP will evolve over time to encompass more of the compute heavy tasks in a variety of problem domains. The NPP library is written to maximize flexibility, while maintaining high performance.">NPP</a></li>
18476 <li><a href="../thrust/index.html" title="The Thrust getting started guide.">Thrust</a></li>
18477 <li><a href="../cuda-samples/index.html" title="This document contains a complete listing of the code samples that are included with the NVIDIA CUDA Toolkit. It describes each code sample, lists the minimum GPU specification, and provides links to the source code and white papers if available.">CUDA Samples</a></li>
18478 <li><a href="../cuda-compiler-driver-nvcc/index.html" title="This document is a reference guide on the use of the CUDA compiler driver nvcc. Instead of being a specific CUDA compilation driver, nvcc mimics the behavior of the GNU compiler gcc, accepting a range of conventional compiler options, such as for defining macros and include/library paths, and for steering the compilation process.">NVCC</a></li>
18479 <li><a href="../cuda-gdb/index.html" title="The NVIDIA tool for debugging CUDA applications running on Linux and Mac, providing developers with a mechanism for debugging CUDA applications running on actual hardware. CUDA-GDB is an extension to the x86-64 port of GDB, the GNU Project debugger.">CUDA-GDB</a></li>
18480 <li><a href="../cuda-memcheck/index.html" title="CUDA-MEMCHECK is a suite of run time tools capable of precisely detecting out of bounds and misaligned memory access errors, checking device allocation leaks, reporting hardware errors and identifying shared memory data access hazards.">CUDA-MEMCHECK</a></li>
18481 <li><a href="../nsight-eclipse-edition-getting-started-guide/index.html" title="Nsight Eclipse Edition getting started guide">Nsight Eclipse Edition</a></li>
18482 <li><a href="../profiler-users-guide/index.html" title="This is the guide to the Profiler.">Profiler</a></li>
18483 <li><a href="../cuda-binary-utilities/index.html" title="The application notes for cuobjdump and nvdisasm.">CUDA Binary Utilities</a></li>
18484 <li><a href="../floating-point/index.html" title="A number of issues related to floating point accuracy and compliance are a frequent source of confusion on both CPUs and GPUs. The purpose of this white paper is to discuss the most common issues related to NVIDIA GPUs and to supplement the documentation in the CUDA C Programming Guide.">Floating Point and IEEE 754</a></li>
18485 <li><a href="../incomplete-lu-cholesky/index.html" title="In this white paper we show how to use the CUSPARSE and CUBLAS libraries to achieve a 2x speedup over CPU in the incomplete-LU and Cholesky preconditioned iterative methods. We focus on the Bi-Conjugate Gradient Stabilized and Conjugate Gradient iterative methods, that can be used to solve large sparse nonsymmetric and symmetric positive definite linear systems, respectively. Also, we comment on the parallel sparse triangular solve, which is an essential building block in these algorithms.">Incomplete-LU and Cholesky Preconditioned Iterative Methods</a></li>
18486 <li><a href="../libnvvm-api/index.html" title="The libNVVM API.">libNVVM API</a></li>
18487 <li><a href="../libdevice-users-guide/index.html" title="The libdevice library is an LLVM bitcode library that implements common functions for GPU kernels.">libdevice User's Guide</a></li>
18488 <li><a href="../nvvm-ir-spec/index.html" title="NVVM IR is a compiler IR (internal representation) based on the LLVM IR. The NVVM IR is designed to represent GPU compute kernels (for example, CUDA kernels). High-level language front-ends, like the CUDA C compiler front-end, can generate NVVM IR.">NVVM IR</a></li>
18489 <li><a href="../cupti/index.html" title="The CUPTI API.">CUPTI</a></li>
18490 <li><a href="../debugger-api/index.html" title="The CUDA debugger API.">Debugger API</a></li>
18491 <li><a href="../gpudirect-rdma/index.html" title="A tool for Kepler-class GPUs and CUDA 5.0 enabling a direct path for communication between the GPU and a peer device on the PCI Express bus when the devices share the same upstream root complex using standard features of PCI Express. This document introduces the technology and describes the steps necessary to enable a RDMA for GPUDirect connection to NVIDIA GPUs within the Linux device driver model.">RDMA for GPUDirect</a></li>
18493 <div class="category"><span class="twiddle">▼</span><a href="index.html" title="CUSPARSE">CUSPARSE</a></div>
18495 <li><a href="#introduction">1. Introduction</a><ul>
18496 <li><a href="#naming-convention">1.1. Naming Conventions</a></li>
18497 <li><a href="#asynchronous-execution">1.2. Asynchronous Execution</a></li>
18500 <li><a href="#using-the-cusparse-api">2. Using the CUSPARSE API</a><ul>
18501 <li><a href="#thread-safety">2.1. Thread Safety</a></li>
18502 <li><a href="#scalar-parameters2">2.2. Scalar Parameters</a></li>
18503 <li><a href="#parallelism-with-streams">2.3. Parallelism with Streams</a></li>
18506 <li><a href="#cusparse-indexing-and-data-formats">3. CUSPARSE Indexing and Data Formats</a><ul>
18507 <li><a href="#index-base-format">3.1. Index Base Format</a></li>
18508 <li><a href="#vector-formats">3.2. Vector Formats</a><ul>
18509 <li><a href="#dense-format">3.2.1. Dense Format</a></li>
18510 <li><a href="#sparse-format">3.2.2. Sparse Format</a></li>
18513 <li><a href="#matrix-formats">3.3. Matrix Formats</a><ul>
18514 <li><a href="#dense-format2">3.3.1. Dense Format</a></li>
18515 <li><a href="#coordinate-format-coo">3.3.2. Coordinate Format (COO)</a></li>
18516 <li><a href="#compressed-sparse-row-format-csr">3.3.3. Compressed Sparse Row Format (CSR)</a></li>
18517 <li><a href="#compressed-sparse-column-format-csc">3.3.4. Compressed Sparse Column Format (CSC)</a></li>
18518 <li><a href="#ellpack-itpack-format-ell">3.3.5. Ellpack-Itpack Format (ELL)</a></li>
18519 <li><a href="#hybrid-format-hyb">3.3.6. Hybrid Format (HYB)</a></li>
18520 <li><a href="#block-compressed-sparse-row-format-bsr">3.3.7. Block Compressed Sparse Row Format (BSR)</a></li>
18521 <li><a href="#extended-bsr-format-bsrx">3.3.8. Extended BSR Format (BSRX)</a></li>
18526 <li><a href="#cusparse-types-reference">4. CUSPARSE Types Reference</a><ul>
18527 <li><a href="#data-types">4.1. Data types</a></li>
18528 <li><a href="#cusparseactiont">4.2. cusparseAction_t</a></li>
18529 <li><a href="#cusparsedirectiont">4.3. cusparseDirection_t</a></li>
18530 <li><a href="#cusparsehandlet">4.4. cusparseHandle_t</a></li>
18531 <li><a href="#cusparsehybmatt">4.5. cusparseHybMat_t</a><ul>
18532 <li><a href="#cusparsehybpartitiont">4.5.1. cusparseHybPartition_t</a></li>
18535 <li><a href="#cusparsematdescrt">4.6. cusparseMatDescr_t</a><ul>
18536 <li><a href="#cusparsediagtypet">4.6.1. cusparseDiagType_t</a></li>
18537 <li><a href="#cusparsefillmodet">4.6.2. cusparseFillMode_t</a></li>
18538 <li><a href="#cusparseindexbaset">4.6.3. cusparseIndexBase_t</a></li>
18539 <li><a href="#cusparsematrixtypet">4.6.4. cusparseMatrixType_t</a></li>
18542 <li><a href="#cusparseoperationt">4.7. cusparseOperation_t</a></li>
18543 <li><a href="#cusparsepointermode_t">4.8. cusparsePointerMode_t</a></li>
18544 <li><a href="#cusparsesolveanalysisinfot">4.9. cusparseSolveAnalysisInfo_t</a></li>
18545 <li><a href="#cusparsestatust">4.10. cusparseStatus_t</a></li>
18548 <li><a href="#cusparse-helper-function-reference">5. CUSPARSE Helper Function Reference</a><ul>
18549 <li><a href="#cusparsecreate">5.1. cusparseCreate()</a></li>
18550 <li><a href="#cusparsecreatehybmat">5.2. cusparseCreateHybMat()</a></li>
18551 <li><a href="#cusparsecreatematdescr">5.3. cusparseCreateMatDescr()</a></li>
18552 <li><a href="#cusparsecreatesolveanalysisinfo">5.4. cusparseCreateSolveAnalysisInfo()</a></li>
18553 <li><a href="#cusparsedestroy">5.5. cusparseDestroy()</a></li>
18554 <li><a href="#cusparsedestroyhybmat">5.6. cusparseDestroyHybMat()</a></li>
18555 <li><a href="#cusparsedestroymatdescr">5.7. cusparseDestroyMatDescr()</a></li>
18556 <li><a href="#cusparsedestroysolveanalysisinfo">5.8. cusparseDestroySolveAnalysisInfo()</a></li>
18557 <li><a href="#cusparsegetlevelinfo">5.9. cusparseGetLevelInfo()</a></li>
18558 <li><a href="#cusparsegetmatdiagtype">5.10. cusparseGetMatDiagType()</a></li>
18559 <li><a href="#cusparsegetmatfillmode">5.11. cusparseGetMatFillMode()</a></li>
18560 <li><a href="#cusparsegetmatindexbase">5.12. cusparseGetMatIndexBase()</a></li>
18561 <li><a href="#cusparsegetmattype">5.13. cusparseGetMatType()</a></li>
18562 <li><a href="#cusparsegetpointermode">5.14. cusparseGetPointerMode()</a></li>
18563 <li><a href="#cusparsegetversion">5.15. cusparseGetVersion()</a></li>
18564 <li><a href="#cusparsesetmatdiagtype">5.16. cusparseSetMatDiagType()</a></li>
18565 <li><a href="#cusparsesetmatfillmode">5.17. cusparseSetMatFillMode()</a></li>
18566 <li><a href="#cusparsesetmatindexbase">5.18. cusparseSetMatIndexBase()</a></li>
18567 <li><a href="#cusparsesetmattype">5.19. cusparseSetMatType()</a></li>
18568 <li><a href="#cusparsesetpointermode">5.20. cusparseSetPointerMode()</a></li>
18569 <li><a href="#cusparsesetstream">5.21. cusparseSetStream()</a></li>
18572 <li><a href="#cusparse-level-1-function-reference">6. CUSPARSE Level 1 Function Reference</a><ul>
18573 <li><a href="#cusparse-lt-t-gt-axpyi">6.1. cusparse<t>axpyi</a></li>
18574 <li><a href="#cusparse-lt-t-gt-doti">6.2. cusparse<t>doti</a></li>
18575 <li><a href="#cusparse-lt-t-gt-dotci">6.3. cusparse<t>dotci</a></li>
18576 <li><a href="#cusparse-lt-t-gt-gthr">6.4. cusparse<t>gthr</a></li>
18577 <li><a href="#cusparse-lt-t-gt-gthrz">6.5. cusparse<t>gthrz</a></li>
18578 <li><a href="#cusparse-lt-t-gt-roti">6.6. cusparse<t>roti</a></li>
18579 <li><a href="#cusparse-lt-t-gt-sctr">6.7. cusparse<t>sctr</a></li>
18582 <li><a href="#cusparse-level-2-function-reference">7. CUSPARSE Level 2 Function Reference</a><ul>
18583 <li><a href="#cusparse-lt-t-gt-bsrmv">7.1. cusparse<t>bsrmv</a></li>
18584 <li><a href="#cusparse-lt-t-gt-bsrxmv">7.2. cusparse<t>bsrxmv</a></li>
18585 <li><a href="#cusparse-lt-t-gt-csrmv">7.3. cusparse<t>csrmv</a></li>
18586 <li><a href="#cusparse-lt-t-gt-csrsvanalysis">7.4. cusparse<t>csrsv_analysis</a></li>
18587 <li><a href="#cusparse-lt-t-gt-csrsvsolve">7.5. cusparse<t>csrsv_solve</a></li>
18588 <li><a href="#cusparse-lt-t-gt-hybmv">7.6. cusparse<t>hybmv</a></li>
18589 <li><a href="#cusparse-lt-t-gt-hybsvanalysis">7.7. cusparse<t>hybsv_analysis</a></li>
18590 <li><a href="#cusparse-lt-t-gt-hybsvsolve">7.8. cusparse<t>hybsv_solve</a></li>
18593 <li><a href="#cusparse-level-3-function-reference">8. CUSPARSE Level 3 Function Reference</a><ul>
18594 <li><a href="#cusparse-lt-t-gt-csrmm">8.1. cusparse<t>csrmm</a></li>
18595 <li><a href="#cusparse-lt-t-gt-csrmm2">8.2. cusparse<t>csrmm2</a></li>
18596 <li><a href="#cusparse-lt-t-gt-csrsmanalysis">8.3. cusparse<t>csrsm_analysis</a></li>
18597 <li><a href="#cusparse-lt-t-gt-csrsmsolve">8.4. cusparse<t>csrsm_solve</a></li>
18600 <li><a href="#cusparse-extra-function-reference">9. CUSPARSE Extra Function Reference</a><ul>
18601 <li><a href="#cusparse-lt-t-gt-csrgeam">9.1. cusparse<t>csrgeam</a></li>
18602 <li><a href="#cusparse-lt-t-gt-csrgemm">9.2. cusparse<t>csrgemm</a></li>
18605 <li><a href="#cusparse-preconditioners-reference">10. CUSPARSE Preconditioners Reference</a><ul>
18606 <li><a href="#cusparse-lt-t-gt-csric0">10.1. cusparse<t>csric0</a></li>
18607 <li><a href="#cusparse-lt-t-gt-csrilu0">10.2. cusparse<t>csrilu0</a></li>
18608 <li><a href="#cusparse-lt-t-gt-gtsv">10.3. cusparse<t>gtsv</a></li>
18609 <li><a href="#cusparse-lt-t-gt-gtsv_nopivot">10.4. cusparse<t>gtsv_nopivot</a></li>
18610 <li><a href="#cusparse-lt-t-gt-gtsvstridedbatch">10.5. cusparse<t>gtsvStridedBatch</a></li>
18613 <li><a href="#cusparse-format-conversion-reference">11. CUSPARSE Format Conversion Reference</a><ul>
18614 <li><a href="#cusparse-lt-t-gt-bsr2csr">11.1. cusparse<t>bsr2csr</a></li>
18615 <li><a href="#cusparse-lt-t-gt-coo2csr">11.2. cusparse<t>coo2csr</a></li>
18616 <li><a href="#cusparse-lt-t-gt-csc2dense">11.3. cusparse<t>csc2dense</a></li>
18617 <li><a href="#cusparse-lt-t-gt-csc2hyb">11.4. cusparse<t>csc2hyb</a></li>
18618 <li><a href="#cusparse-lt-t-gt-csr2bsr">11.5. cusparse<t>csr2bsr</a></li>
18619 <li><a href="#cusparse-lt-t-gt-csr2coo">11.6. cusparse<t>csr2coo</a></li>
18620 <li><a href="#cusparse-lt-t-gt-csr2csc">11.7. cusparse<t>csr2csc</a></li>
18621 <li><a href="#cusparse-lt-t-gt-csr2dense">11.8. cusparse<t>csr2dense</a></li>
18622 <li><a href="#cusparse-lt-t-gt-csr2hyb">11.9. cusparse<t>csr2hyb</a></li>
18623 <li><a href="#cusparse-lt-t-gt-dense2csc">11.10. cusparse<t>dense2csc</a></li>
18624 <li><a href="#cusparse-lt-t-gt-dense2csr">11.11. cusparse<t>dense2csr</a></li>
18625 <li><a href="#cusparse-lt-t-gt-dense2hyb">11.12. cusparse<t>dense2hyb</a></li>
18626 <li><a href="#cusparse-lt-t-gt-hyb2csc">11.13. cusparse<t>hyb2csc</a></li>
18627 <li><a href="#cusparse-lt-t-gt-hyb2csr">11.14. cusparse<t>hyb2csr</a></li>
18628 <li><a href="#cusparse-lt-t-gt-hyb2dense">11.15. cusparse<t>hyb2dense</a></li>
18629 <li><a href="#cusparse-lt-t-gt-nnz">11.16. cusparse<t>nnz</a></li>
18632 <li><a href="#appendix-b-cusparse-library-c---example">12. Appendix A: CUSPARSE Library C++ Example</a></li>
18633 <li><a href="#appendix-c-cusparse-fortran-bindings">13. Appendix B: CUSPARSE Fortran Bindings</a><ul>
18634 <li><a href="#example-b">13.1. Example B, Fortran Application</a></li>
18637 <li><a href="#appendix-acknowledgements">14. Appendix C: Acknowledgements</a></li>
18638 <li><a href="#bibliography">15. Bibliography</a></li>
18641 <nav id="search-results">
18642 <h2>Search Results</h2>
18645 <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/formatting/common.min.js"></script>
18646 <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/omniture/s_code_us_dev_aut1-nolinktrackin.js"></script>
18647 <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/omniture/omniture.js"></script>
18648 <noscript><a href="http://www.omniture.com" title="Web Analytics"><img src="http://omniture.nvidia.com/b/ss/nvidiacudadocs/1/H.17--NS/0" height="1" width="1" border="0" alt=""></img></a></noscript>
18649 <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/google-analytics/google-analytics-write.js"></script>
18650 <script language="JavaScript" type="text/javascript" charset="utf-8" src="../common/scripts/google-analytics/google-analytics-tracker.js"></script>