MAGMA  2.4.0
Matrix Algebra for GPU and Multicore Architectures

Functions

magma_int_t magma_svalinit_gpu (magma_int_t num_el, magmaFloat_ptr dval, magma_queue_t queue)
 Initializes a device array with zero. More...
 
magma_int_t magma_sindexinit_gpu (magma_int_t num_el, magmaIndex_ptr dind, magma_queue_t queue)
 Initializes a device array with zero. More...
 
magma_int_t magma_sbajac_csr (magma_int_t localiters, magma_s_matrix D, magma_s_matrix R, magma_s_matrix b, magma_s_matrix *x, magma_queue_t queue)
 This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. More...
 
magma_int_t magma_sbajac_csr_overlap (magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_s_matrix *D, magma_s_matrix *R, magma_s_matrix b, magma_s_matrix *x, magma_queue_t queue)
 This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block. More...
 
magma_int_t magma_scompact (magma_int_t m, magma_int_t n, magmaFloat_ptr dA, magma_int_t ldda, magmaFloat_ptr dnorms, float tol, magmaInt_ptr active, magmaInt_ptr cBlock, magma_queue_t queue)
 ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol. More...
 
magma_int_t magma_sjacobisetup_vector_gpu (magma_int_t num_rows, magma_s_matrix b, magma_s_matrix d, magma_s_matrix c, magma_s_matrix *x, magma_queue_t queue)
 Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. More...
 
magma_int_t magma_slobpcg_maxpy (magma_int_t num_rows, magma_int_t num_vecs, magmaFloat_ptr X, magmaFloat_ptr Y, magma_queue_t queue)
 This routine computes a axpy for a mxn matrix: More...
 
magma_int_t magma_sbicgstab_1 (magma_int_t num_rows, magma_int_t num_cols, float beta, float omega, magmaFloat_ptr r, magmaFloat_ptr v, magmaFloat_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sbicgstab_2 (magma_int_t num_rows, magma_int_t num_cols, float alpha, magmaFloat_ptr r, magmaFloat_ptr v, magmaFloat_ptr s, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sbicgstab_3 (magma_int_t num_rows, magma_int_t num_cols, float alpha, float omega, magmaFloat_ptr p, magmaFloat_ptr s, magmaFloat_ptr t, magmaFloat_ptr x, magmaFloat_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sbicgstab_4 (magma_int_t num_rows, magma_int_t num_cols, float alpha, float omega, magmaFloat_ptr y, magmaFloat_ptr z, magmaFloat_ptr s, magmaFloat_ptr t, magmaFloat_ptr x, magmaFloat_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sbicgmerge_spmv1 (magma_s_matrix A, magmaFloat_ptr d1, magmaFloat_ptr d2, magmaFloat_ptr dp, magmaFloat_ptr dr, magmaFloat_ptr dv, magmaFloat_ptr skp, magma_queue_t queue)
 Merges the first SpmV using CSR with the dot product and the computation of alpha. More...
 
magma_int_t magma_sbicgmerge_spmv2 (magma_s_matrix A, magmaFloat_ptr d1, magmaFloat_ptr d2, magmaFloat_ptr ds, magmaFloat_ptr dt, magmaFloat_ptr skp, magma_queue_t queue)
 Merges the second SpmV using CSR with the dot product and the computation of omega. More...
 
magma_int_t magma_sbicgmerge_xrbeta (magma_int_t n, magmaFloat_ptr d1, magmaFloat_ptr d2, magmaFloat_ptr rr, magmaFloat_ptr r, magmaFloat_ptr p, magmaFloat_ptr s, magmaFloat_ptr t, magmaFloat_ptr x, magmaFloat_ptr skp, magma_queue_t queue)
 Merges the second SpmV using CSR with the dot product and the computation of omega. More...
 
magma_int_t magma_sbicgmerge1 (magma_int_t n, magmaFloat_ptr skp, magmaFloat_ptr v, magmaFloat_ptr r, magmaFloat_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sbicgmerge2 (magma_int_t n, magmaFloat_ptr skp, magmaFloat_ptr r, magmaFloat_ptr v, magmaFloat_ptr s, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sbicgmerge3 (magma_int_t n, magmaFloat_ptr skp, magmaFloat_ptr p, magmaFloat_ptr s, magmaFloat_ptr t, magmaFloat_ptr x, magmaFloat_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sbicgmerge4 (magma_int_t type, magmaFloat_ptr skp, magma_queue_t queue)
 Performs some parameter operations for the BiCGSTAB with scalars on GPU. More...
 
magma_int_t magma_smergeblockkrylov (magma_int_t num_rows, magma_int_t num_cols, magmaFloat_ptr alpha, magmaFloat_ptr p, magmaFloat_ptr x, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_scgmerge_spmv1 (magma_s_matrix A, magmaFloat_ptr d1, magmaFloat_ptr d2, magmaFloat_ptr dd, magmaFloat_ptr dz, magmaFloat_ptr skp, magma_queue_t queue)
 Merges the first SpmV using different formats with the dot product and the computation of rho. More...
 
magma_int_t magma_scgs_1 (magma_int_t num_rows, magma_int_t num_cols, float beta, magmaFloat_ptr r, magmaFloat_ptr q, magmaFloat_ptr u, magmaFloat_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_scgs_2 (magma_int_t num_rows, magma_int_t num_cols, magmaFloat_ptr r, magmaFloat_ptr u, magmaFloat_ptr p, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_scgs_3 (magma_int_t num_rows, magma_int_t num_cols, float alpha, magmaFloat_ptr v_hat, magmaFloat_ptr u, magmaFloat_ptr q, magmaFloat_ptr t, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_scgs_4 (magma_int_t num_rows, magma_int_t num_cols, float alpha, magmaFloat_ptr u_hat, magmaFloat_ptr t, magmaFloat_ptr x, magmaFloat_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sidr_smoothing_1 (magma_int_t num_rows, magma_int_t num_cols, magmaFloat_ptr drs, magmaFloat_ptr dr, magmaFloat_ptr dt, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sidr_smoothing_2 (magma_int_t num_rows, magma_int_t num_cols, float omega, magmaFloat_ptr dx, magmaFloat_ptr dxs, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_1 (magma_int_t num_rows, magma_int_t num_cols, float rho, float psi, magmaFloat_ptr y, magmaFloat_ptr z, magmaFloat_ptr v, magmaFloat_ptr w, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_2 (magma_int_t num_rows, magma_int_t num_cols, float pde, float rde, magmaFloat_ptr y, magmaFloat_ptr z, magmaFloat_ptr p, magmaFloat_ptr q, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_3 (magma_int_t num_rows, magma_int_t num_cols, float beta, magmaFloat_ptr pt, magmaFloat_ptr v, magmaFloat_ptr y, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_4 (magma_int_t num_rows, magma_int_t num_cols, float eta, magmaFloat_ptr p, magmaFloat_ptr pt, magmaFloat_ptr d, magmaFloat_ptr s, magmaFloat_ptr x, magmaFloat_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_5 (magma_int_t num_rows, magma_int_t num_cols, float eta, float pds, magmaFloat_ptr p, magmaFloat_ptr pt, magmaFloat_ptr d, magmaFloat_ptr s, magmaFloat_ptr x, magmaFloat_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_6 (magma_int_t num_rows, magma_int_t num_cols, float beta, float rho, float psi, magmaFloat_ptr y, magmaFloat_ptr z, magmaFloat_ptr v, magmaFloat_ptr w, magmaFloat_ptr wt, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_7 (magma_int_t num_rows, magma_int_t num_cols, float beta, magmaFloat_ptr pt, magmaFloat_ptr v, magmaFloat_ptr vt, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sqmr_8 (magma_int_t num_rows, magma_int_t num_cols, float rho, float psi, magmaFloat_ptr vt, magmaFloat_ptr wt, magmaFloat_ptr y, magmaFloat_ptr z, magmaFloat_ptr v, magmaFloat_ptr w, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_stfqmr_1 (magma_int_t num_rows, magma_int_t num_cols, float alpha, float sigma, magmaFloat_ptr v, magmaFloat_ptr Au, magmaFloat_ptr u_m, magmaFloat_ptr pu_m, magmaFloat_ptr u_mp1, magmaFloat_ptr w, magmaFloat_ptr d, magmaFloat_ptr Ad, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_stfqmr_2 (magma_int_t num_rows, magma_int_t num_cols, float eta, magmaFloat_ptr d, magmaFloat_ptr Ad, magmaFloat_ptr x, magmaFloat_ptr r, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_stfqmr_3 (magma_int_t num_rows, magma_int_t num_cols, float beta, magmaFloat_ptr w, magmaFloat_ptr u_m, magmaFloat_ptr u_mp1, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_stfqmr_4 (magma_int_t num_rows, magma_int_t num_cols, float beta, magmaFloat_ptr Au_new, magmaFloat_ptr v, magmaFloat_ptr Au, magma_queue_t queue)
 Merges multiple operations into one kernel: More...
 
magma_int_t magma_stfqmr_5 (magma_int_t num_rows, magma_int_t num_cols, float alpha, float sigma, magmaFloat_ptr v, magmaFloat_ptr Au, magmaFloat_ptr u_mp1, magmaFloat_ptr w, magmaFloat_ptr d, magmaFloat_ptr Ad, magma_queue_t queue)
 Mergels multiple operations into one kernel: More...
 
magma_int_t magma_sparic_csr (magma_s_matrix A, magma_s_matrix A_CSR, magma_queue_t queue)
 This routine iteratively computes an incomplete LU factorization. More...
 
magma_int_t magma_sparilu_csr (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_queue_t queue)
 This routine iteratively computes an incomplete LU factorization. More...
 
magma_int_t magma_sparilut_sweep_gpu (magma_s_matrix *A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue)
 This function does an ParILUT sweep. More...
 
magma_int_t magma_sparilut_residuals_gpu (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *R, magma_queue_t queue)
 This function computes the ILU residual in the locations included in the sparsity pattern of R. More...
 

Detailed Description

Function Documentation

◆ magma_svalinit_gpu()

magma_int_t magma_svalinit_gpu ( magma_int_t  num_el,
magmaFloat_ptr  dval,
magma_queue_t  queue 
)

Initializes a device array with zero.

Parameters
[in]num_elmagma_int_t size of array
[in,out]dvalmagmaFloat_ptr array to initialize
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sindexinit_gpu()

magma_int_t magma_sindexinit_gpu ( magma_int_t  num_el,
magmaIndex_ptr  dind,
magma_queue_t  queue 
)

Initializes a device array with zero.

Parameters
[in]num_elmagma_int_t size of array
[in,out]dindmagmaIndex_ptr array to initialize
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbajac_csr()

magma_int_t magma_sbajac_csr ( magma_int_t  localiters,
magma_s_matrix  D,
magma_s_matrix  R,
magma_s_matrix  b,
magma_s_matrix *  x,
magma_queue_t  queue 
)

This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters
[in]localitersmagma_int_t number of local Jacobi-like updates
[in]Dmagma_s_matrix input matrix with diagonal blocks
[in]Rmagma_s_matrix input matrix with non-diagonal parts
[in]bmagma_s_matrix RHS
[in]xmagma_s_matrix* iterate/solution
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbajac_csr_overlap()

magma_int_t magma_sbajac_csr_overlap ( magma_int_t  localiters,
magma_int_t  matrices,
magma_int_t  overlap,
magma_s_matrix *  D,
magma_s_matrix *  R,
magma_s_matrix  b,
magma_s_matrix *  x,
magma_queue_t  queue 
)

This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block.

Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.

Parameters
[in]localitersmagma_int_t number of local Jacobi-like updates
[in]matricesmagma_int_t number of sub-matrices
[in]overlapmagma_int_t size of the overlap
[in]Dmagma_s_matrix* set of matrices with diagonal blocks
[in]Rmagma_s_matrix* set of matrices with non-diagonal parts
[in]bmagma_s_matrix RHS
[in]xmagma_s_matrix* iterate/solution
[in]queuemagma_queue_t Queue to execute in.

◆ magma_scompact()

magma_int_t magma_scompact ( magma_int_t  m,
magma_int_t  n,
magmaFloat_ptr  dA,
magma_int_t  ldda,
magmaFloat_ptr  dnorms,
float  tol,
magmaInt_ptr  active,
magmaInt_ptr  cBlock,
magma_queue_t  queue 
)

ZCOMPACT takes a set of n vectors of size m (in dA) and their norms and compacts them into the cBlock size<=n vectors that have norms > tol.

The active mask array has 1 or 0, showing if a vector remained or not in the compacted resulting set of vectors.

Parameters
[in]mINTEGER The number of rows of the matrix dA. M >= 0.
[in]nINTEGER The number of columns of the matrix dA. N >= 0.
[in,out]dACOMPLEX REAL array, dimension (LDDA,N) The m by n matrix dA.
[in]lddaINTEGER The leading dimension of the array dA. LDDA >= max(1,M).
[in]dnormsREAL array, dimension N The norms of the N vectors in dA
[in]tolDOUBLE PRECISON The tolerance value used in the criteria to compact or not.
[in,out]activeINTEGER array, dimension N A mask of 1s and 0s showing if a vector remains or has been removed
[in,out]cBlockmagmaInt_ptr The number of vectors that remain in dA (i.e., with norms > tol).
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sjacobisetup_vector_gpu()

magma_int_t magma_sjacobisetup_vector_gpu ( magma_int_t  num_rows,
magma_s_matrix  b,
magma_s_matrix  d,
magma_s_matrix  c,
magma_s_matrix *  x,
magma_queue_t  queue 
)

Prepares the Jacobi Iteration according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.

Returns the vector c. It calls a GPU kernel

Parameters
[in]num_rowsmagma_int_t number of rows
[in]bmagma_s_matrix RHS b
[in]dmagma_s_matrix vector with diagonal entries
[out]cmagma_s_matrix* c = D^(-1) * b
[out]xmagma_s_matrix* iteration vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_slobpcg_maxpy()

magma_int_t magma_slobpcg_maxpy ( magma_int_t  num_rows,
magma_int_t  num_vecs,
magmaFloat_ptr  X,
magmaFloat_ptr  Y,
magma_queue_t  queue 
)

This routine computes a axpy for a mxn matrix:

Y = X + Y

It replaces: magma_saxpy(m*n, c_one, Y, 1, X, 1);

/ x1[0] x2[0] x3[0] \
| x1[1] x2[1] x3[1] |

X = | x1[2] x2[2] x3[2] | = x1[0] x1[1] x1[2] x1[3] x1[4] x2[0] x2[1] . | x1[3] x2[3] x3[3] | \ x1[4] x2[4] x3[4] /

Parameters
[in]num_rowsmagma_int_t number of rows
[in]num_vecsmagma_int_t number of vectors
[in]XmagmaFloat_ptr input vector X
[in,out]YmagmaFloat_ptr input/output vector Y
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgstab_1()

magma_int_t magma_sbicgstab_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  beta,
float  omega,
magmaFloat_ptr  r,
magmaFloat_ptr  v,
magmaFloat_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

p = r + beta * ( p - omega * v )

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betafloat scalar
[in]omegafloat scalar
[in]rmagmaFloat_ptr vector
[in]vmagmaFloat_ptr vector
[in,out]pmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgstab_2()

magma_int_t magma_sbicgstab_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  alpha,
magmaFloat_ptr  r,
magmaFloat_ptr  v,
magmaFloat_ptr  s,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

s = r - alpha v

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphafloat scalar
[in]rmagmaFloat_ptr vector
[in]vmagmaFloat_ptr vector
[in,out]smagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgstab_3()

magma_int_t magma_sbicgstab_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  alpha,
float  omega,
magmaFloat_ptr  p,
magmaFloat_ptr  s,
magmaFloat_ptr  t,
magmaFloat_ptr  x,
magmaFloat_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + alpha * p + omega * s r = s - omega * t

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphafloat scalar
[in]omegafloat scalar
[in]pmagmaFloat_ptr vector
[in]smagmaFloat_ptr vector
[in]tmagmaFloat_ptr vector
[in,out]xmagmaFloat_ptr vector
[in,out]rmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgstab_4()

magma_int_t magma_sbicgstab_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  alpha,
float  omega,
magmaFloat_ptr  y,
magmaFloat_ptr  z,
magmaFloat_ptr  s,
magmaFloat_ptr  t,
magmaFloat_ptr  x,
magmaFloat_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + alpha * y + omega * z r = s - omega * t

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphafloat scalar
[in]omegafloat scalar
[in]ymagmaFloat_ptr vector
[in]zmagmaFloat_ptr vector
[in]smagmaFloat_ptr vector
[in]tmagmaFloat_ptr vector
[in,out]xmagmaFloat_ptr vector
[in,out]rmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgmerge_spmv1()

magma_int_t magma_sbicgmerge_spmv1 ( magma_s_matrix  A,
magmaFloat_ptr  d1,
magmaFloat_ptr  d2,
magmaFloat_ptr  dp,
magmaFloat_ptr  dr,
magmaFloat_ptr  dv,
magmaFloat_ptr  skp,
magma_queue_t  queue 
)

Merges the first SpmV using CSR with the dot product and the computation of alpha.

Parameters
[in]Amagma_s_matrix system matrix
[in]d1magmaFloat_ptr temporary vector
[in]d2magmaFloat_ptr temporary vector
[in]dpmagmaFloat_ptr input vector p
[in]drmagmaFloat_ptr input vector r
[in]dvmagmaFloat_ptr output vector v
[in,out]skpmagmaFloat_ptr array for parameters ( skp[0]=alpha )
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgmerge_spmv2()

magma_int_t magma_sbicgmerge_spmv2 ( magma_s_matrix  A,
magmaFloat_ptr  d1,
magmaFloat_ptr  d2,
magmaFloat_ptr  ds,
magmaFloat_ptr  dt,
magmaFloat_ptr  skp,
magma_queue_t  queue 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
[in]Amagma_s_matrix input matrix
[in]d1magmaFloat_ptr temporary vector
[in]d2magmaFloat_ptr temporary vector
[in]dsmagmaFloat_ptr input vector s
[in]dtmagmaFloat_ptr output vector t
[in,out]skpmagmaFloat_ptr array for parameters
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgmerge_xrbeta()

magma_int_t magma_sbicgmerge_xrbeta ( magma_int_t  n,
magmaFloat_ptr  d1,
magmaFloat_ptr  d2,
magmaFloat_ptr  rr,
magmaFloat_ptr  r,
magmaFloat_ptr  p,
magmaFloat_ptr  s,
magmaFloat_ptr  t,
magmaFloat_ptr  x,
magmaFloat_ptr  skp,
magma_queue_t  queue 
)

Merges the second SpmV using CSR with the dot product and the computation of omega.

Parameters
[in]nint dimension n
[in]d1magmaFloat_ptr temporary vector
[in]d2magmaFloat_ptr temporary vector
[in]rrmagmaFloat_ptr input vector rr
[in]rmagmaFloat_ptr input/output vector r
[in]pmagmaFloat_ptr input vector p
[in]smagmaFloat_ptr input vector s
[in]tmagmaFloat_ptr input vector t
[out]xmagmaFloat_ptr output vector x
[in]skpmagmaFloat_ptr array for parameters
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgmerge1()

magma_int_t magma_sbicgmerge1 ( magma_int_t  n,
magmaFloat_ptr  skp,
magmaFloat_ptr  v,
magmaFloat_ptr  r,
magmaFloat_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

p = beta*p p = p-omega*beta*v p = p+r

-> p = r + beta * ( p - omega * v )

Parameters
[in]nint dimension n
[in]skpmagmaFloat_ptr set of scalar parameters
[in]vmagmaFloat_ptr input vector v
[in]rmagmaFloat_ptr input vector r
[in,out]pmagmaFloat_ptr input/output vector p
[in]queuemagma_queue_t queue to execute in.

◆ magma_sbicgmerge2()

magma_int_t magma_sbicgmerge2 ( magma_int_t  n,
magmaFloat_ptr  skp,
magmaFloat_ptr  r,
magmaFloat_ptr  v,
magmaFloat_ptr  s,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

s=r s=s-alpha*v

-> s = r - alpha * v

Parameters
[in]nint dimension n
[in]skpmagmaFloat_ptr set of scalar parameters
[in]rmagmaFloat_ptr input vector r
[in]vmagmaFloat_ptr input vector v
[out]smagmaFloat_ptr output vector s
[in]queuemagma_queue_t queue to execute in.

◆ magma_sbicgmerge3()

magma_int_t magma_sbicgmerge3 ( magma_int_t  n,
magmaFloat_ptr  skp,
magmaFloat_ptr  p,
magmaFloat_ptr  s,
magmaFloat_ptr  t,
magmaFloat_ptr  x,
magmaFloat_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x=x+alpha*p x=x+omega*s r=s r=r-omega*t

-> x = x + alpha * p + omega * s -> r = s - omega * t

Parameters
[in]nint dimension n
[in]skpmagmaFloat_ptr set of scalar parameters
[in]pmagmaFloat_ptr input p
[in]smagmaFloat_ptr input s
[in]tmagmaFloat_ptr input t
[in,out]xmagmaFloat_ptr input/output x
[in,out]rmagmaFloat_ptr input/output r
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sbicgmerge4()

magma_int_t magma_sbicgmerge4 ( magma_int_t  type,
magmaFloat_ptr  skp,
magma_queue_t  queue 
)

Performs some parameter operations for the BiCGSTAB with scalars on GPU.

Parameters
[in]typeint kernel type
[in,out]skpmagmaFloat_ptr vector with parameters
[in]queuemagma_queue_t Queue to execute in.

◆ magma_smergeblockkrylov()

magma_int_t magma_smergeblockkrylov ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloat_ptr  alpha,
magmaFloat_ptr  p,
magmaFloat_ptr  x,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphamagmaFloat_ptr matrix containing all SKP
[in]pmagmaFloat_ptr search directions
[in,out]xmagmaFloat_ptr approximation vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_scgmerge_spmv1()

magma_int_t magma_scgmerge_spmv1 ( magma_s_matrix  A,
magmaFloat_ptr  d1,
magmaFloat_ptr  d2,
magmaFloat_ptr  dd,
magmaFloat_ptr  dz,
magmaFloat_ptr  skp,
magma_queue_t  queue 
)

Merges the first SpmV using different formats with the dot product and the computation of rho.

Parameters
[in]Amagma_s_matrix input matrix
[in]d1magmaFloat_ptr temporary vector
[in]d2magmaFloat_ptr temporary vector
[in]ddmagmaFloat_ptr input vector d
[out]dzmagmaFloat_ptr input vector z
[out]skpmagmaFloat_ptr array for parameters ( skp[3]=rho )
[in]queuemagma_queue_t Queue to execute in.

◆ magma_scgs_1()

magma_int_t magma_scgs_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  beta,
magmaFloat_ptr  r,
magmaFloat_ptr  q,
magmaFloat_ptr  u,
magmaFloat_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u = r + beta q p = u + beta*(q + beta*p)

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betafloat scalar
[in]rmagmaFloat_ptr vector
[in]qmagmaFloat_ptr vector
[in,out]umagmaFloat_ptr vector
[in,out]pmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_scgs_2()

magma_int_t magma_scgs_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloat_ptr  r,
magmaFloat_ptr  u,
magmaFloat_ptr  p,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u = r p = r

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]rmagmaFloat_ptr vector
[in,out]umagmaFloat_ptr vector
[in,out]pmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_scgs_3()

magma_int_t magma_scgs_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  alpha,
magmaFloat_ptr  v_hat,
magmaFloat_ptr  u,
magmaFloat_ptr  q,
magmaFloat_ptr  t,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

q = u - alpha v_hat t = u + q

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphafloat scalar
[in]v_hatmagmaFloat_ptr vector
[in]umagmaFloat_ptr vector
[in,out]qmagmaFloat_ptr vector
[in,out]tmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_scgs_4()

magma_int_t magma_scgs_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  alpha,
magmaFloat_ptr  u_hat,
magmaFloat_ptr  t,
magmaFloat_ptr  x,
magmaFloat_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + alpha u_hat r = r -alpha*A u_hat = r -alpha*t

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphafloat scalar
[in]u_hatmagmaFloat_ptr vector
[in]tmagmaFloat_ptr vector
[in,out]xmagmaFloat_ptr vector
[in,out]rmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sidr_smoothing_1()

magma_int_t magma_sidr_smoothing_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
magmaFloat_ptr  drs,
magmaFloat_ptr  dr,
magmaFloat_ptr  dt,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

dt = drs - dr

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]drsmagmaFloat_ptr vector
[in]drmagmaFloat_ptr vector
[in,out]dtmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sidr_smoothing_2()

magma_int_t magma_sidr_smoothing_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  omega,
magmaFloat_ptr  dx,
magmaFloat_ptr  dxs,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

dxs = dxs - gamma*(dxs-dx)

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]omegafloat scalar
[in]dxmagmaFloat_ptr vector
[in,out]dxsmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_1()

magma_int_t magma_sqmr_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  rho,
float  psi,
magmaFloat_ptr  y,
magmaFloat_ptr  z,
magmaFloat_ptr  v,
magmaFloat_ptr  w,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]rhofloat scalar
[in]psifloat scalar
[in,out]ymagmaFloat_ptr vector
[in,out]zmagmaFloat_ptr vector
[in,out]vmagmaFloat_ptr vector
[in,out]wmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_2()

magma_int_t magma_sqmr_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  pde,
float  rde,
magmaFloat_ptr  y,
magmaFloat_ptr  z,
magmaFloat_ptr  p,
magmaFloat_ptr  q,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

p = y - pde * p q = z - rde * q

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]pdefloat scalar
[in]rdefloat scalar
[in]ymagmaFloat_ptr vector
[in]zmagmaFloat_ptr vector
[in,out]pmagmaFloat_ptr vector
[in,out]qmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_3()

magma_int_t magma_sqmr_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  beta,
magmaFloat_ptr  pt,
magmaFloat_ptr  v,
magmaFloat_ptr  y,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = pt - beta * v y = v

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betafloat scalar
[in]ptmagmaFloat_ptr vector
[in,out]vmagmaFloat_ptr vector
[in,out]ymagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_4()

magma_int_t magma_sqmr_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  eta,
magmaFloat_ptr  p,
magmaFloat_ptr  pt,
magmaFloat_ptr  d,
magmaFloat_ptr  s,
magmaFloat_ptr  x,
magmaFloat_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

d = eta * p; s = eta * pt; x = x + d; r = r - s;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]etafloat scalar
[in]pmagmaFloat_ptr vector
[in]ptmagmaFloat_ptr vector
[in,out]dmagmaFloat_ptr vector
[in,out]smagmaFloat_ptr vector
[in,out]xmagmaFloat_ptr vector
[in,out]rmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_5()

magma_int_t magma_sqmr_5 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  eta,
float  pds,
magmaFloat_ptr  p,
magmaFloat_ptr  pt,
magmaFloat_ptr  d,
magmaFloat_ptr  s,
magmaFloat_ptr  x,
magmaFloat_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

d = eta * p + pds * d; s = eta * pt + pds * s; x = x + d; r = r - s;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]etafloat scalar
[in]pdsfloat scalar
[in]pmagmaFloat_ptr vector
[in]ptmagmaFloat_ptr vector
[in,out]dmagmaFloat_ptr vector
[in,out]smagmaFloat_ptr vector
[in,out]xmagmaFloat_ptr vector
[in,out]rmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_6()

magma_int_t magma_sqmr_6 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  beta,
float  rho,
float  psi,
magmaFloat_ptr  y,
magmaFloat_ptr  z,
magmaFloat_ptr  v,
magmaFloat_ptr  w,
magmaFloat_ptr  wt,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

wt = wt - conj(beta) * w v = y / rho y = y / rho w = wt / psi z = wt / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betafloat scalar
[in]rhofloat scalar
[in]psifloat scalar
[in,out]ymagmaFloat_ptr vector
[in,out]zmagmaFloat_ptr vector
[in,out]vmagmaFloat_ptr vector
[in,out]wmagmaFloat_ptr vector
[in,out]wtmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_7()

magma_int_t magma_sqmr_7 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  beta,
magmaFloat_ptr  pt,
magmaFloat_ptr  v,
magmaFloat_ptr  vt,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

vt = pt - beta * v

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betafloat scalar
[in]ptmagmaFloat_ptr vector
[in,out]vmagmaFloat_ptr vector
[in,out]vtmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sqmr_8()

magma_int_t magma_sqmr_8 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  rho,
float  psi,
magmaFloat_ptr  vt,
magmaFloat_ptr  wt,
magmaFloat_ptr  y,
magmaFloat_ptr  z,
magmaFloat_ptr  v,
magmaFloat_ptr  w,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

v = y / rho y = y / rho w = wt / psi z = z / psi

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]rhofloat scalar
[in]psifloat scalar
[in]vtmagmaFloat_ptr vector
[in]wtmagmaFloat_ptr vector
[in,out]ymagmaFloat_ptr vector
[in,out]zmagmaFloat_ptr vector
[in,out]vmagmaFloat_ptr vector
[in,out]wmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_stfqmr_1()

magma_int_t magma_stfqmr_1 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  alpha,
float  sigma,
magmaFloat_ptr  v,
magmaFloat_ptr  Au,
magmaFloat_ptr  u_m,
magmaFloat_ptr  pu_m,
magmaFloat_ptr  u_mp1,
magmaFloat_ptr  w,
magmaFloat_ptr  d,
magmaFloat_ptr  Ad,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u_mp1 = u_mp1 - alpha*v; w = w - alpha*Au; d = pu_m + sigma*d; Ad = Au + sigma*Ad;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphafloat scalar
[in]sigmafloat scalar
[in]vmagmaFloat_ptr vector
[in]AumagmaFloat_ptr vector
[in,out]u_mmagmaFloat_ptr vector
[in,out]pu_mmagmaFloat_ptr vector
[in,out]u_mp1magmaFloat_ptr vector
[in,out]wmagmaFloat_ptr vector
[in,out]dmagmaFloat_ptr vector
[in,out]AdmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_stfqmr_2()

magma_int_t magma_stfqmr_2 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  eta,
magmaFloat_ptr  d,
magmaFloat_ptr  Ad,
magmaFloat_ptr  x,
magmaFloat_ptr  r,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

x = x + eta * d r = r - eta * Ad

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]etafloat scalar
[in]dmagmaFloat_ptr vector
[in]AdmagmaFloat_ptr vector
[in,out]xmagmaFloat_ptr vector
[in,out]rmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_stfqmr_3()

magma_int_t magma_stfqmr_3 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  beta,
magmaFloat_ptr  w,
magmaFloat_ptr  u_m,
magmaFloat_ptr  u_mp1,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

u_mp1 = w + beta*u_mp1

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betafloat scalar
[in]wmagmaFloat_ptr vector
[in]u_mmagmaFloat_ptr vector
[in,out]u_mp1magmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_stfqmr_4()

magma_int_t magma_stfqmr_4 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  beta,
magmaFloat_ptr  Au_new,
magmaFloat_ptr  v,
magmaFloat_ptr  Au,
magma_queue_t  queue 
)

Merges multiple operations into one kernel:

v = Au_new + beta*(Au+beta*v); Au = Au_new

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]betafloat scalar
[in]Au_newmagmaFloat_ptr vector
[in,out]vmagmaFloat_ptr vector
[in,out]AumagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_stfqmr_5()

magma_int_t magma_stfqmr_5 ( magma_int_t  num_rows,
magma_int_t  num_cols,
float  alpha,
float  sigma,
magmaFloat_ptr  v,
magmaFloat_ptr  Au,
magmaFloat_ptr  u_mp1,
magmaFloat_ptr  w,
magmaFloat_ptr  d,
magmaFloat_ptr  Ad,
magma_queue_t  queue 
)

Mergels multiple operations into one kernel:

w = w - alpha*Au; d = pu_m + sigma*d; Ad = Au + sigma*Ad;

Parameters
[in]num_rowsmagma_int_t dimension m
[in]num_colsmagma_int_t dimension n
[in]alphafloat scalar
[in]sigmafloat scalar
[in]vmagmaFloat_ptr vector
[in]AumagmaFloat_ptr vector
[in,out]u_mp1magmaFloat_ptr vector
[in,out]wmagmaFloat_ptr vector
[in,out]dmagmaFloat_ptr vector
[in,out]AdmagmaFloat_ptr vector
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sparic_csr()

magma_int_t magma_sparic_csr ( magma_s_matrix  A,
magma_s_matrix  A_CSR,
magma_queue_t  queue 
)

This routine iteratively computes an incomplete LU factorization.

For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1Ð16, 2015.

The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.

Parameters
[in]Amagma_s_matrix input matrix A - initial guess (lower triangular)
[in,out]A_CSRmagma_s_matrix input/output matrix containing the IC approximation
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sparilu_csr()

magma_int_t magma_sparilu_csr ( magma_s_matrix  A,
magma_s_matrix  L,
magma_s_matrix  U,
magma_queue_t  queue 
)

This routine iteratively computes an incomplete LU factorization.

For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1Ð16, 2015.

The input format of the system matrix is COO, the lower triangular factor L is stored in CSR, the upper triangular factor U is transposed, then also stored in CSR (equivalent to CSC format for the non-transposed U). Every component of L and U is handled by one thread.

Parameters
[in]Amagma_s_matrix input matrix A determing initial guess & processing order
[in,out]Lmagma_s_matrix input/output matrix L containing the lower triangular factor
[in,out]Umagma_s_matrix input/output matrix U containing the upper triangular factor
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sparilut_sweep_gpu()

magma_int_t magma_sparilut_sweep_gpu ( magma_s_matrix *  A,
magma_s_matrix *  L,
magma_s_matrix *  U,
magma_queue_t  queue 
)

This function does an ParILUT sweep.

The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A. L has a unit diagonal.

This is the GPU version of the asynchronous ParILUT sweep.

Parameters
[in]Amagma_s_matrix* System matrix. The format is sorted CSR.
[in,out]Lmagma_s_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in,out]Umagma_s_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sparilut_residuals_gpu()

magma_int_t magma_sparilut_residuals_gpu ( magma_s_matrix  A,
magma_s_matrix  L,
magma_s_matrix  U,
magma_s_matrix *  R,
magma_queue_t  queue 
)

This function computes the ILU residual in the locations included in the sparsity pattern of R.

Parameters
[in]Amagma_s_matrix System matrix. The format is sorted CSR.
[in]Lmagma_s_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in]Umagma_s_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored.
[in,out]Rmagma_s_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual.
[in]queuemagma_queue_t Queue to execute in.