MAGMA  2.4.0
Matrix Algebra for GPU and Multicore Architectures

Functions

magma_int_t magma_cgetrf_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_cgetrf_recpanel_batched (magma_int_t m, magma_int_t n, magma_int_t min_recpnb, magmaFloatComplex **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t **dipiv_array, magma_int_t **dpivinfo_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 This is an internal routine that might have many assumption. More...
 
magma_int_t magma_dgetrf_batched (magma_int_t m, magma_int_t n, double **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 DGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_dgetrf_recpanel_batched (magma_int_t m, magma_int_t n, magma_int_t min_recpnb, double **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t **dipiv_array, magma_int_t **dpivinfo_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 This is an internal routine that might have many assumption. More...
 
magma_int_t magma_sgetrf_batched (magma_int_t m, magma_int_t n, float **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 SGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_sgetrf_recpanel_batched (magma_int_t m, magma_int_t n, magma_int_t min_recpnb, float **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t **dipiv_array, magma_int_t **dpivinfo_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 This is an internal routine that might have many assumption. More...
 
magma_int_t magma_zgetrf_batched (magma_int_t m, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 ZGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_zgetrf_recpanel_batched (magma_int_t m, magma_int_t n, magma_int_t min_recpnb, magmaDoubleComplex **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t **dipiv_array, magma_int_t **dpivinfo_array, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 This is an internal routine that might have many assumption. More...
 
magma_int_t magma_cgetf2_nopiv_internal_batched (magma_int_t m, magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 cgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_cgetrf_batched_smallsq_noshfl (magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 cgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_cgetrf_batched_smallsq_shfl (magma_int_t n, magmaFloatComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 cgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_dgetf2_nopiv_internal_batched (magma_int_t m, magma_int_t n, double **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 dgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_dgetrf_batched_smallsq_noshfl (magma_int_t n, double **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 dgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_dgetrf_batched_smallsq_shfl (magma_int_t n, double **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 dgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_sgetf2_nopiv_internal_batched (magma_int_t m, magma_int_t n, float **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 sgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_sgetrf_batched_smallsq_noshfl (magma_int_t n, float **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 sgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_sgetrf_batched_smallsq_shfl (magma_int_t n, float **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 sgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_zgetf2_nopiv_internal_batched (magma_int_t m, magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ai, magma_int_t aj, magma_int_t ldda, magma_int_t *info_array, magma_int_t gbstep, magma_int_t batchCount, magma_queue_t queue)
 zgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_zgetrf_batched_smallsq_noshfl (magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 zgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 
magma_int_t magma_zgetrf_batched_smallsq_shfl (magma_int_t n, magmaDoubleComplex **dA_array, magma_int_t ldda, magma_int_t **ipiv_array, magma_int_t *info_array, magma_int_t batchCount, magma_queue_t queue)
 zgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges. More...
 

Detailed Description

Function Documentation

◆ magma_cgetrf_batched()

magma_int_t magma_cgetrf_batched ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

CGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_cgetrf_recpanel_batched()

magma_int_t magma_cgetrf_recpanel_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  min_recpnb,
magmaFloatComplex **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t **  dipiv_array,
magma_int_t **  dpivinfo_array,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

This is an internal routine that might have many assumption.

Documentation is not fully completed

CGETRF_PANEL computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in]min_recpnbINTEGER. Internal use. The recursive nb
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]dipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]dpivinfo_arrayArray of pointers, dimension (batchCount), for internal use.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]gbstepINTEGER internal use.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_dgetrf_batched()

magma_int_t magma_dgetrf_batched ( magma_int_t  m,
magma_int_t  n,
double **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

DGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_dgetrf_recpanel_batched()

magma_int_t magma_dgetrf_recpanel_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  min_recpnb,
double **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t **  dipiv_array,
magma_int_t **  dpivinfo_array,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

This is an internal routine that might have many assumption.

Documentation is not fully completed

DGETRF_PANEL computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in]min_recpnbINTEGER. Internal use. The recursive nb
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]dipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]dpivinfo_arrayArray of pointers, dimension (batchCount), for internal use.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]gbstepINTEGER internal use.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sgetrf_batched()

magma_int_t magma_sgetrf_batched ( magma_int_t  m,
magma_int_t  n,
float **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

SGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sgetrf_recpanel_batched()

magma_int_t magma_sgetrf_recpanel_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  min_recpnb,
float **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t **  dipiv_array,
magma_int_t **  dpivinfo_array,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

This is an internal routine that might have many assumption.

Documentation is not fully completed

SGETRF_PANEL computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in]min_recpnbINTEGER. Internal use. The recursive nb
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]dipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]dpivinfo_arrayArray of pointers, dimension (batchCount), for internal use.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]gbstepINTEGER internal use.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_zgetrf_batched()

magma_int_t magma_zgetrf_batched ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

ZGETRF computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_zgetrf_recpanel_batched()

magma_int_t magma_zgetrf_recpanel_batched ( magma_int_t  m,
magma_int_t  n,
magma_int_t  min_recpnb,
magmaDoubleComplex **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t **  dipiv_array,
magma_int_t **  dpivinfo_array,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

This is an internal routine that might have many assumption.

Documentation is not fully completed

ZGETRF_PANEL computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges.

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]mINTEGER The number of rows of each matrix A. M >= 0.
[in]nINTEGER The number of columns of each matrix A. N >= 0.
[in]min_recpnbINTEGER. Internal use. The recursive nb
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]dipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]dpivinfo_arrayArray of pointers, dimension (batchCount), for internal use.
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]gbstepINTEGER internal use.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_cgetf2_nopiv_internal_batched()

magma_int_t magma_cgetf2_nopiv_internal_batched ( magma_int_t  m,
magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

cgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_cgetrf_batched_smallsq_noshfl()

magma_int_t magma_cgetrf_batched_smallsq_noshfl ( magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

cgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_cgetrf_batched_smallsq_shfl()

magma_int_t magma_cgetrf_batched_smallsq_shfl ( magma_int_t  n,
magmaFloatComplex **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

cgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_dgetf2_nopiv_internal_batched()

magma_int_t magma_dgetf2_nopiv_internal_batched ( magma_int_t  m,
magma_int_t  n,
double **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

dgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_dgetrf_batched_smallsq_noshfl()

magma_int_t magma_dgetrf_batched_smallsq_noshfl ( magma_int_t  n,
double **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

dgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_dgetrf_batched_smallsq_shfl()

magma_int_t magma_dgetrf_batched_smallsq_shfl ( magma_int_t  n,
double **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

dgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a DOUBLE PRECISION array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sgetf2_nopiv_internal_batched()

magma_int_t magma_sgetf2_nopiv_internal_batched ( magma_int_t  m,
magma_int_t  n,
float **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

sgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sgetrf_batched_smallsq_noshfl()

magma_int_t magma_sgetrf_batched_smallsq_noshfl ( magma_int_t  n,
float **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

sgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_sgetrf_batched_smallsq_shfl()

magma_int_t magma_sgetrf_batched_smallsq_shfl ( magma_int_t  n,
float **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

sgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a REAL array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_zgetf2_nopiv_internal_batched()

magma_int_t magma_zgetf2_nopiv_internal_batched ( magma_int_t  m,
magma_int_t  n,
magmaDoubleComplex **  dA_array,
magma_int_t  ai,
magma_int_t  aj,
magma_int_t  ldda,
magma_int_t *  info_array,
magma_int_t  gbstep,
magma_int_t  batchCount,
magma_queue_t  queue 
)

zgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_zgetrf_batched_smallsq_noshfl()

magma_int_t magma_zgetrf_batched_smallsq_noshfl ( magma_int_t  n,
magmaDoubleComplex **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

zgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.

◆ magma_zgetrf_batched_smallsq_shfl()

magma_int_t magma_zgetrf_batched_smallsq_shfl ( magma_int_t  n,
magmaDoubleComplex **  dA_array,
magma_int_t  ldda,
magma_int_t **  ipiv_array,
magma_int_t *  info_array,
magma_int_t  batchCount,
magma_queue_t  queue 
)

zgetrf_batched_smallsq_noshfl computes the LU factorization of a square N-by-N matrix A using partial pivoting with row interchanges.

This routine can deal only with square matrices of size up to 32

The factorization has the form A = P * L * U where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

This is the right-looking Level 3 BLAS version of the algorithm.

This is a batched version that factors batchCount M-by-N matrices in parallel. dA, ipiv, and info become arrays with one entry per matrix.

Parameters
[in]nINTEGER The size of each matrix A. N >= 0.
[in,out]dA_arrayArray of pointers, dimension (batchCount). Each is a COMPLEX_16 array on the GPU, dimension (LDDA,N). On entry, each pointer is an M-by-N matrix to be factored. On exit, the factors L and U from the factorization A = P*L*U; the unit diagonal elements of L are not stored.
[in]lddaINTEGER The leading dimension of each array A. LDDA >= max(1,M).
[out]ipiv_arrayArray of pointers, dimension (batchCount), for corresponding matrices. Each is an INTEGER array, dimension (min(M,N)) The pivot indices; for 1 <= i <= min(M,N), row i of the matrix was interchanged with row IPIV(i).
[out]info_arrayArray of INTEGERs, dimension (batchCount), for corresponding matrices.
  • = 0: successful exit
  • < 0: if INFO = -i, the i-th argument had an illegal value or another error occured, such as memory allocation failed.
  • > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, and division by zero will occur if it is used to solve a system of equations.
[in]batchCountINTEGER The number of matrices to operate on.
[in]queuemagma_queue_t Queue to execute in.