Kokkos Node API and Local Linear Algebra Kernels Version of the Day
Public Types | Public Member Functions | Protected Member Functions
TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType > Class Template Reference

Parallel intranode TSQR implemented using the Kokkos Node API. More...

#include <Tsqr_KokkosNodeTsqr.hpp>

Inheritance diagram for TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >:
Inheritance graph
[legend]

List of all members.

Public Types

typedef NodeTsqr< LocalOrdinal,
Scalar,
KokkosNodeTsqrFactorOutput
< LocalOrdinal, Scalar >
>::factor_output_type 
FactorOutput
 Part of the implicit Q representation returned by factor().

Public Member Functions

 KokkosNodeTsqr (const Teuchos::RCP< const node_type > &node, const Teuchos::RCP< Teuchos::ParameterList > &params)
 Constructor.
std::string description () const
 One-line description of this object.
void setParameterList (const Teuchos::RCP< Teuchos::ParameterList > &paramList)
 Validate and read in parameters.
Teuchos::RCP< const
Teuchos::ParameterList
getValidParameters () const
 Default valid parameter list.
FactorOutput factor (const LocalOrdinal numRows, const LocalOrdinal numCols, Scalar A[], const LocalOrdinal lda, Scalar R[], const LocalOrdinal ldr, const bool contiguousCacheBlocks) const
 Factor the matrix A (see NodeTsqr documentation).
void apply (const ApplyType &applyType, const LocalOrdinal nrows, const LocalOrdinal ncols_Q, const Scalar Q[], const LocalOrdinal ldq, const FactorOutput &factorOutput, const LocalOrdinal ncols_C, Scalar C[], const LocalOrdinal ldc, const bool contiguousCacheBlocks) const
 Apply the Q factor to C (see NodeTsqr documentation).
void explicit_Q (const LocalOrdinal nrows, const LocalOrdinal ncols_Q, const Scalar Q[], const LocalOrdinal ldq, const FactorOutput &factorOutput, const LocalOrdinal ncols_C, Scalar C[], const LocalOrdinal ldc, const bool contiguousCacheBlocks) const
 Compute the explicit Q factor (see NodeTsqr documentation).
bool QR_produces_R_factor_with_nonnegative_diagonal () const
 Whether the R factor always has a nonnegative diagonal.
size_t TEUCHOS_DEPRECATED cache_block_size () const
 Cache size hint in bytes.
size_t cache_size_hint () const
 Cache size hint in bytes.
void fill_with_zeros (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar A[], const LocalOrdinal lda, const bool contiguousCacheBlocks) const
 Fill A with zeros (see NodeTsqr documentation).
void cache_block (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar A_out[], const Scalar A_in[], const LocalOrdinal lda_in) const
 Cache block A (see NodeTsqr documentation).
void un_cache_block (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar A_out[], const LocalOrdinal lda_out, const Scalar A_in[]) const
 Un - cache block A (see NodeTsqr documentation).
void Q_times_B (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar Q[], const LocalOrdinal ldq, const Scalar B[], const LocalOrdinal ldb, const bool contiguousCacheBlocks) const
 Compute Q := Q*B in place (see NodeTsqr documentation).
virtual void apply (const ApplyType &applyType, const LocalOrdinalnrows, const LocalOrdinalncols_Q, const Scalar Q[], const LocalOrdinalldq, const KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > &factorOutput, const LocalOrdinalncols_C, Scalar C[], const LocalOrdinalldc, const bool contiguousCacheBlocks) const =0
 Apply the implicit Q factor from factor() to C.
virtual void explicit_Q (const LocalOrdinalnrows, const LocalOrdinalncols_Q, const Scalar Q[], const LocalOrdinalldq, const factor_output_type &factorOutput, const LocalOrdinalncols_C, Scalar C[], const LocalOrdinalldc, const bool contiguousCacheBlocks) const =0
 Compute the explicit Q factor from the result of factor().
MatrixViewType top_block (const MatrixViewType &C, const bool contiguous_cache_blocks) const
 Return view of topmost cache block of C.
LocalOrdinal reveal_R_rank (const LocalOrdinalncols, Scalar R[], const LocalOrdinalldr, Scalar U[], const LocalOrdinalldu, const typename Teuchos::ScalarTraits< Scalar >::magnitudeType tol) const
 Reveal rank of TSQR's R factor.
LocalOrdinal reveal_rank (const LocalOrdinalnrows, const LocalOrdinalncols, Scalar Q[], const LocalOrdinalldq, Scalar R[], const LocalOrdinalldr, const typename Teuchos::ScalarTraits< Scalar >::magnitudeType tol, const bool contiguousCacheBlocks) const
 Compute rank-revealing decomposition.

Protected Member Functions

ConstMatView< LocalOrdinal,
Scalar > 
const_top_block (const ConstMatView< LocalOrdinal, Scalar > &C, const bool contiguous_cache_blocks) const
 Return the topmost cache block of the matrix C.

Detailed Description

template<class LocalOrdinal, class Scalar, class NodeType>
class TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >

Parallel intranode TSQR implemented using the Kokkos Node API.

Author:
Mark Hoemmen

This implementation of the intranode part of TSQR factors the matrix in two passes. The first pass parallelizes over partitions, doing Sequential TSQR over each partition. The second pass combines the R factors from the partitions, and is not currently parallel. Thus, the overall algorithm is similar to that of TbbTsqr, except that:

Definition at line 1176 of file Tsqr_KokkosNodeTsqr.hpp.


Member Typedef Documentation

template<class LocalOrdinal , class Scalar , class NodeType >
TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::FactorOutput

Part of the implicit Q representation returned by factor().

Definition at line 1187 of file Tsqr_KokkosNodeTsqr.hpp.


Constructor & Destructor Documentation

template<class LocalOrdinal , class Scalar , class NodeType >
TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::KokkosNodeTsqr ( const Teuchos::RCP< const node_type > &  node,
const Teuchos::RCP< Teuchos::ParameterList > &  params 
) [inline]

Constructor.

Parameters:
node[in] Pointer to a Kokkos Node instance.
params[in/out] List of parameters. Missing parameters will be filled in with default values.

Definition at line 1238 of file Tsqr_KokkosNodeTsqr.hpp.


Member Function Documentation

template<class LocalOrdinal , class Scalar , class NodeType >
std::string TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::description ( ) const [inline, virtual]

One-line description of this object.

This implements Teuchos::Describable::description().

Reimplemented from TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1248 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
void TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::setParameterList ( const Teuchos::RCP< Teuchos::ParameterList > &  paramList) [inline, virtual]

Validate and read in parameters.

Parameters:
paramList[in/out] On input: non-null parameter list containing zero or more of the parameters in getValidParameters(). On output: missing parameters (i.e., parameters in getValidParameters() but not in the input list) are filled in with default values.

Implements Teuchos::ParameterListAcceptor.

Definition at line 1271 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
Teuchos::RCP<const Teuchos::ParameterList> TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::getValidParameters ( ) const [inline, virtual]

Default valid parameter list.

The returned list contains all parameters accepted by KokkosNodeTsqr, with their default values and documentation. This method is reentrant and should be thread safe.

Note:
This method creates a new parameter list each time it is called. This saves storage for the common case of setParameterList() being called infrequently (since setParameterList() calls this method once). If you find yourself calling setParameterList() often, you might want to change the implementation of getValidParameters() to store the valid parameter list as member data. Calling setParameterList() often would be unusual for a class like this one, whose configuration options are parameters related to hardware that are unlikely to change at run time.

Reimplemented from Teuchos::ParameterListAcceptor.

Definition at line 1331 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
FactorOutput TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::factor ( const LocalOrdinal  numRows,
const LocalOrdinal  numCols,
Scalar  A[],
const LocalOrdinal  lda,
Scalar  R[],
const LocalOrdinal  ldr,
const bool  contiguousCacheBlocks 
) const [inline, virtual]

Factor the matrix A (see NodeTsqr documentation).

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1367 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
void TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::apply ( const ApplyType applyType,
const LocalOrdinal  nrows,
const LocalOrdinal  ncols_Q,
const Scalar  Q[],
const LocalOrdinal  ldq,
const FactorOutput factorOutput,
const LocalOrdinal  ncols_C,
Scalar  C[],
const LocalOrdinal  ldc,
const bool  contiguousCacheBlocks 
) const [inline]

Apply the Q factor to C (see NodeTsqr documentation).

Definition at line 1383 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
void TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::explicit_Q ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols_Q,
const Scalar  Q[],
const LocalOrdinal  ldq,
const FactorOutput factorOutput,
const LocalOrdinal  ncols_C,
Scalar  C[],
const LocalOrdinal  ldc,
const bool  contiguousCacheBlocks 
) const [inline]

Compute the explicit Q factor (see NodeTsqr documentation).

Definition at line 1405 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
bool TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::QR_produces_R_factor_with_nonnegative_diagonal ( ) const [inline, virtual]

Whether the R factor always has a nonnegative diagonal.

See the NodeTsqr documentation.

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1427 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
size_t TEUCHOS_DEPRECATED TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::cache_block_size ( ) const [inline, virtual]

Cache size hint in bytes.

This method is deprecated, because the name is misleading (the return value is not the size of one cache block, even though it is used to pick the "typical" cache block size). Please use cache_size_hint() instead (which returns the same value).

See the NodeTsqr documentation for details.

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1440 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
size_t TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::cache_size_hint ( ) const [inline, virtual]

Cache size hint in bytes.

See the NodeTsqr documentation for details.

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1447 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
void TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::fill_with_zeros ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  A[],
const LocalOrdinal  lda,
const bool  contiguousCacheBlocks 
) const [inline, virtual]

Fill A with zeros (see NodeTsqr documentation).

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1453 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
void TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::cache_block ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  A_out[],
const Scalar  A_in[],
const LocalOrdinal  lda_in 
) const [inline, virtual]
template<class LocalOrdinal , class Scalar , class NodeType >
void TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::un_cache_block ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  A_out[],
const LocalOrdinal  lda_out,
const Scalar  A_in[] 
) const [inline, virtual]

Un - cache block A (see NodeTsqr documentation).

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1494 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
void TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::Q_times_B ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  Q[],
const LocalOrdinal  ldq,
const Scalar  B[],
const LocalOrdinal  ldb,
const bool  contiguousCacheBlocks 
) const [inline, virtual]

Compute Q := Q*B in place (see NodeTsqr documentation).

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1517 of file Tsqr_KokkosNodeTsqr.hpp.

template<class LocalOrdinal , class Scalar , class NodeType >
ConstMatView<LocalOrdinal, Scalar> TSQR::KokkosNodeTsqr< LocalOrdinal, Scalar, NodeType >::const_top_block ( const ConstMatView< LocalOrdinal, Scalar > &  C,
const bool  contiguous_cache_blocks 
) const [inline, protected, virtual]

Return the topmost cache block of the matrix C.

NodeTsqr's top_block() method must be implemented using its subclasses' const_top_block() method. This is because top_block() is a template method, and template methods cannot be virtual.

Parameters:
C[in] View of a matrix, with at least as many rows as columns.
contiguous_cache_blocks[in] Whether the cache blocks of C are stored contiguously.
Returns:
View of the topmost cache block of the matrix C.

Implements TSQR::NodeTsqr< LocalOrdinal, Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >.

Definition at line 1848 of file Tsqr_KokkosNodeTsqr.hpp.

TSQR::details::ApplyFirstPass< LocalOrdinal, Scalar >::apply ( const ApplyType applyType,
const LocalOrdinal  nrows,
const LocalOrdinal  ncols_Q,
const Scalar  Q[],
const LocalOrdinal  ldq,
const KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > &  factorOutput,
const LocalOrdinal  ncols_C,
Scalar  C[],
const LocalOrdinal  ldc,
const bool  contiguousCacheBlocks 
) const [pure virtual, inherited]

Apply the implicit Q factor from factor() to C.

Parameters:
applyType[in] Whether to apply Q, Q^T, or Q^H to C.
nrows[in] Number of rows in Q and C.
ncols[in] Number of columns in in Q.
Q[in] Part of the implicit representation of the Q factor; the A matrix output of factor(). See the factor() documentation for details.
ldq[in] Leading dimension (a.k.a. stride) of Q, if Q is stored in column-major order (not contiguously cache blocked).
factorOutput[in] Return value of factor(), corresponding to Q.
ncols_C[in] Number of columns in the matrix C. This may be different than the number of columns in Q. There is no restriction on this value, but we optimize performance for the case ncols_C == ncols_Q.
C[in/out] On input: Matrix to which to apply the Q factor. On output: Result of applying the Q factor (or Q^T, or Q^H, depending on applyType) to C.
ldc[in] leading dimension (a.k.a. stride) of C, if C is stored in column-major order (not contiguously cache blocked).
contiguousCacheBlocks[in] Whether the cache blocks of Q and C are stored contiguously. If you don't know what this means, put "false" here.

Definition at line 514 of file TbbTsqr_TbbRecursiveTsqr_Def.hpp.

void TSQR::TBB::TbbRecursiveTsqr< LocalOrdinal, Scalar >::explicit_Q ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols_Q,
const Scalar  Q[],
const LocalOrdinal  ldq,
const factor_output_type factorOutput,
const LocalOrdinal  ncols_C,
Scalar  C[],
const LocalOrdinal  ldc,
const bool  contiguousCacheBlocks 
) const [pure virtual, inherited]

Compute the explicit Q factor from the result of factor().

This is equivalent to calling apply() on the first ncols_C columns of the identity matrix (suitably cache-blocked, if applicable).

Parameters:
nrows[in] Number of rows in Q and C.
ncols[in] Number of columns in in Q.
Q[in] Part of the implicit representation of the Q factor; the A matrix output of factor(). See the factor() documentation for details.
ldq[in] Leading dimension (a.k.a. stride) of Q, if Q is stored in column-major order (not contiguously cache blocked).
factorOutput[in] Return value of factor(), corresponding to Q.
ncols_C[in] Number of columns in the matrix C. This may be different than the number of columns in Q, in which case that number of columns of the Q factor will be computed. There is no restriction on this value, but we optimize performance for the case ncols_C == ncols_Q.
C[out] The first ncols_C columns of the Q factor.
ldc[in] leading dimension (a.k.a. stride) of C, if C is stored in column-major order (not contiguously cache blocked).
contiguousCacheBlocks[in] Whether the cache blocks of Q and C are stored contiguously. If you don't know what this means, put "false" here.

Definition at line 550 of file TbbTsqr_TbbRecursiveTsqr_Def.hpp.

MatrixViewType TSQR::NodeTsqr< LocalOrdinal , Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >::top_block ( const MatrixViewType &  C,
const bool  contiguous_cache_blocks 
) const [inline, inherited]

Return view of topmost cache block of C.

Parameters:
C[in] View of a matrix C.
contiguousCacheBlocks[in] Whether the cache blocks in C are stored contiguously.

Return a view of the topmost cache block (on this node) of the given matrix C. This is not necessarily square, though it must have at least as many rows as columns. For a view of the first C.ncols() rows of that block, which methods like Tsqr::apply() need, do the following:

 MatrixViewType top = this->top_block (C, contig);
 MatView<Ordinal, Scalar> square (ncols, ncols, top.get(), top.lda());

Models for MatrixViewType are MatView and ConstMatView. MatrixViewType must have member functions nrows(), ncols(), get(), and lda(), and its constructor must take the same four arguments as the constructor of ConstMatView.

Definition at line 356 of file Tsqr_NodeTsqr.hpp.

LocalOrdinal TSQR::NodeTsqr< LocalOrdinal , Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >::reveal_R_rank ( const LocalOrdinal  ncols,
Scalar  R[],
const LocalOrdinal  ldr,
Scalar  U[],
const LocalOrdinal  ldu,
const typename Teuchos::ScalarTraits< Scalar >::magnitudeType  tol 
) const [inherited]

Reveal rank of TSQR's R factor.

Compute the singular value decomposition (SVD) $R = U \Sigma V^*$. This is done not in place, so that the original R is not affected. Use the resulting singular values to compute the numerical rank of R, with respect to the relative tolerance tol. If R is full rank, return without modifying R. If R is not full rank, overwrite R with $\Sigma \cdot V^*$.

Parameters:
ncols[in] Number of (rows and) columns in R.
R[in/out] ncols x ncols upper triangular matrix, stored in column-major order with leading dimension ldr.
ldr[in] Leading dimension of the matrix R.
U[out] Left singular vectors of the matrix R; an ncols x ncols matrix with leading dimension ldu.
ldu[in] Leading dimension of the matrix U.
tol[in] Numerical rank tolerance; relative to the largest nonzero singular value of R.
Returns:
Numerical rank of R: 0 <= rank <= ncols.
LocalOrdinal TSQR::NodeTsqr< LocalOrdinal , Scalar, KokkosNodeTsqrFactorOutput< LocalOrdinal, Scalar > >::reveal_rank ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  Q[],
const LocalOrdinal  ldq,
Scalar  R[],
const LocalOrdinal  ldr,
const typename Teuchos::ScalarTraits< Scalar >::magnitudeType  tol,
const bool  contiguousCacheBlocks 
) const [inherited]

Compute rank-revealing decomposition.

Using the R factor from factor() and the explicit Q factor from explicit_Q(), compute the SVD of R ( $R = U \Sigma V^*$). R. If R is full rank (with respect to the given relative tolerance tol), don't change Q or R. Otherwise, compute $Q := Q \cdot U$ and $R := \Sigma V^*$ in place (the latter may be no longer upper triangular).

Returns:
Rank $r$ of R: $ 0 \leq r \leq ncols$.

The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends