Kokkos Node API and Local Linear Algebra Kernels Version of the Day

First pass of KokkosNodeTsqr's factorization. More...
#include <Tsqr_KokkosNodeTsqr.hpp>
Public Member Functions  
FactorFirstPass (const MatView< LocalOrdinal, Scalar > &A, std::vector< std::vector< Scalar > > &tauArrays, std::vector< MatView< LocalOrdinal, Scalar > > &topBlocks, const CacheBlockingStrategy< LocalOrdinal, Scalar > &strategy, const int numPartitions, const bool contiguousCacheBlocks=false)  
Constructor.  
void  execute (const int partitionIndex) 
First pass of intranode TSQR factorization. 
First pass of KokkosNodeTsqr's factorization.
Definition at line 184 of file Tsqr_KokkosNodeTsqr.hpp.
TSQR::details::FactorFirstPass< LocalOrdinal, Scalar >::FactorFirstPass  (  const MatView< LocalOrdinal, Scalar > &  A, 
std::vector< std::vector< Scalar > > &  tauArrays,  
std::vector< MatView< LocalOrdinal, Scalar > > &  topBlocks,  
const CacheBlockingStrategy< LocalOrdinal, Scalar > &  strategy,  
const int  numPartitions,  
const bool  contiguousCacheBlocks = false 

)  [inline] 
Constructor.
A  [in/out] On input: View of the matrix to factor. On output: (Part of) the implicitly stored Q factor. (The other part is tauArrays.) 
tauArrays  [out] Where to write the "TAU" arrays (implicit factorization results) for each cache block. (TAU is what LAPACK's QR factorization routines call this array; see the LAPACK documentation for an explanation.) Indexed by the cache block index; one TAU array per cache block. 
strategy  [in] Cache blocking strategy to use. 
numPartitions  [in] Number of partitions (positive integer), and therefore the maximum parallelism available to the algorithm. Oversubscribing processors is OK, but should not be done to excess. This is an int, and not a LocalOrdinal, because it is the argument to Kokkos' parallel_for. 
contiguousCacheBlocks  [in] Whether the cache blocks of A are stored contiguously. 
Definition at line 339 of file Tsqr_KokkosNodeTsqr.hpp.
void TSQR::details::FactorFirstPass< LocalOrdinal, Scalar >::execute  (  const int  partitionIndex  )  [inline] 
First pass of intranode TSQR factorization.
Invoked by Kokkos' parallel_for template method. This routine parallelizes over contiguous partitions of the matrix. Each partition in turn contains cache blocks. Partitions do not break up cache blocks. (This ensures that the cache blocking scheme is the same as that used by SequentialTsqr, as long as the cache blocking strategies are the same. However, the implicit Q factor is not compatible with that of SequentialTsqr.)
This method also saves a view of the top block of the partition in the topBlocks_ array. This is useful for the next factorization pass.
partitionIndex  [in] Zerobased index of the partition. If greater than or equal to the number of partitions, this routine does nothing. 
Definition at line 393 of file Tsqr_KokkosNodeTsqr.hpp.