Kokkos Node API and Local Linear Algebra Kernels Version of the Day
Public Member Functions
TSQR::details::ApplyFirstPass< LocalOrdinal, Scalar > Class Template Reference

"First" pass of applying KokkosNodeTsqr's implicit Q factor. More...

#include <Tsqr_KokkosNodeTsqr.hpp>

List of all members.

Public Member Functions

 ApplyFirstPass (const ApplyType &applyType, const ConstMatView< LocalOrdinal, Scalar > &Q, const std::vector< std::vector< Scalar > > &tauArrays, const std::vector< MatView< LocalOrdinal, Scalar > > &topBlocks, const MatView< LocalOrdinal, Scalar > &C, const CacheBlockingStrategy< LocalOrdinal, Scalar > &strategy, const int numPartitions, const bool explicitQ=false, const bool contiguousCacheBlocks=false)
 Constructor.
void execute (const int partitionIndex)
 First pass of applying intranode TSQR's implicit Q factor.

Detailed Description

template<class LocalOrdinal, class Scalar>
class TSQR::details::ApplyFirstPass< LocalOrdinal, Scalar >

"First" pass of applying KokkosNodeTsqr's implicit Q factor.

Author:
Mark Hoemmen

We call this ApplyFirstPass as a reminder that this algorithm has the same form as FactorFirstPass and uses the results of the latter, even though ApplyFirstPass is really the last pass of applying the implicit Q factor.

Definition at line 424 of file Tsqr_KokkosNodeTsqr.hpp.


Constructor & Destructor Documentation

template<class LocalOrdinal , class Scalar >
TSQR::details::ApplyFirstPass< LocalOrdinal, Scalar >::ApplyFirstPass ( const ApplyType applyType,
const ConstMatView< LocalOrdinal, Scalar > &  Q,
const std::vector< std::vector< Scalar > > &  tauArrays,
const std::vector< MatView< LocalOrdinal, Scalar > > &  topBlocks,
const MatView< LocalOrdinal, Scalar > &  C,
const CacheBlockingStrategy< LocalOrdinal, Scalar > &  strategy,
const int  numPartitions,
const bool  explicitQ = false,
const bool  contiguousCacheBlocks = false 
) [inline]

Constructor.

Parameters:
applyType[in] Whether we are applying Q, Q^T, or Q^H.
A[in/out] On input: View of the matrix to factor. On output: (Part of) the implicitly stored Q factor. (The other part is tauArrays.)
tauArrays[in] Where to write the "TAU" arrays (implicit factorization results) for each cache block. (TAU is what LAPACK's QR factorization routines call this array; see the LAPACK documentation for an explanation.) Indexed by the cache block index; one TAU array per cache block.
strategy[in] Cache blocking strategy to use.
numPartitions[in] Number of partitions (positive integer), and therefore the maximum parallelism available to the algorithm. Oversubscribing processors is OK, but should not be done to excess. This is an int, and not a LocalOrdinal, because it is the argument to Kokkos' parallel_for.
contiguousCacheBlocks[in] Whether the cache blocks of A are stored contiguously.

Definition at line 689 of file Tsqr_KokkosNodeTsqr.hpp.


Member Function Documentation

template<class LocalOrdinal , class Scalar >
void TSQR::details::ApplyFirstPass< LocalOrdinal, Scalar >::execute ( const int  partitionIndex) [inline]

First pass of applying intranode TSQR's implicit Q factor.

Invoked by Kokkos' parallel_for template method. This routine parallelizes over contiguous partitions of the C matrix. Each partition in turn contains cache blocks. We take care not to break up the cache blocks among partitions; this ensures that the cache blocking scheme is the same as SequentialTsqr uses. (However, the implicit Q factor is not compatible with that of SequentialTsqr.)

Parameters:
partitionIndex[in] Zero-based index of the partition which this instance of ApplyFirstPass is currently processing. If greater than or equal to the number of partitions, this routine does nothing.
Warning:
This routine almost certainly won't work in CUDA. If it does, it won't be efficient. If you are interested in a GPU TSQR routine, please contact the author (Mark Hoemmen <mhoemme@sandia.gov>) of this code to discuss the possibilities.
Note:
Unlike typical Kokkos work-data pairs (WDPs) passed into parallel_for, this one is not declared inline. This method is heavyweight enough that an inline declaration is unlikely to improve performance.

Definition at line 735 of file Tsqr_KokkosNodeTsqr.hpp.


The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends