Kokkos Node API and Local Linear Algebra Kernels Version of the Day
Public Types | Public Member Functions | Static Public Member Functions
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType > Class Template Reference

Parallel implementation of TbbTsqr. More...

#include <TbbTsqr_TbbParallelTsqr.hpp>

List of all members.

Public Types

typedef SequentialTsqr
< LocalOrdinal, Scalar >
::FactorOutput 
SeqOutput
 Results of SequentialTsqr for each core.
typedef std::vector
< std::vector< Scalar > > 
ParOutput
 Array of ncores "local tau arrays" from parallel TSQR.
typedef std::pair< std::vector
< SeqOutput >, ParOutput
FactorOutput
 Partial representation of the Q factor.

Public Member Functions

 TbbParallelTsqr (const size_t numCores=1, const size_t cacheSizeHint=0)
 Constructor.
size_t ncores () const
 Number of cores that TSQR will use to solve the problem.
size_t TEUCHOS_DEPRECATED cache_block_size () const
 Cache size hint (in bytes) used for the factorization.
size_t cache_size_hint () const
 Cache size hint (in bytes) used for the factorization.
double min_seq_factor_timing () const
 Fastest time over all tasks of the last SequentialTsqr::factor() call.
double max_seq_factor_timing () const
 Slowest time over all tasks of the last SequentialTsqr::factor() call.
double min_seq_apply_timing () const
 Fastest time over all tasks of the last SequentialTsqr::apply() call.
double max_seq_apply_timing () const
 Slowest time over all tasks of the last SequentialTsqr::apply() call.
void Q_times_B (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar Q[], const LocalOrdinal ldq, const Scalar B[], const LocalOrdinal ldb, const bool contiguous_cache_blocks) const
 Compute Q*B.
LocalOrdinal reveal_R_rank (const LocalOrdinal ncols, Scalar R[], const LocalOrdinal ldr, Scalar U[], const LocalOrdinal ldu, const magnitude_type tol) const
LocalOrdinal reveal_rank (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar Q[], const LocalOrdinal ldq, Scalar R[], const LocalOrdinal ldr, const magnitude_type tol, const bool contiguous_cache_blocks=false) const
 Rank-revealing decomposition.

Static Public Member Functions

static bool QR_produces_R_factor_with_nonnegative_diagonal ()

Detailed Description

template<class LocalOrdinal, class Scalar, class TimerType>
class TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >

Parallel implementation of TbbTsqr.

Author:
Mark Hoemmen

This class implements the functionality of TbbTsqr. It is not meant to be seen by users of TbbTsqr.

The third template parameter, TimerType, allows different timer implementations. TbbParallelTsqr times each task's invocations of SequentialTsqr::factor() and SequentialTsqr::apply(). TrivialTimer is a "timer" that does nothing, in case you don't want to invoke timers.

Definition at line 66 of file TbbTsqr_TbbParallelTsqr.hpp.


Member Typedef Documentation

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::SeqOutput

Results of SequentialTsqr for each core.

Definition at line 127 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::ParOutput

Array of ncores "local tau arrays" from parallel TSQR.

(Local Q factors are stored in place.)

Definition at line 132 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::FactorOutput

Partial representation of the Q factor.

The factor() method returns a pair: the results of SequentialTsqr for data on each core, and the results of combining the data on the cores.

Definition at line 139 of file TbbTsqr_TbbParallelTsqr.hpp.


Constructor & Destructor Documentation

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::TbbParallelTsqr ( const size_t  numCores = 1,
const size_t  cacheSizeHint = 0 
) [inline]

Constructor.

Parameters:
numCores[in] Number of parallel cores to use in the factorization. This should be <= the number of cores with which Intel TBB was initialized.
cacheSizeHint[in] Cache size hint in bytes. Zero means that TSQR will pick a reasonable nonzero default.

Definition at line 148 of file TbbTsqr_TbbParallelTsqr.hpp.


Member Function Documentation

template<class LocalOrdinal, class Scalar, class TimerType>
static bool TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::QR_produces_R_factor_with_nonnegative_diagonal ( ) [inline, static]

Whether or not this QR factorization produces an R factor with all nonnegative diagonal entries.

Definition at line 117 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
size_t TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::ncores ( ) const [inline]

Number of cores that TSQR will use to solve the problem.

That is, the number of subproblems into which to divide the main problem, to solve it in parallel.

Definition at line 166 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
size_t TEUCHOS_DEPRECATED TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::cache_block_size ( ) const [inline]

Cache size hint (in bytes) used for the factorization.

This method is deprecated, because the name is misleading. Please call cache_size_hint() instead.

Definition at line 172 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
size_t TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::cache_size_hint ( ) const [inline]

Cache size hint (in bytes) used for the factorization.

This may be different from the corresponding constructor argument, because TSQR may revise unreasonable suggestions into reasonable values.

Definition at line 181 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::min_seq_factor_timing ( ) const [inline]

Fastest time over all tasks of the last SequentialTsqr::factor() call.

Definition at line 185 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::max_seq_factor_timing ( ) const [inline]

Slowest time over all tasks of the last SequentialTsqr::factor() call.

Definition at line 188 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::min_seq_apply_timing ( ) const [inline]

Fastest time over all tasks of the last SequentialTsqr::apply() call.

Definition at line 191 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::max_seq_apply_timing ( ) const [inline]

Slowest time over all tasks of the last SequentialTsqr::apply() call.

Definition at line 194 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
void TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::Q_times_B ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  Q[],
const LocalOrdinal  ldq,
const Scalar  B[],
const LocalOrdinal  ldb,
const bool  contiguous_cache_blocks 
) const [inline]

Compute Q*B.

Compute matrix-matrix product Q*B, where Q is nrows by ncols and B is ncols by ncols. Respect cache blocks of Q.

Definition at line 362 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
LocalOrdinal TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::reveal_R_rank ( const LocalOrdinal  ncols,
Scalar  R[],
const LocalOrdinal  ldr,
Scalar  U[],
const LocalOrdinal  ldu,
const magnitude_type  tol 
) const [inline]

Compute SVD $R = U \Sigma V^*$, not in place. Use the resulting singular values to compute the numerical rank of R, with respect to the relative tolerance tol. If R is full rank, return without modifying R. If R is not full rank, overwrite R with $\Sigma \cdot V^*$.

Returns:
Numerical rank of R: 0 <= rank <= ncols.

Definition at line 403 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
LocalOrdinal TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::reveal_rank ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  Q[],
const LocalOrdinal  ldq,
Scalar  R[],
const LocalOrdinal  ldr,
const magnitude_type  tol,
const bool  contiguous_cache_blocks = false 
) const [inline]

Rank-revealing decomposition.

Using the R factor from factor() and the explicit Q factor from explicit_Q(), compute the SVD of R ( $R = U \Sigma V^*$). R. If R is full rank (with respect to the given relative tolerance tol), don't change Q or R. Otherwise, compute $Q := Q \cdot U$ and $R := \Sigma V^*$ in place (the latter may be no longer upper triangular).

Returns:
Rank $r$ of R: $ 0 \leq r \leq ncols$.

Definition at line 425 of file TbbTsqr_TbbParallelTsqr.hpp.


The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends