Kokkos Node API and Local Linear Algebra Kernels Version of the Day
Public Types | Public Member Functions | Static Public Member Functions
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType > Class Template Reference

Parallel implementation of TbbTsqr. More...

#include <TbbTsqr_TbbParallelTsqr.hpp>

List of all members.

Public Types

typedef SequentialTsqr
< LocalOrdinal, Scalar >
::FactorOutput 
SeqOutput
 Results of SequentialTsqr for each core.
typedef std::vector
< std::vector< Scalar > > 
ParOutput
 Array of numTasks_ "local tau arrays" from parallel TSQR.
typedef std::pair< std::vector
< SeqOutput >, ParOutput
FactorOutput
 Partial representation of the Q factor.

Public Member Functions

 TbbParallelTsqr (const size_t numTasks=1, const size_t cacheSizeHint=0)
 Constructor.
 TbbParallelTsqr (const Teuchos::RCP< Teuchos::ParameterList > &plist)
 Constructor (that takes a parameter list).
size_t ntasks () const
 Number of tasks that TSQR will use to solve the problem.
size_t TEUCHOS_DEPRECATED ncores () const
 Number of tasks that TSQR will use to solve the problem.
size_t TEUCHOS_DEPRECATED cache_block_size () const
 Cache size hint (in bytes) used for the factorization.
size_t cache_size_hint () const
 Cache size hint (in bytes) used for the factorization.
double min_seq_factor_timing () const
 Fastest time over all tasks of the last SequentialTsqr::factor() call.
double max_seq_factor_timing () const
 Slowest time over all tasks of the last SequentialTsqr::factor() call.
double min_seq_apply_timing () const
 Fastest time over all tasks of the last SequentialTsqr::apply() call.
double max_seq_apply_timing () const
 Slowest time over all tasks of the last SequentialTsqr::apply() call.
void Q_times_B (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar Q[], const LocalOrdinal ldq, const Scalar B[], const LocalOrdinal ldb, const bool contiguous_cache_blocks) const
 Compute Q*B.
LocalOrdinal reveal_R_rank (const LocalOrdinal ncols, Scalar R[], const LocalOrdinal ldr, Scalar U[], const LocalOrdinal ldu, const magnitude_type tol) const
LocalOrdinal reveal_rank (const LocalOrdinal nrows, const LocalOrdinal ncols, Scalar Q[], const LocalOrdinal ldq, Scalar R[], const LocalOrdinal ldr, const magnitude_type tol, const bool contiguous_cache_blocks=false) const
 Rank-revealing decomposition.

Static Public Member Functions

static bool QR_produces_R_factor_with_nonnegative_diagonal ()

Detailed Description

template<class LocalOrdinal, class Scalar, class TimerType>
class TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >

Parallel implementation of TbbTsqr.

Author:
Mark Hoemmen

This class implements the functionality of TbbTsqr. It is not meant to be seen by users of TbbTsqr.

The third template parameter, TimerType, allows different timer implementations. TbbParallelTsqr times each task's invocations of SequentialTsqr::factor() and SequentialTsqr::apply(). TrivialTimer is a "timer" that does nothing, in case you don't want to invoke timers.

Definition at line 79 of file TbbTsqr_TbbParallelTsqr.hpp.


Member Typedef Documentation

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::SeqOutput

Results of SequentialTsqr for each core.

Definition at line 140 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::ParOutput

Array of numTasks_ "local tau arrays" from parallel TSQR.

(Local Q factors are stored in place.)

Definition at line 146 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::FactorOutput

Partial representation of the Q factor.

The factor() method returns a pair: the results of SequentialTsqr for data on each core, and the results of combining the data on the cores.

Definition at line 154 of file TbbTsqr_TbbParallelTsqr.hpp.


Constructor & Destructor Documentation

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::TbbParallelTsqr ( const size_t  numTasks = 1,
const size_t  cacheSizeHint = 0 
) [inline]

Constructor.

Parameters:
numTasks[in] Number of parallel tasks to use in the factorization. This should be >= the number of cores with which Intel TBB was initialized.
cacheSizeHint[in] Cache size hint in bytes. Zero means that TSQR will pick a reasonable nonzero default.

Definition at line 163 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::TbbParallelTsqr ( const Teuchos::RCP< Teuchos::ParameterList > &  plist) [inline]

Constructor (that takes a parameter list).

Parameters:
plist[in/out] On input: list of parameters. On output: missing parameters are filled in with default values.

For a list of accepted parameters and thei documentation, see the parameter list returned by getValidParameters().

Definition at line 185 of file TbbTsqr_TbbParallelTsqr.hpp.


Member Function Documentation

template<class LocalOrdinal, class Scalar, class TimerType>
static bool TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::QR_produces_R_factor_with_nonnegative_diagonal ( ) [inline, static]

Whether or not this QR factorization produces an R factor with all nonnegative diagonal entries.

Definition at line 130 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
size_t TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::ntasks ( ) const [inline]

Number of tasks that TSQR will use to solve the problem.

This is the number of subproblems into which to divide the main problem, in order to solve it in parallel.

Definition at line 259 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
size_t TEUCHOS_DEPRECATED TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::ncores ( ) const [inline]

Number of tasks that TSQR will use to solve the problem.

This is the number of subproblems into which to divide the main problem, in order to solve it in parallel.

This method is deprecated, because the name is misleading. Please call ntasks() instead.

Definition at line 268 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
size_t TEUCHOS_DEPRECATED TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::cache_block_size ( ) const [inline]

Cache size hint (in bytes) used for the factorization.

This method is deprecated, because the name is misleading. Please call cache_size_hint() instead.

Definition at line 274 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
size_t TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::cache_size_hint ( ) const [inline]

Cache size hint (in bytes) used for the factorization.

This may be different from the corresponding constructor argument, because TSQR may revise unreasonable suggestions into reasonable values.

Definition at line 283 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::min_seq_factor_timing ( ) const [inline]

Fastest time over all tasks of the last SequentialTsqr::factor() call.

Definition at line 287 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::max_seq_factor_timing ( ) const [inline]

Slowest time over all tasks of the last SequentialTsqr::factor() call.

Definition at line 290 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::min_seq_apply_timing ( ) const [inline]

Fastest time over all tasks of the last SequentialTsqr::apply() call.

Definition at line 293 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
double TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::max_seq_apply_timing ( ) const [inline]

Slowest time over all tasks of the last SequentialTsqr::apply() call.

Definition at line 296 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
void TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::Q_times_B ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  Q[],
const LocalOrdinal  ldq,
const Scalar  B[],
const LocalOrdinal  ldb,
const bool  contiguous_cache_blocks 
) const [inline]

Compute Q*B.

Compute matrix-matrix product Q*B, where Q is nrows by ncols and B is ncols by ncols. Respect cache blocks of Q.

Definition at line 464 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
LocalOrdinal TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::reveal_R_rank ( const LocalOrdinal  ncols,
Scalar  R[],
const LocalOrdinal  ldr,
Scalar  U[],
const LocalOrdinal  ldu,
const magnitude_type  tol 
) const [inline]

Compute SVD $R = U \Sigma V^*$, not in place. Use the resulting singular values to compute the numerical rank of R, with respect to the relative tolerance tol. If R is full rank, return without modifying R. If R is not full rank, overwrite R with $\Sigma \cdot V^*$.

Returns:
Numerical rank of R: 0 <= rank <= ncols.

Definition at line 505 of file TbbTsqr_TbbParallelTsqr.hpp.

template<class LocalOrdinal, class Scalar, class TimerType>
LocalOrdinal TSQR::TBB::TbbParallelTsqr< LocalOrdinal, Scalar, TimerType >::reveal_rank ( const LocalOrdinal  nrows,
const LocalOrdinal  ncols,
Scalar  Q[],
const LocalOrdinal  ldq,
Scalar  R[],
const LocalOrdinal  ldr,
const magnitude_type  tol,
const bool  contiguous_cache_blocks = false 
) const [inline]

Rank-revealing decomposition.

Using the R factor from factor() and the explicit Q factor from explicit_Q(), compute the SVD of R ( $R = U \Sigma V^*$). R. If R is full rank (with respect to the given relative tolerance tol), don't change Q or R. Otherwise, compute $Q := Q \cdot U$ and $R := \Sigma V^*$ in place (the latter may be no longer upper triangular).

Returns:
Rank $r$ of R: $ 0 \leq r \leq ncols$.

Definition at line 527 of file TbbTsqr_TbbParallelTsqr.hpp.


The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends