Kokkos Node API and Local Linear Algebra Kernels Version of the Day
Public Member Functions | Static Public Attributes
Kokkos::ThrustGPUNode Class Reference

Kokkos node interface to the Thrust library for NVIDIA CUDA-capable GPUs. More...

#include <Kokkos_ThrustGPUNode.hpp>

Inheritance diagram for Kokkos::ThrustGPUNode:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 ThrustGPUNode (Teuchos::ParameterList &pl)
 Constructor acceptings a list of parameters.
 ~ThrustGPUNode ()
 Destructor has no effect.
template<class T >
ArrayRCP< T > allocBuffer (size_t size)
 Allocate a parallel buffer, returning it as a pointer ecnapsulated in an ArrayRCP.
template<class T >
void copyFromBuffer (size_t size, const ArrayRCP< const T > &buffSrc, const ArrayView< T > &hostDest)
 Copy data to host memory from a parallel buffer.
template<class T >
void copyToBuffer (size_t size, const ArrayView< const T > &hostSrc, const ArrayRCP< T > &buffDest)
 Copy data to host memory from a parallel buffer.
template<class T >
void copyBuffers (size_t size, const ArrayRCP< const T > &buffSrc, const ArrayRCP< T > &buffDest)
 Copy data between buffers.
template<class T >
ArrayRCP< const T > viewBuffer (size_t size, ArrayRCP< const T > buff)
 Return a const view of a buffer for use on the host. This creates a const view of length size, constituting the first size entries of buff, as they exist at the time of view creation. host memory allocated for the creation of this view is automatically deleted when no refences to the view remain.
template<class T >
ArrayRCP< T > viewBufferNonConst (ReadWriteOption rw, size_t size, const ArrayRCP< T > &buff)
 Return a non-const view of a buffer for use on the host.
void printStatistics (const RCP< Teuchos::FancyOStream > &os) const
 Print some statistics regarding node allocation and memory transfer.
void clearStatistics ()
 Clear all statistics on memory transfer.

Static Public Attributes

static const bool isHostNode = false
 Indicates that parallel buffers allocated by this node are not available for use on the host thread.
void sync () const
 Block until all node work is complete. Aids in accurate timing of multiple kernels.
template<class WDP >
static void parallel_for (int begin, int end, WDP wdp)
 parallel for skeleton, a wrapper around thrust::for_each. See Kokkos Node API
template<class WDP >
static WDP::ReductionType parallel_reduce (int begin, int end, WDP wd)
 parallel reduction skeleton, a wrapper around thrust::transform_reduce. See Kokkos Node API

Detailed Description

Kokkos node interface to the Thrust library for NVIDIA CUDA-capable GPUs.

Definition at line 13 of file Kokkos_ThrustGPUNode.hpp.


Constructor & Destructor Documentation

Kokkos::ThrustGPUNode::ThrustGPUNode ( Teuchos::ParameterList pl)

Constructor acceptings a list of parameters.

This constructor accepts the parameters:

Parameters:
Device Number[int] Specifies the CUDA device to which the node will attach.
Verbose[int] Non-zero parameter specifies that the constructor is verbose, printing information about the the attached device. Default: 0.

The constructor throw std::runtime_error if "Device Number" is outside the range $[0,numDevices)$, where numDevices is the number of CUDA devices reported by cudaGetDeviceCount().

Definition at line 8 of file Kokkos_ThrustGPUNode.cpp.

Kokkos::ThrustGPUNode::~ThrustGPUNode ( )

Destructor has no effect.

Definition at line 49 of file Kokkos_ThrustGPUNode.cpp.


Member Function Documentation

template<class WDP >
void Kokkos::ThrustGPUNode::parallel_for ( int  begin,
int  end,
WDP  wdp 
) [static]

parallel for skeleton, a wrapper around thrust::for_each. See Kokkos Node API

Definition at line 53 of file Kokkos_ThrustGPUNode.cuh.

template<class WDP >
WDP::ReductionType Kokkos::ThrustGPUNode::parallel_reduce ( int  begin,
int  end,
WDP  wd 
) [static]

parallel reduction skeleton, a wrapper around thrust::transform_reduce. See Kokkos Node API

Definition at line 77 of file Kokkos_ThrustGPUNode.cuh.

void Kokkos::ThrustGPUNode::sync ( ) const

Block until all node work is complete. Aids in accurate timing of multiple kernels.

Definition at line 51 of file Kokkos_ThrustGPUNode.cpp.

template<class T >
ArrayRCP< T > Kokkos::CUDANodeMemoryModel::allocBuffer ( size_t  size) [inline, inherited]

Allocate a parallel buffer, returning it as a pointer ecnapsulated in an ArrayRCP.

Dereferencing the returned ArrayRCP or its underlying pointer in general results in undefined behavior outside of parallel computations.

The buffer will be automatically freed by the Node when no more references remain.

Template Parameters:
TThe data type of the allocate buffer. This is used to perform alignment and determine the number of bytes to allocate.
Parameters:
[in]sizeThe size requested for the parallel buffer, greater than zero.
Postcondition:
The method will return an ArrayRCP encapsulating a pointer. The underlying pointer may be used in parallel computation routines, and is guaranteed to have size large enough to reference size number of entries of type T.

Definition at line 24 of file Kokkos_CUDANodeMemoryModelImpl.hpp.

template<class T >
void Kokkos::CUDANodeMemoryModel::copyFromBuffer ( size_t  size,
const ArrayRCP< const T > &  buffSrc,
const ArrayView< T > &  hostDest 
) [inline, inherited]

Copy data to host memory from a parallel buffer.

Parameters:
[in]sizeThe number of entries to copy from buffSrc to hostDest.
[in]buffSrcThe parallel buffer from which to copy.
[out]hostDestThe location in host memory where the data from buffSrc is copied to.
Precondition:
size is non-negative.
buffSrc has length at least size.
hostDest has length equal to size.
Postcondition:
On return, entries in the range [0 , size) of buffSrc have been copied to hostDest entries in the range [0 , size).

Definition at line 46 of file Kokkos_CUDANodeMemoryModelImpl.hpp.

template<class T >
void Kokkos::CUDANodeMemoryModel::copyToBuffer ( size_t  size,
const ArrayView< const T > &  hostSrc,
const ArrayRCP< T > &  buffDest 
) [inline, inherited]

Copy data to host memory from a parallel buffer.

Parameters:
[in]sizeThe number of entries to copy from hostSrc to buffDest.
[in]hostSrcThe location in host memory from where the data is copied.
[out]buffDestThe parallel buffer to which the data is copied.
Precondition:
size is non-negative.
hostSrc has length equal to size.
buffSrc has length at least size.
Postcondition:
On return, entries in the range [0 , size) of hostSrc are allowed to be written to. The data is guaranteed to be present in buffDest before it is used in a parallel computation.

Definition at line 65 of file Kokkos_CUDANodeMemoryModelImpl.hpp.

template<class T >
void Kokkos::CUDANodeMemoryModel::copyBuffers ( size_t  size,
const ArrayRCP< const T > &  buffSrc,
const ArrayRCP< T > &  buffDest 
) [inline, inherited]

Copy data between buffers.

Parameters:
[in]sizeThe size of the copy, greater than zero.
[in]buffSrcThe source buffer, with length at least as large as size.
[in,out]buffDestThe destination buffer, with length at least as large as size.
Postcondition:
The data is guaranteed to have been copied before any other usage of buffSrc or buffDest occurs.

Definition at line 84 of file Kokkos_CUDANodeMemoryModelImpl.hpp.

template<class T >
ArrayRCP< const T > Kokkos::CUDANodeMemoryModel::viewBuffer ( size_t  size,
ArrayRCP< const T >  buff 
) [inline, inherited]

Return a const view of a buffer for use on the host. This creates a const view of length size, constituting the first size entries of buff, as they exist at the time of view creation. host memory allocated for the creation of this view is automatically deleted when no refences to the view remain.

Precondition:
buff.size() >= size

Definition at line 105 of file Kokkos_CUDANodeMemoryModelImpl.hpp.

template<class T >
ArrayRCP< T > Kokkos::CUDANodeMemoryModel::viewBufferNonConst ( ReadWriteOption  rw,
size_t  size,
const ArrayRCP< T > &  buff 
) [inline, inherited]

Return a non-const view of a buffer for use on the host.

Parameters:
[in]rwSpecifies Kokkos::ReadWrite or Kokkos::WriteOnly. If Kokkos::WriteOnly, the contents of the view are undefined when it is created and must be initialized on the host. However, this prevents the potential need for a copy from device to host memory needed to set the view values as when Kokkos::ReadWrite is specified.

This creates a view of length size, constituting the first size entries of buff, as they exist at the time of view creation.

A non-const view permits changes, which must be copied back to the buffer. This does not occur until all references to the view are deleted. If the buffer is deallocated before the view is deleted, then the copy-back does not occur.

Precondition:
buff.size() >= size

Definition at line 120 of file Kokkos_CUDANodeMemoryModelImpl.hpp.

void Kokkos::CUDANodeMemoryModel::printStatistics ( const RCP< Teuchos::FancyOStream > &  os) const [inherited]

Print some statistics regarding node allocation and memory transfer.

Definition at line 22 of file Kokkos_CUDANodeMemoryModel.cpp.

void Kokkos::CUDANodeMemoryModel::clearStatistics ( ) [inherited]

Clear all statistics on memory transfer.

Definition at line 13 of file Kokkos_CUDANodeMemoryModel.cpp.


Member Data Documentation

const bool Kokkos::CUDANodeMemoryModel::isHostNode = false [static, inherited]

Indicates that parallel buffers allocated by this node are not available for use on the host thread.

Definition at line 32 of file Kokkos_CUDANodeMemoryModel.hpp.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends