|
Kokkos Node API and Local Linear Algebra Kernels Version of the Day
|
Kokkos node interface to the Thrust library for NVIDIA CUDA-capable GPUs. More...
#include <Kokkos_ThrustGPUNode.hpp>

Public Member Functions | |
| ThrustGPUNode (Teuchos::ParameterList &pl) | |
| Constructor acceptings a list of parameters. | |
| ~ThrustGPUNode () | |
| Destructor has no effect. | |
| template<class T > | |
| ArrayRCP< T > | allocBuffer (size_t size) |
| Allocate a parallel buffer, returning it as a pointer ecnapsulated in an ArrayRCP. | |
| template<class T > | |
| void | copyFromBuffer (size_t size, const ArrayRCP< const T > &buffSrc, const ArrayView< T > &hostDest) |
| Copy data to host memory from a parallel buffer. | |
| template<class T > | |
| void | copyToBuffer (size_t size, const ArrayView< const T > &hostSrc, const ArrayRCP< T > &buffDest) |
| Copy data to host memory from a parallel buffer. | |
| template<class T > | |
| void | copyBuffers (size_t size, const ArrayRCP< const T > &buffSrc, const ArrayRCP< T > &buffDest) |
| Copy data between buffers. | |
| template<class T > | |
| ArrayRCP< const T > | viewBuffer (size_t size, ArrayRCP< const T > buff) |
Return a const view of a buffer for use on the host. This creates a const view of length size, constituting the first size entries of buff, as they exist at the time of view creation. host memory allocated for the creation of this view is automatically deleted when no refences to the view remain. | |
| template<class T > | |
| ArrayRCP< T > | viewBufferNonConst (ReadWriteOption rw, size_t size, const ArrayRCP< T > &buff) |
| Return a non-const view of a buffer for use on the host. | |
| void | printStatistics (const RCP< Teuchos::FancyOStream > &os) const |
| Print some statistics regarding node allocation and memory transfer. | |
| void | clearStatistics () |
| Clear all statistics on memory transfer. | |
Static Public Attributes | |
| static const bool | isHostNode = false |
| Indicates that parallel buffers allocated by this node are not available for use on the host thread. | |
| void | sync () const |
| Block until all node work is complete. Aids in accurate timing of multiple kernels. | |
| template<class WDP > | |
| static void | parallel_for (int begin, int end, WDP wdp) |
| parallel for skeleton, a wrapper around thrust::for_each. See Kokkos Node API | |
| template<class WDP > | |
| static WDP::ReductionType | parallel_reduce (int begin, int end, WDP wd) |
| parallel reduction skeleton, a wrapper around thrust::transform_reduce. See Kokkos Node API | |
Kokkos node interface to the Thrust library for NVIDIA CUDA-capable GPUs.
Definition at line 54 of file Kokkos_ThrustGPUNode.hpp.
| Kokkos::ThrustGPUNode::ThrustGPUNode | ( | Teuchos::ParameterList & | pl | ) |
Constructor acceptings a list of parameters.
This constructor accepts the parameters:
| Device Number | [int] Specifies the CUDA device to which the node will attach. |
| Verbose | [int] Non-zero parameter specifies that the constructor is verbose, printing information about the the attached device. Default: 0. |
The constructor throw std::runtime_error if "Device Number" is outside the range
, where numDevices is the number of CUDA devices reported by cudaGetDeviceCount().
Definition at line 49 of file Kokkos_ThrustGPUNode.cpp.
| Kokkos::ThrustGPUNode::~ThrustGPUNode | ( | ) |
Destructor has no effect.
Definition at line 90 of file Kokkos_ThrustGPUNode.cpp.
| void Kokkos::ThrustGPUNode::parallel_for | ( | int | begin, |
| int | end, | ||
| WDP | wdp | ||
| ) | [static] |
parallel for skeleton, a wrapper around thrust::for_each. See Kokkos Node API
Definition at line 96 of file Kokkos_ThrustGPUNode.cuh.
| WDP::ReductionType Kokkos::ThrustGPUNode::parallel_reduce | ( | int | begin, |
| int | end, | ||
| WDP | wd | ||
| ) | [static] |
parallel reduction skeleton, a wrapper around thrust::transform_reduce. See Kokkos Node API
Definition at line 120 of file Kokkos_ThrustGPUNode.cuh.
| void Kokkos::ThrustGPUNode::sync | ( | ) | const |
Block until all node work is complete. Aids in accurate timing of multiple kernels.
Definition at line 92 of file Kokkos_ThrustGPUNode.cpp.
| ArrayRCP< T > Kokkos::CUDANodeMemoryModel::allocBuffer | ( | size_t | size | ) | [inline, inherited] |
Allocate a parallel buffer, returning it as a pointer ecnapsulated in an ArrayRCP.
Dereferencing the returned ArrayRCP or its underlying pointer in general results in undefined behavior outside of parallel computations.
The buffer will be automatically freed by the Node when no more references remain.
| T | The data type of the allocate buffer. This is used to perform alignment and determine the number of bytes to allocate. |
| [in] | size | The size requested for the parallel buffer, greater than zero. |
size number of entries of type T. Definition at line 66 of file Kokkos_CUDANodeMemoryModelImpl.hpp.
| void Kokkos::CUDANodeMemoryModel::copyFromBuffer | ( | size_t | size, |
| const ArrayRCP< const T > & | buffSrc, | ||
| const ArrayView< T > & | hostDest | ||
| ) | [inline, inherited] |
Copy data to host memory from a parallel buffer.
| [in] | size | The number of entries to copy from buffSrc to hostDest. |
| [in] | buffSrc | The parallel buffer from which to copy. |
| [out] | hostDest | The location in host memory where the data from buffSrc is copied to. |
size is non-negative. buffSrc has length at least size. hostDest has length equal to size. [0 , size) of buffSrc have been copied to hostDest entries in the range [0 , size). Definition at line 89 of file Kokkos_CUDANodeMemoryModelImpl.hpp.
| void Kokkos::CUDANodeMemoryModel::copyToBuffer | ( | size_t | size, |
| const ArrayView< const T > & | hostSrc, | ||
| const ArrayRCP< T > & | buffDest | ||
| ) | [inline, inherited] |
Copy data to host memory from a parallel buffer.
| [in] | size | The number of entries to copy from hostSrc to buffDest. |
| [in] | hostSrc | The location in host memory from where the data is copied. |
| [out] | buffDest | The parallel buffer to which the data is copied. |
size is non-negative. hostSrc has length equal to size. buffSrc has length at least size. [0 , size) of hostSrc are allowed to be written to. The data is guaranteed to be present in buffDest before it is used in a parallel computation. Definition at line 117 of file Kokkos_CUDANodeMemoryModelImpl.hpp.
| void Kokkos::CUDANodeMemoryModel::copyBuffers | ( | size_t | size, |
| const ArrayRCP< const T > & | buffSrc, | ||
| const ArrayRCP< T > & | buffDest | ||
| ) | [inline, inherited] |
Copy data between buffers.
| [in] | size | The size of the copy, greater than zero. |
| [in] | buffSrc | The source buffer, with length at least as large as size. |
| [in,out] | buffDest | The destination buffer, with length at least as large as size. |
Definition at line 145 of file Kokkos_CUDANodeMemoryModelImpl.hpp.
| ArrayRCP< const T > Kokkos::CUDANodeMemoryModel::viewBuffer | ( | size_t | size, |
| ArrayRCP< const T > | buff | ||
| ) | [inline, inherited] |
Return a const view of a buffer for use on the host. This creates a const view of length size, constituting the first size entries of buff, as they exist at the time of view creation. host memory allocated for the creation of this view is automatically deleted when no refences to the view remain.
buff.size() >= size Definition at line 177 of file Kokkos_CUDANodeMemoryModelImpl.hpp.
| ArrayRCP< T > Kokkos::CUDANodeMemoryModel::viewBufferNonConst | ( | ReadWriteOption | rw, |
| size_t | size, | ||
| const ArrayRCP< T > & | buff | ||
| ) | [inline, inherited] |
Return a non-const view of a buffer for use on the host.
| [in] | rw | Specifies Kokkos::ReadWrite or Kokkos::WriteOnly. If Kokkos::WriteOnly, the contents of the view are undefined when it is created and must be initialized on the host. However, this prevents the potential need for a copy from device to host memory needed to set the view values as when Kokkos::ReadWrite is specified. |
This creates a view of length size, constituting the first size entries of buff, as they exist at the time of view creation.
A non-const view permits changes, which must be copied back to the buffer. This does not occur until all references to the view are deleted. If the buffer is deallocated before the view is deleted, then the copy-back does not occur.
buff.size() >= size Definition at line 192 of file Kokkos_CUDANodeMemoryModelImpl.hpp.
| void Kokkos::CUDANodeMemoryModel::printStatistics | ( | const RCP< Teuchos::FancyOStream > & | os | ) | const [inherited] |
Print some statistics regarding node allocation and memory transfer.
Definition at line 63 of file Kokkos_CUDANodeMemoryModel.cpp.
| void Kokkos::CUDANodeMemoryModel::clearStatistics | ( | ) | [inherited] |
Clear all statistics on memory transfer.
Definition at line 54 of file Kokkos_CUDANodeMemoryModel.cpp.
const bool Kokkos::CUDANodeMemoryModel::isHostNode = false [static, inherited] |
Indicates that parallel buffers allocated by this node are not available for use on the host thread.
Definition at line 73 of file Kokkos_CUDANodeMemoryModel.hpp.
1.7.4