The Device abstract represents a hardware device with memory and computing units. All Tensor operations are scheduled by the resident device for execution. Tensor memory is also managed by the device's memory manager. Therefore, optimization of memory and execution are implemented in the Device class.
Currently, SINGA has three Device implmentations,
- CudaGPU for an Nvidia GPU card which runs Cuda code
- CppCPU for a CPU which runs Cpp code
- OpenclGPU for a GPU card which runs OpenCL code
The following code provides examples of creating devices:
from singa import device cuda = device.create_cuda_gpu_on(0) # use GPU card of ID 0 host = device.get_default_device() # get the default host device (a CppCPU) ary1 = device.create_cuda_gpus(2) # create 2 devices, starting from ID 0 ary2 = device.create_cuda_gpus([0,2]) # create 2 devices on ID 0 and 2