TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy
... [More] computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. [Less]
scikit-cuda provides Python interfaces to many of the functions in the CUDA device/runtime, CUBLAS, CUFFT, and CUSOLVER libraries distributed as part of NVIDIA's CUDA Programming Toolkit, as well as interfaces to select functions in the free and standard versions of the CULA Dense Toolkit. Both
... [More] low-level wrapper functions similar to their C counterparts and high-level functions comparable to those in NumPy and Scipy are provided. [Less]
CNTK (Computational Network Toolkit) by Microsoft Research, is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix
... [More] operations upon their inputs. CNTK allows to easily realize and combine popular model types such as feed-forward DNNs, convolutional nets (CNNs), and recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD, error backpropagation) learning with automatic differentiation and parallelization across multiple GPUs and servers. [Less]
Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine
DSSTNE (pronounced "Destiny") is a library for training and deploying deep neural networks using GPUs. It is build to solve deep learning problems at Amazon's scale. It is built for production deployment of real-world deep learning
... [More] applications, emphasizing speed and scale over experimental flexibility.
Multi-GPU Scale: Training and prediction both scale out to use multiple GPUs, spreading out computation and storage in a model-parallel fashion for each layer.
Large Layers: Model-parallel scaling enables larger networks than are possible with a single GPU.
Sparse Data: DSSTNE is optimized for fast performance on sparse datasets. Custom GPU kernels perform sparse computation on the GPU, without filling in lots of zeroes. [Less]