All posts by dani

deep-learning-neurons

GPU Accelerated Deep Learning also Wanted for .NET

We took the chance and did a second Channel 9 recording on our GPU accelerated Machine Learning project in the Microsoft offices at Time Square New York City. It was a great experience to do the recording with Seth Juarez. Many thanks Seth!

There exist already several deep learning libraries but none of them targets .NET. Alea TK is a new open source project that aims to develop a complete GPU accelerated deep learning stack for .NET. It is built on top of Alea GPU and is designed from ground up to be GPU accelerated, easy to extend and to deploy. It is in an early phase. Contributors and developers are welcome! The recording explains why we started the project and what we plan to do with it in the future.

Check out Alea TK on our project web site and on GitHub.

channel-9-with-seth

Radically Simplified GPU Programming with C#

We were very happy to do a Channel 9 recording for our new Alea GPU version 3 in the Microsoft offices at Time Square New York City. It was a great experience to do the recording with Seth Juarez. Many thanks Seth!

GPU computing is all about number crunching and performance. Do you have a lot of parallel calculations? Then try to use GPU with C#. With the new Alea GPU parallel GPU methods it is as easy as changing a few lines of code to utilize the power of GPUs. No GPU in your box? Don’t worry, you can get them from Azure or other cloud providers. I explained how easy it is to run C# code on the GPU, with full debugging support in Visual Studio.

Check out Alea GPU on our product web site.

gtc-europe-2016

A new Deep Learning Stack for .NET

alea-tk-images

I gave a talk at GTC Europe 2016 in Amsterdam about our new open source project Alea TK.

Alea TK is a library for general purpose numerical computing and Deep Learning based on tensors and tensor expressions supporting imperative calculations as well as symbolic calculations with auto-differentiation. It is designed from ground up with CUDA acceleration in mind. It is easy to extend, install and deploy and is perfectly suited for rapid prototyping and new model development. Alea TK is built entirely in .NET and C#. It relies on the Alea GPU compiler and uses NVIDIA’s cuDNN library to accelerate many standard deep neural networks primitives.

Alea TK is still a young project. I explained the main design principles and presented the framework, with a particular focus on GPU kernel fusion technology.

Check out the slides and our poster.

jet_logo

Deficiencies of .NET CLR JIT Compilers

Another Reason to Use a GPU!

I recently gave a talk at an F# meetup hosted by Jet.com about deficiencies of .NET CLR JIT compilers.

We know that often C# or F# does not perform at the level of native C++ because the CLR JIT compiler is not optimizing the code well enough. In worst cases we loose a factor of 2 to 4 against native code. To investigate this problem in more depth you can check how .NET CLR JIT compilers compile simple loops and nested loops. It is not enough to just look at MSIL code. We have to dig deep into the optimized assembly code, generated by the CLR JIT compilers. We find that the CLR JIT compilers are not capable to remove array bound checks or optimize array access patterns of the form a[i*lda + j] in simple nested loops. This is very bad news for performance critical code in .NET.

Fortunately, you can get around these problems by moving performance critical code to the GPU. The Floyd-Warshall all shortest path algorithm serves as an example: an advanced GPU implementation fully written in C# and compiled with Alea GPU gives a significant speedup. It runs at the same speed as a native GPU version coded in C++ and 125 times faster than the initial C# version!

Developing such efficient algorithms is not straightforward at all and requires some experience. We therefore take a step back and show that simpler problems can often be solved efficiently with parallel-for and parallel aggregate patterns running on the GPU with a dramatic performance increase of a factor of 50 to 100.

Here are the slides.