How to use Alea.cuBase in Python


Python is often used for scripting and rapid prototyping. In this post we illustrate how we can integrate Alea.cuBase and Python so that we can call GPU algorithms coded with Alea.cuBase conveniently in Python.

In this post we rely on Python for .NET. It provides a nearly seamless integration of Python with the .NET Common Language Runtime (CLR). Note that it does not implement Python as a first-class CLR language, nor does it translate Python code to managed code IL code. It is rather an integration of the C Python engine with the .NET runtime.

An alternative approach would be to use IronPython, which is a is an implementation of the Python programming language targeting the .NET Framework, entirely written in C#. However, because IronPython has some limitations in using very useful Python libraries such as matplotlib, we prefer to work with C Python and Python for .NET.

Setting up the Environment

We suggest that you install the Python tools for Visual Studio from which turn Visual Studio into a nice Python IDE, supporting both CPython and IronPython.


If you are going to use IronPython, all that is needed is to install IronPython from

Python for .NET

Python for .NET consists of two components:

  1. clr.pyd, a Python module interfacing with the .NET world
  2. Python.Runtime.dll, an assembly used by clr.pyd

We need to compile Python for .NET to use .NET 4.0 framework and the proper Python versions. Currently, Python for .NET surpported Python version from 2.3 to 2.7 Checkout the source of Python for .NET from

It contains one solution file for VS 2008. Open it with VS 2010, the conversion will succeed without errors. To compile Python for .NET to use Python 2.7 and .NET 4.0 the following steps are required:

Right-click on project “Python.Runtime” and select “Properties”, select “Application” tab and change the “Target framework” to “.NET Framework 4”. Then open the file pythonnet\pythonnet\src\runtime\buildclrmodule.bat and change the following command:


Attention, it appears two times. Next, open the file and change the lines with the version number in the following piece of code:



To change the Python interpreter version, right-click on project “Python.Runtime” and select “Properties”. In the “Build” tab, “Conditional compilation symbols”, change “PYTHON26” to “PYTHON27″ to select the Python 2.7 interpreter.

The last step is to patch methodbinder.cs. Replace the method MatchParameters with the following code:

[sourcecode language=”csharp”]
private static bool _RetrieveGenericArguments(List<Type> gts, Type pt, Type it)
bool ok = true;
if (pt.GUID == new Guid())
else if (pt.IsGenericType && it.IsGenericType && it.GetGenericTypeDefinition().GUID == pt.GUID)
var pts = pt.GetGenericArguments();
var its = it.GetGenericArguments();
for (int i = 0; i < pts.Length; ++i)
ok &= _RetrieveGenericArguments(gts, pts[i], its[i]);
else if (!pt.IsGenericType && !it.IsGenericType && pt.GUID == it.GUID)
// nothing
ok = false;
return ok;

internal static MethodInfo MatchParameters(MethodInfo[] mis, Type[] its)
foreach (var mi in mis)
if (!mi.IsGenericMethodDefinition) continue;

var pts = (from p in mi.GetParameters() select p.ParameterType).ToArray();
if (pts.Length != its.Length) continue;

var n = pts.Length;
var gts = new List<Type>();
bool ok = true;
for (int i = 0; i < n; ++i)
ok &= _RetrieveGenericArguments(gts, pts[i], its[i]);
if (!ok) continue;
if (gts.Count != mi.GetGenericArguments().Length) continue;
return mi.MakeGenericMethod(gts.ToArray());

return null;

Now recompile the project “Python.Runtime”.

After a sucessful build you can test it with the following simple Python script:

[sourcecode language=”python”]
import sys


import clr, System

print System.Environment.

you can also print out the sys.path

print ‘—–‘
for p in sys.path:
print p
print ‘—–‘

Note that the path C:\dev\pythonnet\pythonnet\src\runtime\bin\Release has to point to the location of the module clr.pyd and the assembly Python.Runtime.dll.

Interfacing Python and .NET

In order to use a private assembly, use clr.AddReference() function. For example to use the assembly “Test.dll” call clr.AddReference(“Test”) to load it.

We refer to for how to interoperate with .NET from Python.

Preparing a .NET Assembly with GPU Code

We create an F# library project, referencing Alea.CUDA. Make sure that you set the “Copy Local” property of the Alea.CUDA assembly refernce to true. The example below provides a simple kernel adding two arrays and some helper class DeviceWorkerHelper, which exposes some module load functions to get around some limitations of Python for .NET with class extension methods.

[sourcecode language=”fsharp”]
module Lib.Test

open Alea.CUDA

let a = [| 1.0; 2.0 |]

let pfunct = cuda {
let! kernel =
<@ fun (C:DevicePtr<float>) (A:DevicePtr<float>) (B:DevicePtr<float>) ->
let tid = threadIdx.x
C.[tid] <- A.[tid] + B.[tid] @>
|> defineKernelFunc

return PFunc(fun (m:Module) (A:float[]) (B:float[]) ->
let n = A.Length
use A = m.Worker.Malloc(A)
use B = m.Worker.Malloc(B)
use C = m.Worker.Malloc(n)
let lp = LaunchParam(1, n)
kernel.Launch m lp C.Ptr A.Ptr B.Ptr
C.ToHost()) }

type DeviceWorkerHelper(worker:DeviceWorker) =
member this.LoadPModule(f:PFunc<‘T>, m:Builder.PTXModule) = worker.LoadPModule(f, m)
member this.LoadPModule(fm:PFunc<‘T> * Builder.PTXModule) = worker.LoadPModule(fm)
member this.LoadPModule(f:PFunc<‘T>, m:Builder.IRModule) = worker.LoadPModule(f, m)
member this.LoadPModule(fm:PFunc<‘T> * Builder.IRModule) = worker.LoadPModule(fm)
member this.LoadPModule(t:PTemplate<PFunc<‘T>>) = worker.LoadPModule(t)

Calling a GPU Kernel from Python

The following Python script shows how to call the kernel from the Test assembly:

[sourcecode language=”python”]
import sys
import clr, System



from Alea.CUDA import Engine, Framework
from Lib import Test

worker = Engine.workers.DefaultWorker
print worker.Name
worker = Test.DeviceWorkerHelper(worker)

A = System.Array[System.Double]([1.0, 2.0, 3.0, 4.0])
B = System.Array[System.Double]([1.5, 2.5, 3.5, 4.5])

def test(pfuncm):
C = pfuncm.Invoke.Invoke(A).Invoke(B)
for x in C: print x,
print “”

print “Loading into worker”
pfuncm = worker.LoadPModule(Test.pfunct)

print “Invoking GPU kernel”

Executing the script produces the following output:

Unfortunately this script cannot be executed in the Python Interactive inside Visual Studio, because the Python REPL process exits with a StackOverflowException at the import of Alea.CUDA.


We have show how to use Alea.cuBase in Python with a suitable modification of Python for .NET. If you just want to do rapid prototyping together with some simple plotting and visualization we suggest that you also take a look at the F# interactive and the FSharpChart library.