Qt, CUDA and Windows Development

So you want to develop a Qt application that takes advantage of CUDA acceleration AND you want to do it on Windows you say…. Well young naive programmer welcome to hell. Before I begin I cannot stress enough how much of a ball ache getting this development environment up and running was. However if you are a stubborn as I then hopefully this should make things easier for you. You have been warned!

Firstly there are a few things you need to know about developing CUDA applications on Windows. Firstly the CUDA tool kit is only build for certain compilers here are your options,

– Visual C++ 12.0 – Visual Studio 2013

– Visual C++ 12.0 – Visual Studio Community 2013 (Only 64bit)

– Visual C++ 11.0 – Visual Studio 2012

– Visual C++ 10.0 – Visual Studio 2010 (Deprecated)

Download one of these, for me it was Visual Studio Community 2013 because it was free!

Note: CUDA has very limited support for 32bit applications therefore I would recommend that you stick to 64bit. 

Now you have you compiler downloaded, if you have your paths set up correctly Qt should detect it and set up a kit for you. Make sure that you have a  version of Qt installed which is built with the compiler that you have downloaded. For me this was Qt 5.5.1 MSVC2013 64bit.

Now you need to make sure you install the right version of the CUDA toolkit. Most likely the most recent version will be suitable. I had some issues as versions before CUDA 7.5 do not support Windows 10.

Now you should be good to go to set up your Qt project. If you try to create a CUDA app now however you will run into a load of conflicts with windows libraries and such. So there are a number of definitions you will need to add to your build instructions to resolve these. For ease here is my .pro file. This is setup to allow dynamic parallelism.


#Enter your gencode here!
GENCODE = arch=compute_52,code=sm_52

# as I want to support 4.8 and 5 this will set a flag for some of the mac stuff
# mainly in the types.h file for the setMacVisual which is native in Qt5
isEqual(QT_MAJOR_VERSION, 5) {

QT+=gui opengl core
VPATH += ./src
    src/main.cpp \
    src/mainwindow.cpp \
    src/OpenGLWidget.cpp \
    #src/SPHEngine.cpp \
    src/GLTexture.cpp \
    src/GLTextureLib.cpp \
    src/FrameBuffer.cpp \
    src/RenderBuffer.cpp \
    src/RenderTargetLib.cpp \
    src/FluidShader.cpp \
    src/FluidPropDockWidget.cpp \
    src/Camera.cpp \
    src/Text.cpp \
    src/ShaderLib.cpp \
    src/Shader.cpp \
    src/ShaderProgram.cpp \
    src/ShaderUtils.cpp \

    include/mainwindow.h \
    include/OpenGLWidget.h \
    #include/CudaSPHKernals.h \
    #include/SPHEngine.h \
    include/GLTexture.h \
    include/GLTextureLib.h \
    include/FrameBuffer.h \
    include/RenderBuffer.h \
    include/RenderTargetLib.h \
    include/AbstractOpenGLObject.h \
    include/FluidShader.h \
    include/FluidPropDockWidget.h \
    include/Camera.h \
    include/Text.h \
    include/ShaderLib.h \
    include/Shader.h \
    include/ShaderProgram.h \
    include/ShaderUtils.h \
    include/SPHSolverCUDAKernals.h \

OTHER_FILES += shaders/*glsl \
    shaders/fluidShaderFrag.glsl \
    shaders/fluidShaderVert.glsl \
    shaders/bilateralFilterFrag.glsl \
    shaders/bilateralFilterVert.glsl \
    shaders/thicknessFrag.glsl \
    shaders/thicknessVert.glsl \
    shaders/skyBoxFrag.glsl \
    shaders/skyBoxVert.glsl \
    mainpage.dox \
    shaders/cuboidVert.glsl \
    shaders/cuboidGeom.glsl \
    shaders/cuboidFrag.glsl \
    shaders/TextFrag.glsl \

INCLUDEPATH +=./include
    INCLUDEPATH+= /opt/local/include
    LIBS += -L/opt/local/lib -lGLEW

CONFIG += console

#in on mac define DARWIN
    LIBS+= -lopengl32 -lglew32s
# basic compiler flags (not all appropriate for all platforms)
QMAKE_CXXFLAGS+= -msse -msse2 -msse3
# use this to suppress some warning from boost
unix*:QMAKE_CXXFLAGS_WARN_ON += "-Wno-unused-parameter"

#-------------------------Cuda setup-----------------------------

#Enter your gencode here!
GENCODE = arch=compute_52,code=sm_52

#We must define this as we get some confilcs in minwindef.h and helper_math.h

#set out cuda sources
CUDA_SOURCES = "$$PWD"/cudaSrc/SPHSolverCUDAKernals.cu

#This is to add our .cu files to our file browser in Qt

# Path to cuda SDK install
macx:CUDA_DIR = /Developer/NVIDIA/CUDA-6.5
linux:CUDA_DIR = /usr/local/cuda-6.5
win32:CUDA_DIR = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5"
# Path to cuda toolkit install
macx:CUDA_SDK = /Developer/NVIDIA/CUDA-6.5/samples
linux:CUDA_SDK = /usr/local/cuda-6.5/samples
win32:CUDA_SDK = "C:\ProgramData\NVIDIA Corporation\CUDA Samples\v7.5"

#Cuda include paths
#INCLUDEPATH += $$CUDA_DIR/common/inc/
#INCLUDEPATH += $$CUDA_DIR/../shared/inc/
#To get some prewritten helper functions from NVIDIA
win32:INCLUDEPATH += $$CUDA_SDK\common\inc

#cuda libs
linux:QMAKE_LIBDIR += $$CUDA_DIR/lib64
win32:QMAKE_LIBDIR += $$CUDA_DIR\lib\x64
linux|macx:QMAKE_LIBDIR += $$CUDA_SDK/common/lib
win32:QMAKE_LIBDIR +=$$CUDA_SDK\common\lib\x64
LIBS += -lcudart -lcudadevrt

# join the includes in a line
CUDA_INC = $$join(INCLUDEPATH,'" -I"','-I"','"')

# nvcc flags (ptxas option verbose is always useful)
NVCCFLAGS = --compiler-options  -fno-strict-aliasing --ptxas-options=-v -maxrregcount 20 --use_fast_math

#On windows we must define if we are in debug mode or not
CONFIG(debug, debug|release) {
    # MSVCRT link option (static or dynamic, it must be the same with your Qt SDK link option)
#Release UNTESTED!!!

#prepare intermediat cuda compiler
cudaIntr.input = CUDA_SOURCES
cudaIntr.output = ${OBJECTS_DIR}${QMAKE_FILE_BASE}.o
#So in windows object files have to be named with the .obj suffix instead of just .o
#God I hate you windows!!
win32:cudaIntr.output = $$OBJECTS_DIR/${QMAKE_FILE_BASE}.obj

## Tweak arch according to your hw's compute capability
cudaIntr.commands = $$CUDA_DIR/bin/nvcc -m64 -g -gencode $$GENCODE -dc $$NVCCFLAGS $$CUDA_INC $$LIBS ${QMAKE_FILE_NAME} -o ${QMAKE_FILE_OUT}

#Set our variable out. These obj files need to be used to create the link obj file
#and used in our final gcc compilation
cudaIntr.variable_out = CUDA_OBJ
cudaIntr.variable_out += OBJECTS
cudaIntr.clean = cudaIntrObj/*.o
win32:cudaIntr.clean = cudaIntrObj/*.obj


# Prepare the linking compiler step
cuda.input = CUDA_OBJ
cuda.output = ${QMAKE_FILE_BASE}_link.o
win32:cuda.output = ${QMAKE_FILE_BASE}_link.obj

# Tweak arch according to your hw's compute capability
cuda.commands = $$CUDA_DIR/bin/nvcc -m64 -g -gencode $$GENCODE  -dlink    ${QMAKE_FILE_NAME} -o ${QMAKE_FILE_OUT}
cuda.dependency_type = TYPE_C
cuda.depend_command = $$CUDA_DIR/bin/nvcc -g -M $$CUDA_INC $$NVCCFLAGS   ${QMAKE_FILE_NAME}
# Tell Qt that we want add more stuff to the Makefile

Now you should be good to go to develop to your hearts content! One last note is that if you plan to use the CUDA openGL interoperability then for some reason you must include windows.h before cuda_gl_interop.h as you seem to get lots of redefinition conflicts if you don’t -.-

Good luck!

Sources: CUDA Getting Started Guide


Fluid Simulation Improvements

Over the last few weeks/months? I’ve a lot of my research has dragged me back to the mathematical hell of SPH fluid simulations. Inevitably leading to rather reluctantly writing my 3rd fluid simulation from scratch. However this year around I went about things differently (#NewYearNewMe) and actually learnt a lot about where I had been going wrong in the past. In fact after the completion of this simulation I had been enlightened to original simulation which is nothing short of an abomination. After the though of this haunting me for a few weeks I have finally got around to rectifying the mistakes to make this simulation something slightly closer to an acceptable release.

As usual here is a video, sorry if it drags out a little too much I got a bit carried away. I’ve put some upbeat music over the top to hopefully keep you entertained (Its Mr Probz “Waves” get it cos its a water simulation! XD )



This demo has all pressure, viscosity and surface tension forces implemented however there are still improvements that can be made to this. You may notice that the blurring of the normals to make the implicit fluid surface isn’t perfect and still looks very spherical. Furthermore shadows I think would be pretty fun to implement however at the moment as you may see I have no floor to project these onto. Finally there are always improvements that can be made to the performance of the simulation. The next step is to abuse shared memory of blocks to store our particles in.