October 11, 2018

In this post we discuss rules of thumb for performance limiters when using shared memory in a CUDA compute kernel running on a Titan V - coincidentally the topic of my advanced GPU+FEM topics VT course lecture today.

According to the Volta micro-architecture wiki entry,...

September 26, 2018

Simulation: flow constrained to the surface of a sphere modeled using Galerkin-Boltzmann equations (Tölke et al '00) discretized with a tenth order  discontinuous Galerkin spectral elements in space and adaptive semi-analytic Runge-Kutta time stepping. See th...

September 18, 2018

Added ellipsoids to the set of objects supported by the simple Whitted style ray tracer [1] we developed for the CMDA 3634 course @Virginia Tech for Fall 2018. 

To detect collisions between sphere and ellipsoids we use a Newton based algorithm for finding the nearest p...

September 2, 2018

Capabilities of the Paranumal Accelerated Ray Tracer:

  • Whitted based ray transport (link).

  • Stack based multiple scattering.

  • GPU acceleration. 

  • Primitives: spheres, cylinders, cones, planes, triangles, disks. 

  • Field of view emulated using Monte Ca...

August 24, 2018

Adding more primitives and learning about the dreaded ray tracing "acne" caused by finite precision issues with computing ray-shape intersections. All rendering in the above movie is done in CUDA on a Titan V.  Unfortunately the youtube compression algorithm is a lit h...

August 22, 2018

This semester the students in CMDA 3634 @ Virginia Tech will be building up  ray tracing codes that runs with threaded using OpenMP, distributed with MPI, and/or accelerated with CUDA on GPUs.

Gearing up the basic ray tracer just to make sure I understand everythi...

libParanumal simulation for 3D flow over a finite fence modeled with the Galerkin-Boltzmann flow equations of gas dynamics (Toelke et al 00). The simulation uses a mix of discretization techniques:

Physical space: a 4th order polynomial discontinuous Galerkin spatial di...

Simulation on 100K quartic triangle elements with discontinuous Galerkin discretization in space and adaptive Runge-Kutta time stepper. Flow initially accelerates downwards until the bulk flow is established.

Calculation was performed on six NVIDIA GTX 1080TI GPUs.


May 18, 2018

Four VT undergraduates have joined the paranumal team as summer research assistants. From left to right: Nick Polidoro, Dallas Viar, Tulika Chaudhary, and Weichen Li.

They have all taken Computer Science Foundations for CMDA (CMDA 3634) and are using their GPU programmi...

May 18, 2018


Jesse Chan gave a colloquium talk in the Math Department @VT on novel entropy stable flux differencing discontinuous Galerkin formulations as described in his article.

Please reload

Our Recent Posts

Please reload


Please reload


I'm busy working on my blog posts. Watch this space!

Please reload


225 Stanger St
Blacksburg, VA 24061