Rough-n-Ready Roofline: NVIDIA V100 edition
In this post we discuss rules of thumb for performance limiters when using shared memory in a NVIDIA V100 CUDA compute kernel. The V100...
Our Recent Posts
Archive
Tags
Rough-n-Ready Roofline: NVIDIA V100 edition
Concurrent Cloud Computing: installing occaBench for V100
Vaunted Volta Verified: initial comparison of the NVIDIA V100 & P100 GPUs
CEED Code Competition: VT software release
Concurrent Cloud Computing: running OCCA
Limiting Performance: an interesting read
Portable Performance Profiling: occaBench
Spurious Solution Suppression: the Goldilocks upwind discontinuous Galerkin Time-domain method
High-order Discontinuous Galerkin Simulations: is single precision enough?
Spherical Shear-flow Solver