CUDA Path Tracer

October 2025
  • CUDA
  • C++
  • Thrust
Hero1
Hero2

Overview

This project is a physically-based path tracer built from the ground up in C++ and CUDA for the University of Pennsylvania's CIS 565: GPU Programming and Architecture course. Path tracing is a rendering technique that simulates light transport by tracing rays backward from the camera. Each ray bounces off surfaces, accumulating color and lighting information until it hits a light source or is terminated. By averaging many randomly sampled paths per pixel, the algorithm converges to a photorealistic image with accurate global illumination, soft shadows, and complex light interactions.

GPU Implementation

This path tracer uses a wavefront architecture optimized for GPU parallelism. Instead of assigning each thread a complete path (which would cause divergence as paths terminate at different times), each thread processes a single path segment—one bounce at a time. This approach maintains high GPU occupancy by keeping threads synchronized at each bounce level, avoiding the warp divergence that would occur if different threads were at different depths in their paths.

Features

Core Rendering

  • Physically-Based Materials: Diffuse and mirror BSDFs with stochastic roughness-based blending.
  • Stochastic Anti-Aliasing: Randomized subpixel sampling for smooth edges.
  • Environment Mapping: HDR skybox lighting with spherical coordinate sampling.

Advanced Effects

  • Black Hole Gravitational Lensing: Physically accurate light bending with a procedural accretion disk.
  • Depth of Field: Thin lens camera model with configurable focal distance and aperture size.
  • Bloom Post-Processing: Perceptual luminance-based glow for bright light sources.

Performance Optimizations

  • BVH Acceleration: Custom bounding volume hierarchy for fast ray-mesh intersection.
  • Stream Compaction: Automatic culling of terminated ray paths to maintain GPU efficiency.
  • Material Sorting: Coherent BSDF evaluation through dynamic ray reordering.

Pipeline

  • Custom OBJ Loader: Direct .obj mesh import supporting positions and normals.

Black Hole Gravitational Lensing

Path tracing typically assumes light travels in perfectly straight lines. Black holes, however, are a dramatic exception. Their immense mass distorts spacetime itself, bending the paths of light rays. This implementation simulates a ray's trajectory as it curves under gravitational acceleration, much like simulating a particle under Newtonian gravity.

Hero2
Light from the background environment map bending around the black hole's event horizon.

The Physics

The implementation is based on an excellent article by rantonels, which derives a formula for the acceleration experienced by light near a Schwarzschild black hole. Using an RK4 integrator for numerical stability, light rays are marched through the gravitational field.

The acceleration is given by:

\[\mathbf{a} = \left( -\frac{3 M h^2}{\lVert \mathbf{r} \rVert^5} \right)\mathbf{r}\,w\]

Where M is the mass, is the squared angular momentum, and w is a windowing function.

Accretion Disk

To make the lensing visible, a procedural accretion disk was created. When a simulated ray passes through the disk's plane, its position is used to sample a swirled Perlin noise function. This sample stochastically determines if the ray should terminate and emit light or pass through, creating a wispy, turbulent appearance without the overhead of a full volumetric simulation. This technique was adapted from a previous WebGL shader project of mine.

Close up of the procedural accretion disk
The procedural accretion disk, showing the swirling noise pattern.

Visual Improvements

Bloom

Bloom is a post-processing effect that adds a soft glow to bright areas of the image, simulating light scattering inside a camera lens or the human eye. After the image is rendered, a brightness filter is applied, and the result is blurred with a 21x21 Gaussian kernel. This blurred layer is then additively blended with the original image.

Black hole render without bloom
Without Bloom
Black hole render with bloom
With Bloom

Environment Mapping

To light scenes realistically, I implemented support for HDR environment maps. When a ray fails to intersect any scene geometry, its direction is converted to spherical coordinates, which are used to sample a texture that surrounds the entire scene at an infinite distance. This provides both background imagery and high-quality, image-based lighting.

Scene with basic lighting
Without Environment Map
Scene with HDR environment map lighting
With Environment Map

Thin Lens Depth of Field

A thin lens camera model was implemented to simulate depth of field. Instead of originating from a single point, rays are cast from random points on a virtual lens aperture. These rays are directed to converge at a specific focal plane, causing objects at that distance to appear sharp while foreground and background objects become blurred.

Depth of field with focus on the foreground
Focus on Foreground
Depth of field with focus on the middle ground
Focus on Middle Ground

Performance Improvements

BVH and OBJs

To render complex triangle meshes from .obj files, I implemented a Bounding Volume Hierarchy (BVH). A BVH is a tree structure that spatially organizes geometry into nested bounding boxes. During rendering, this allows the tracer to quickly discard large parts of the scene that a ray cannot possibly intersect, reducing the number of ray-triangle intersection tests from O(N) to O(log N) and making complex scenes render at interactive rates.

Simple BVH scene
A simple scene with a few objects.
Complex BVH scene with an OBJ model
The same scene with a 5k+ triangle model.

Stream Compaction & Material Sorting

Two key optimizations for the wavefront architecture are stream compaction and material sorting. Stream compaction uses thrust::partition to remove "dead" rays from the processing queue after each bounce, ensuring that threads are not wasted on paths that have already terminated. This is especially effective in open scenes where many rays miss all geometry. Material sorting uses thrust::sort_by_key to group rays by the material they have hit. This ensures that threads within a GPU warp execute the same shading code, avoiding divergence and maximizing throughput.

Performance Analysis

Performance was tested on an NVIDIA GeForce RTX 3080. The graphs below show the impact of key optimizations on framerate.

Graph showing performance gain from stream compaction
Stream compaction provides a major FPS boost in open scenes.
Graph showing performance gain from BVH
BVH is essential for rendering scenes with more than a few hundred triangles.
Graph showing performance with multiple black holes
Performance scales well even with a large number of black holes in the scene.

Gallery

Overall, I was very happy with how this project turned out. The gravitational lensing effect is visually compelling and integrates seamlessly into the physically-based rendering pipeline. Future work could include adding more material types like glass and subsurface scattering, as well as improving the scene loading and UI. Below are some additional renders created during development.

Gallery render 2
Gallery render 3
Gallery render 4
A hand reaching for a black hole

Bloopers

These are just some wild renders I got while trying to implement some of these features.

Blooper 1
Blooper 2
Blooper 3
Blooper 4