Evan DeSantola (tabasco)

BVH Query Cost Visualization and Extended Abstract

Visualization of Total Query Points Returned by our Acceleration Data Structure (BVH)

The following video is a 3-D heat map representation of the computation required in our simulation. The video above shows 10 iterations of a store falling from a wing. The visualization is color coded by the value of a hardware agnostic indicator (the number of candidate triangles returned by our acceleration data structure) of the amount of computation required to compute the signed distance from points in each region to the mesh. Thus, blue areas represent regions where signed distance queries are least expensive, while the red region represent areas where signed distance queries are the most expensive.

We see that by the nature of the implementation of our acceleration data structure, signed distance queries for points lying in certain regions of the mesh are much more expensive than queries for points in other regions. Additionally, we see that such regions change positions as the simulation progresses.

We thus see that we must perform some sort of load balancing if we wish to obtain better speedup results.

Abstract:

Signed distance is commonly employed to numerically represent material interfaces with complex boundaries in multi-material numerical simulations. However, the performance of computing the signed distance field is hindered by the complexity and size of the input. Recent trends in HPC architecture consist of multi-core CPUs and accelerators that collectively expose tens to thousands of cores to the application. Harnessing this massive parallelism for computing the signed distance field presents significant challenges. Chief among them, the design and implementation of a performance portable solution that can work across architectures. Addressing these challenges to accelerate signed distance queries is the primary merit of this work. Herein, we employ the RAJA programming model, which provides a loop-level abstraction that decouples the loop-body from the parallel execution and insulates application developers from non-portable compiler and platform-specific directives. Implementation and performance results are discussed in more detail.

Check Out our Extended Abstract Here

Biographies:

Evan DeSantola:

Evan DeSantola is an undergraduate student in Carnegie Mellon University’s School of Computer Science and is pursuing minors in Mathematics and Business Administration. His academic interests include healthcare informatics, HPC, computer vision, machine learning and optimization. In his spare time, Evan enjoys participating in CTFs with Carnegie Mellon’s Plaid Parliament of Pwning as well as software prototyping both independently and at hackathons.

Jordan Backes:

Jordan Backes is an undergraduate student at the University of Missouri Columbia, pursuing a degree in Electrical and Computer Engineering with minors in Mathematics, Physics, and Computer Science. His academic interests include high performance computing, machine learning, medical devices, and additive manufacturing. When not working on independent projects, he collaborates on his school’s electric car team.

Poster


Share this: