Mr. Paul Fox Profile

Paul Fox

Research Fellow at EM Photonics Inc

SPIE Involvement:

Author

Publications (6)

Proceedings Article | 12 May 2016 Paper

Many-core graph analytics using accelerated sparse linear algebra routines

Stephen Kozacik, Aaron Paolini, Paul Fox, Eric Kelmelis

Proceedings Volume 9848, 984808 (2016) https://doi.org/10.1117/12.2228554

KEYWORDS: Analytics, Linear algebra, Computer programming, Algorithm development, Performance modeling, Defense and security, Photonics, Systems modeling, Matrices, C++

Read Abstract +

Proceedings Article | 12 May 2016 Paper

Enabling power-aware software in embedded systems

James Bonnett, Paul Fox, Aaron Paolini, Adam Markey, Stephen Kozacik, Eric Kelmelis

Proceedings Volume 9848, 984806 (2016) https://doi.org/10.1117/12.2228551

KEYWORDS: Control systems, Embedded systems, Mobile devices, Software development, Control systems, Signal processing, Clocks, Prototyping, Digital signal processing, Interfaces, Computing systems

Read Abstract +

Proceedings Article | 22 May 2015 Paper

Adaptive OpenCL libraries for platform portability

Paul Fox, Allyssa Batten, Marcus Hayes, Eric Kelmelis

Proceedings Volume 9478, 947806 (2015) https://doi.org/10.1117/12.2177410

KEYWORDS: Computer programming, Performance modeling, Field programmable gate arrays, Detection and tracking algorithms, Instrument modeling, Computing systems, Linear algebra, Databases, Evolutionary algorithms, Algorithm development

Read Abstract +

Proceedings Article | 19 May 2015 Paper

Real-time technology for enhancing long-range imagery

Aaron Paolini, Eric Kelmelis, Stephen Kozacik, James Bonnett, Paul Fox

Proceedings Volume 9460, 94600C (2015) https://doi.org/10.1117/12.2177575

KEYWORDS: Image processing, Video, Image enhancement, Video acceleration, Video processing, Intelligence systems, Turbulence, Cameras, Atmospheric turbulence, Distortion

Read Abstract +

Proceedings Article | 13 June 2014 Paper

Optimization techniques for OpenCL-based linear algebra routines

Stephen Kozacik, Paul Fox, John Humphrey, Aryeh Kuller, Eric Kelmelis, Dennis Prather

Proceedings Volume 9095, 90950D (2014) https://doi.org/10.1117/12.2050673

KEYWORDS: Linear algebra, Field programmable gate arrays, Matrix multiplication, Matrices, Standards development, Algorithm development, Optimization (mathematics), Graphics processing units, Computer programming, Image processing

Read Abstract +

Proceedings Article | 13 June 2014 Paper

Targeting multiple heterogeneous hardware platforms with OpenCL

Paul Fox, Stephen Kozacik, John Humphrey, Aaron Paolini, Aryeh Kuller, Eric Kelmelis

Proceedings Volume 9095, 90950E (2014) https://doi.org/10.1117/12.2050643

KEYWORDS: Control systems, Computer programming, Switching, Parallel computing, Manufacturing, Standards development, Wavefronts, Detection and tracking algorithms, Algorithm development, Photonics

Read Abstract +

The OpenCL API allows for the abstract expression of parallel, heterogeneous computing, but hardware implementations have substantial implementation differences. The abstractions provided by the OpenCL API are often insufficiently high-level to conceal differences in hardware architecture. Additionally, implementations often do not take advantage of potential performance gains from certain features due to hardware limitations and other factors. These factors make it challenging to produce code that is portable in practice, resulting in much OpenCL code being duplicated for each hardware platform being targeted. This duplication of effort offsets the principal advantage of OpenCL: portability. The use of certain coding practices can mitigate this problem, allowing a common code base to be adapted to perform well across a wide range of hardware platforms. To this end, we explore some general practices for producing performant code that are effective across platforms. Additionally, we explore some ways of modularizing code to enable optional optimizations that take advantage of hardware-specific characteristics. The minimum requirement for portability implies avoiding the use of OpenCL features that are optional, not widely implemented, poorly implemented, or missing in major implementations. Exposing multiple levels of parallelism allows hardware to take advantage of the types of parallelism it supports, from the task level down to explicit vector operations. Static optimizations and branch elimination in device code help the platform compiler to effectively optimize programs. Modularization of some code is important to allow operations to be chosen for performance on target hardware. Optional subroutines exploiting explicit memory locality allow for different memory hierarchies to be exploited for maximum performance. The C preprocessor and JIT compilation using the OpenCL runtime can be used to enable some of these techniques, as well as to factor in hardware-specific optimizations as necessary.

Showing 5 of 6 publications

View contact details

UPDATE YOUR PROFILE

Is this your profile? Update it now.

Sign into your SPIE.org account

Don’t have a profile and want one?

Create an account on SPIE.org

Keywords/Phrases

Search In:

Publication Years