Further Reading

FFTX and SpectralPACK

1. Franz Franchetti, Daniele G. Spampinato, Anuva Kulkarni, Doru Thom Popovici, Tze Meng Low, Michael Franusich, Andrew Canning, Peter McCorquodale, Brian Van Straalen and Phillip Colella,
FFTX and SpectralPack: A First Look”,
IEEE International Conference on High Performance Computing, Data, and Analytics,
Bengaluru, India (2018), pp. 18–27, doi: 10.1109/HiPCW.2018.8634111.
2. Franz Franchetti,
SPIRAL, FFTX, and the path to SpectralPACK”,
talk given in the Nagoya University Booth at Supercomputing 2018, Dallas, TX.
3. Franz Franchetti,
FFTX and SpectralPack: A First Look”,

SPIRAL

1. Franz Franchetti, Tze Meng Low, Doru Thom Popovici, Richard M. Veras, Member, Daniele G. Spampinato, Jeremy R. Johnson, Markus Püschel, James C. Hoe and José M. F. Moura,
SPIRAL: Extreme Performance Portability”,
Proceedings of the IEEE, Special Issue on From High Level Specifications to High Performance Code (2018).
2. Markus Püschel, Franz Franchetti and Yevgen Voronenko,
Spiral”,
Encyclopedia of Parallel Computing (2011), pp. 1920–1933.
3. Markus Püschel, José M. F. Moura, Jeremy R. Johnson, David A. Padua, Manuela M. Veloso, Bryan Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, Kang Chen, Robert W. Johnson and Nicholas Rizzolo,
SPIRAL: Code Generation for DSP Transforms”,
Proceedings of the IEEE, 93 (2): 232–275, (2005).

FFTs

1. Franz Franchetti, Markus Püschel,
Fast Fourier Transform”,
Encyclopedia of Parallel Computing 2011, pp. 658–671.
2. F. Franchetti and M. Püschel,
Generating High-Performance Pruned FFT Implementations”,
Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009, Taipei, Taiwan.
3. F. Franchetti, Y. Voronenko, S. Chellappa, J. M. F. Moura and M. Püschel,
Discrete Fourier Transform on Multicores: Algorithms and Automatic Implementation”,
IEEE Signal Processing Magazine, special issue on Signal Processing on Platforms with Multiple Cores, 2009.
4. Yevgen Voronenko and Markus Püschel,
Algebraic Signal Processing Theory: Cooley-Tukey Type Algorithms for Real DFTs”,
IEEE Transactions on Signal Processing, 57 (1): 205–222 (2009).
5. Yevgen Voronenko, Frédéric de Mesmay and Markus Püschel,
Computer Generation of General Size Linear Transform Libraries”,
Proc. International Symposium on Code Generation and Optimization (CGO), pp. 102–113 (2009).

Distributed and Multicore FFTs

1. D. T. Popovici, T. M. Low and F. Franchetti,
Large Bandwidth-Efficient FFTs on Multicore and Multi-Socket Systems”,
IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2018, Vancouver, Canada.
2. A. Bonelli, F. Franchetti, J. Lorenz, M. Püschel and C. W. Ueberhuber,
Automatic Performance Optimization of the Discrete Fourier Transform on Distributed Memory Computers”,
Proceedings of ISPA 06. Lecture Notes in Computer Science, Volume 4330, pages 818-–832 (2006).
Best Paper Award.

Interesting Target Hardware

1. Q. Guo, T. M. Low, N. Alachiotis, B. Akin, L. Pileggi, J. C. Hoe and F. Franchetti,
Enabling Portable Energy Efficiency with Memory Accelerated Library”,
48th IEEE/ACM International Symposium on Microarchitecture (MICRO-48), 2015, Waikiki, HI.
2. Berkin Akin, Franz Franchetti and James C. Hoe,
Understanding the Design Space of DRAM-optimized Hardware FFT Accelerators”,
Proc. IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP), 2014, Zurich, Switzerland, pp. 248–255.
3. P. A. Milder, F. Franchetti, J. C. Hoe and M. Püschel,
Computer Generation of Hardware for Linear Digital Signal Processing Transforms”,
ACM Transactions on Design Automation of Electronic Systems, 17 (2), Article 15 (2012).
ACM TODAES Best Paper Award 2014 .
4. Franz Franchetti, Yevgen Voronenko and G. Almasi,
Automatic Generation of the HPC Challenges Global FFT Benchmark for BlueGene/P”,
Proc. High Performance Computing for Computational Science (VECPAR), 2012, Kobe, Japan.
S. Chellappa, F. Franchetti and M. Püschel,
Computer Generation of Fast Fourier Transforms for the Cell Broadband Engine”,
Proceedings of the 23rd International Conference on Supercomputing (ICS), 2009, Yorktown Heights, NY.

Linear Algebra Libraries

1. Daniele G. Spampinato, Diego Fabregat-Traver, Paolo Bientinesi and Markus Püschel,
Program Generation for Small-Scale Linear Algebra Applications”,
Proc. International Symposium on Code Generation and Optimization (CGO), 2018, Vienna, Austria, pp. 327–339.

2. Daniele G. Spampinato and Markus Püschel,
A Basic Linear Algebra Compiler for Structured Matrices”,
Proc. International Symposium on Code Generation and Optimization (CGO), 2016, Edinburgh, Scotland, pp. 117–127.

3. Daniele G. Spampinato and Markus Püschel,
A Basic Linear Algebra Compiler”,
Proc. International Symposium on Code Generation and Optimization (CGO), 2014, Orlando, FL, pp. 23–32.
4. Frédéric de Mesmay, Franz Franchetti, Yevgen Voronenko and Markus Püschel
Automatic Generation of Multithreaded Vectorized Adaptive Libraries for Matrix Multiplication”,
Proc. International Workshop on Parallel Matrix Algorithms and Applications (PMAA), 2008, Zurich, Switzerland.

Applications

1. Thom Popovici,
An Approach to Specifying and Automatically Optimizing Fourier Transform Based Operations”,
PhD. thesis, Electrical and Computer Engineering, Carnegie Mellon University, 2018.
2. A. Kulkarni, F. Franchetti and J. Kovacevic,
Algorithm Design for Large Scale Parallel FFT-Based Simulations on Heterogeneous Platforms”,
IEEE High Performance Extreme Computing Conference (HPEC), 2018, Waltham, MA.
3. Tze-Meng Low, Qi Guo and Franz Franchetti,
Optimizing Space Time Adaptive Processing Through Accelerating Memory-Bounded Operations”,
Proc. High Performance Extreme Computing (HPEC), 2015, Waltham, MA.
4. D. A. Popovici, F. Russell, K. Wilkinson, C-K. Skylaris, P. H. J. Kelly and F. Franchetti,
Generating Optimized Fourier Interpolation Routines for Density Functional Theory Using SPIRAL”,
Proceedings of the 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2015, Hyderabad, India.
5. F. Gygi, E. W. Draeger, M. Schulz, B. R. de Supinski, J. A. Gunnels, V. Austel, J. C. Sexton, F. Franchetti, S. Kral, C. W. Ueberhuber and J. Lorenz,
Large-Scale Electronic Structure Calculations of High-Z Metals on the BlueGene/L Platform”,
Proceedings of the ACM/IEEE conference on Supercomputing, 2006, Tampa, FL.
Gordon Bell Prize Winner 2006 (Peak Performance Award).

3D FFTs and Plane Wave Codes

1. A Canning, J Shalf, NJ Wright, S Anderson and M Gajbe,
A Hybrid MPI/OpenMP 3D FFT for Plane Wave First-Principles Materials Science Codes”,
Proceedings of the International Conference on Scientific Computing (CSC), 2012, Shanghai, China, p. 1.
2. A Canning, J Shalf, LW Wang, H Wasserman and M Gajbe,
A Comparison of Different Communication Structures for Scalable Parallel Three Dimensional FFTs in First Principles Codes”,
in Chapman, B., Desprez, F., Joubert, GR, et al.(eds.), Proceed. ParCo 09, 2009, Lyon, France, pp. 107–116.
3. M Gajbe, A Canning, LW Wang, J Shalf, H Wasserman and R Vuduc,
Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4”,
Proc. Cray User’s Group (CUG) Meeting, 2009, Atlanta, GA.
4. A Canning,
Scalable Parallel 3D FFTs for Electronic Structure Codes”,
International Conference on High Performance Computing for Computational Science, pp. 280–286, Springer, 2008.
5. M Del Ben, H Felipe, A Canning, N Wichmann, K Raman and R Sasanka,
Large-Scale GW Calculations on Pre-Exascale HPC Systems”,
Computer Physics Communications, 235:187–195 (2018).
6. L Oliker, A Canning, J Carter, C Iancu, M Lijewski, S Kamil and J Shalf,
Scientific Application Performance on Candidate Petascale Platforms”,
Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2007. Long Beach, CA.
7. A Canning and D Raczkowski,
Scaling First-Principles Plane Wave Codes to Thousands of Processors”,
Computer Physics Communications, 169 (1–3):449–453 (2005).
8. A Canning, LW Wang, A Williamson and A Zunger,
Parallel Empirical Pseudopotential Electronic Structure Calculations for Million Atom Systems”,
Journal of Computational Physics, 160 (1):29–41 (2000).