64-bit floating-point FPGA matrix multiplication

  • Yong Dou*
  • , S. Vassiliadis
  • , G. K. Kuzmanov
  • , G. N. Gaydadjiev
  • *Corresponding author for this work

Research output: Contribution to conferencePaperAcademic

170 Citations (Scopus)

Abstract

We introduce a 64-bit ANSI/IEEE Std 754-1985 floating point design of a hardware matrix multiplier optimized for FPGA implementations. A general block matrix multiplication algorithm, applicable for an arbitrary matrix size is proposed. The algorithm potentially enables optimum performance by exploiting the data locality and reusability incurred by the general matrix multiplication scheme and considering the limitations of the I/O bandwidth and the local storage volume. We implement a scalable linear array of processing elements (PE) supporting the proposed algorithm in the Xilinx Virtex II Pro technology. Synthesis results confirm a superior performance-area ratio compared to related recent works. Assuming the same FPGA chip, the same amount of local memory, and the same I/O bandwidth, our design outperforms related proposals by at least 1.7X and up to 18X consuming the least reconfigurable resources. A total of 39 PEs can be integrated into the xc2vp125-7 FPGA, reaching performance of, e.g., 15.6 GFLOPS with 1600 KB local memory and 400 MB/s external memory bandwidth.

Original languageEnglish
Pages86-95
Number of pages10
Publication statusPublished - 20-Jun-2005
Externally publishedYes
EventACM/SIGDA Thirteenth ACM International Symposium on Field Programmable Gate Arrays - FPGA 2005 - Monterey, CA, United States
Duration: 20-Feb-200522-Feb-2005

Conference

ConferenceACM/SIGDA Thirteenth ACM International Symposium on Field Programmable Gate Arrays - FPGA 2005
Country/TerritoryUnited States
CityMonterey, CA
Period20/02/200522/02/2005

Keywords

  • Floating-point
  • FPGA
  • Matrix multiplication

Fingerprint

Dive into the research topics of '64-bit floating-point FPGA matrix multiplication'. Together they form a unique fingerprint.

Cite this