BLAS-3 for the Quadrics parallel computer

T Lippert, N Petkov, K Schilling

    Research output: Chapter in Book/Report/Conference proceedingChapterAcademic

    4 Citations (Scopus)

    Abstract

    A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and! in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved.

    Original languageEnglish
    Title of host publicationHIGH-PERFORMANCE COMPUTING AND NETWORKING
    EditorsB Hertzberger, P Sloot
    Place of PublicationBERLIN
    PublisherSpringer
    Pages332-341
    Number of pages10
    ISBN (Print)3-540-62898-3
    Publication statusPublished - 1997
    EventHigh Performance Computing and Networking Europe 1997 Conference - , Austria
    Duration: 28-Apr-199730-Apr-1997

    Publication series

    NameLecture Notes in Computer Science
    PublisherSPRINGER-VERLAG BERLIN
    Volume1225
    ISSN (Print)0302-9743

    Other

    OtherHigh Performance Computing and Networking Europe 1997 Conference
    CountryAustria
    Period28/04/199730/04/1997

    Keywords

    • APE100/QUADRICS

    Cite this