DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation

Zhenxing Zhang, Lambert Schomaker

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

28 Citations (Scopus)
156 Downloads (Pure)

Abstract

Most existing text-to-image generation methods adopt a multi-stage modular architecture which has three significant problems: 1) Training multiple networks increases the run time and affects the convergence and stability of the generative model; 2) These approaches ignore the quality of early-stage generator images; 3) Many discriminators need to be trained. To this end, we propose the Dual Attention Generative Adversarial Network (DTGAN) which can synthesize high-quality and semantically consistent images only employing a single generator/discriminator pair. The proposed model introduces channel-aware and pixel-aware attention modules that can guide the generator to focus on text-relevant channels and pixels based on the global sentence vector and to fine-tune original feature maps using attention weights. Also, Conditional Adaptive Instance-Layer Normalization (CAdaILN) is presented to help our attention modules flexibly control the amount of change in shape and texture by the input natural-language description. Furthermore, a new type of visual loss is utilized to enhance the image resolution by ensuring vivid shape and perceptually uniform color distributions of generated images. Experimental results on benchmark datasets demonstrate the superiority of our proposed method compared to the state-of-the-art models with a multi-stage framework.
Original languageEnglish
Title of host publication2021 International Joint Conference on Neural Networks (IJCNN)
PublisherIEEE
Number of pages8
ISBN (Electronic)978-1-6654-3900-8
ISBN (Print)978-1-6654-4597-9
DOIs
Publication statusPublished - 20-Sept-2021
Event2021 International Joint Conference on Neural Networks (IJCNN) - Shenzhen, China
Duration: 18-Jul-202122-Jul-2021

Conference

Conference2021 International Joint Conference on Neural Networks (IJCNN)
Period18/07/202122/07/2021

Keywords

  • Generative adversarial network
  • dual-attention-model
  • text-to-image transform
  • Conditional Adaptive Instance-Layer Normalization
  • CAdaILN
  • Channel-Aware Attention

Fingerprint

Dive into the research topics of 'DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation'. Together they form a unique fingerprint.

Cite this