TiV-ODE: A Neural ODE-based Approach for Controllable Video Generation from Text-Image Pairs

Yucheng Xu*, Nanbo Li, Arushi Goel, Zonghai Yao, Zijian Guo, Hamidreza Kasaei, Mohammadreza Kasaei, Zhibin Li

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Videos capture the evolution of continuous dynamical systems over time in the form of discrete image sequences. Recently, video generation models have been widely used in robotic research. However, generating controllable videos from image-text pairs is an important yet underexplored research topic in both robotic and computer vision communities. This paper introduces an innovative and elegant framework named TiV-ODE, formulating this task as modeling the dynamical system in a continuous space. Specifically, our framework leverages the ability of Neural Ordinary Differential Equations (Neural ODEs) to model the complex dynamical system depicted by videos as a nonlinear ordinary differential equation. The resulting framework offers control over the generated videos' dynamics, content, and frame rate, a feature not provided by previous methods. Experiments demonstrate the ability of the proposed method to generate highly controllable and visually consistent videos and its capability of modeling dynamical systems. Overall, this work is a significant step towards developing advanced controllable video generation models that can handle complex and dynamic scenes.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages14645-14652
Number of pages8
ISBN (Electronic)9798350384574
DOIs
Publication statusPublished - 2024
Event2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, Japan
Duration: 13-May-202417-May-2024

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2024 IEEE International Conference on Robotics and Automation, ICRA 2024
Country/TerritoryJapan
CityYokohama
Period13/05/202417/05/2024

Fingerprint

Dive into the research topics of 'TiV-ODE: A Neural ODE-based Approach for Controllable Video Generation from Text-Image Pairs'. Together they form a unique fingerprint.

Cite this