Skip to content

Blog

Video Generation Models Explosion 2024

Video generation models exploded onto the scene in 2024, sparked by the release of Sora from OpenAI. This blog post is my way of keeping track of the progress of this fascinating field. I will review all the key techniques that are used in building state-of-the-art video generation models (1).

    • A comprehensive review of all text-to-image/text-to-video models is beyond the scope of this blog post. I will focus on research that has been published, productionized, or open-sourced.
    • All of the videos and images are reproduced from the cited projects and papers, and the copyright belongs to the authors or the organization that published their papers. Below I adapted key figures for each paper under the fair use clause of copyright law.