Skip to content

Blog

Video Models Explosion 2024

2024 was the year in which video generation models exploded onto the scene, triggered by the impressive release of Sora from OpenAI. This blog post is my way of keeping track of the progress of a fascinating field. I will review all the key techniques that are used in building state-of-the-art video models (1).

    • A comprehensive review of all text-to-image/text-to-video models is beyond the scope of this blog post. I will focus on research that has been published, productionized, or open-sourced.
    • All of the videos and images are reproduced from the cited projects and papers, and the copyright belongs to the authors or the organization that published their papers. Below I adapted key figures for each paper under the fair use clause of copyright law.