Sony Interactive Entertainment PlayStation Meta Meta Reality Labs
Netflix Wue CVLab

Overview

Welcome to the 1st Workshop on AI for Streaming at CVPR! This workshop focuses on unifying new streaming technologies, computer graphics, and computer vision, from the modern deep learning point of view. Streaming is a huge industry where hundreds of millions of users demand everyday high-quality content on different platforms. Computer vision and deep learning have emerged as revolutionary forces for rendering content, image and video compression, enhancement, and quality assessment. From neural codecs for efficient compression to deep learning-based video enhancement and quality assessment, these advanced techniques are setting new standards for streaming quality and efficiency. Moreover, novel neural representations also pose new challenges and opportunities in rendering streamable content, and allowing to redefine computer graphics pipelines and visual content.

Call for Papers (Closed)

We welcome papers addressing topics related to VR, streaming, efficient image/video (pre- & post-)processing and neural compression. The topics include:

  • Efficient Deep Learning
  • Model optimization and Quantization
  • Image/video quality assessment
  • Image/video super-resolution and enhancement
  • Compressed Input Enhancement

  • Generative Models (Image & Video)
  • Neural Codecs
  • Real-time Rendering
  • Neural Compression
  • Video pre/post processing


Challenges 🚀

We are happy to host the following grand challenges focused on realistic image/video applications.
Register now in the challenges to receive news by email on updates and new challenges.
The workshop challenges prizes pool will be +10.000$ 🚀 & cool stuff like PS5s


The top ranked participants will be awarded and invited to present their solution at the AIS workshop at CVPR 2024.
The challenge reports (if applicable) will be published at AIS 2024 workshop, and in the CVPR 2024 Workshops proceedings.
The participants can submit papers describing their solution to the challenges and/or related problems (more info below).

We also invite you to check the challenges at the New Trends in Image Restoration and Enhancement (NTIRE) workshop .


Keynote Speaker


Professor Alan Bovik (HonFRPS) holds the Cockrell Family Endowed Regents Chair in Engineering in the Chandra Family Department of Electrical and Computer Engineering in the Cockrell School of Engineering at The University of Texas at Austin, where he is Director of the Laboratory for Image and Video Engineering (LIVE). He is a faculty member in the Department of Electrical and Computer Engineering, the Wireless Networking and Communication Group (WNCG), and the Institute for Neuroscience. His research interests include digital television, digital photography, visual perception, social media, and image and video processing.



Invited Speakers

Lucas Theis

Lucas Theis

Google DeepMind

Saman Zadtootaghaj

Saman Zadtootaghaj

Sony PlayStation

Ryan Lei

Ryan Lei

Meta

Christos Bampis

Christos Bampis

Netflix

Schedule Details (TBD) - 17th June

Please click in the title of each presentation to see the abstract. Local time.
  • 09:00 - 09:15: Opening
  • 09:15 - 10:00: Event-based Eye-Tracking Challenge and Results
  • 10:00 - 10:30:
    "AI and Machine Learning for Video Compression" by Ryan Lei (Video Codec Specialist, Meta)

    Over the past 30 years, significant advances have been achieved in the video compression domain, which has made it become an absolutely indispensable technology that powers today’s Internet. Meanwhile, the past decade has also witnessed the great success of machine learning in many areas, especially in computer vision and image processing. It is very natural to ask if and how machine learning techniques can be leveraged to improve video compression. In this talk, the speaker will first provide a high level overview of progress in the conventional video coding standard development and the challenges that the community is facing. Then the speaker will focus on a few trends on how machine learning can be leveraged for video compression. The first topic is how machine learning is used to further optimize coding tools of traditional video coding, such as intra/inter prediction, loop filtering, etc. The second topic is how machine learning is used to improve the coding efficiency of the overall compression system, such as pre/post filtering, super resolution, layered coding, etc. The last topic is how neural networks are used to develop end-to-end learned video coding frameworks and fully replace the conventional video coding system. Over the talk, the speaker will also present examples of techniques that have worked and techniques that may not work in the near future.

  • 10:45 - 11:15: Video Quality Assessment Challenge and Results
  • 11:15 - 12:00:
    "Deep Video Pre-processing" by Dr. Christos Bampis (Netflix)

    Over the past 30 years, significant advances have been achieved in the video compression domain, which has made it become an absolutely indispensable technology that powers today’s Internet. Meanwhile, the past decade has also witnessed the great success of machine learning in many areas, especially in computer vision and image processing. It is very natural to ask if and how machine learning techniques can be leveraged to improve video compression. In this talk, the speaker will first provide a high level overview of progress in the conventional video coding standard development and the challenges that the community is facing. Then the speaker will focus on a few trends on how machine learning can be leveraged for video compression. The first topic is how machine learning is used to further optimize coding tools of traditional video coding, such as intra/inter prediction, loop filtering, etc. The second topic is how machine learning is used to improve the coding efficiency of the overall compression system, such as pre/post filtering, super resolution, layered coding, etc. The last topic is how neural networks are used to develop end-to-end learned video coding frameworks and fully replace the conventional video coding system. Over the talk, the speaker will also present examples of techniques that have worked and techniques that may not work in the near future.

  • 12:00 - 13:00: Lunch & Poster Session
  • 13:00 - 14:00: Keynote from Professor Alan Bovik
  • 14:00 - 14:30: On the Edge Real-time Image Super-Resolution Challenge and Results
  • 14:30 - 16:45: Invited Talks on Super-Resolution
  • 16:45 - 17:30: "Deep Neural Compression" by Dr. Lucas Theis (Google DeepMind)
  • 17:30 - 18:00: Closing Remarks & Award Ceremony [Challenge Diplomas]

Organizers

Marcos V. Conde

Marcos V. Conde ✉️

University of Würzburg & Sony PlayStation

Radu Timofte

Radu Timofte ✉️

University of Würzburg

Ioannis Katsavounidis

Ioannis Katsavounidis

Meta

Rakesh Ranjan

Rakesh Ranjan

Meta Reality Labs

Christos Bampis

Christos Bampis

Netflix

Ryan Lei

Ryan Lei

Meta

Daniel Motilla

Daniel Motilla

Sony PlayStation

Program Committee

Marcos V. Conde (University of Würzburg & Sony PlayStation)
Radu Timofte (University of Würzburg)
Florin Vasluianu (University of Würzburg)
Zongwei Wu (University of Würzburg)
Ioannis Katsavounidis (Meta)
Ryan Lei (Meta)
Wen Li (Meta)
Cosmin Stejerean (Meta)
Shiranchal Taneja (Meta)
Christos Bampis (Netflix)
Zhi Li (Netflix)


Rakesh Ranjan (Meta Reality Labs)
Andy Bigos (Sony PlayStation)
Michael Stopa (Sony PlayStation)
Daniel Motilla (Sony PlayStation)
Saman Zadtootaghaj (Sony PlayStation)
Chang Gao (Delft University of Technology)
Qinyu Chen (University of Zurich and ETHZ & Leiden Univ)
Zuowen Wang (University of Zurich and ETHZ)
Shih-Chii Liu (University of Zurich and ETHZ)