Overview
Welcome to the 1st Workshop on AI for Streaming at CVPR! This workshop focuses on unifying new streaming technologies, computer graphics, and computer vision, from the modern deep learning point of view. Streaming is a huge industry where hundreds of millions of users demand everyday high-quality content on different platforms. Computer vision and deep learning have emerged as revolutionary forces for rendering content, image and video compression, enhancement, and quality assessment. From neural codecs for efficient compression to deep learning-based video enhancement and quality assessment, these advanced techniques are setting new standards for streaming quality and efficiency. Moreover, novel neural representations also pose new challenges and opportunities in rendering streamable content, and allowing to redefine computer graphics pipelines and visual content.
Call for Papers (Closed)
We welcome papers addressing topics related to VR, streaming, efficient image/video (pre- & post-)processing and neural compression. The topics include:- Efficient Deep Learning
- Model optimization and Quantization
- Image/video quality assessment
- Image/video super-resolution and enhancement
- Compressed Input Enhancement
- Generative Models (Image & Video)
- Neural Codecs
- Real-time Rendering
- Neural Compression
- Video pre/post processing
Challenges 🚀
We are happy to host the following grand challenges focused on realistic image/video applications.
Register now in the challenges to receive news by email on updates and new challenges.
The workshop challenges prizes pool will be +10.000$ 🚀 & cool stuff like PS5s
- Real-time Compressed Image Super-Resolution (Finished) A single neural network upscales compressed images (AVIF) to 4K considering different compression factors.
- UGC Video Quality Assessment (Finished) Estimate the quality of user-generated content (UGC) videos using efficient neural networks (24-30 FPS).
- Event-based Eye Tracking (Finished)
- Mobile Real-time Video Super-Resolution (Ongoing) Upscale videos compressed with AV1 in real-time on mobile devices such as iPhone 14. Fom 360p to 1080p.
- Efficient Video Super-Resolution (Ongoing) Upscale videos compressed with AV1 in real-time at 30-60FPS on commercial GPUs. From 540p to 4K.
- Depth Upsampling and Refinement (Ongoing) (From 22nd March - May) Given a low-resolution depth map, and a high-resolution RGB, upscale and refine the depth map. Top teams will be invited to present their solutions and poster at the workshop.
(From Feb to May). Top teams will be invited to present their solutions and poster at the workshop. We will showcase the best models.
(From Feb to May). Top teams will be invited to present their solutions and poster at the workshop.
The top ranked participants will be awarded and invited to present their solution at the AIS workshop at CVPR 2024.
The challenge reports (if applicable) will be published at AIS 2024 workshop, and in the CVPR 2024 Workshops proceedings.
The participants can submit papers describing their solution to the challenges and/or related problems (more info below).
We also invite you to check the challenges at the New Trends in Image Restoration and Enhancement (NTIRE) workshop .
Keynote Speaker
Professor Alan Bovik (HonFRPS) holds the Cockrell Family Endowed Regents Chair in Engineering in the Chandra Family Department of Electrical and Computer Engineering in the Cockrell School of Engineering at The University of Texas at Austin, where he is Director of the Laboratory for Image and Video Engineering (LIVE). He is a faculty member in the Department of Electrical and Computer Engineering, the Wireless Networking and Communication Group (WNCG), and the Institute for Neuroscience. His research interests include digital television, digital photography, visual perception, social media, and image and video processing.
Invited Speakers
Schedule Details (TBD) - 17th June
Please click in the title of each presentation to see the abstract. Local time.- 09:00 - 09:15: Opening
- 09:15 - 10:00: Event-based Eye-Tracking Challenge and Results
- 10:00 - 10:30:
"AI and Machine Learning for Video Compression" by Ryan Lei (Video Codec Specialist, Meta)
Over the past 30 years, significant advances have been achieved in the video compression domain, which has made it become an absolutely indispensable technology that powers today’s Internet. Meanwhile, the past decade has also witnessed the great success of machine learning in many areas, especially in computer vision and image processing. It is very natural to ask if and how machine learning techniques can be leveraged to improve video compression. In this talk, the speaker will first provide a high level overview of progress in the conventional video coding standard development and the challenges that the community is facing. Then the speaker will focus on a few trends on how machine learning can be leveraged for video compression. The first topic is how machine learning is used to further optimize coding tools of traditional video coding, such as intra/inter prediction, loop filtering, etc. The second topic is how machine learning is used to improve the coding efficiency of the overall compression system, such as pre/post filtering, super resolution, layered coding, etc. The last topic is how neural networks are used to develop end-to-end learned video coding frameworks and fully replace the conventional video coding system. Over the talk, the speaker will also present examples of techniques that have worked and techniques that may not work in the near future.
- 10:45 - 11:15: Video Quality Assessment Challenge and Results
- 11:15 - 12:00:
"Deep Video Pre-processing" by Dr. Christos Bampis (Netflix)
Over the past 30 years, significant advances have been achieved in the video compression domain, which has made it become an absolutely indispensable technology that powers today’s Internet. Meanwhile, the past decade has also witnessed the great success of machine learning in many areas, especially in computer vision and image processing. It is very natural to ask if and how machine learning techniques can be leveraged to improve video compression. In this talk, the speaker will first provide a high level overview of progress in the conventional video coding standard development and the challenges that the community is facing. Then the speaker will focus on a few trends on how machine learning can be leveraged for video compression. The first topic is how machine learning is used to further optimize coding tools of traditional video coding, such as intra/inter prediction, loop filtering, etc. The second topic is how machine learning is used to improve the coding efficiency of the overall compression system, such as pre/post filtering, super resolution, layered coding, etc. The last topic is how neural networks are used to develop end-to-end learned video coding frameworks and fully replace the conventional video coding system. Over the talk, the speaker will also present examples of techniques that have worked and techniques that may not work in the near future.
- 12:00 - 13:00: Lunch & Poster Session
- 13:00 - 14:00: Keynote from Professor Alan Bovik
- 14:00 - 14:30: On the Edge Real-time Image Super-Resolution Challenge and Results
- 14:30 - 16:45: Invited Talks on Super-Resolution
- 16:45 - 17:30: "Deep Neural Compression" by Dr. Lucas Theis (Google DeepMind)
- 17:30 - 18:00: Closing Remarks & Award Ceremony [Challenge Diplomas]
Organizers
Program Committee
Radu Timofte (University of Würzburg)
Florin Vasluianu (University of Würzburg)
Zongwei Wu (University of Würzburg)
Ioannis Katsavounidis (Meta)
Ryan Lei (Meta)
Wen Li (Meta)
Cosmin Stejerean (Meta)
Shiranchal Taneja (Meta)
Christos Bampis (Netflix)
Zhi Li (Netflix)
Andy Bigos (Sony PlayStation)
Michael Stopa (Sony PlayStation)
Daniel Motilla (Sony PlayStation)
Saman Zadtootaghaj (Sony PlayStation)
Chang Gao (Delft University of Technology)
Qinyu Chen (University of Zurich and ETHZ & Leiden Univ)
Zuowen Wang (University of Zurich and ETHZ)
Shih-Chii Liu (University of Zurich and ETHZ)