Meta: Video Accelerators at Meta
Chair/moderator: Xing Cindy Chen
In this workshop, we will introduce MSVP (Meta Scalable Video Processor), the first generation server grade video processing hardware accelerator of its kind developed at Meta. We will describe the motivation behind it, the architecture, and some of the novel algorithms that are in the video encoder and other video processing blocks to achieve high quality. We will also describe how the hardware accelerators are used in Meta’s data center to support processing and transcoding billions of videos every day and provide premium video quality to end users, while saving power.
- MSVP: Meta’s First Server-Class Video Processor (Harikrishna Reddy, Yunqing Chen)
- Recipes for Premium Video Quality in Silicon (Junqiang Lan, Xing Cindy Chen)
- Leveraging HW ASICs for Transcoding/Processing Video at Scale (Ryan Lei, Haixiong Wang)
- Xing Cindy Chen is a member of the Infra Silicon Architecture team at Meta working on algorithms and architecture of hardware accelerators for video processing at scale. Before joining Meta, she worked at Nvidia and Intel as a senior ASIC architect. She worked on algorithm investigation, architecture definition and functional/performance simulation for various units in GPUs and SOCs, such as rasterization, texture mapping, stereo vision and image signal processing. Her technical interests lie in video processing, computer vision and computer graphics. She is a senior member of the IEEE and currently serves as the Chair of the IEEE SPS Seasonal Schools Subcommittee. Cindy received her Ph.D. in Electrical Engineering from Stanford University.
- Harikrishna Reddy is a Technical Lead in the Infra Silicon Team at Meta, leading all Accelerator ASICs. He specializes in Video Transcoder, Computer Vision, and AIML SOC Designs for both Mobile and Server class applications. Prior to Meta, Harikrishna has led Architecture and Design of several generations of key Multimedia IPs at Qualcomm and Nvidia. In the last 20 years, he has also delivered Scalable Video decoder and Encoder IPs for MPEG-2, MPEG-4, H264, H265, VP9, and AV1 Codecs that have been very successful in the industry. Harikrishna received his M.S. in Computer Engineering from Texas A&M University.
- Yunqing Chen is a technical lead on the Infra Silicon Design team at Meta working on infrastructure ASIC solutions. He is currently working on cloud-based video processing, transcoder hardware architecture and ASIC development, which has been achieving overall better performance per watts than and still maintaining great quality per bitrate as software transcoders. Before joining Meta, Yunqing worked at Nvidia and Qualcomm, developing image signal processor (ISP), video codec processor, computing vision HW IPs and SoC solutions for the mobile products. Yunqing received his Ph.D. degree in Physics from University of Michigan at Ann Arbor.
- Junqiang Lan is an algorithm/architecture technical lead on the Infra Silicon team at Meta working on infrastructure ASIC solutions. He is currently working on video encoding/processing algorithms, architecture and firmware for video transcoding ASIC accelerators in Meta data centers. Before Meta, Junqiang worked at Ambarella, CConchip and Marvell as a multimedia algorithm and system architect. His field of expertise covers the hardware algorithm and architecture of video codec, image/video processing/ISP, computer vision, machine learning, and SoC design. Junqiang received his PhD degree in CECS from University of Missouri at Columbia.
- Ryan Lei is a video codec specialist on the Video Infra team at Meta working on algorithms and architecture of cloud based video processing, transcoding, and delivery at scale for various Facebook and Instagram products. Ryan is also the co-chair of the AOM testing subgroup and is actively contributing to the standardization of AV1 and AV2. Before joining Meta, Ryan worked at Intel as a principal engineer and codec architect. He worked on algorithm implementation and architecture definition for multiple generations of hardware based video codecs, such as AVC, VP9, HEVC and AV1. Ryan received his Ph.D. in Computer Science from the University of Ottawa. His research interests include image/video processing, compression and parallel computing.
- Haixiong Wang is a technical lead on the Video Infra team at Meta working on video compression efficiency and video quality measurement at scale. He is especially interested in how to deploy advanced compression algorithms to billions of users efficiently, and achieving overall great quality of experience while doing that. He is currently working on improving the accuracy and efficiency of video quality metrics for streaming applications. Before joining Meta, Haixiong worked at Intel designing and validating the video encoding / decoding capabilities of a wide range of products. Haixiong received his M.S. in Computer Science and Engineering from University of Michigan at Ann Arbor.
Google: Advances in Video Compression and Quality Assessment
- UVQ: Measuring YouTube’s Perceptual Video Quality (Yilin Wang): Universal Video Quality (UVQ) model is a deep learning based video quality metric recently proposed by Google, which has also been widely applied in production. In this talk, we will discuss recent developments and applications of UVQ, and share our thoughts about challenging open questions in UGC video quality assessment.
- Update on the next-generation AOM codec (Rachel Barker, Keng-Shih (Lester) Lu): AV2 is a next-generation codec that is currently under development by the Alliance for Open Media (AOM). It aims to achieve 30% compression gain over its predecessor AV1, making it the most advanced royalty-free video codec. This talk will include a brief overview of the most innovative tools in AV2, present current coding results, ongoing activities as well as future plans.
- Neural Video Compression: State and Future (Fabian Mentzer): In this talk, we will review the latest developments in neural video compression, and look at the challenges ahead of us. We will look at recent work that outperforms VTM in PSNR(RGB), the state in YUV420, how transformers simplified the neural compression setup, and where we are in terms of runtime.
- Yilin Wang is a staff software engineer in the Media Algorithms team at YouTube/Google. He spent the last eight years on improving YouTube video processing and transcoding infrastructures, and building video quality metrics. Beside the video engineering work, he is also an active researcher in video quality related areas, and published papers in CVPR, ICCV, TIP, ICIP, etc. He received his PhD from the University of North Carolina at Chapel Hill in 2014, working on topics in computer vision and image processing.
- Rachel Barker is a software engineer in the Open Codecs team at Google, working on the AV1 and AV2 video codecs. She has a Master’s degree in Mathematics from the University of Cambridge, and has six years’ experience in digital signal processing, video compression, and embedded systems. Recently she was involved with open-sourcing the Argon Streams AV1 video decoder verification tool, which she previously worked on in 2016-2018.
- Keng-Shih (Lester) Lu received the B.S. degree in electrical engineering and in mathematics, and the M.S. degree in communication engineering from National Taiwan University, Taipei, Taiwan, in 2012 and 2014, respectively, and the Ph.D. degree in electrical and computer engineering from the University of Southern California, Los Angeles, CA, USA, in 2020. From 2014 to 2015, he was a full-time research assistant with the Institute of Information Science, Academia Sinica, Taiwan. Since 2020, he has been with the Video Coding Team with Google. His research interests include signal processing and machine learning with applications in multimedia compression and computer vision.
- Fabian Mentzer received his PhD from ETH Zurich in 2021, with his thesis on Neural Compression. He joined Google as a Research Scientist thereafter. Currently, his focus is on improving neural video compression, both conceptually and in terms of rate-distortion.