Three of the Hottest Topics in Computer Vision Today

CVPR’s paper submissions reflect the evolution of the field

When the engineering community’s leading AI conference puts out a call for papers, the industry responds. This year, CVPR experienced a 13% rise in submissions, leading to a total of 13,008 papers from more than 40,000 unique authors from around the world.

“We’ve been on that exponential for a while,” explained Phillip Isola, CVPR 2025 Program Co-Chair and an associate professor at the Massachusetts Institute of Technology (MIT) in Boston, Mass., U.S. “AI in general is the big thing, and students are getting degrees in it. So, the community is just getting bigger and bigger.”

But beyond that general trend, diving deeper into the program exposes the areas of finite focus from the community. Advances in particular areas have given way to new streams of research emphasis, drawing more papers on new topics. Specifically, three areas are emerging as hot topics for 2025:

1. 3D from Multi-View and Sensors – 3D from multi-view and sensors brought in a large number of submissions to the CVPR technical program, with good reason: Image-based research has grown from exploring single images or 2D renderings to a more complex landscape of evaluating in 3D. The introduction of NeRF in 2020 spurred a new channel of research efforts.

“From 2020, since NeRF was first published, there has been this trend of using a deep network to perform 3D reconstruction. And now we have Gaussian splatting, which furthers this trend,” shared Fuxin Li, CVPR 2025 Program Co-Chair, and an associate professor at Oregon State University in Corvallis, Ore., U.S. “So basically, computer vision and graphics are converging. We have this neural rendering research, and it definitely drives a significant growth of the papers on the 3D side.”

2. Image and Video Synthesis – As research evolves, with it comes the ability to generate more accurate representations of an environment in video and image formats. Exploration in this area has become a focal point of CVPR 2025 papers, with image and video synthesis landing as one of the largest categories reflected at this year’s conference.

“One of the big trends this year in commercial chatbots is they’ve become multimodal; they now analyze and generate not just text but also images and sometimes videos,” explained Isola. “On the horizon is the ability to generate complete interactive worlds. The image, video, and world synthesis methods being presented at CVPR are paving the way toward this kind of technology.”

3. Multimodal Learning, and Vision, Language, and Reasoning – While these were listed as two separate topics of interest in the call for papers, when combined, they make up one of the largest categories of submitted papers. The volume of individualized work in these areas may point to new trends to watch at this year’s conference.

CVPR: The great equalizer

And with the paper acceptance rate hovering at a very low 22%, every paper presented at CVPR has earned its spot on the program. While paper submissions reflect the field’s enthusiasm of particular subject areas, the program chairs emphasize that CVPR serves as the great equalizer for the field, focusing on the research that deserves recognition, not just on the players with the loudest voices.

“CVPR serves a really important purpose in enlarging the voices of the field, not just those from large organizations,” concluded Li. “At CVPR, every paper has the same right. If it’s a poster, it’s a poster. If it's an oral, it’s an oral. It doesn't matter who you are. That part is super important to the ecosystem of the computer vision field.”

For more information on CVPR 2025, taking place 11-15 June in Nashville, Tenn., U.S., or to register, visit https://cvpr.thecvf.com/.