Skip to yearly menu bar Skip to main content




CVPR 2024 Career Website

Here we highlight career opportunities submitted by our Exhibitors, and other top industry, academic, and non-profit leaders. We would like to thank each of our exhibitors for supporting CVPR 2024. Opportunities can be sorted by job category, location, and filtered by any other field using the search box. For information on how to post an opportunity, please visit the help page, linked in the navigation bar above.

Search Opportunities

※Location※ South Korea Seoul / Pangyo


※Description※ 1) Deep learning compression and optimization - Development of algorithms for compression and optimization of deep learning networks - Perform deep learning network embedding (requires understanding of HW platform)

2) AD vision recognition SW - Development of deep learning recognition technology based on sensors such as cameras - Development of pre- and post-processing algorithms and function output - Development of optimization of image recognition algorithm

3) AD decision/control SW - Development of information-based map generation technology recognized by many vehicles - Development of learning-based nearby object behavior prediction model - Development of driving mode determination and collision prevention function of Lv 3 autonomous driving system


Apply

Location Multiple Locations


Description

Members of our team are part of a multi-disciplinary core research group within Qualcomm which spans software, hardware, and systems. Our members contribute technology deployed worldwide by partnering with our business teams across mobile, compute, automotive, cloud, and IOT. We also perform and publish state-of-the-art research on a wide range of topics in machine-learning, ranging from general theory to techniques that enable deployment on resource-constrained devices. Our research team has demonstrated first-in-the-world research and proof-of-concepts in areas such model efficiency, neural video codecs, video semantic segmentation, federated learning, and wireless RF sensing (https://www.qualcomm.com/ai-research), has won major research competitions such as the visual wake word challenge, and converted leading research into best-in-class user-friendly tools such as Qualcomm Innovation Center’s AI Model Efficiency Toolkit (https://github.com/quic/aimet). We recently demonstrated the feasibility of running a foundation model (Stable Diffusion) with >1 billion parameters on an Android phone under one second after performing our full-stack AI optimizations on the model.

Role responsibility can include both, applied and fundamental research in the field of machine learning with development focus in one or many of the following areas:

  • Conducts fundamental machine learning research to create new models or new training methods in various technology areas, e.g. large language models, deep generative models (VAE, Normalizing-Flow, ARM, etc), Bayesian deep learning, equivariant CNNs, adversarial learning, diffusion models, active learning, Bayesian optimizations, unsupervised learning, and ML combinatorial optimization using tools like graph neural networks, learned message-passing heuristics, and reinforcement learning.

  • Drives systems innovations for model efficiency advancement on device as well as in the cloud. This includes auto-ML methods (model-based, sampling based, back-propagation based) for model compression, quantization, architecture search, and kernel/graph compiler/scheduling with or without systems-hardware co-design.

  • Performs advanced platform research to enable new machine learning compute paradigms, e.g., compute in memory, on-device learning/training, edge-cloud distributed/federated learning, causal and language-based reasoning.

  • Creates new machine learning models for advanced use cases that achieve state-of-the-art performance and beyond. The use cases can broadly include computer vision, audio, speech, NLP, image, video, power management, wireless, graphics, and chip design

  • Design, develop & test software for machine learning frameworks that optimize models to run efficiently on edge devices. Candidate is expected to have strong interest and deep passion on making leading-edge deep learning algorithms work on mobile/embedded platforms for the benefit of end users.

  • Research, design, develop, enhance, and implement different components of machine learning compiler for HW Accelerators.

  • Design, implement and train DL/RL algorithms in high-level languages/frameworks (PyTorch and TensorFlow).


Apply

Location Seattle, WA


Description To help a growing organization quickly deliver more efficient features to Prime Video customers, Prime Video’s READI organization is innovating on behalf of our global software development team consisting of thousands of engineers. The READI organization is building a team specialized in forecasting and recommendations. This team will apply supervised learning algorithms for forecasting multi-dimensional related time series using recurrent neural networks. The team will develop forecasts on key business dimensions and recommendations on performance and efficiency opportunities across our global software environment.

As a member of the team, you will apply your deep knowledge of machine learning to concrete problems that have broad cross-organizational, global, and technology impact. Your work will focus on retrieving, cleansing and preparing large scale datasets, training and evaluating models and deploying them for customers, where we continuously monitor and evaluate. You will work on large engineering efforts that solve significantly complex problems facing global customers. You will be trusted to operate with complete independence and are often assigned to focus on areas where the business and/or architectural strategy has not yet been defined. You must be equally comfortable digging in to business requirements as you are drilling into designs with development teams and developing ready-to-use learning models. You consistently bring strong, data-driven business and technical judgment to decisions.


Apply

Natick, MA, United States


The Company Cognex is a global leader in the exciting and growing field of machine vision.

The Team: Vision Algorithms, Advanced Vision Technology This position is in the Vision Algorithms Team of Advanced Vision Technology group, which is responsible for designing and developing the most sophisticated machine vision tools in the world. We combine custom hardware, specialized lighting, optics, and world-class vision algorithms to create software systems that are used to analyze imagery (intensity, color, density, Z-data, ID barcodes, etc.), to detect, identify and localize objects, to make measurements, to inspect for defects, and to read encoded data. Technology development is critical to the overall business to expand areas of application, improve performance, discover new algorithms, and to make use of new hardware and processing power. Engineers in this group typically have experience with image analysis, machine vision, or signal processing.

Job Summary: The Vision Algorithms team is looking for well-rounded, intelligent, creative, and motivated summer or fall intern with a passion for results! You will work with our senior engineers and technical leads on projects that advance our software development infrastructure and enhance our key technologies and customer experience. You will get mentorship on tackling technical challenges and opportunities to build a solid foundation for your career in Software Engineering, or Computer Vision and Artificial Intelligence.

Essential Functions: - Prototype and develop Vision (2D and ID) applications on top of Cognex products and technology. - Build internal tools or automated tests that can be used in software development or testing. - Understand our products and contribute to creating optimal solutions for customer applications in the automation industry. - High energy and motivated learner. Creative, motivated, and looking to work hard for a fast-moving company. - Strong analytical and problem-solving skills. - Strong programming skills in both C/C++ and Python are required. - Solid understanding of machine learning (ML) fundamentals and experience with ML frameworks like TensorFlow or PyTorch required. - Demonstrated projects or internships in AI/ML domain during academic or professional tenure is highly desirable. - Experience with embedded systems, Linux systems, vision/image-processing and optics all valued. - Background in 2D vision, 3D camera calibration & multi camera systems are preferred.

Minimum education and work experience required: Pursuing a MS, or Ph.D. from a top engineering school in EE, CS, or equivalent.

If you would like to meet the hiring manager at CVPR to discuss this opportunity, please email ahmed.elbarkouky@cognex.com


Apply

Redmond, Washington, United States


Overview Within AI Platform, the Cognitive Services team empowers developers and data scientists around the world and of all skill levels to easily add AI capabilities to their apps. #aiplatform

We are looking for a Research Scientist with a background in Computer Vision, Natural Language Processing and/or Artificial Intelligence, including topics like layout analysis, chart understanding, multi-page multi-document question answering, novel ways of leveraging large language models for document understanding and solving problems inherent to large language models (grounding, retrieval-based generation, etc.). Familiarity with modern large language models is a plus, but not required.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities Your responsibilities will include:

Conduct pioneering research to propel the state-of-the-art in various tasks in document understanding. Work closely with fellow Research Scientists and Product Engineering teams to translate research outcomes into practical solutions. Provide expertise and support to the engineering team on various challenges, fostering collaboration between research and practical application. Take charge of the research agenda from problem definition to algorithm and model development.


Apply

Canberra/Australia


We are looking for new outstanding PhD students for the upcoming scholarship round (application is due on 31st August 2024) at the Australian National University (ANU is ranked #30 in the QS Ranking 2025) or possibly at another Australian universities.

We are looking for new PhD students to work on new problems that may span over (but are not limited to) "clever" adapting of Foundation Models, LLMs, diffusion models (LORAs etc.,), NERF, or design of Graph Neural Networks, design of new (multi-modal) Self-supervised Learning and Contrastive Learning Models (masked models, images, videos, text, graphs, time series, sequences, etc. ) or adversarial and/or federated learning or other contemporary fundamental/applied problems (e.g., learning without backprop, adapting FMs to be less resource hungry, planning and reasoning, hyperbolic geometry, protein property prediction, structured output generative models, visual relation inference, incremental/learning to learn problems, low shot, etc.)

To succeed, you need an outstanding publication record, e.g., one or more first-author papers in venues such CVPR, ICCV, ECCV, AAAI, ICLR, NeurIPS, ICML, IJCAI, ACM KDD, ACCV, BMVC, ACM MM, IEEE. Trans. On Image Processing, CVIU, IEEE TPAMI, or similar (the list is non-exhaustive). Non-first author papers will also help if they are in the mix. Some patents and/or professional experience in Computer Vision, Machine Learning or AI are a bonus. You also need a good GPA to succeed.

We are open to discussing your interests and topics, if you reach out, we can discuss what is possible. Yes, we have GPUs.

If you are interested, reach out for an informal chat with Dr. Koniusz. I am at CVPR if you want to chat?): piotr.koniusz@data61.csiro.au (or piotr.koniusz@anu.edu.au, www.koniusz.com)


Apply

Location San Francisco, CA


Description Amazon Music is an immersive audio entertainment service that deepens connections between fans, artists, and creators. From personalized music playlists to exclusive podcasts, concert livestreams to artist merch, Amazon Music is innovating at some of the most exciting intersections of music and culture. We offer experiences that serve all listeners with our different tiers of service: Prime members get access to all the music in shuffle mode, and top ad-free podcasts, included with their membership; customers can upgrade to Amazon Music Unlimited for unlimited, on-demand access to 100 million songs, including millions in HD, Ultra HD, and spatial audio; and anyone can listen for free by downloading the Amazon Music app or via Alexa-enabled devices. Join us for the opportunity to influence how Amazon Music engages fans, artists, and creators on a global scale.

You will be managing a team within the Music Machine Learning and Personalization organization that is responsible for developing, training, serving and iterating on models used for personalized candidate generation for both Music and Podcasts.


Apply

Location Santa Clara, CA


Description Amazon is looking for a passionate, talented, and inventive Applied Scientists with a strong machine learning background to help build industry-leading Speech, Vision and Language technology.

AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS. Within AWS UC, Amazon Dedicated Cloud (ADC) roles engage with AWS customers who require specialized security solutions for their cloud services.

Our mission is to provide a delightful experience to Amazon’s customers by pushing the envelope in Automatic Speech Recognition (ASR), Machine Translation (MT), Natural Language Understanding (NLU), Machine Learning (ML) and Computer Vision (CV).

As part of our AI team in Amazon AWS, you will work alongside internationally recognized experts to develop novel algorithms and modeling techniques to advance the state-of-the-art in human language technology. Your work will directly impact millions of our customers in the form of products and services that make use of speech and language technology. You will gain hands on experience with Amazon’s heterogeneous speech, text, and structured data sources, and large-scale computing resources to accelerate advances in spoken language understanding.

We are hiring in all areas of human language technology: ASR, MT, NLU, text-to-speech (TTS), and Dialog Management, in addition to Computer Vision.


Apply

Geomagical Labs is a 3D R&D lab, in partnership with IKEA. We create magical mixed-reality experiences for hundreds of millions of users, using computer vision, neural networks, graphics, and computational photography. Last year we launched IKEA Kreativ, and we’re excited for what’s next! We have an opening in our lab for a senior computer vision researcher, with 3D Reconstruction and Deep Learning expertise, to develop and improve the underlying algorithms powering our consumer products. We are looking for highly-motivated, creative, applied researchers with entrepreneurial drive, that are excited about building novel technologies and shipping them all the way to the hands of millions of customers!

Requirements: Ph.D. and 2+ years of experience, or Master's and 6+ years of experience, focused on 3D Computer Vision and Deep Learning. Experience in classical methods for 3D Reconstruction: SfM/SLAM, Multi-view Stereo, RGB-D Fusion. Experience in using Deep Learning for 3D Reconstruction and/or Scene Understanding, having worked in any of: Depth Estimation, Room Layout Estimation, NeRFs, Inverse Rendering, 3D Scene Understanding. Familiarity with Computer Graphics and Computational Photography. Expertise in ML frameworks and libraries, e.g. PyTorch. Highly productive in Python. Ability to architect and implement complex systems at the micro and macro level. Entrepreneurial: Adventurous, self-driven, comfortable under uncertainty, with a desire to make systems work end-to-end. Innovative; with a track record of patents and/or first-authored publications at leading workshops or conferences such as CVPR, ECCV/ICCV, SIGGRAPH, ISMAR, NeurIPS, ICLR etc. Experience in developing technologies that got integrated into products, as well as post-launch performance tracking and shipping improvements. [Bonus] Comfortable with C++.

Benefits: Join a mission-driven R&D lab, strategically backed by an influential global brand. Work in a dynamic team of computer vision, AI, computational photography, AR, graphics, and design professionals, and successful serial entrepreneurs. Opportunity to publish novel and relevant research. Fully remote work available to people living in the USA or Canada. Headquartered in downtown Palo Alto, California --- an easy walk from restaurants, coffee shops and Caltrain commuter rail. The USA base salary for this full-time position ranges from $180,000 to $250,000 determined by location, role, skill, and experience level. Geomagical Labs offers a comprehensive set of benefits, and for qualifying roles, substantial incentive grants, vesting annually.


Apply

Location Seattle, WA New York, NY


Description We are looking for an Applied Scientist to join our Seattle team. As an Applied Scientist, you are able to use a range of science methodologies to solve challenging business problems when the solution is unclear. Our team solves a broad range of problems ranging from natural knowledge understanding of third-party shoppable content, product and content recommendation to social media influencers and their audiences, determining optimal compensation for creators, and mitigating fraud. We generate deep semantic understanding of the photos, and videos in shoppable content created by our creators for efficient processing and appropriate placements for the best customer experience. For example, you may lead the development of reinforcement learning models such as MAB to rank content/product to be shown to influencers. To achieve this, a deep understanding of the quality and relevance of content must be established through ML models that provide those contexts for ranking.

In order to be successful in our team, you need a combination of business acumen, broad knowledge of statistics, deep understanding of ML algorithms, and an analytical mindset. You thrive in a collaborative environment, and are passionate about learning. Our team utilizes a variety of AWS tools such as SageMaker, S3, and EC2 with a variety of skillset in shallow and deep learning ML models, particularly in NLP and CV. You will bring knowledge in many of these domains along with your own specialties.


Apply

Location Seattle, WA Palo Alto, CA


Description Amazon’s product search engine is one of the most heavily used services in the world, indexes billions of products, and serves hundreds of millions of customers world-wide. We are working on an AI-first initiative to continue to improve the way we do search through the use of large scale next-generation deep learning techniques. Our goal is to make step function improvements in the use of advanced multi-modal deep-learning models on very large scale datasets, specifically through the use of advanced systems engineering and hardware accelerators. This is a rare opportunity to develop cutting edge Computer Vision and Deep Learning technologies and apply them to a problem of this magnitude. Some exciting questions that we expect to answer over the next few years include: * How can multi-modal inputs in deep-learning models help us deliver delightful shopping experiences to millions of Amazon customers? * Can combining multi-modal data and very large scale deep-learning models help us provide a step-function improvement to the overall model understanding and reasoning capabilities? We are looking for exceptional scientists who are passionate about innovation and impact, and want to work in a team with a startup culture within a larger organization.


Apply

Location Mountain View, CA


Description Gatik is thrilled to be at CVPR! Come meet our team at booth 1831 to talk about how you could make an impact at the autonomous middle mile logistics company redefining the transportation landscape.

Who we are: Gatik, the leader in autonomous middle mile logistics, delivers goods safely and efficiently using its fleet of light & medium-duty trucks. The company focuses on short-haul, B2B logistics for Fortune 500 customers including Kroger, Walmart, Tyson Foods, Loblaw, Pitney Bowes, Georgia-Pacific, and KBX; enabling them to optimize their hub-and-spoke supply chain operations, enhance service levels and product flow across multiple locations while reducing labor costs and meeting an unprecedented expectation for faster deliveries. Gatik’s Class 3-7 autonomous box trucks are commercially deployed in multiple markets including Texas, Arkansas, and Ontario, Canada.

About the role: We are seeking passionate Senior/Staff Software Engineers, who have strong fundamentals in software development practices and are experts in C++ language in production-oriented environment. The ideal candidate is a highly experienced C++ developer with a passion for enabling the world's first safe, reliable & efficient network of autonomous vehicles. You will partner with the research and software engineers to design, develop, test and validate AV features for our autonomous fleet.

This role will be onsite at our Mountain View office.

What you'll do: +Design, implement, integrate, and support real-time mission-critical software for the Gatik’s autonomy stack +Work with the research engineers to develop maintainable, testable and robust software designs +Architect and implement solutions to complex issues between components partitioned across the large software stack +Be at the forefront of guiding & ensuring best SDLC practices while contributing to improving the safety in the core autonomy stack +Collaborate with the Infrastructure and DevOps teams for efficient, secure and scalable software delivery to a network of Gatik’s autonomous fleet
+Guide and mentor autonomy researchers and algorithm developers to make sure their components are running efficiently and with optimal compute and memory usage +Review and refine technical requirements and translate them into high-level design & plans to support the development of safe AV technology +Conduct code and design reviews and advise on technical matters

Click the apply button below to see the full job description and apply


Apply

Redmond, Washington, United States


Overview Do you want to shape the future of Artificial Intelligence (AI)? Do you have a passion for solving real-world problems with cutting-edge technologies? Do you enjoy working in a diverse and collaborative team?

The Microsoft Research AI Frontiers group is looking for a Principal Research Software Engineer with demonstrated machine learning experience to advance the state-of-the-art in foundational model-based technologies. Areas of focus on our team include, but are not limited to:

Human-AI interaction, collaboration, and experiences Applications of foundation models and model-based technologies Multi-agent systems and agent platform technologies Model, agent, and AI systems evaluation As a Principal Research Software Engineer on our team, you will need:

A drive for real world impact, demonstrated by a passion to build and deploy applications, prototypes, or open-source technologies. Demonstrated experience working with large foundation models and state-of-the-art ML frameworks and toolkits. A team player mindset, characterized by effective communication, collaboration, and feedback skills. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Responsibilities Leverage full-stack software engineering skills to build, test, and deploy robust and intuitive AI based technologies. Work closely with researchers and engineers to rapidly develop and test research ideas and drive a high-impact agenda. Collaborate with product partners to integrate and test new ideas within existing frameworks and toolchains. Embody our culture and values.


Apply

Location Multiple Locations


Description

Qualcomm's Multimedia R&D and Standards Group is seeking candidates for Video Compression Research Engineer positions. You will be part of world-renowned team of video compression experts. The team develops algorithms, hardware architectures, and systems for state-of-the-art applications of classical and machine learning methods in video compression, video processing, point cloud coding and processing, AR/VR and computer vision use cases. The successful candidate for this position will be a highly self-directed individual with strong creative and analytic skills and a passion for video compression technology. You will work on, but not be limited to, developing new applications of classical and machine learning methods in video compression improving state-of-the-art video codecs.

We are considering candidates with various levels of experience. We are flexible on location and open to hiring anywhere, preferred locations are USA, Germany and Taiwan.

Responsibilities: Contribute to the conception, development, implementation, and optimization of new algorithms extending existing techniques and systems allowing improved video compression. Initiate ideas, design and implement algorithms for superior hardware encoder performance, including perceptually based bit allocation. Develop new algorithms for deep learning-based video compression solutions. Represent Qualcomm in the related standardization forums: JVET, MPEG Video, and ITU-T/VCEG. Document and present new algorithms and implementations in various forms, including standards contributions, patent applications, conference and journal publications, presentations, etc. Ideal candidate would have the skills/experience below: Expert knowledge of the theory, algorithms, and techniques used in video and image coding. Knowledge and experience of video codecs and their test models, such as ECM, VVC, HEVC and AV1. Experience with deep learning structures CNN, RNN, autoencoder etc. and frameworks like TensorFlow/PyTorch. Track record of successful research accomplishments demonstrated through published papers, and/or patent applications in the fields of video coding or video processing. Solid programming and debugging skills in C/C++. Strong written and verbal English communication skills, great work ethic, and ability to work in a team environment to accomplish common goals. PhD or Masters degree in Electrical Engineering, Computer Science, Physics, Mathematics or similar field, or equivalent practical experience.

Qualifications: PhD or Masters degree in Electrical Engineering, Computer Science, Physics, Mathematics, or similar fields. 1+ years of experience with programming language such as C, C++, MATLAB, etc.


Apply