Llm Engineer Green Card Jobs
LLM Engineer roles sit squarely within EB-2 and EB-3 sponsorship pathways, as employers must complete PERM labor certification before filing the I-140 immigrant petition. Demand for professionals who design, fine-tune, and deploy large language models has pushed more U.S. tech employers to sponsor green cards for qualified foreign candidates.
See All Llm Engineer JobsOverview
Showing 5 of 46+ Llm Engineer jobs


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?
See all 46+ Llm Engineer jobs
Sign up for free to unlock all listings, filter by visa type, and get alerts for new Llm Engineer roles.
Get Access To All Jobs
INTRODUCTION
NVIDIA is seeking a Principal Engineer to drive the performance of large-scale AI training and post-training workloads across NVIDIA’s full hardware and software stack. This role sits at the intersection of distributed training, GPU architecture, systems software, deep learning frameworks, and performance engineering. You will analyze and optimize frontier-scale LLM workloads running on thousands of GPUs, drive improvements across frameworks such as PyTorch, JAX, NeMo, and NeMo RL, and use insights from real workloads to help shape future NVIDIA GPU, system, and software roadmaps.
We are looking for a deeply technical leader who can operate across abstraction layers: from application-level training behavior to framework/runtime internals, CUDA libraries, communication collectives, memory systems, networking, and GPU architecture. At this level, success means both directly improving performance directly as well as setting technical direction, raising the bar for the organization, and influencing multi-functional decisions across NVIDIA.
What you will be doing:
- Lead end-to-end performance analysis and optimization of innovative LLM pre-training and post-training workloads on the latest NVIDIA hardware and software platforms.
- Drive workloads closer to speed-of-light performance by identifying and removing bottlenecks across compute, memory, communication, scheduling, parallelism strategy, kernel efficiency, framework overhead, and system-level scaling.
- Develop production-quality software, tools, models, benchmarks, and analysis infrastructure that improve training performance, efficiency, and developer velocity across NVIDIA’s AI software stack.
- Build and refine performance models, workload characterizations, and simulation methodologies to guide future GPU, networking, system, and software architecture decisions.
- Serve as a technical authority for AI training performance, partnering closely with teams across GPU architecture, systems, CUDA libraries, compilers, networking, frameworks, product management, and applied AI.
- Translate workload insights into concrete hardware and software recommendations, and advocate for changes that improve performance and efficiency across the AI ecosystem.
- Mentor and provide technical leadership to engineers across the organization, helping establish best practices for large-scale AI performance analysis and optimization.
What we need to see:
- A MS, or PhD (or equivalent experience) in Computer Science, Electrical Engineering, Computer Engineering, or a related field, with 12+ years of relevant work or research experience.
- Demonstrated principal-level technical impact in one or more of the following areas: large-scale AI training systems, GPU performance optimization, distributed systems, high-performance computing, ML frameworks, compilers/runtimes, or hardware/software co-design.
- Deep hands-on experience analyzing and optimizing performance of large-scale deep learning workloads, especially transformer-based models, LLM pre-training, reinforcement learning, fine-tuning, or other post-training workloads.
- Strong understanding of GPU and AI accelerator architecture from individual accelerators to datacenter-scale systems.
- Experience with distributed training techniques such as data parallelism, tensor parallelism, pipeline parallelism, expert parallelism, sequence parallelism, activation checkpointing, mixed precision training, and communication/computation overlap.
- A strong track record of using profiling, tracing, benchmarking, and performance modeling tools to diagnose complex bottlenecks and drive measurable improvements.
- Excellent communication and technical leadership skills, with the ability to influence architecture and software decisions across multiple teams without relying on direct authority.
GPU computing is the most productive and pervasive platform for deep learning and AI. It begins with the most advanced GPUs and the systems and software we build on top of them. We integrate and optimize every deep learning framework. We work with the major systems companies and every major cloud service provider to make GPUs available in data centers and in the cloud. We craft computers and software to bring AI to edge devices, such as self-driving cars and autonomous robots. AI has the potential to spur a wave of social progress unmatched since the industrial revolution.
This opportunity offers you the ability to collaborate with some of the most forward-thinking and hard-working people in the world, shaping the future of AI in a creative and autonomous work environment that encourages innovation. If you're passionate about working across the full hardware & software stack—from GPU architecture to application code—to achieve optimal performance, we want to hear from you!
COMPENSATION
- Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.
- You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until May 2, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

INTRODUCTION
NVIDIA is seeking a Principal Engineer to drive the performance of large-scale AI training and post-training workloads across NVIDIA’s full hardware and software stack. This role sits at the intersection of distributed training, GPU architecture, systems software, deep learning frameworks, and performance engineering. You will analyze and optimize frontier-scale LLM workloads running on thousands of GPUs, drive improvements across frameworks such as PyTorch, JAX, NeMo, and NeMo RL, and use insights from real workloads to help shape future NVIDIA GPU, system, and software roadmaps.
We are looking for a deeply technical leader who can operate across abstraction layers: from application-level training behavior to framework/runtime internals, CUDA libraries, communication collectives, memory systems, networking, and GPU architecture. At this level, success means both directly improving performance directly as well as setting technical direction, raising the bar for the organization, and influencing multi-functional decisions across NVIDIA.
What you will be doing:
- Lead end-to-end performance analysis and optimization of innovative LLM pre-training and post-training workloads on the latest NVIDIA hardware and software platforms.
- Drive workloads closer to speed-of-light performance by identifying and removing bottlenecks across compute, memory, communication, scheduling, parallelism strategy, kernel efficiency, framework overhead, and system-level scaling.
- Develop production-quality software, tools, models, benchmarks, and analysis infrastructure that improve training performance, efficiency, and developer velocity across NVIDIA’s AI software stack.
- Build and refine performance models, workload characterizations, and simulation methodologies to guide future GPU, networking, system, and software architecture decisions.
- Serve as a technical authority for AI training performance, partnering closely with teams across GPU architecture, systems, CUDA libraries, compilers, networking, frameworks, product management, and applied AI.
- Translate workload insights into concrete hardware and software recommendations, and advocate for changes that improve performance and efficiency across the AI ecosystem.
- Mentor and provide technical leadership to engineers across the organization, helping establish best practices for large-scale AI performance analysis and optimization.
What we need to see:
- A MS, or PhD (or equivalent experience) in Computer Science, Electrical Engineering, Computer Engineering, or a related field, with 12+ years of relevant work or research experience.
- Demonstrated principal-level technical impact in one or more of the following areas: large-scale AI training systems, GPU performance optimization, distributed systems, high-performance computing, ML frameworks, compilers/runtimes, or hardware/software co-design.
- Deep hands-on experience analyzing and optimizing performance of large-scale deep learning workloads, especially transformer-based models, LLM pre-training, reinforcement learning, fine-tuning, or other post-training workloads.
- Strong understanding of GPU and AI accelerator architecture from individual accelerators to datacenter-scale systems.
- Experience with distributed training techniques such as data parallelism, tensor parallelism, pipeline parallelism, expert parallelism, sequence parallelism, activation checkpointing, mixed precision training, and communication/computation overlap.
- A strong track record of using profiling, tracing, benchmarking, and performance modeling tools to diagnose complex bottlenecks and drive measurable improvements.
- Excellent communication and technical leadership skills, with the ability to influence architecture and software decisions across multiple teams without relying on direct authority.
GPU computing is the most productive and pervasive platform for deep learning and AI. It begins with the most advanced GPUs and the systems and software we build on top of them. We integrate and optimize every deep learning framework. We work with the major systems companies and every major cloud service provider to make GPUs available in data centers and in the cloud. We craft computers and software to bring AI to edge devices, such as self-driving cars and autonomous robots. AI has the potential to spur a wave of social progress unmatched since the industrial revolution.
This opportunity offers you the ability to collaborate with some of the most forward-thinking and hard-working people in the world, shaping the future of AI in a creative and autonomous work environment that encourages innovation. If you're passionate about working across the full hardware & software stack—from GPU architecture to application code—to achieve optimal performance, we want to hear from you!
COMPENSATION
- Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.
- You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until May 2, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
See all 46+ Llm Engineer jobs
Sign up for free to unlock all listings, filter by visa type, and get alerts for new Llm Engineer roles.
Get Access To All JobsTips for Finding Green Card Sponsorship as a Llm Engineer
Document your LLM specialization before applying
Gather evidence that your work is specific to large language model development, not general software engineering. PERM job descriptions must match your actual duties, so your resume, publications, and project records need to reflect fine-tuning, RLHF, or inference optimization specifically.
Target employers with established PERM filing history
Search OFLC disclosure data to identify tech companies that have filed PERM applications for LLM or machine learning engineer roles. Employers already familiar with the PERM recruitment process move faster and are less likely to withdraw sponsorship mid-process.
Use Migrate Mate to filter green card sponsoring roles
Filter your job search on Migrate Mate to surface LLM Engineer positions at employers with active EB-2 and EB-3 sponsorship history. This cuts out roles where sponsorship is uncertain and focuses your applications on companies already set up to file.
Clarify the EB tier before accepting an offer
Ask the employer's immigration counsel whether your role qualifies under EB-2 or EB-3 before signing. EB-2 requires a master's degree or its equivalent; EB-3 covers bachelor's-level professionals. The distinction affects your priority date and wait time, especially if you were born in India or China.
Understand how PERM recruitment timelines affect your start date
DOL requires employers to complete a supervised recruitment process before filing PERM. This typically adds three to six months before the I-140 is even filed. Negotiating your start date with this window in mind prevents gaps if you're transitioning from a visa status with a hard end date.
Request concurrent filing of I-140 and I-485 if your priority date is current
If your country of birth has a current priority date at the time PERM is approved, ask USCIS to allow concurrent I-140 and I-485 filing. This can shorten your path to permanent residency by months and lets you work and travel freely while your case is pending.
Llm Engineer jobs are hiring across the US. Find yours.
Find Llm Engineer JobsLlm Engineer Green Card Sponsorship: Frequently Asked Questions
Do LLM Engineer roles qualify for EB-2 or EB-3 green card sponsorship?
Most LLM Engineer positions qualify for EB-2 because they require a master's degree or its equivalent in computer science, machine learning, or a related field. Roles that accept a bachelor's degree with progressive experience can fall under EB-3. The employer's immigration counsel determines the tier based on the actual job requirements during PERM filing.
How does green card sponsorship differ from H-1B sponsorship for LLM Engineers?
H-1B is a temporary status requiring renewal every three years with no path to permanent residency on its own. EB-2 and EB-3 green card sponsorship goes through PERM labor certification and leads directly to lawful permanent residency. There is no annual lottery at the EB-3 level, though country-of-birth backlogs affect how long the I-485 stage takes for applicants born in India or China.
What does the PERM process look like for an LLM Engineer role?
The employer files a PERM application with DOL after completing a supervised recruitment process to confirm no qualified U.S. workers are available for the role. For LLM Engineer positions, the job description must specify technical requirements tied to large language model development. DOL processing currently averages several months to over a year depending on whether the application is audited.
Where can I find LLM Engineer jobs that include green card sponsorship?
Migrate Mate lets you search specifically for LLM Engineer roles at employers with EB-2 and EB-3 sponsorship history, so you're not guessing whether a company will file. Standard job boards don't filter by PERM filing history, which means most listings don't tell you whether the employer has ever sponsored a green card before.
Can my employer start PERM while I'm on H-1B, and does my status stay protected?
Yes. Employers routinely begin PERM while you're on H-1B, and your status is unaffected during the process. Once the I-140 is approved, AC21 portability lets you change employers or roles after 180 days without losing your priority date, as long as the new position is in the same or similar occupational classification as the original PERM job.
See which Llm Engineer employers are hiring and sponsoring visas right now.
Search Llm Engineer Jobs