Site Reliability Engineer Jobs in USA with Visa Sponsorship
There are 758+ site reliability engineer positions currently offering visa sponsorship in the United States. The most common visa types for these roles include H-1B, Green Card, E-3. Top hiring companies include MongoDB, Apple, & SS&C Technologies, among others. Salaries for sponsored positions range from $150K – $216K.
See All Site Reliability Engineer JobsOverview
Showing 5 of 758+ site reliability engineer jobs


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?
See all 758+ Site Reliability Engineer jobs
Sign up for free to unlock all listings, filter by visa type, and get alerts for new Site Reliability Engineer roles.
Get Access To All Jobs
About Baseten
Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.
THE ROLE
You will serve as the primary post-sales technical owner for our most strategic customers, ensuring smooth deployment, performance, and reliability of ML workloads in production. You’ll own the technical success and long-term outcome of our strategic and enterprise accounts by managing and resolving escalations, maintaining and improving our runbooks, and proactively surface patterns that drive product improvements. This role blends hands-on debugging, infrastructure expertise, AI model performance monitoring, and executive-level customer-facing ownership. You’re of course not alone in this endeavor — partnering closely with product, engineering, and forward-deployed teams to remove technical friction, drive adoption, and ensure long-term success for high-value accounts.
Responsibilities
- Diagnose and resolve runtime issues related to latency, memory behavior, GPU utilization, concurrency, and model lifecycle management.
- Debug infrastructure issues across Kubernetes (pods, controllers), networking, observability, and alerting systems.
- Lead incident response during outages or escalations, managing coordination between Product, FDE, Sales, and Engineering.
- Serve as the technical owner for top enterprise accounts with strict SLAs and high responsiveness expectations.
- Identify common failure modes and translate user feedback into roadmap signals, product improvements, our internal runbooks, knowledge bases, and diagnostic best practices.
- Own project coordination end-to-end: scoping, execution, communication, and stakeholder alignment across technical and non-technical teams ranging from feature requests, new deployments, and operational debugging issues.
Requirements
- Deep Kubernetes troubleshooting expertise, including advanced resource debugging, pod/runtime analysis, and log-based diagnostics using observability tooling such as Grafana, Loki, and Prometheus.
- Strong infrastructure debugging ability across container orchestration, networking, and service dependencies, with hands-on experience supporting production-grade clusters.
- Experience managing high-severity incidents with major customers, including SLAs, post-incident reviews, and clear communication throughout escalations.
- Proven project management and organizational skills with an ownership mindset, able to manage multiple complex, multi-stakeholder initiatives in parallel — including issue resolution, root-cause analysis, and feature delivery.
- Ability to translate recurring technical pain points into roadmap-level insights, documentation improvements, or product enhancements.
- Strong communication skills and executive presence during high-visibility situations, ensuring technical clarity and customer confidence.
- 3+ years of experience in a fast-paced, high-growth, or customer-facing engineering environment.
BONUS
- Familiarity with running high-performance AI models and workloads, including troubleshooting ML pipelines from preprocessing through inference and serving.
- Experience implementing or managing ticketing and incident-response systems such as Zendesk or Pylon.
- Familiarity with Helm, Flux, CI/CD tooling, or scripting automations to improve deployment, release, or operational workflows.
Benefits
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents.
- Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!).
- Paid parental leave.
- Company-facilitated 401(k).
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Compensation Range: $150K - $225K
Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.
At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

How to Get Visa Sponsorship as a Site Reliability Engineer
Emphasize software engineering skills in your application materials
SRE is distinguished from traditional operations by its foundation in software engineering. Highlight your ability to write production-grade code in Python, Go, or Java for automation, monitoring, and tooling. This engineering emphasis is exactly what makes SRE qualify as a specialty occupation and what differentiates it from systems administration roles.
Target companies that operate large-scale distributed systems
Companies like Google, Netflix, Uber, and Spotify run services at scales that require dedicated SRE teams to maintain reliability. These organizations have mature SRE practices and understand the specialized skills required, making them more willing to sponsor. Their scale also means they have ongoing SRE hiring needs rather than one-off positions.
Demonstrate experience with observability and incident management
Proficiency with monitoring tools like Prometheus, Grafana, Datadog, or PagerDuty, combined with experience leading incident response and conducting blameless postmortems, is central to SRE work. Employers evaluate SRE candidates heavily on their ability to diagnose production issues under pressure. Describing specific incidents you managed and the systemic improvements you implemented afterward shows the kind of expertise that justifies sponsorship.
Learn to define and manage service level objectives
SLOs, SLIs, and error budgets are core SRE concepts that distinguish the role from general DevOps. Being able to articulate how you defined reliability targets, measured them with service level indicators, and used error budgets to balance feature velocity with system stability demonstrates domain-specific knowledge. This vocabulary and framework are specific to SRE and signal to employers that you understand the discipline.
Consider SRE roles at financial services companies
Bloomberg, Goldman Sachs, and Two Sigma employ SREs to maintain the reliability of trading platforms and financial data systems where downtime has immediate monetary consequences. These firms pay competitively with top tech companies and have established H-1B sponsorship processes. The high-stakes nature of financial systems makes SRE expertise particularly valued and well-compensated in this sector.
See all 758+ Site Reliability Engineer jobs
Sign up for free to unlock all listings, filter by visa type, and get alerts for new Site Reliability Engineer roles.
Get Access To All JobsFrequently Asked Questions
What is a site reliability engineer, and does the role qualify for visa sponsorship?
Site reliability engineering (SRE) is a discipline that applies software engineering principles to IT operations, focusing on system availability, performance, and scalability. SRE roles typically require a bachelor's degree in computer science or software engineering and involve writing code to automate infrastructure management. This combination of software engineering and operational expertise makes SRE a strong fit for the H-1B specialty occupation classification.
Which companies are known for sponsoring SRE positions?
Google, which originated the SRE concept, along with LinkedIn, Dropbox, Twitter, and other large tech companies have well-established SRE teams that regularly sponsor visa holders. Financial institutions like Bloomberg and major banks also hire SREs for their trading and risk platforms. Any company operating large-scale distributed systems is a potential SRE employer.
How is an SRE role classified differently from a DevOps engineer for visa purposes?
While both roles deal with infrastructure and automation, SRE emphasizes software engineering as the primary skill set - writing code to solve operational problems, building monitoring systems, and defining service level objectives. DevOps tends to focus more on CI/CD pipelines and deployment automation. For visa purposes, the SRE role's heavier emphasis on software development can strengthen the specialty occupation argument.
Do SRE roles offer competitive enough salaries for H-1B prevailing wage requirements?
SRE salaries are among the highest in software engineering, typically ranging from $130,000 to $200,000 at major tech companies. These compensation levels exceed prevailing wage thresholds by a significant margin in most metro areas. The combination of software engineering skills and operational expertise commands a premium that comfortably supports H-1B wage requirements. You can look up current prevailing wage rates for any occupation and location using the OFLC Wage Search tool.
See which Site Reliability Engineer employers are hiring and sponsoring visas right now.
Browse Site Reliability Engineer Jobs