Remote Site Reliability Engineer Jobs
Remote Site Reliability Engineer jobs are in active demand at remote-first companies and large distributed teams, including employers like IT Labs, Cognitive Medical Systems, and Five9, from junior to senior. Scan the live roles below and apply to whichever ones fit.
Find JobsOverview
Showing 5 of 194+ Remote Site Reliability Engineer jobs









INTRODUCTION
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, and data they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits, and career development opportunities. Come make an impact on the communities we serve as you help us advance health equity on a global scale. Join us to start Caring. Connecting. Growing together.
The Site Reliability Engineering (SRE) team at Optum Financial ensures world-class reliability, scalability, security, compliance, and performance of a scalable infrastructure platform that powers diverse financial products. We exist so our customers, partners, and engineers can trust and innovate financial products without fear and with velocity. As a Senior SRE, you will lead our mission to own the tools, platforms, and processes that enable success. Our team is driving modern observability practices with OpenTelemetry and the adoption of SLOs as reliability measures. You will be instrumental in automating our environment and building AI-enhanced platforms to support the next generation of financial technology.
You will enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges.
PRIMARY RESPONSIBILITIES:
- Design, develop, and deploy AI-powered solutions to address complex infrastructure and reliability challenges with an emphasis on the responsible use of AI
- Implement and support observability and monitoring solutions using tools such as OpenTelemetry, Datadog, Splunk, and Dynatrace to improve system visibility and reliability
- Define, implement, and maintain service level indicators (SLIs), service level objectives (SLOs), and actionable alerting strategies in partnership with engineering teams
- Use and evaluate enterprise-approved AI tools to streamline workflows, automate tasks, and drive continuous improvement across the platform
- Develop and maintain automation to improve operational efficiency, including alerting, incident analysis, and recovery workflows
- Support incident response processes, including troubleshooting, root cause analysis (RCA), and implementation of corrective actions to prevent recurrence
- Support cloud-based infrastructure (Azure or AWS) and containerized environments (Kubernetes, Docker) to enhance scalability, stability, and efficiency
- Evaluate emerging technology trends to inform solution design and strategic innovation for the SRE platform
- Contribute to the development of SRE platform capabilities, including self-healing systems and automated operational processes
- Partner with cross-functional teams to promote adoption of SRE best practices and improve overall system reliability
You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear directions on what it takes to succeed in your role as well as provide development for other roles you may be interested in.
REQUIRED QUALIFICATIONS:
- 5+ years of experience in software engineering, DevOps, or Site Reliability Engineering (SRE) roles
- 2+ years of experience implementing and supporting observability and monitoring tools (e.g., OpenTelemetry, Datadog, Splunk, Dynatrace)
- 2+ years of experience defining and maintaining SLIs, SLOs, and production alerting strategies
- 2+ years of experience working in cloud environments (Azure or AWS)
- 1+ years of experience supporting containerized applications (e.g., Kubernetes, Docker)
PREFERRED QUALIFICATIONS:
- Bachelor's degree in Computer Science, Information Technology, or a related field
- 2+ years of experience with CI/CD tools (e.g., Jenkins, GitHub Actions, ArgoCD)
- 1+ years of experience with infrastructure as code tools (e.g., Terraform, Pulumi)
- 1+ years of experience participating in incident response and root cause analysis (RCA) processes
- Direct experience developing automation for operational workflows or reliability engineering tasks
- Exposure to AI/ML concepts or practical experience applying automation to improve operational efficiency
- All Telecommuters will be required to adhere to UnitedHealth Group's Telecommuter Policy.
COMPENSATION
- Salary Range: $91,700 - $163,700 annually based on full-time employment
Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. We comply with all minimum wage laws as applicable.
Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants.
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone - of every race, gender, sexuality, age, location, and income - deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups, and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.
UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations.
UnitedHealth Group is a drug-free workplace. Candidates are required to pass a drug test before beginning employment.
RPO #GREEN
See All 194+ Remote Site Reliability Engineer Jobs
Find roles that match your experience and apply in just a few clicks.
Find JobsRemote Site Reliability Engineer Job Market
Who's Hiring
- IT Labs42I
- Cognitive Medical Systems7

- Five95

- Capgemini4

- nSCALE3N
Top Industries Hiring
- Technology & Software63
- Consulting & Professional Services11
- Investment & Asset Management4
- Insurance3
- Sports & Recreation3
What Employers Look For
The qualifications that appear most often in remote site reliability engineer jobs.
- Proficiency with container orchestration platforms such as Kubernetes and Docker
- Experience with infrastructure-as-code tools including Terraform or Pulumi
- Hands-on background with cloud platforms such as AWS, GCP, or Azure
- Fluency in at least one scripting or programming language such as Python or Go
- Experience designing and maintaining observability stacks using tools like Prometheus, Grafana, or Datadog
- Familiarity with CI/CD pipelines and deployment automation tooling
Tips for Your Remote Site Reliability Engineer Job Search
Quantify your reliability impact clearly
Recruiters and hiring managers scan for SLO, SLA, and SLI metrics you've owned or improved. Include uptime percentages you maintained, incident response times you reduced, and toil-automation wins. Numbers tied to reliability engineering stand out far more than general infrastructure experience.
Tailor your resume to the stack
Site reliability engineer postings vary widely between Kubernetes-heavy shops, Terraform-centric teams, and observability-first orgs. Read each job description for the exact tooling mentioned and mirror that language in your resume. A generic SRE resume loses to one matched to the hiring team's actual stack.
Apply early to roles that fit
Migrate Mate lists site reliability engineer openings from across the United States in one place, so you can find roles that match and apply directly to each listing.
Highlight on-call ownership and postmortems
Many candidates skip on-call history, but hiring teams treat it as proof you've operated systems under real pressure. Note the scale of systems you were on-call for, any blameless postmortem culture you contributed to, and runbooks or playbooks you authored or standardized.
Prepare for system design and failure scenarios
SRE interviews almost always include a distributed systems design question and at least one incident simulation or troubleshooting walkthrough. Practice narrating your diagnostic reasoning out loud, covering how you'd isolate a latency spike or cascading failure across dependent services.
Negotiate scope alongside compensation
When evaluating an offer, ask specifically about on-call rotation frequency, escalation paths, and headcount on the SRE team. Understaffed teams mean heavier rotations. Understanding operational load before you accept is as important as any other term in the offer.
Remote Site Reliability Engineer Jobs: Frequently Asked Questions
How do I get a remote site reliability engineer job?
Target companies that already run distributed teams, since they hire remotely by default and know how to onboard someone they never meet in person. Remote site reliability engineer employers screen hard for self-direction and clear written communication on top of the core skills, so show evidence you can own work without someone over your shoulder. Apply to the openings above that match your experience.
Which companies hire remote site reliability engineers?
Employers currently hiring remote site reliability engineers include IT Labs, Cognitive Medical Systems, and Five9, per current remote listings on Migrate Mate as of June 2026. Remote-first firms and large companies running distributed teams post the most remote site reliability engineer roles.
Can you get a remote site reliability engineer job with no experience?
Yes, but it is harder than an on-site role, because remote work expects you to operate independently from the start. Entry-level remote site reliability engineer openings do exist, especially at remote-first companies, and a portfolio of real work helps more than a long resume. Applying broadly to the roles that fit improves your odds.
Do you need a degree for remote site reliability engineer jobs?
Not always. Many employers hire remote site reliability engineers on demonstrated skills and prior work rather than a specific degree, though some larger companies still prefer one. Showing relevant results matters more than a credential for most remote site reliability engineer roles.
Which industries hire the most remote site reliability engineers?
The sectors hiring the most remote site reliability engineers are Technology & Software, Consulting & Professional Services, and Investment & Asset Management, based on current remote listings on Migrate Mate as of June 2026. These sectors run distributed teams and hire site reliability engineers remotely most consistently.
See All 194+ Remote Site Reliability Engineer Jobs
Find roles that match your experience and apply in just a few clicks.
Find Jobs