Site Reliability Engineer Jobs for OPT Students
Site Reliability Engineer jobs on OPT are available across tech, finance, and cloud infrastructure companies. SRE roles typically require a computer science or engineering degree, making them a strong fit for STEM OPT extensions. Most positions offer the 24-month STEM extension, giving you up to three years of total work authorization.
See All Site Reliability Engineer JobsOverview
Showing 5 of 187+ Site Reliability Engineer jobs


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?


Have you applied for this role?
See all 187+ Site Reliability Engineer jobs
Sign up for free to unlock all listings, filter by visa type, and get alerts for new Site Reliability Engineer roles.
Get Access To All Jobs
Role: SRE Devops Architect
Duration: Long Term Contract
Location: Chicago, IL
Job Description:
We are seeking a high-caliber Lead SRE & Edge Architect to serve as the Sole Custodian of our Integrated Intelligence Ops Ecosystem. This role is critical for bridging the gap between physical hardware security and high-velocity, cloud-native application delivery within a Google Distributed Cloud (GDC) Connected environment. You will lead the orchestration of hybrid workloads (VMs and Containers) and pioneer a Zero-Trust security posture at the edge. This is a "Full-Stack Infrastructure" role, where your responsibility begins at the physical intrusion sensor on the rack and extends to the AIOps-driven self-healing of application workloads.
Key Responsibilities:
Section 1: Scope of Work (SOW)
Ecosystem Custodianship: Design, deploy, and maintain the bridge between physical hardware security and GDC-native application delivery.
Hybrid Orchestration: Lead the deployment of VM Runtime on GDC and Symcloud Storage, enabling the co-existence of legacy VM workloads and modern microservices.
Zero-Trust Engineering: Establish and maintain a "Zero-Trust" architecture using MASQUE/TLS tunnels, per-machine certificates, and hardware-level encryption.
Full-Lifecycle Ownership: Responsible for the entire stack: from monitoring Physical Intrusion Sensors in the rack to implementing AIOps for workload self-healing.
Connectivity Management: Ensure seamless and secure GDC-to-GCP communication while maintaining 99.99% uptime for disconnected/local-first operations.
What are the Mandatory skills and skill proficiencies required for this position?
Cloud-Native & DevOps Orchestration
- Orchestration: Expert-level mastery of Kubernetes (K8s) and Docker for large-scale distributed systems.
- Infrastructure as Code (IaC): Advanced automation using Terraform (GCP/GDC resources) and Ansible (Bare Metal OS configuration).
- CI/CD & Packaging: Engineering complex Helm Charts and managing pipelines via GitHub Actions, GitLab CI, and Jenkins.
- Networking: Implementing NGINX Load Balancer as a Service and integrating with the Distributed Cloud Edge Network API.
II. Observability & Intelligence Ops
- The "Gold Standard" Stack: Hands-on implementation of Prometheus (metrics), Grafana (visualization), and OpenTelemetry (OTEL) for distributed tracing.
- AIOps & Self-Healing: Proven ability to build closed-loop remediation using Vertex AI to analyze OTEL traces and trigger automated Google Cloud CLI/API scripts.
III. Storage & Virtualization
- Storage Architecture: Designing software-defined storage using Symcloud Storage and local persistent volumes.
- VM Runtime: Deep experience orchestrating legacy VMs alongside containers within a single GDC control plane.
Section 3: Mandatory Security & Hardware Integrity
- Hardware Defense: Knowledge of Physical Intrusion Sensors, Port Lockdown strategies, and Platform Certificates.
- Root of Trust: Managing Trusted Platform Module (TPM) integrations and LUKS for data-at-rest protection.
- Encryption: Implementation of Self-Encrypting Disks (SED) and Cloud KMS for Customer Managed Encryption Keys (CMEK).
- Secure Tunnels: Engineering MASQUE tunnels or TLS-wrapped connections using per-machine certificates.
Section 4: Compliance & Operational Governance
- IAM Governance: Expert knowledge of Distributed Cloud Edge Container API roles and least-privilege access enforcement.
- Resiliency: Implementation of Availability Best Practices (Region/Zone/Rack awareness) for disconnected state reliability.
- Strategic Liaison: Act as the primary technical point of contact for Google-certified System Integrators (SI) to ensure hardware meets GDC specifications.
Preferred Qualifications
- Google Cloud Professional Cloud Architect or Professional Data Engineer certification.
- Experience in highly regulated sectors (Defense, Telco, Banking, or Healthcare).
- Strong background in Linux Kernel tuning and Bare Metal performance optimization.

Role: SRE Devops Architect
Duration: Long Term Contract
Location: Chicago, IL
Job Description:
We are seeking a high-caliber Lead SRE & Edge Architect to serve as the Sole Custodian of our Integrated Intelligence Ops Ecosystem. This role is critical for bridging the gap between physical hardware security and high-velocity, cloud-native application delivery within a Google Distributed Cloud (GDC) Connected environment. You will lead the orchestration of hybrid workloads (VMs and Containers) and pioneer a Zero-Trust security posture at the edge. This is a "Full-Stack Infrastructure" role, where your responsibility begins at the physical intrusion sensor on the rack and extends to the AIOps-driven self-healing of application workloads.
Key Responsibilities:
Section 1: Scope of Work (SOW)
Ecosystem Custodianship: Design, deploy, and maintain the bridge between physical hardware security and GDC-native application delivery.
Hybrid Orchestration: Lead the deployment of VM Runtime on GDC and Symcloud Storage, enabling the co-existence of legacy VM workloads and modern microservices.
Zero-Trust Engineering: Establish and maintain a "Zero-Trust" architecture using MASQUE/TLS tunnels, per-machine certificates, and hardware-level encryption.
Full-Lifecycle Ownership: Responsible for the entire stack: from monitoring Physical Intrusion Sensors in the rack to implementing AIOps for workload self-healing.
Connectivity Management: Ensure seamless and secure GDC-to-GCP communication while maintaining 99.99% uptime for disconnected/local-first operations.
What are the Mandatory skills and skill proficiencies required for this position?
Cloud-Native & DevOps Orchestration
- Orchestration: Expert-level mastery of Kubernetes (K8s) and Docker for large-scale distributed systems.
- Infrastructure as Code (IaC): Advanced automation using Terraform (GCP/GDC resources) and Ansible (Bare Metal OS configuration).
- CI/CD & Packaging: Engineering complex Helm Charts and managing pipelines via GitHub Actions, GitLab CI, and Jenkins.
- Networking: Implementing NGINX Load Balancer as a Service and integrating with the Distributed Cloud Edge Network API.
II. Observability & Intelligence Ops
- The "Gold Standard" Stack: Hands-on implementation of Prometheus (metrics), Grafana (visualization), and OpenTelemetry (OTEL) for distributed tracing.
- AIOps & Self-Healing: Proven ability to build closed-loop remediation using Vertex AI to analyze OTEL traces and trigger automated Google Cloud CLI/API scripts.
III. Storage & Virtualization
- Storage Architecture: Designing software-defined storage using Symcloud Storage and local persistent volumes.
- VM Runtime: Deep experience orchestrating legacy VMs alongside containers within a single GDC control plane.
Section 3: Mandatory Security & Hardware Integrity
- Hardware Defense: Knowledge of Physical Intrusion Sensors, Port Lockdown strategies, and Platform Certificates.
- Root of Trust: Managing Trusted Platform Module (TPM) integrations and LUKS for data-at-rest protection.
- Encryption: Implementation of Self-Encrypting Disks (SED) and Cloud KMS for Customer Managed Encryption Keys (CMEK).
- Secure Tunnels: Engineering MASQUE tunnels or TLS-wrapped connections using per-machine certificates.
Section 4: Compliance & Operational Governance
- IAM Governance: Expert knowledge of Distributed Cloud Edge Container API roles and least-privilege access enforcement.
- Resiliency: Implementation of Availability Best Practices (Region/Zone/Rack awareness) for disconnected state reliability.
- Strategic Liaison: Act as the primary technical point of contact for Google-certified System Integrators (SI) to ensure hardware meets GDC specifications.
Preferred Qualifications
- Google Cloud Professional Cloud Architect or Professional Data Engineer certification.
- Experience in highly regulated sectors (Defense, Telco, Banking, or Healthcare).
- Strong background in Linux Kernel tuning and Bare Metal performance optimization.
How to Get Visa Sponsorship as a Site Reliability Engineer
Lead with your on-call and incident response experience
SRE hiring managers want to see that you've owned production systems under pressure. Quantify your impact: mention uptime percentages you maintained, incidents you resolved, or MTTR improvements you drove. Numbers signal operational maturity.
Certify in cloud platforms before applying
AWS, GCP, or Azure certifications signal hands-on infrastructure credibility to OPT-sponsoring employers. Many SRE job postings list these as preferred qualifications. Earning one before your job search closes the gap between academic projects and production environments.
Target companies with established OPT STEM extension track records
Large tech companies and cloud-native firms file OPT STEM extensions routinely. Prioritize employers who have sponsored SRE roles before. Browse Migrate Mate to filter specifically for companies actively hiring OPT students in SRE and infrastructure roles.
Frame your thesis or research as production infrastructure work
If your graduate research involved distributed systems, containerization, or large-scale data pipelines, reframe it in SRE terms. Employers sponsoring OPT care about scope and ownership, not just academic framing. Show you built and maintained something at scale.
Address your OPT timeline proactively in applications
SRE roles often involve long ramp-up periods. Mentioning that your STEM-eligible degree qualifies you for up to 36 months of work authorization reassures employers that onboarding investment is protected. Include this briefly in your cover letter or application notes.
Build a portfolio of reliability tooling and runbooks
Public GitHub repositories showing monitoring dashboards, alerting configurations, or infrastructure-as-code projects demonstrate practical SRE skills. OPT-sponsoring employers want evidence you can contribute immediately, and a portfolio removes uncertainty about your technical readiness before the interview.
Site Reliability Engineer jobs are hiring across the US. Find yours.
Find Site Reliability Engineer JobsSee all 187+ Site Reliability Engineer jobs
Sign up for free to unlock all listings, filter by visa type, and get alerts for new Site Reliability Engineer roles.
Get Access To All JobsFrequently Asked Questions
Do Site Reliability Engineer jobs qualify for the STEM OPT extension?
Yes. SRE roles almost always qualify for the 24-month STEM OPT extension because they require a degree in computer science, computer engineering, or a related STEM field. This gives eligible F-1 students up to 36 months of total OPT work authorization, which significantly reduces the urgency for employers to file an H-1B immediately.
How do I find SRE jobs that are open to OPT students?
Migrate Mate is built specifically for F-1 OPT students and filters job listings by sponsorship eligibility. Rather than applying broadly and asking about OPT status mid-process, Migrate Mate lets you browse SRE roles at companies already open to hiring students on work authorization, which saves time and avoids awkward conversations late in the hiring process.
What degree fields support an SRE OPT application?
Computer science, computer engineering, electrical engineering, information systems, and software engineering degrees all commonly support SRE OPT work authorization. Your degree field must appear on USCIS's official STEM Designated Degree Program list for the STEM extension to apply. If your degree is in a borderline field, confirm with your DSO before applying.
Will SRE employers wait for my OPT EAD card to arrive before my start date?
Most established tech employers are familiar with OPT EAD processing timelines and will negotiate a start date accordingly. You cannot begin working until your EAD card arrives and your OPT start date is reached. File your OPT application 90 days before your program end date to give yourself the maximum buffer before a target start date.
Do I need H-1B sponsorship immediately after getting an SRE job on OPT?
Not immediately. If you qualify for the STEM extension, you have up to 36 months total on OPT before needing H-1B status. That covers roughly three H-1B lottery cycles, which meaningfully improves your odds of selection over time. Many SRE employers file H-1B petitions for OPT employees during this window without requiring you to secure it upfront.
See which Site Reliability Engineer employers are hiring and sponsoring visas right now.
Search Site Reliability Engineer Jobs