Location: West Hollywood / Los Angeles, CA
Work Model: On-site (5 days per week)
Employment Type: Full-Time
Compensation: $200,000–$300,000+ USD (depending on experience and seniority), plus a competitive sign-on bonus.
Applicants must be legally authorized to work in the United States. Visa sponsorship is not available for this role.
About the Opportunity
Our client is a well-funded, early-stage AI company building a next-generation intelligence platform for high-stakes, real-world decision making.
The platform ingests and fuses data from satellite feeds, autonomous sensors, logistics networks, enterprise systems, and open-source intelligence (OSINT) to power production AI/ML workloads, knowledge graphs, and intelligent decision-making systems.
This is not a traditional SaaS, DevOps, or chatbot company. The engineering team is building production AI infrastructure where reliability, scalability, security, and developer productivity are mission-critical.
We're looking for a Senior, Lead, or Principal Platform Engineer who enjoys building platforms—not simply maintaining them. You'll own the cloud infrastructure, Kubernetes platform, CI/CD and GitOps workflows, infrastructure automation, and internal developer platform that enables engineering teams to build and deploy production AI systems at scale.
This is a highly collaborative, hands-on engineering role with significant ownership and influence over the platform architecture.
The Role
As a Platform Engineer, you'll design, build, and operate the infrastructure that powers complex AI/ML workloads, while creating the internal tooling and platform capabilities that help software engineers move faster and more reliably.
The ideal candidate has a strong software engineering foundation, deep cloud infrastructure expertise, and experience owning production Kubernetes environments from design through day-to-day operations.
Key Responsibilities
Platform Engineering
-
Design, build, and operate scalable cloud infrastructure supporting production AI/ML workloads.
-
Own Kubernetes infrastructure, including architecture, networking, security, upgrades, scaling, and operational reliability.
-
Build and evolve an internal developer platform that improves engineering productivity and deployment velocity.
-
Develop self-service infrastructure and automation that enables engineering teams to ship software quickly and safely.
-
Continuously improve developer experience through platform engineering best practices.
Cloud Infrastructure & DevOps
-
Design and implement modern CI/CD and GitOps workflows for production environments.
-
Build reusable Infrastructure-as-Code solutions using Terraform and related tooling.
-
Architect highly available, resilient, and cost-efficient cloud infrastructure.
-
Drive adoption of containerization, Kubernetes, and cloud-native infrastructure across engineering teams.
-
Support AI-powered development workflows using tools such as Claude Code, Cursor, GitHub Copilot, or similar technologies.
AI Infrastructure
-
Build and optimize infrastructure supporting GPU-accelerated machine learning workloads.
-
Improve GPU provisioning, scheduling, utilization, and resource management.
-
Support scalable infrastructure for model training, inference, and AI services deployed in production.
-
Partner closely with AI engineers to optimize platform performance and reliability.
Reliability & Operations
-
Lead the investigation and resolution of complex production incidents across cloud infrastructure, Kubernetes, networking, and applications.
-
Perform root-cause analysis and implement long-term improvements that increase reliability.
-
Build comprehensive monitoring, alerting, logging, and observability solutions.
-
Drive platform reliability, performance optimization, and operational excellence.
Collaboration & Architecture
-
Partner with software engineers, AI engineers, security teams, and technical leadership on platform architecture decisions.
-
Produce technical design documentation for major infrastructure initiatives.
-
Champion engineering best practices around automation, scalability, security, testing, and reliability.
-
Evaluate emerging technologies that improve infrastructure capabilities and developer productivity.
Required Qualifications
-
Bachelor's degree in Computer Science, Software Engineering, Information Technology, or a related technical discipline (Master's preferred).
-
5+ years of experience building and operating production cloud infrastructure, Platform Engineering, DevOps, or Site Reliability Engineering (SRE) environments.
-
Strong software engineering foundation with experience building automation, tooling, services, or developer platforms using Python, Go, Bash, or similar languages.
-
Demonstrated ownership of production Kubernetes clusters, including architecture, networking, upgrades, scaling, and operational support.
-
Hands-on experience designing and building Infrastructure-as-Code solutions using Terraform, including authoring reusable modules.
-
Strong experience designing and building CI/CD and GitOps pipelines—not simply maintaining existing pipelines.
-
Deep experience with Google Cloud Platform (GCP) and/or AWS.
-
Strong understanding of containerization technologies including Docker and Kubernetes.
-
Experience building and operating production-scale distributed systems.
-
Strong troubleshooting skills across cloud infrastructure, Kubernetes, networking, and applications.
-
Experience with observability platforms such as Prometheus, Grafana, Datadog, ELK, or equivalent.
-
Excellent communication and collaboration skills.
Preferred Qualifications
Experience with one or more of the following is highly desirable:
-
AI/ML infrastructure and GPU-accelerated workloads.
-
NVIDIA GPU infrastructure and CUDA environments.
-
Internal developer platforms and self-service infrastructure.
-
GitOps methodologies.
-
AI-native development tools such as Claude Code, Cursor, GitHub Copilot, or Codex.
-
Security-focused environments including DevSecOps practices.
-
Air-gapped, sovereign, or highly regulated deployment environments.
-
Defense, aerospace, government, or other mission-critical industries.
-
FedRAMP, ITAR, CMMC, or similar compliance frameworks.
-
Serverless architectures and distributed systems.
What We're Looking For
Successful candidates will demonstrate:
-
A platform engineering mindset with experience designing, building, and owning infrastructure—not simply maintaining existing environments.
-
A strong software engineering foundation and passion for automation.
-
Experience building platforms and internal tooling that improve developer productivity.
-
Excellent systems thinking across cloud infrastructure, Kubernetes, networking, security, and distributed systems.
-
A high level of ownership and comfort working in fast-moving environments with significant technical responsibility.
-
A pragmatic approach to balancing reliability, scalability, security, and developer experience.
Compensation & Benefits
-
Base salary: $200,000–$300,000+, depending on experience and seniority.
-
Competitive sign-on bonus.
-
Comprehensive benefits package.
-
Opportunity to join a well-funded, high-growth AI company at an early stage with significant technical ownership.
-
Long-term career growth with opportunities to take on broader platform and infrastructure leadership responsibilities as the organization continues to scale.
Why Join?
-
Build production infrastructure powering real-world AI systems—not internal IT or traditional enterprise DevOps.
-
Own the Kubernetes platform, developer experience, and cloud infrastructure that enables AI engineers to move faster.
-
Work alongside a highly technical engineering team solving challenging platform and infrastructure problems.
-
Support GPU-accelerated AI/ML workloads deployed in production.
-
Help shape the technical foundation of a rapidly growing AI company where engineering quality, ownership, and innovation are highly valued.
If you're passionate about Platform Engineering, cloud infrastructure, Kubernetes, automation, and building the systems that power next-generation AI applications, we'd love to hear from you.
Candidates located anywhere in the U.S. are encouraged to apply. The company offers a competitive sign-on bonus for successful hires. Please note that relocation assistance is not provided.