LOG IN
SIGN UP
Canary Wharfian - Online Investment Banking & Finance Community.
Sign In
or continue with e-mail and password
Forgot password?
Don't have an account?
Create an account
or continue with e-mail and password
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Senior DevOps Engineer (AI + Azure)

ExperiencedNo visa sponsorship
Ernst & Young logo

at Ernst & Young

Big Four

Posted 15 days ago

No clicks

**Senior DevOps Engineer (AI + Azure):** Build, run, and secure cloud-AI infrastructure at EY wavespace Madrid - AI & Data Hub. Key responsibilities include Azure Terraform, GitHub Actions/Azure DevOps CI/CD, Kubernetes/AKS management, LLM and RAG runtime support, observability, and Zero Trust security. Required: 4+ years in DevOps, strong Linux, Terraform, and deep Azure experience, plus CI/CD and container/Kubernetes skills. Nice to have includes multi-cloud exposure, Azure AI services, and relevant certifications. Work in multicultural, innovative team. Everything as code, small batches, clear runbooks.

Compensation
Not specified

Currency: Not specified

City
Madrid
Country
Spain

Full Job Description

About Us

At EY wavespace Madrid - AI & Data Hub, we are a diverse, multicultural team at the forefront of technological innovation, working with cutting-edge technologies like Gen AI, data analytics, robotics, etc. Our center is dedicated to exploring the future of AI and Data.

 

Overview:

Were looking for a Senior DevOps Engineer to build and run cloud and AI infrastructure at scale. Youll own IaC with Terraform, CI/CD, Kubernetes, and Linux. Youll also help run LLM workloads both in Azure and locally (Ollama/vLLM/llama.cpp). Your work will enable fast, secure, repeatable delivery.

 

Key responsibilities

  • Build and maintain Azure infrastructure with Terraform (modules, workspaces, pipelines, policies).
  • Design and operate CI/CD with GitHub Actions and/or Azure DevOps (multi-stage, approvals, environments).
  • Run containers and Kubernetes/AKS (Helm, ingress, autoscaling, node pools, storage).
  • Manage AI/LLM runtime: local model runners (Ollama, vLLM, llama.cpp), GPU/CPU configs.
  • Support RAG: embeddings pipelines, vector DBs (Azure AI Search/Cognitive Search, pgvector, Milvus), data sync, retention.
  • Automate platform tasks with Python (tooling, CLI utilities, API glue, ops scripts).
  • Implement observability (Azure Monitor, Prometheus/Grafana, logs/traces/metrics, alerts, runbooks, SLOs).
  • Apply Zero Trust security; Enforce least privilege and role-based access control (RBAC), Identity-based segmentation (Azure AD, Conditional Access, MFA).
  • Implement policy-as-code (OPA, Azure Policy) for compliance.
  • Rotate secrets and certificates via Key Vault; integrate with pipelines.
  • Add continuous security scanning (SAST/DAST, container image scanning).
  • Handle reliability: rollout strategies, health probes, incident response, postmortems.
  • Optimize costs: right-sizing, autoscaling, budgets, tags, reporting.

 

Key requirements:

  • 4+ years in DevOps/SRE/Platform Engineering.
  • Strong Linux (shell, systemd, networking, performance troubleshooting).
  • Terraform at scale (modules, state backends, CI/CD integration).
  • Deep Azure experience (AKS, VNets, Key Vault, Storage, Monitor, Identity, Networking).
  • CI/CD expertise (GitHub Actions and/or Azure DevOps).
  • Containers and Kubernetes in production.
  • Python or scripting for automation (solid scripting and tooling; not full-time app dev).
  • Hands-on with LLM setups (local runners or Azure OpenAI), embeddings, vector indexes, and RAG basics.

Nice to have

  • Multi-cloud exposure (AWS / GCP).
  • Azure AI services (Azure OpenAI, Cognitive Search).
  • GitOps (Argo CD/Flux), Helm packaging, OCI registries.
  • Eventing/queues (Event Grid, Service Bus, Kafka).
  • Security/compliance in cloud (CIS, NIST, Microsoft CAF).
  • Certifications: AZ104, AZ204, AZ400, AI900, HashiCorp Terraform Associate, CKA/CKAD.
  • Experience with GPU nodes, drivers, CUDA/ROCm, or CPU-only optimizations for LLMs.

How we work

  • Everything as code. PRs, reviews, and tests.
  • Small batches. Trunk-based or short-lived branches.
  • Clear runbooks and on-call rotation where needed.
  • Measure, alert, fix, and improve.

 

Our commitment to diversity & inclusion

We are genuinely passionate about inclusion and we support individuals of all groups; we do not discriminate on the basis of race, religion, gender, sexual orientation, or disability status. 

 

 

Senior DevOps Engineer (AI + Azure)

Compensation

Not specified

City: Madrid

Country: Spain

Ernst & Young logo
Big Four

15 days ago

No clicks

at Ernst & Young

ExperiencedNo visa sponsorship

**Senior DevOps Engineer (AI + Azure):** Build, run, and secure cloud-AI infrastructure at EY wavespace Madrid - AI & Data Hub. Key responsibilities include Azure Terraform, GitHub Actions/Azure DevOps CI/CD, Kubernetes/AKS management, LLM and RAG runtime support, observability, and Zero Trust security. Required: 4+ years in DevOps, strong Linux, Terraform, and deep Azure experience, plus CI/CD and container/Kubernetes skills. Nice to have includes multi-cloud exposure, Azure AI services, and relevant certifications. Work in multicultural, innovative team. Everything as code, small batches, clear runbooks.

Full Job Description

About Us

At EY wavespace Madrid - AI & Data Hub, we are a diverse, multicultural team at the forefront of technological innovation, working with cutting-edge technologies like Gen AI, data analytics, robotics, etc. Our center is dedicated to exploring the future of AI and Data.

 

Overview:

Were looking for a Senior DevOps Engineer to build and run cloud and AI infrastructure at scale. Youll own IaC with Terraform, CI/CD, Kubernetes, and Linux. Youll also help run LLM workloads both in Azure and locally (Ollama/vLLM/llama.cpp). Your work will enable fast, secure, repeatable delivery.

 

Key responsibilities

  • Build and maintain Azure infrastructure with Terraform (modules, workspaces, pipelines, policies).
  • Design and operate CI/CD with GitHub Actions and/or Azure DevOps (multi-stage, approvals, environments).
  • Run containers and Kubernetes/AKS (Helm, ingress, autoscaling, node pools, storage).
  • Manage AI/LLM runtime: local model runners (Ollama, vLLM, llama.cpp), GPU/CPU configs.
  • Support RAG: embeddings pipelines, vector DBs (Azure AI Search/Cognitive Search, pgvector, Milvus), data sync, retention.
  • Automate platform tasks with Python (tooling, CLI utilities, API glue, ops scripts).
  • Implement observability (Azure Monitor, Prometheus/Grafana, logs/traces/metrics, alerts, runbooks, SLOs).
  • Apply Zero Trust security; Enforce least privilege and role-based access control (RBAC), Identity-based segmentation (Azure AD, Conditional Access, MFA).
  • Implement policy-as-code (OPA, Azure Policy) for compliance.
  • Rotate secrets and certificates via Key Vault; integrate with pipelines.
  • Add continuous security scanning (SAST/DAST, container image scanning).
  • Handle reliability: rollout strategies, health probes, incident response, postmortems.
  • Optimize costs: right-sizing, autoscaling, budgets, tags, reporting.

 

Key requirements:

  • 4+ years in DevOps/SRE/Platform Engineering.
  • Strong Linux (shell, systemd, networking, performance troubleshooting).
  • Terraform at scale (modules, state backends, CI/CD integration).
  • Deep Azure experience (AKS, VNets, Key Vault, Storage, Monitor, Identity, Networking).
  • CI/CD expertise (GitHub Actions and/or Azure DevOps).
  • Containers and Kubernetes in production.
  • Python or scripting for automation (solid scripting and tooling; not full-time app dev).
  • Hands-on with LLM setups (local runners or Azure OpenAI), embeddings, vector indexes, and RAG basics.

Nice to have

  • Multi-cloud exposure (AWS / GCP).
  • Azure AI services (Azure OpenAI, Cognitive Search).
  • GitOps (Argo CD/Flux), Helm packaging, OCI registries.
  • Eventing/queues (Event Grid, Service Bus, Kafka).
  • Security/compliance in cloud (CIS, NIST, Microsoft CAF).
  • Certifications: AZ104, AZ204, AZ400, AI900, HashiCorp Terraform Associate, CKA/CKAD.
  • Experience with GPU nodes, drivers, CUDA/ROCm, or CPU-only optimizations for LLMs.

How we work

  • Everything as code. PRs, reviews, and tests.
  • Small batches. Trunk-based or short-lived branches.
  • Clear runbooks and on-call rotation where needed.
  • Measure, alert, fix, and improve.

 

Our commitment to diversity & inclusion

We are genuinely passionate about inclusion and we support individuals of all groups; we do not discriminate on the basis of race, religion, gender, sexual orientation, or disability status.