LOG IN
SIGN UP
Canary Wharfian - Online Investment Banking & Finance Community.
Sign In
or continue with e-mail and password
Forgot password?
Don't have an account?
Create an account
or continue with e-mail and password
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Compute Operations Engineer

ExperiencedNo visa sponsorship
Qube logo

at Qube

Proprietary Trading

Posted 8 days ago

No clicks

**Compute Operations Engineer** Support day-to-day on-prem compute infrastructure at QRT, collaborating with cross-functional teams. **Responsibilities** include server hardware management, Linux systems troubleshooting, infrastructure monitoring, hardware lifecycle activities, and user-facing issue resolution across Slurm and Kubernetes platforms. **Skills required**: 2-5 years in compute infrastructure, RHEL proficiency, server hardware knowledge, monitoring tools familiarity, HPC/SLURM/Kubernetes experience, automation scripting, and strong communication.

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
Not specified

Full Job Description

Qube Research & Technologies (QRT) is a global quantitative and systematic investment manager, operating in all liquid asset classes across the world. We are a technology- and data-driven group implementing a scientific approach to investing. Combining data, research, technology, and trading expertise has shaped our collaborative mindset, which enables us to solve the most complex challenges. QRTs culture of innovation continuously drives our ambition to deliver high-quality returns for our investors.

The Compute Operations team support the day-to-day operation of on-prem compute infrastructure, covering HPC server hardware, Linux-based platforms, and user-facing support. You will work closely with the Compute Ops Team Lead, Linux engineers, and other platform groups to maintain reliable, performant compute services across Slurm, Kubernetes, and control-plane environments. 

Your Future Role within QRT
You will:

  • Provide hands-on support for HPC server hardware, including diagnostics, issue investigation, and coordination with vendors for repairs
  • Monitor system health and respond to alerts using infrastructure monitoring tools
  • Support hardware lifecycle activities, including provisioning, maintenance, and decommissioning
  • Troubleshoot Linux-based systems across OS, networking, and storage layers
  • Triage and resolve user-facing issues across compute platforms such as Slurm and Kubernetes
  • Coordinate with internal teams and vendors on maintenance and incident resolution
  • Execute scheduled maintenance and change activities
  • Maintain accurate infrastructure records and documentation
  • Contribute to runbooks and continuous improvement of operational processes
  • Participate in on-call rotations and incident response

Your Present Skillset

  • 25 years of experience in compute infrastructure, systems engineering, or a related role
  • Strong Linux systems administration experience (i.e. RHEL, Rocky Linux, or similar)
  • Strong understanding of server hardware (i.e. compute, storage, networking components)
  • Familiarity with infrastructure monitoring tools (e.g. OneView, Dell OME, or similar)
  • Exposure to HPC or platform environments such as Slurm or Kubernetes
  • Experience or familiarity with operational tooling (i.e. NetBox, DNS, HashiCorp Vault, Ansible, scripting languages or similar)
  • Knowledge of automation or scripting (e.g. Bash, Python, Ansible)
  • Strong troubleshooting and problem-solving skills
  • Ability to communicate effectively and work in a collaborative environment
  • Understanding of datacentre operations and safety practices is beneficial

QRT is an equal opportunity employer. We welcome diversity as essential to our success. QRT empowers employees to work openly and respectfully to achieve collective success. In addition to professional achievement, we are offering initiatives and programs to enable employees achieve a healthy work-life balance.

Compute Operations Engineer

Compensation

Not specified

City: Not specified

Country: Not specified

Qube logo
Proprietary Trading

8 days ago

No clicks

at Qube

ExperiencedNo visa sponsorship

**Compute Operations Engineer** Support day-to-day on-prem compute infrastructure at QRT, collaborating with cross-functional teams. **Responsibilities** include server hardware management, Linux systems troubleshooting, infrastructure monitoring, hardware lifecycle activities, and user-facing issue resolution across Slurm and Kubernetes platforms. **Skills required**: 2-5 years in compute infrastructure, RHEL proficiency, server hardware knowledge, monitoring tools familiarity, HPC/SLURM/Kubernetes experience, automation scripting, and strong communication.

Full Job Description

Qube Research & Technologies (QRT) is a global quantitative and systematic investment manager, operating in all liquid asset classes across the world. We are a technology- and data-driven group implementing a scientific approach to investing. Combining data, research, technology, and trading expertise has shaped our collaborative mindset, which enables us to solve the most complex challenges. QRTs culture of innovation continuously drives our ambition to deliver high-quality returns for our investors.

The Compute Operations team support the day-to-day operation of on-prem compute infrastructure, covering HPC server hardware, Linux-based platforms, and user-facing support. You will work closely with the Compute Ops Team Lead, Linux engineers, and other platform groups to maintain reliable, performant compute services across Slurm, Kubernetes, and control-plane environments. 

Your Future Role within QRT
You will:

  • Provide hands-on support for HPC server hardware, including diagnostics, issue investigation, and coordination with vendors for repairs
  • Monitor system health and respond to alerts using infrastructure monitoring tools
  • Support hardware lifecycle activities, including provisioning, maintenance, and decommissioning
  • Troubleshoot Linux-based systems across OS, networking, and storage layers
  • Triage and resolve user-facing issues across compute platforms such as Slurm and Kubernetes
  • Coordinate with internal teams and vendors on maintenance and incident resolution
  • Execute scheduled maintenance and change activities
  • Maintain accurate infrastructure records and documentation
  • Contribute to runbooks and continuous improvement of operational processes
  • Participate in on-call rotations and incident response

Your Present Skillset

  • 25 years of experience in compute infrastructure, systems engineering, or a related role
  • Strong Linux systems administration experience (i.e. RHEL, Rocky Linux, or similar)
  • Strong understanding of server hardware (i.e. compute, storage, networking components)
  • Familiarity with infrastructure monitoring tools (e.g. OneView, Dell OME, or similar)
  • Exposure to HPC or platform environments such as Slurm or Kubernetes
  • Experience or familiarity with operational tooling (i.e. NetBox, DNS, HashiCorp Vault, Ansible, scripting languages or similar)
  • Knowledge of automation or scripting (e.g. Bash, Python, Ansible)
  • Strong troubleshooting and problem-solving skills
  • Ability to communicate effectively and work in a collaborative environment
  • Understanding of datacentre operations and safety practices is beneficial

QRT is an equal opportunity employer. We welcome diversity as essential to our success. QRT empowers employees to work openly and respectfully to achieve collective success. In addition to professional achievement, we are offering initiatives and programs to enable employees achieve a healthy work-life balance.