LOG IN
SIGN UP
Canary Wharfian - Online Investment Banking & Finance Community.
Sign In
or continue with e-mail and password
Forgot password?
Don't have an account?
Create an account
or continue with e-mail and password
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Lead Software Engineer - DevOps / Production Support

ExperiencedNo visa sponsorship
J.P. Morgan logo

at J.P. Morgan

Bulge Bracket Investment Banks

Posted 5 days ago

No clicks

**Lead Software Engineer - DevOps / Production Support** executes creative technical solutions, design and development, and daily production support for electronic trading platforms, with a focus on FIX protocol connectivity and Python automation. This senior-level role requires 5+ years of experience in DevOps, SRE, or application support, and proficiency in Python, C++, Linux troubleshooting, and Grafana-based observability. Ideal candidates will have expertise in AWS, Terraform, and proven incident management skills in demanding trading environments. Leads incident triage, root cause analysis, and operational automation, while collaborating across trading, ops, and engineering teams.

Compensation
Not specified USD

Currency: $ (USD)

City
Houston
Country
United States

Full Job Description

Location: Houston, TX, United States

We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.

As a Lead Software Engineer at JPMorgan Chase within the Commercial & Investment Banking - Markets Tech - Trading / Derivatives Execution Tech team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firms business objectives.

This position will support the reliability, performance, and operational integrity of electronic and equities trading systems, with a specific focus on FIX protocol connectivity. This role is hands-on and operations-oriented, partnering closely with trading, technology, and development teams to ensure stable order flow, rapid incident response, and disciplined change execution. The position emphasizes Python automation, Linux troubleshooting, and Grafana-based observability, with C++ exposure used primarily to investigate issues and collaborate effectively with Application 

 

Job responsibilities

  • Executes creative software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
  • Provide daily production support for electronic trading platforms, including FIX sessions, connectivity health, and order/trade workflow stability
  • Monitor system health and trading-impacting signals using Grafana dashboards and alerting to improve visibility with latency, errors, throughput, and availability 
  • Lead incident triage and restoration activities during service degradation, including structured troubleshooting, stakeholder communications, and post-incident follow-up
  • Perform root cause analysis on recurring issues and implement durable remediation, including runbook improvements, alert tuning, and operational automation
  • Develops secure high-quality production code with reviewing and debugging code by using Python scripts and tools for health checks, operational workflows, reporting, and environment validation (per user-provided role intent)
  • Drives team adoption of enterprise-authorized AI-assisted engineering practices within the work environment to improve code quality, delivery speed, and operational outcomes (e.g., AI-assisted code review / refactoring, test strategy acceleration, incident/root-cause analysis support), while establishing consistent validation standards (secure coding, peer review, automated testing) and promoting reuse of effective patterns across the team
  • Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automaton capabilities, to improve the value realized by automation
  • Troubleshoot Linux based systems using logs, process and resource diagnostics, and network-level checks relevant to connectivity and application behavior (per user-provided role intent)
  • Partner with development teams to investigate complex issues in trading components with read logs, traces, diagnostic output and the ability to interpret and discuss findings in contexts where components are implemented in C++
  • Adds to team culture of diversity, opportunity, inclusion, and respect

 

Required qualifications, capabilities, and skills

  • Formal training or certification on Software engineering concepts and 5+ years applied experience 
  • Advanced in one or more programming language(s), framework(s) and tools (e.g., Python, C++, Linux, Grafana, etc.)
  • Demonstrated experience in DevOps, production support, SRE, or application support in a mission-critical environment, with accountability for uptime and incident execution
  • Practical understanding of the FIX protocol
  • Strong Linux troubleshooting capability, including log analysis, process/resource diagnostics, and command-line proficiency
  • Hands-on experience with AWS and Terraform (infrastructure as code), and familiarity/experience with Atlas and Copilot as part of the deployment and platform toolchain

  • Ability to collaborate effectively across trading, operations, and engineering teams, including clear incident communications under time pressure

  • Proficiency in automation and continuous delivery methods, with advanced understanding of agile methodologies such as CI/CD, Application Resiliency, and Security
  • Demonstrated experience leading effective use of approved AI-assisted software development tools (e.g., for coding, code review, test acceleration, troubleshooting) with the ability to set team expectations for validating AI outputs for correctness, performance, and security
  • Strong understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; experience coaching engineers on safe, compliant adoption within delivery practices
  • Demonstrated proficiency in software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)

 

Preferred qualifications, capabilities, and skills
 
  • Knowledge in electronic trading or equities trading environments, including familiarity with order lifecycle concepts and trading-impacting incident patterns 
  • Exposure to C++ sufficient to assist with investigation (e.g., reading stack traces, understanding logs and component behavior), without being a primary feature developer
  • Demonstrated proficiency in software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)
  • Familiarity with incident management disciplines, including runbooks, post-incident reviews, alert quality management, and operational readiness practices
  • Basic networking knowledge relevant to troubleshooting connectivity and performance (e.g., TCP/IP behavior, port connectivity, latency sensitivity)
Carry out critical tech solutions across multiple technical areas as an integral part of an agile team

Lead Software Engineer - DevOps / Production Support

Compensation

Not specified USD

City: Houston

Country: United States

J.P. Morgan logo
Bulge Bracket Investment Banks

5 days ago

No clicks

at J.P. Morgan

ExperiencedNo visa sponsorship

**Lead Software Engineer - DevOps / Production Support** executes creative technical solutions, design and development, and daily production support for electronic trading platforms, with a focus on FIX protocol connectivity and Python automation. This senior-level role requires 5+ years of experience in DevOps, SRE, or application support, and proficiency in Python, C++, Linux troubleshooting, and Grafana-based observability. Ideal candidates will have expertise in AWS, Terraform, and proven incident management skills in demanding trading environments. Leads incident triage, root cause analysis, and operational automation, while collaborating across trading, ops, and engineering teams.

Full Job Description

Location: Houston, TX, United States

We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.

As a Lead Software Engineer at JPMorgan Chase within the Commercial & Investment Banking - Markets Tech - Trading / Derivatives Execution Tech team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firms business objectives.

This position will support the reliability, performance, and operational integrity of electronic and equities trading systems, with a specific focus on FIX protocol connectivity. This role is hands-on and operations-oriented, partnering closely with trading, technology, and development teams to ensure stable order flow, rapid incident response, and disciplined change execution. The position emphasizes Python automation, Linux troubleshooting, and Grafana-based observability, with C++ exposure used primarily to investigate issues and collaborate effectively with Application 

 

Job responsibilities

  • Executes creative software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
  • Provide daily production support for electronic trading platforms, including FIX sessions, connectivity health, and order/trade workflow stability
  • Monitor system health and trading-impacting signals using Grafana dashboards and alerting to improve visibility with latency, errors, throughput, and availability 
  • Lead incident triage and restoration activities during service degradation, including structured troubleshooting, stakeholder communications, and post-incident follow-up
  • Perform root cause analysis on recurring issues and implement durable remediation, including runbook improvements, alert tuning, and operational automation
  • Develops secure high-quality production code with reviewing and debugging code by using Python scripts and tools for health checks, operational workflows, reporting, and environment validation (per user-provided role intent)
  • Drives team adoption of enterprise-authorized AI-assisted engineering practices within the work environment to improve code quality, delivery speed, and operational outcomes (e.g., AI-assisted code review / refactoring, test strategy acceleration, incident/root-cause analysis support), while establishing consistent validation standards (secure coding, peer review, automated testing) and promoting reuse of effective patterns across the team
  • Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automaton capabilities, to improve the value realized by automation
  • Troubleshoot Linux based systems using logs, process and resource diagnostics, and network-level checks relevant to connectivity and application behavior (per user-provided role intent)
  • Partner with development teams to investigate complex issues in trading components with read logs, traces, diagnostic output and the ability to interpret and discuss findings in contexts where components are implemented in C++
  • Adds to team culture of diversity, opportunity, inclusion, and respect

 

Required qualifications, capabilities, and skills

  • Formal training or certification on Software engineering concepts and 5+ years applied experience 
  • Advanced in one or more programming language(s), framework(s) and tools (e.g., Python, C++, Linux, Grafana, etc.)
  • Demonstrated experience in DevOps, production support, SRE, or application support in a mission-critical environment, with accountability for uptime and incident execution
  • Practical understanding of the FIX protocol
  • Strong Linux troubleshooting capability, including log analysis, process/resource diagnostics, and command-line proficiency
  • Hands-on experience with AWS and Terraform (infrastructure as code), and familiarity/experience with Atlas and Copilot as part of the deployment and platform toolchain

  • Ability to collaborate effectively across trading, operations, and engineering teams, including clear incident communications under time pressure

  • Proficiency in automation and continuous delivery methods, with advanced understanding of agile methodologies such as CI/CD, Application Resiliency, and Security
  • Demonstrated experience leading effective use of approved AI-assisted software development tools (e.g., for coding, code review, test acceleration, troubleshooting) with the ability to set team expectations for validating AI outputs for correctness, performance, and security
  • Strong understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; experience coaching engineers on safe, compliant adoption within delivery practices
  • Demonstrated proficiency in software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)

 

Preferred qualifications, capabilities, and skills
 
  • Knowledge in electronic trading or equities trading environments, including familiarity with order lifecycle concepts and trading-impacting incident patterns 
  • Exposure to C++ sufficient to assist with investigation (e.g., reading stack traces, understanding logs and component behavior), without being a primary feature developer
  • Demonstrated proficiency in software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)
  • Familiarity with incident management disciplines, including runbooks, post-incident reviews, alert quality management, and operational readiness practices
  • Basic networking knowledge relevant to troubleshooting connectivity and performance (e.g., TCP/IP behavior, port connectivity, latency sensitivity)
Carry out critical tech solutions across multiple technical areas as an integral part of an agile team