Bulge Bracket Investment Banks

Posted 3 days ago

No clicks

**Senior Principal Software Engineer - AI Foundation Services** In Plano, TX, helm AI Foundation Services, building secure, high-performance AI/ML infrastructure for JPMorganChase. Collaborate cross-functionally to synthesize business needs into robust designs. Oversee solution co-development, launch, and early operations, optimizing for scale, reliability, and security. Enhance firm-wide adoption via shared architectures, playbooks, and GPU baselines. Set strategy for AI-powered engineering, driving improvements in delivery speed and code quality. Leverage cloud-native architecture and secure-by-design practices. Requires 10+ years' experience, proficiency in AI/ML platforms and performance engineering.

Compensation: Not specified
City: Not specified
Country: United States

Full Job Description

Location: Plano, TX, United States

If you are looking for a game-changing career, working for one of the world's leading financial institutions, youve come to the right place.

As a Senior Principal Software Engineer at JPMorganChase within AMDP/CDAO, you will serve as a hands-on thought leader and builder for AI Foundation Servicesthe scaled, secure, performance-optimized infrastructure that enables large-scale GenAI and traditional AI/ML across Lines of Business. You will partner directly with Lines of Business application teams to synthesize requirements into implementable designs, co-develop solutions through launch and early operations, and de-risk delivery across performance, scale, reliability, and security. You will also drive firmwide reuse through shared reference architectures, playbooks, test harnesses, and GPU training/serving baselines, raising the engineering bar and accelerating adoption across the portfolio.

Job responsibilities

Leads as a hands-on technical thought leader to build, integrate, and optimize AI Foundation Services infrastructure for GenAI and traditional AI/ML platforms
Co-develops with Lines of Business (LOB) application teams to deliver reusable AI/ML foundational services and managed service patterns
Synthesizes Lines of Business (LOB) requirements into implementable designs and drives delivery from design through launch and early operational support
De-risks delivery across performance, scale, reliability, and security by defining non-functional requirements, testing strategies, and operational readiness criteria
Drives reuse and standardization through shared reference architectures, playbooks, test harnesses, and GPU training/serving baselines for model hosting platforms
Sets strategy and operating standards for agentic AI-enabled engineering across a portfolio (using enterprise-authorized tools within the work environment) to drive measurable improvements in delivery speed, reliability, and code quality (e.g., AI-orchestrated SDLC/TLM automation, release readiness gating, incident triage/root-cause acceleration, and large-scale refactoring/test modernization), while defining guardrails for validation, security, resiliency, and reuse across teams and functions.
Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automation capabilities, to improve the value realized by automation at scale
Advises and leads on the strategy and development of multiple products, applications, and technologies across a portfolio by creating novel code solutions and drives the development of new production code capabilities across teams and functions
Translates highly complex technical issues, trends, and approaches to leadership to drive the firms innovation and enable leaders to make strategic, well-informed decisions about technological advancements
Drives adoption and implementation of technical methods in specialized fields in line with the latest product development methodologies
Influences across business, product, and technology teams and successfully manages senior stakeholder relationships

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 10+ years applied experience
Proven hands-on experience designing and operating AI/ML platform capabilities (model training, serving, feature/data access patterns, and multi-tenant controls)
Demonstrated experience designing and scaling agentic AI-enabled development patterns (using enterprise-authorized tools within the work environment) across teams/functions, including establishing governance for human-in-the-loop validation, traceability/auditability, and secure handling of sensitive inputs/outputs.
Strong understanding of responsible AI use and control expectations at scale, including security/resiliency implications, data sensitivity, and risk-based governance; ability to advise senior leaders on safe adoption, reuse, and measurable outcomes.
Demonstrated expertise in performance engineering and production reliability (capacity planning, load testing, Service Level Objective (SLOs) /Service Level Indicator (SLIs), incident response, and root-cause remediation)
Strong experience with cloud-native architecture (Kubernetes, containers, CI/CD, infrastructure-as-code using Terraform) and secure-by-design engineering practices
Ability to lead end-to-end technical engagements with senior stakeholders, translating requirements into delivered services with clear milestones and acceptance criteria
Practical experience delivering system design, application development, testing, and operational stability
Demonstrated prior experience with influencing across functions and teams and delivering value at scale
Experience applying expertise and new methods to determine solutions for complex technology problems across various technical disciplines
Extensive practical cloud native experience

Preferred qualifications, capabilities, and skills

Experience building GPU-backed model hosting platforms and optimizing inference/training performance (profiling, batching, caching, parallelism, and cost controls)
Experience implementing reusable reference architectures and developer enablement assets (golden paths, templates, playbooks, and automated test harnesses)
Experience with LLM and model serving stacks (e.g., routing, autoscaling, model gateways, online evaluation, and guardrails) in production environments
Experience operating in regulated environments with strong controls (security reviews, threat modeling, audit readiness, and data governance)

Drive firmwide AI Foundation Services strategy and delivery, partnering with LOB teams to de-risk performance.

Full Job Description

Location: Plano, TX, United States

If you are looking for a game-changing career, working for one of the world's leading financial institutions, youve come to the right place.

Job responsibilities

Leads as a hands-on technical thought leader to build, integrate, and optimize AI Foundation Services infrastructure for GenAI and traditional AI/ML platforms
Co-develops with Lines of Business (LOB) application teams to deliver reusable AI/ML foundational services and managed service patterns
Synthesizes Lines of Business (LOB) requirements into implementable designs and drives delivery from design through launch and early operational support
De-risks delivery across performance, scale, reliability, and security by defining non-functional requirements, testing strategies, and operational readiness criteria
Drives reuse and standardization through shared reference architectures, playbooks, test harnesses, and GPU training/serving baselines for model hosting platforms
Sets strategy and operating standards for agentic AI-enabled engineering across a portfolio (using enterprise-authorized tools within the work environment) to drive measurable improvements in delivery speed, reliability, and code quality (e.g., AI-orchestrated SDLC/TLM automation, release readiness gating, incident triage/root-cause acceleration, and large-scale refactoring/test modernization), while defining guardrails for validation, security, resiliency, and reuse across teams and functions.
Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automation capabilities, to improve the value realized by automation at scale
Advises and leads on the strategy and development of multiple products, applications, and technologies across a portfolio by creating novel code solutions and drives the development of new production code capabilities across teams and functions
Translates highly complex technical issues, trends, and approaches to leadership to drive the firms innovation and enable leaders to make strategic, well-informed decisions about technological advancements
Drives adoption and implementation of technical methods in specialized fields in line with the latest product development methodologies
Influences across business, product, and technology teams and successfully manages senior stakeholder relationships

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 10+ years applied experience
Proven hands-on experience designing and operating AI/ML platform capabilities (model training, serving, feature/data access patterns, and multi-tenant controls)
Demonstrated experience designing and scaling agentic AI-enabled development patterns (using enterprise-authorized tools within the work environment) across teams/functions, including establishing governance for human-in-the-loop validation, traceability/auditability, and secure handling of sensitive inputs/outputs.
Strong understanding of responsible AI use and control expectations at scale, including security/resiliency implications, data sensitivity, and risk-based governance; ability to advise senior leaders on safe adoption, reuse, and measurable outcomes.
Demonstrated expertise in performance engineering and production reliability (capacity planning, load testing, Service Level Objective (SLOs) /Service Level Indicator (SLIs), incident response, and root-cause remediation)
Strong experience with cloud-native architecture (Kubernetes, containers, CI/CD, infrastructure-as-code using Terraform) and secure-by-design engineering practices
Ability to lead end-to-end technical engagements with senior stakeholders, translating requirements into delivered services with clear milestones and acceptance criteria
Practical experience delivering system design, application development, testing, and operational stability
Demonstrated prior experience with influencing across functions and teams and delivering value at scale
Experience applying expertise and new methods to determine solutions for complex technology problems across various technical disciplines
Extensive practical cloud native experience

Preferred qualifications, capabilities, and skills

Experience building GPU-backed model hosting platforms and optimizing inference/training performance (profiling, batching, caching, parallelism, and cost controls)
Experience implementing reusable reference architectures and developer enablement assets (golden paths, templates, playbooks, and automated test harnesses)
Experience with LLM and model serving stacks (e.g., routing, autoscaling, model gateways, online evaluation, and guardrails) in production environments
Experience operating in regulated environments with strong controls (security reviews, threat modeling, audit readiness, and data governance)

Drive firmwide AI Foundation Services strategy and delivery, partnering with LOB teams to de-risk performance.

Senior Principal Software Engineer -AI Foundation Services

Full Job Description

SIMILAR OPPORTUNITIES

Principal Software Engineer - AI Foundation Services

Senior AI Engineer - Vice President

Senior GenAI Engineer

Senior Lead AI Engineer

Senior AI Engineer

Senior Principal Software Engineer -AI Foundation Services

Full Job Description

SIMILAR OPPORTUNITIES

Principal Software Engineer - AI Foundation Services

Senior AI Engineer - Vice President

Senior GenAI Engineer

Senior Lead AI Engineer

Senior AI Engineer