Bulge Bracket Investment Banks

Posted 8 days ago

No clicks

**VP - AI Engineering & Prompt Architecture Lead in Predictive Science** - Lead a team of 8-15 prompt engineers and AI/ML developers in building KYC/AML solutions - Architect multi-step AI workflows, debug prompt failures, and design evaluation frameworks for intelligent document processing - Architect RAG systems for financial documents; manage prompt lifecycle, model selection, and operationalize workflows - Collaborate with governance, data engineering, and stakeholders to ensure robust, compliant AI systems

Compensation: Not specified
City: Not specified
Country: India

Full Job Description

Location: India

Take a lead role in acquiring, managing and retaining meaningful relationships that deliver outstanding experience to our customers. In this role, you will balance your focus on business results by offering options and finding solutions to help our customers with issues.

As a Vice President AI Engineering & Prompt Architecture Lead in Predictive Science, you will own the technical vision and delivery for a team of prompt engineers and AI/ML model developers building intelligent document processing and data extraction solutions in the KYC/AML domain.

Job Responsibilities:

Lead and develop a team of 815 prompt engineers and AI/ML developers; set technical direction, run architecture reviews, and define quality standards for prompts and model artifacts.
Architect agentic, multi-step AI workflows chaining classification extraction cross-validation exception routing with human-in-the-loop checkpoints.
Debug and remediate complex prompt failures (context-window overflow, instruction drift in long chains, RAG retrieval poisoning, output/format instability).
Design prompt/model evaluation frameworks measuring accuracy plus consistency, robustness, latency, cost-per-call, and hallucination rate.
Operationalize prompt lifecycle management as production code: versioning, CI/CD prompt tests, A/B experiments, rollback, and audited change history.
Guide model selection and optimization (prompting vs fine-tuning vs custom training) balancing accuracy, latency, cost, and data sensitivity.
Design RAG architectures for financial documents: chunking, embeddings, vector store design, re-ranking, and context injection.
Oversee fine-tuning/training workflows: dataset curation, annotation quality, training configurations, and generalization across document variants.
Build and maintain evaluation infrastructure: benchmark/golden datasets, regression suites, and automated scoring to catch regressions pre-production.
Define confidence calibration and escalation logic so systems estimate uncertainty and route low-confidence outputs to human reviewers with the right context.
Partner with governance/model risk and data engineering: produce validator-ready documentation (explainability/auditability) and ensure robust, refreshed data/annotation/eval pipelines; drive AI-native development practices to improve velocity.

Required qualifications, capabilities and skills:

10+ years in NLP/AI/ML or computational linguistics, including 3+ years leading technical teams with direct reports.
Hands-on LLM internals expertise: tokenization impacts, attention limits, context window management, and temperature/sampling trade-offs.
Proven prompt architecture design/debugging: multi-turn chains, few-/many-shot, chain-of-thought, self-consistency, and constitutional AI.
Strong RAG system design experience: embeddings trade-offs (e.g., ada/BGE/Cohere), chunking for semi-structured docs, hybrid retrieval (dense+sparse), and re-ranking.
Fine-tuning experience: LoRA/QLoRA, instruction tuning, RLHF/DPO, dataset curation, and evaluating tuned vs prompted performance.
Python + ML engineering proficiency: PyTorch, Hugging Face, LangChain/LlamaIndex (or equivalents), vector DBs (Pinecone/Weaviate/pgvector), and API development.
AI evaluation systems experience beyond F1: faithfulness, answer relevance, RAG context precision/recall, automated eval pipelines, and LLM-as-judge.
Deep understanding of LLM failure modes: hallucinations, sycophancy, long-context instruction degradation, prompt-format sensitivity, and catastrophic forgetting.
Structured output enforcement in production: JSON mode, function calling, constrained decoding, output parsers, and schema validation.
Build-vs-buy/model selection track record: benchmarking foundation models (GPT-4/Claude/Llama/Mistral) against task requirements.
Leadership under ambiguity: pragmatic trade-offs, rapid iteration as practices evolve, and strong communication with compliance, risk, and business stakeholders.

Preferred qualifications, capabilities and skills:

Domain expertise in KYC/AML or financial document processing: entity extraction from registries, beneficial ownership structures, sanctions screening logic, adverse media classification.
Experience designing autonomous AI agents: tool-use patterns, planning/reasoning loops, memory architectures, and safety guardrails for regulated environments.
Knowledge of AI security/adversarial robustness: prompt injection defense, jailbreak detection, data poisoning awareness, and output monitoring for sensitive financial data.
Experience with model distillation to produce smaller, faster models for cost-effective deployment.
Familiarity with AI observability/monitoring: tracking prompt/model performance, drift detection, alerting, and health dashboards.
Experience with multi-modal AI combining OCR, layout understanding, and LLM-based extraction for complex documents.
Advanced degree (MS/PhD) in CS/NLP/ML or equivalent depth via publications, open-source, or production system design and Passion for talent development: growing engineers from junior prompt writers into senior AI system designers via structured mentorship.

Promote AI teams to design, build, and refine intelligent document processing and data extraction for complex financial workflows

Full Job Description

Location: India

Job Responsibilities:

Lead and develop a team of 815 prompt engineers and AI/ML developers; set technical direction, run architecture reviews, and define quality standards for prompts and model artifacts.
Architect agentic, multi-step AI workflows chaining classification extraction cross-validation exception routing with human-in-the-loop checkpoints.
Debug and remediate complex prompt failures (context-window overflow, instruction drift in long chains, RAG retrieval poisoning, output/format instability).
Design prompt/model evaluation frameworks measuring accuracy plus consistency, robustness, latency, cost-per-call, and hallucination rate.
Operationalize prompt lifecycle management as production code: versioning, CI/CD prompt tests, A/B experiments, rollback, and audited change history.
Guide model selection and optimization (prompting vs fine-tuning vs custom training) balancing accuracy, latency, cost, and data sensitivity.
Design RAG architectures for financial documents: chunking, embeddings, vector store design, re-ranking, and context injection.
Oversee fine-tuning/training workflows: dataset curation, annotation quality, training configurations, and generalization across document variants.
Build and maintain evaluation infrastructure: benchmark/golden datasets, regression suites, and automated scoring to catch regressions pre-production.
Define confidence calibration and escalation logic so systems estimate uncertainty and route low-confidence outputs to human reviewers with the right context.
Partner with governance/model risk and data engineering: produce validator-ready documentation (explainability/auditability) and ensure robust, refreshed data/annotation/eval pipelines; drive AI-native development practices to improve velocity.

Required qualifications, capabilities and skills:

10+ years in NLP/AI/ML or computational linguistics, including 3+ years leading technical teams with direct reports.
Hands-on LLM internals expertise: tokenization impacts, attention limits, context window management, and temperature/sampling trade-offs.
Proven prompt architecture design/debugging: multi-turn chains, few-/many-shot, chain-of-thought, self-consistency, and constitutional AI.
Strong RAG system design experience: embeddings trade-offs (e.g., ada/BGE/Cohere), chunking for semi-structured docs, hybrid retrieval (dense+sparse), and re-ranking.
Fine-tuning experience: LoRA/QLoRA, instruction tuning, RLHF/DPO, dataset curation, and evaluating tuned vs prompted performance.
Python + ML engineering proficiency: PyTorch, Hugging Face, LangChain/LlamaIndex (or equivalents), vector DBs (Pinecone/Weaviate/pgvector), and API development.
AI evaluation systems experience beyond F1: faithfulness, answer relevance, RAG context precision/recall, automated eval pipelines, and LLM-as-judge.
Deep understanding of LLM failure modes: hallucinations, sycophancy, long-context instruction degradation, prompt-format sensitivity, and catastrophic forgetting.
Structured output enforcement in production: JSON mode, function calling, constrained decoding, output parsers, and schema validation.
Build-vs-buy/model selection track record: benchmarking foundation models (GPT-4/Claude/Llama/Mistral) against task requirements.
Leadership under ambiguity: pragmatic trade-offs, rapid iteration as practices evolve, and strong communication with compliance, risk, and business stakeholders.

Preferred qualifications, capabilities and skills:

Domain expertise in KYC/AML or financial document processing: entity extraction from registries, beneficial ownership structures, sanctions screening logic, adverse media classification.
Experience designing autonomous AI agents: tool-use patterns, planning/reasoning loops, memory architectures, and safety guardrails for regulated environments.
Knowledge of AI security/adversarial robustness: prompt injection defense, jailbreak detection, data poisoning awareness, and output monitoring for sensitive financial data.
Experience with model distillation to produce smaller, faster models for cost-effective deployment.
Familiarity with AI observability/monitoring: tracking prompt/model performance, drift detection, alerting, and health dashboards.
Experience with multi-modal AI combining OCR, layout understanding, and LLM-based extraction for complex documents.
Advanced degree (MS/PhD) in CS/NLP/ML or equivalent depth via publications, open-source, or production system design and Passion for talent development: growing engineers from junior prompt writers into senior AI system designers via structured mentorship.

Promote AI teams to design, build, and refine intelligent document processing and data extraction for complex financial workflows

Predictive Science - AI Engineering & Prompt Architecture Lead - Vice President

Full Job Description

SIMILAR OPPORTUNITIES

Applied Machine Learning Scientist - Vice President

Senior AI Engineer – Vice President

Artificial Intelligence Engineer, Portfolio Management Group, Associate/Vice President

Quant AI/ML Model Developer VP

VP, AI Business Relationship Lead (Data Scientist)

Predictive Science - AI Engineering & Prompt Architecture Lead - Vice President

Full Job Description

SIMILAR OPPORTUNITIES

Applied Machine Learning Scientist - Vice President

Senior AI Engineer – Vice President

Artificial Intelligence Engineer, Portfolio Management Group, Associate/Vice President

Quant AI/ML Model Developer VP

VP, AI Business Relationship Lead (Data Scientist)