LOG IN
SIGN UP
Canary Wharfian - Online Investment Banking & Finance Community.
Sign In
or continue with e-mail and password
Forgot password?
Don't have an account?
Create an account
or continue with e-mail and password
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Data Domain Architect Lead

ExperiencedNo visa sponsorship
J.P. Morgan logo

at J.P. Morgan

Bulge Bracket Investment Banks

Posted 8 days ago

No clicks

**Data Domain Architect Lead**ɛ�Coordinates and leads data annotation initiatives, fostering reliable datasets for machine learning models within the Consumer & Community Banking sector. ǚOversees data cleaning, metrics establishment, and automation improvements to boost quality and throughput. quercCatapults annotation teams' growth through technical mentoring and stakeholder management, driving continuous improvement. πDeploys Python, Git, and LLMs, optimizing prompt engineering and scaling data label operations. KedangProven track record (5+ years) in data product delivery with advanced degrees in relevant fields. Eager for senior roles, managing teams and influencing stakeholders.

Compensation
Not specified

Currency: Not specified

City
Bengaluru
Country
India

Full Job Description

Location: Bengaluru, Karnataka, India

Join us for an exciting opportunity to leverage your advanced data annotation skills in the financial industry and contribute to cutting-edge machine learning models.

As a Data Domain Architect Lead within the Consumer & Community Banking team, you will lead data labeling initiatives that produce reliable, controlled, and actionable datasets for model training and evaluation. You will set product direction, manage delivery, and partner with technology, operations, and data science teams to improve data quality, scalability, and stakeholder outcomes.

Job Responsibilities

  • Translate business requirements and ML objectives into implementable requirements, schema, guidelines and quality metrics while defining success measures and key result for each labelling effort and actively manage scope, risks, dependencies, and stakeholder  communications

  • Own the annotation operating model, including workflow design, task routing, queue management, and delivery governance

  • Scale labeling capacity across multiple lines of business while maintaining consistency, quality, throughput and clear documentation

  • Own data cleaning and preparation processes to resolve noise, duplicates, inconsistencies, and labeling defects

  • Establish metrics and annotation reliability standards and a measurable quality framework, including calibration routines, gold datasets, reviews, and feedback loops

  • Leverage prompt engineering to improve task instructions, enable pre-labeling, and support synthetic data generation for LLM-related datasets

  • Develop LLM-as-judge approaches and agentic workflows to automate quality evaluation at scale, flag low-confidence items, and surface disagreements with human oversight

  • Drive annotation innovation by implementing automation across the labeling lifecycle, including ingestion, validation checks, dataset packaging, and audit-ready lineage artifacts

  • Lead benchmarking and executive-ready reporting on delivery performance, quality outcomes, and continuous improvement

  • Collaborate proactively with machine learning engineers and scientists to define evaluation requirements, labeling expectations, and target data volumes as models and usecases evolve in the new agentic/LLM initiatives to keep data deliverables unblocked & on track. 

  • Keep the team growing and stay current on AI data trends, publications, and tools and nurture team's AI & tech capability through training, coaching, and growth opportunities 

 

Required Qualifications, Capabilities, and Skills

  • Master's or PhD degree in Computational Linguistics, Linguistics, Computer Science, Data Science or a related field.

  • 5+ years of experience delivering data products or machine learning-enabled products across the full product lifecycle

  • Hands-on experience in developing annotation metrics, annotation and performing annotation reviews 

  • Experience running text data labeling programs end-to-end, including guideline and taxonomy design and annotation platform operations

  • Hands-on experience in Python for automation, data analysis, cleaning and validating structured and unstructured datasets; plus experience using Git for version control

  • Hands-on prompt engineering experience for LLM labeling workflows (for example, pre-labeling, synthetic data generation, and instruction clarity)

  • Working knowledge of LLM-as-judge methods, including rubric design and integrating automated signals into human-in-the-loop review

  • Hands-on experience in designing labeling quality measurement (for example, gold datasets, calibration, sampling, and inter-annotator agreement targets)

  • Hands-on experience in benchmarking data quality and evaluation outcomes and translating results into product and process improvements

  • Strong stakeholder management, written and verbal communication, and disciplined execution under deadlines 

  • Experience leading cross-functional delivery across technology, operations, and vendor partners

Preferred Qualifications, Capabilities, and Skills

  • Experience managing globally distributed annotation teams and third-party vendors
  • Familiarity with metadata management, data cataloging, and dataset lineage practices
  • Experience applying machine learning to data quality monitoring and anomaly detection
  • Track record influencing senior stakeholders and aligning priorities through measurable OKRs
  • Experience working with privacy, data governance, or model risk controls related to training data

 

 

Lead data annotation workflows for NLU and LLM, improving quality, scalability, automation, and outcome metrics.

Data Domain Architect Lead

Compensation

Not specified

City: Bengaluru

Country: India

J.P. Morgan logo
Bulge Bracket Investment Banks

8 days ago

No clicks

at J.P. Morgan

ExperiencedNo visa sponsorship

**Data Domain Architect Lead**ɛ�Coordinates and leads data annotation initiatives, fostering reliable datasets for machine learning models within the Consumer & Community Banking sector. ǚOversees data cleaning, metrics establishment, and automation improvements to boost quality and throughput. quercCatapults annotation teams' growth through technical mentoring and stakeholder management, driving continuous improvement. πDeploys Python, Git, and LLMs, optimizing prompt engineering and scaling data label operations. KedangProven track record (5+ years) in data product delivery with advanced degrees in relevant fields. Eager for senior roles, managing teams and influencing stakeholders.

Full Job Description

Location: Bengaluru, Karnataka, India

Join us for an exciting opportunity to leverage your advanced data annotation skills in the financial industry and contribute to cutting-edge machine learning models.

As a Data Domain Architect Lead within the Consumer & Community Banking team, you will lead data labeling initiatives that produce reliable, controlled, and actionable datasets for model training and evaluation. You will set product direction, manage delivery, and partner with technology, operations, and data science teams to improve data quality, scalability, and stakeholder outcomes.

Job Responsibilities

  • Translate business requirements and ML objectives into implementable requirements, schema, guidelines and quality metrics while defining success measures and key result for each labelling effort and actively manage scope, risks, dependencies, and stakeholder  communications

  • Own the annotation operating model, including workflow design, task routing, queue management, and delivery governance

  • Scale labeling capacity across multiple lines of business while maintaining consistency, quality, throughput and clear documentation

  • Own data cleaning and preparation processes to resolve noise, duplicates, inconsistencies, and labeling defects

  • Establish metrics and annotation reliability standards and a measurable quality framework, including calibration routines, gold datasets, reviews, and feedback loops

  • Leverage prompt engineering to improve task instructions, enable pre-labeling, and support synthetic data generation for LLM-related datasets

  • Develop LLM-as-judge approaches and agentic workflows to automate quality evaluation at scale, flag low-confidence items, and surface disagreements with human oversight

  • Drive annotation innovation by implementing automation across the labeling lifecycle, including ingestion, validation checks, dataset packaging, and audit-ready lineage artifacts

  • Lead benchmarking and executive-ready reporting on delivery performance, quality outcomes, and continuous improvement

  • Collaborate proactively with machine learning engineers and scientists to define evaluation requirements, labeling expectations, and target data volumes as models and usecases evolve in the new agentic/LLM initiatives to keep data deliverables unblocked & on track. 

  • Keep the team growing and stay current on AI data trends, publications, and tools and nurture team's AI & tech capability through training, coaching, and growth opportunities 

 

Required Qualifications, Capabilities, and Skills

  • Master's or PhD degree in Computational Linguistics, Linguistics, Computer Science, Data Science or a related field.

  • 5+ years of experience delivering data products or machine learning-enabled products across the full product lifecycle

  • Hands-on experience in developing annotation metrics, annotation and performing annotation reviews 

  • Experience running text data labeling programs end-to-end, including guideline and taxonomy design and annotation platform operations

  • Hands-on experience in Python for automation, data analysis, cleaning and validating structured and unstructured datasets; plus experience using Git for version control

  • Hands-on prompt engineering experience for LLM labeling workflows (for example, pre-labeling, synthetic data generation, and instruction clarity)

  • Working knowledge of LLM-as-judge methods, including rubric design and integrating automated signals into human-in-the-loop review

  • Hands-on experience in designing labeling quality measurement (for example, gold datasets, calibration, sampling, and inter-annotator agreement targets)

  • Hands-on experience in benchmarking data quality and evaluation outcomes and translating results into product and process improvements

  • Strong stakeholder management, written and verbal communication, and disciplined execution under deadlines 

  • Experience leading cross-functional delivery across technology, operations, and vendor partners

Preferred Qualifications, Capabilities, and Skills

  • Experience managing globally distributed annotation teams and third-party vendors
  • Familiarity with metadata management, data cataloging, and dataset lineage practices
  • Experience applying machine learning to data quality monitoring and anomaly detection
  • Track record influencing senior stakeholders and aligning priorities through measurable OKRs
  • Experience working with privacy, data governance, or model risk controls related to training data

 

 

Lead data annotation workflows for NLU and LLM, improving quality, scalability, automation, and outcome metrics.