LOG IN
SIGN UP
Canary Wharfian - Online Investment Banking & Finance Community.
Sign In
or continue with e-mail and password
Forgot password?
Don't have an account?
Create an account
or continue with e-mail and password
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Data Engineer for Financial AI Pipelines

ExperiencedNo visa sponsorship
Danske Bank logo

at Danske Bank

Investment Banking

Posted 11 days ago

No clicks

**Data Engineer for Financial AI Pipelines in Copenhagen V, Denmark** Design, own, and optimize data infrastructure feeding AI-driven banking platform. Key responsibilities include: - Building ELT pipelines on Databricks using Apache Spark, Delta Lake for internal banking data and third-party feeds (LSEG). - Implementing data contracts, schemas, and SLAs to ensure consistent interfaces for AI agents. - Enhancing data lineage, governance, and quality using Databricks Unity Catalog and anomaly detection. - Optimizing pipelines for low-latency AI retrieval, near-real-time data ingestion with Kafka/Structured Streaming. - Collaborating with data governance, AI engineers to enforce controls, and design schema for RAG retrieval. Required: 6+ years data engineering, 3+ years Databricks/Spark, proficiency in Spark, Delta Lake, Python. Nice to have: dbt, banking data model knowledge, vector database experience, AWS data services.

Compensation
Not specified

Currency: Not specified

City
Copenhagen
Country
Denmark

Full Job Description

Location: Copenhagen V, Denmark

Build and own the data infrastructure that feeds the bank's AI agentic platform. You will design robust, low-latency pipelines on Databricks that ingest, transform, and serve internal banking data alongside third-party sources such as LSEG. Data quality, lineage, and governance are paramount AI agents are only as reliable as the data they reason over. 

 

This role sits at the critical intersection of data engineering and AI, directly determining the quality and scope of insights agents can deliver to bankers.

 

The Platform Context
 

This role sits within the bank's enterprise AI Agentic Platform a strategic initiative to enhance banker productivity using large language models orchestrated via AWS Bedrock and AWS Agent Core, with data served from Databricks. The platform ingests internal banking data (credit, CRM, trade, GL) alongside external sources such as LSEG, enabling AI agents to draft documents, analyse deals, synthesise research, and surface insights on demand. Security, auditability, and regulatory compliance are non-negotiable.

 

Key Responsibilities
 

  • Design and implement ELT pipelines on Databricks using Apache Spark and Delta Lake to ingest internal banking data (GL, CRM, credit, trade data) and external sources including LSEG market data feeds

  • Build and maintain data contracts, schemas, and SLAs for all datasets consumed by AI agents, ensuring agents can rely on consistent, well-defined data interfaces

  • Implement data cataloguing and lineage tracking using Databricks Unity Catalog to enable agent discoverability and data trust verification

  • Optimise Delta Lake tables for low-latency AI retrieval workloads using Z-ordering, liquid clustering, and bloom filters

  • Build streaming pipelines for near-real-time data ingestion using Databricks Structured Streaming or Apache Kafka

  • Implement data quality checks, anomaly detection, and alerting pipelines to prevent agent hallucinations caused by upstream data issues

  • Collaborate with Data Governance and Compliance to enforce data classification, PII masking, and access controls at the pipeline level

  • Partner with AI engineers to design data schemas and embedding pipelines optimised for RAG retrieval and vector search

 

What you bring

 

  • 6+ years data engineering, 3+ years on Databricks or Spark-based platforms

  • Deep proficiency in Apache Spark, Delta Lake, and Python/PySpark

  • Data cataloguing and governance tools: Unity Catalog, Apache Atlas, or equivalent

  • Strong data modelling for analytical and AI retrieval workloads

  • Experience integrating financial data vendors (LSEG, Refinitiv, Bloomberg, or equivalent)

  • Streaming architecture familiarity: Kafka, Kinesis, or Databricks Structured Streaming

 

Nice to Have
 

  dbt for transformation layer management

  Banking data model knowledge: GL structures, trade lifecycle, credit data

  Vector database experience: Pinecone, pgvector, Chroma, or similar

  AWS data services: Glue, Lake Formation, or S3 at scale

 

What We Offer

  • Opportunity to build one of the most ambitious AI platforms in the banking sector from the ground up

  • Direct exposure to senior banking leadership and C-suite stakeholders

  • Competitive compensation with performance-linked bonus and long-term incentive plan

  • Hybrid working with flexibility we trust our people to deliver

  • Continuous learning budget and access to frontier AI tools and research

  • A culture that values craftsmanship, intellectual honesty, and commercial impact

 

Danske Bank supports a high degree of workplace flexibility. Our team is currently using a hybrid working model, where we work at least 3 days a week in the office.

You will also benefit from a highly attractive benefits package offering health and dental insurance, pension, phone and other benefits. You will also have flexible work hours, with 6 weeks of vacation, and 5 care days to ensure your work-life balance. 

 

Interested?

If you're have any questions, feel free to contact me, Nikodem Binienda on nbin@danskebank.dk, and I will answer your questions!

 

Data Engineer for Financial AI Pipelines

Compensation

Not specified

City: Copenhagen

Country: Denmark

Danske Bank logo
Investment Banking

11 days ago

No clicks

at Danske Bank

ExperiencedNo visa sponsorship

**Data Engineer for Financial AI Pipelines in Copenhagen V, Denmark** Design, own, and optimize data infrastructure feeding AI-driven banking platform. Key responsibilities include: - Building ELT pipelines on Databricks using Apache Spark, Delta Lake for internal banking data and third-party feeds (LSEG). - Implementing data contracts, schemas, and SLAs to ensure consistent interfaces for AI agents. - Enhancing data lineage, governance, and quality using Databricks Unity Catalog and anomaly detection. - Optimizing pipelines for low-latency AI retrieval, near-real-time data ingestion with Kafka/Structured Streaming. - Collaborating with data governance, AI engineers to enforce controls, and design schema for RAG retrieval. Required: 6+ years data engineering, 3+ years Databricks/Spark, proficiency in Spark, Delta Lake, Python. Nice to have: dbt, banking data model knowledge, vector database experience, AWS data services.

Full Job Description

Location: Copenhagen V, Denmark

Build and own the data infrastructure that feeds the bank's AI agentic platform. You will design robust, low-latency pipelines on Databricks that ingest, transform, and serve internal banking data alongside third-party sources such as LSEG. Data quality, lineage, and governance are paramount AI agents are only as reliable as the data they reason over. 

 

This role sits at the critical intersection of data engineering and AI, directly determining the quality and scope of insights agents can deliver to bankers.

 

The Platform Context
 

This role sits within the bank's enterprise AI Agentic Platform a strategic initiative to enhance banker productivity using large language models orchestrated via AWS Bedrock and AWS Agent Core, with data served from Databricks. The platform ingests internal banking data (credit, CRM, trade, GL) alongside external sources such as LSEG, enabling AI agents to draft documents, analyse deals, synthesise research, and surface insights on demand. Security, auditability, and regulatory compliance are non-negotiable.

 

Key Responsibilities
 

  • Design and implement ELT pipelines on Databricks using Apache Spark and Delta Lake to ingest internal banking data (GL, CRM, credit, trade data) and external sources including LSEG market data feeds

  • Build and maintain data contracts, schemas, and SLAs for all datasets consumed by AI agents, ensuring agents can rely on consistent, well-defined data interfaces

  • Implement data cataloguing and lineage tracking using Databricks Unity Catalog to enable agent discoverability and data trust verification

  • Optimise Delta Lake tables for low-latency AI retrieval workloads using Z-ordering, liquid clustering, and bloom filters

  • Build streaming pipelines for near-real-time data ingestion using Databricks Structured Streaming or Apache Kafka

  • Implement data quality checks, anomaly detection, and alerting pipelines to prevent agent hallucinations caused by upstream data issues

  • Collaborate with Data Governance and Compliance to enforce data classification, PII masking, and access controls at the pipeline level

  • Partner with AI engineers to design data schemas and embedding pipelines optimised for RAG retrieval and vector search

 

What you bring

 

  • 6+ years data engineering, 3+ years on Databricks or Spark-based platforms

  • Deep proficiency in Apache Spark, Delta Lake, and Python/PySpark

  • Data cataloguing and governance tools: Unity Catalog, Apache Atlas, or equivalent

  • Strong data modelling for analytical and AI retrieval workloads

  • Experience integrating financial data vendors (LSEG, Refinitiv, Bloomberg, or equivalent)

  • Streaming architecture familiarity: Kafka, Kinesis, or Databricks Structured Streaming

 

Nice to Have
 

  dbt for transformation layer management

  Banking data model knowledge: GL structures, trade lifecycle, credit data

  Vector database experience: Pinecone, pgvector, Chroma, or similar

  AWS data services: Glue, Lake Formation, or S3 at scale

 

What We Offer

  • Opportunity to build one of the most ambitious AI platforms in the banking sector from the ground up

  • Direct exposure to senior banking leadership and C-suite stakeholders

  • Competitive compensation with performance-linked bonus and long-term incentive plan

  • Hybrid working with flexibility we trust our people to deliver

  • Continuous learning budget and access to frontier AI tools and research

  • A culture that values craftsmanship, intellectual honesty, and commercial impact

 

Danske Bank supports a high degree of workplace flexibility. Our team is currently using a hybrid working model, where we work at least 3 days a week in the office.

You will also benefit from a highly attractive benefits package offering health and dental insurance, pension, phone and other benefits. You will also have flexible work hours, with 6 weeks of vacation, and 5 care days to ensure your work-life balance. 

 

Interested?

If you're have any questions, feel free to contact me, Nikodem Binienda on nbin@danskebank.dk, and I will answer your questions!