Investment Banking

Posted 11 days ago

No clicks

**Data Engineer for Financial AI Pipelines in Copenhagen V, Denmark** Design, own, and optimize data infrastructure feeding AI-driven banking platform. Key responsibilities include: - Building ELT pipelines on Databricks using Apache Spark, Delta Lake for internal banking data and third-party feeds (LSEG). - Implementing data contracts, schemas, and SLAs to ensure consistent interfaces for AI agents. - Enhancing data lineage, governance, and quality using Databricks Unity Catalog and anomaly detection. - Optimizing pipelines for low-latency AI retrieval, near-real-time data ingestion with Kafka/Structured Streaming. - Collaborating with data governance, AI engineers to enforce controls, and design schema for RAG retrieval. Required: 6+ years data engineering, 3+ years Databricks/Spark, proficiency in Spark, Delta Lake, Python. Nice to have: dbt, banking data model knowledge, vector database experience, AWS data services.

Compensation: Not specified
City: Copenhagen
Country: Denmark

Full Job Description

Location: Copenhagen V, Denmark

Build and own the data infrastructure that feeds the bank's AI agentic platform. You will design robust, low-latency pipelines on Databricks that ingest, transform, and serve internal banking data alongside third-party sources such as LSEG. Data quality, lineage, and governance are paramount AI agents are only as reliable as the data they reason over.

This role sits at the critical intersection of data engineering and AI, directly determining the quality and scope of insights agents can deliver to bankers.

The Platform Context

This role sits within the bank's enterprise AI Agentic Platform a strategic initiative to enhance banker productivity using large language models orchestrated via AWS Bedrock and AWS Agent Core, with data served from Databricks. The platform ingests internal banking data (credit, CRM, trade, GL) alongside external sources such as LSEG, enabling AI agents to draft documents, analyse deals, synthesise research, and surface insights on demand. Security, auditability, and regulatory compliance are non-negotiable.

Key Responsibilities

Design and implement ELT pipelines on Databricks using Apache Spark and Delta Lake to ingest internal banking data (GL, CRM, credit, trade data) and external sources including LSEG market data feeds
Build and maintain data contracts, schemas, and SLAs for all datasets consumed by AI agents, ensuring agents can rely on consistent, well-defined data interfaces
Implement data cataloguing and lineage tracking using Databricks Unity Catalog to enable agent discoverability and data trust verification
Optimise Delta Lake tables for low-latency AI retrieval workloads using Z-ordering, liquid clustering, and bloom filters
Build streaming pipelines for near-real-time data ingestion using Databricks Structured Streaming or Apache Kafka
Implement data quality checks, anomaly detection, and alerting pipelines to prevent agent hallucinations caused by upstream data issues
Collaborate with Data Governance and Compliance to enforce data classification, PII masking, and access controls at the pipeline level
Partner with AI engineers to design data schemas and embedding pipelines optimised for RAG retrieval and vector search

What you bring

6+ years data engineering, 3+ years on Databricks or Spark-based platforms
Deep proficiency in Apache Spark, Delta Lake, and Python/PySpark
Data cataloguing and governance tools: Unity Catalog, Apache Atlas, or equivalent
Strong data modelling for analytical and AI retrieval workloads
Experience integrating financial data vendors (LSEG, Refinitiv, Bloomberg, or equivalent)
Streaming architecture familiarity: Kafka, Kinesis, or Databricks Structured Streaming

Nice to Have

dbt for transformation layer management

Banking data model knowledge: GL structures, trade lifecycle, credit data

Vector database experience: Pinecone, pgvector, Chroma, or similar

AWS data services: Glue, Lake Formation, or S3 at scale

What We Offer

Opportunity to build one of the most ambitious AI platforms in the banking sector from the ground up
Direct exposure to senior banking leadership and C-suite stakeholders
Competitive compensation with performance-linked bonus and long-term incentive plan
Hybrid working with flexibility we trust our people to deliver
Continuous learning budget and access to frontier AI tools and research
A culture that values craftsmanship, intellectual honesty, and commercial impact

Danske Bank supports a high degree of workplace flexibility. Our team is currently using a hybrid working model, where we work at least 3 days a week in the office.

You will also benefit from a highly attractive benefits package offering health and dental insurance, pension, phone and other benefits. You will also have flexible work hours, with 6 weeks of vacation, and 5 care days to ensure your work-life balance.

Interested?

If you're have any questions, feel free to contact me, Nikodem Binienda on nbin@danskebank.dk, and I will answer your questions!

Full Job Description

Location: Copenhagen V, Denmark

This role sits at the critical intersection of data engineering and AI, directly determining the quality and scope of insights agents can deliver to bankers.

The Platform Context

Key Responsibilities

Design and implement ELT pipelines on Databricks using Apache Spark and Delta Lake to ingest internal banking data (GL, CRM, credit, trade data) and external sources including LSEG market data feeds
Build and maintain data contracts, schemas, and SLAs for all datasets consumed by AI agents, ensuring agents can rely on consistent, well-defined data interfaces
Implement data cataloguing and lineage tracking using Databricks Unity Catalog to enable agent discoverability and data trust verification
Optimise Delta Lake tables for low-latency AI retrieval workloads using Z-ordering, liquid clustering, and bloom filters
Build streaming pipelines for near-real-time data ingestion using Databricks Structured Streaming or Apache Kafka
Implement data quality checks, anomaly detection, and alerting pipelines to prevent agent hallucinations caused by upstream data issues
Collaborate with Data Governance and Compliance to enforce data classification, PII masking, and access controls at the pipeline level
Partner with AI engineers to design data schemas and embedding pipelines optimised for RAG retrieval and vector search

What you bring

6+ years data engineering, 3+ years on Databricks or Spark-based platforms
Deep proficiency in Apache Spark, Delta Lake, and Python/PySpark
Data cataloguing and governance tools: Unity Catalog, Apache Atlas, or equivalent
Strong data modelling for analytical and AI retrieval workloads
Experience integrating financial data vendors (LSEG, Refinitiv, Bloomberg, or equivalent)
Streaming architecture familiarity: Kafka, Kinesis, or Databricks Structured Streaming

Nice to Have

dbt for transformation layer management

Banking data model knowledge: GL structures, trade lifecycle, credit data

Vector database experience: Pinecone, pgvector, Chroma, or similar

AWS data services: Glue, Lake Formation, or S3 at scale

What We Offer

Opportunity to build one of the most ambitious AI platforms in the banking sector from the ground up
Direct exposure to senior banking leadership and C-suite stakeholders
Competitive compensation with performance-linked bonus and long-term incentive plan
Hybrid working with flexibility we trust our people to deliver
Continuous learning budget and access to frontier AI tools and research
A culture that values craftsmanship, intellectual honesty, and commercial impact

Danske Bank supports a high degree of workplace flexibility. Our team is currently using a hybrid working model, where we work at least 3 days a week in the office.

Interested?

If you're have any questions, feel free to contact me, Nikodem Binienda on nbin@danskebank.dk, and I will answer your questions!

Data Engineer for Financial AI Pipelines

Full Job Description

SIMILAR OPPORTUNITIES

Data Engineer - Python, AI

Senior Data Engineer - Feature Engineering for AI Models

Data Engineer

Data Engineer

Data Engineer

Data Engineer for Financial AI Pipelines

Full Job Description

SIMILAR OPPORTUNITIES

Data Engineer - Python, AI

Senior Data Engineer - Feature Engineering for AI Models

Data Engineer

Data Engineer

Data Engineer