Bulge Bracket Investment Banks

Posted 8 days ago

No clicks

**Software Engineer III - Data Engineer, JPMorgan Chase** Design, build, and maintain batch/streaming data pipelines using **Databricks**. Develop and optimize **ETL/ELT** workflows in **PySpark/Spark SQL**. Implement data modeling, curation, and dataset publishing. Tune and optimize Spark jobs for performance, cost, and scalability. Ensure strong **data quality** through validations and monitoring. Collaborate with stakeholders to translate requirements into data solutions. Follow **CI/CD and SDLC practices**. Support production operations: incident management, root cause analysis, and pipeline reliability improvements. Requires 3+ years of **data engineering** experience, with hands-on expertise in **Databricks**, **Python**, and **SQL**. Strong proficiency in **PySpark/Spark SQL**. Experience in data modeling, ETL/ELT, performance tuning, data quality, and monitoring. Understands data pipeline architecture and dependency management. Familiar with **data lakes/lakehouse** storage patterns and Git-based workflows. Location: Bengaluru, Karnataka, India

Compensation: Not specified
City: Bengaluru
Country: India

Full Job Description

Location: Bengaluru, Karnataka, India

We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.

As a Software Engineer III at JPMorgan Chase within the Asset & Wealth Management, you serve as a seasoned member of an agile team to design and deliver trusted market-leading technology products in a secure, stable, and scalable way. You are responsible for carrying out critical technology solutions across multiple technical areas within various business functions in support of the firms business objectives.

Job responsibilities

Designs, build, and maintain batch and (as needed) streaming data pipelines using Databricks.
Develops and optimize ETL/ELT workflows using PySpark / Spark SQL and Databricks workflows/jobs.
Implements data modeling (bronze/silver/gold patterns), curation, and dataset publishing for analytics and consumption.
Tunes and optimize Spark jobs for performance, cost, and scalability (partitioning, file sizing, caching, joins, etc.).
Ensures strong data quality through validations, reconciliations, monitoring, and alerting.
Works with stakeholders (data analysts, data scientists, product, and engineering teams) to translate requirements into data solutions.
Implements and follow CI/CD and SDLC practices for data engineering code (testing, code reviews, version control).
Supports production operations: incident triage, root-cause analysis, and pipeline reliability improvements.
Contributes to documentation, standards, and reusable frameworks to improve team productivity.

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 3+ years applied experience
Hands-on experience in Data Engineering.
Strong experience with Databricks (jobs/workflows, notebooks, clusters, performance tuning).
Proficiency in Python and SQL; strong hands-on in PySpark/Spark SQL.
Experience in Data modeling, ETL/ELT, performance tuning, data quality, monitoring, troubleshooting.
Solid understanding of data pipeline architecture, orchestration concepts, and dependency management.
Experience working with data lakes/lakehouse storage patterns and file formats (e.g., Parquet).
Familiarity with Git-based workflows and engineering best practices.

Preferred qualifications, capabilities, and skills

AI/ML exposure as an added advantage: experience supporting ML workflows by building feature datasets, training/serving data pipelines, or enabling model monitoring and experimentation (e.g., working with data scientists on reproducible data inputs, feature engineering, and ML-ready tables).
Familiarity with ML ecosystem/tools is a plus (examples: MLflow, Databricks model registry, notebooks-based experimentation), and understanding of basic ML concepts (training vs inference, leakage, drift).
Experience with Delta Lake features (ACID tables, time travel, optimization).
Exposure to streaming (e.g., Spark Structured Streaming) and event-driven patterns.
Experience with cloud platforms (AWS/Azure/GCP) and cloud storage integrations.
Knowledge of data governance, access controls, and secure handling of sensitive data.
Familiarity with orchestration tools (e.g., Airflow or similar) and supporting production-grade data platforms (monitoring, SLAs, on-call rotations).

Design and deliver market-leading technology products in a secure and scalable way as a seasoned member of an agile team

Full Job Description

Location: Bengaluru, Karnataka, India

We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.

Job responsibilities

Designs, build, and maintain batch and (as needed) streaming data pipelines using Databricks.
Develops and optimize ETL/ELT workflows using PySpark / Spark SQL and Databricks workflows/jobs.
Implements data modeling (bronze/silver/gold patterns), curation, and dataset publishing for analytics and consumption.
Tunes and optimize Spark jobs for performance, cost, and scalability (partitioning, file sizing, caching, joins, etc.).
Ensures strong data quality through validations, reconciliations, monitoring, and alerting.
Works with stakeholders (data analysts, data scientists, product, and engineering teams) to translate requirements into data solutions.
Implements and follow CI/CD and SDLC practices for data engineering code (testing, code reviews, version control).
Supports production operations: incident triage, root-cause analysis, and pipeline reliability improvements.
Contributes to documentation, standards, and reusable frameworks to improve team productivity.

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 3+ years applied experience
Hands-on experience in Data Engineering.
Strong experience with Databricks (jobs/workflows, notebooks, clusters, performance tuning).
Proficiency in Python and SQL; strong hands-on in PySpark/Spark SQL.
Experience in Data modeling, ETL/ELT, performance tuning, data quality, monitoring, troubleshooting.
Solid understanding of data pipeline architecture, orchestration concepts, and dependency management.
Experience working with data lakes/lakehouse storage patterns and file formats (e.g., Parquet).
Familiarity with Git-based workflows and engineering best practices.

Preferred qualifications, capabilities, and skills

AI/ML exposure as an added advantage: experience supporting ML workflows by building feature datasets, training/serving data pipelines, or enabling model monitoring and experimentation (e.g., working with data scientists on reproducible data inputs, feature engineering, and ML-ready tables).
Familiarity with ML ecosystem/tools is a plus (examples: MLflow, Databricks model registry, notebooks-based experimentation), and understanding of basic ML concepts (training vs inference, leakage, drift).
Experience with Delta Lake features (ACID tables, time travel, optimization).
Exposure to streaming (e.g., Spark Structured Streaming) and event-driven patterns.
Experience with cloud platforms (AWS/Azure/GCP) and cloud storage integrations.
Knowledge of data governance, access controls, and secure handling of sensitive data.
Familiarity with orchestration tools (e.g., Airflow or similar) and supporting production-grade data platforms (monitoring, SLAs, on-call rotations).

Design and deliver market-leading technology products in a secure and scalable way as a seasoned member of an agile team

Software Engineer III - Data Engineer, Databricks

Full Job Description

SIMILAR OPPORTUNITIES