
Posted 15 days ago
No clicks
Data Engineer II at JPMorgan Chase's Connected Commerce Travel Technology team, responsible for designing, building, and maintaining large-scale cloud-based data integration and analytics solutions. You will develop and optimize data models and scalable processing pipelines while ensuring data integrity, quality, and performance. The role involves close collaboration with cross-functional teams to translate business requirements into production-grade data engineering solutions and a focus on innovation and continuous improvement. Required skills include proficiency in Python, distributed processing frameworks (e.g., Spark), cloud data lake/lakehouse technologies, orchestration tools like Airflow, and experience with CI/CD and Agile practices.
- Compensation
- Not specified
- City
- Pune
- Country
- India
Currency: Not specified
Full Job Description
Location: Pune, Maharashtra, India
Job responsibilities
- Design, develop and maintain scalable and large-scale data processing pipelines and infrastructure on the cloud following engineering standards, governance standards and technology best practices
- Develop and optimize data models for large-scale datasets, ensuring efficient storage, retrieval, and analytics while maintaining data integrity and quality.
- Collaborate with cross-functional teams to translate business requirements into scalable and effective data engineering solutions.
- Demonstrate a passion for innovation and continuous improvement in data engineering, proactively identifying opportunities to enhance data infrastructure, data processing and analytics capabilities.
Required qualifications, capabilities, and skills
- Strong analytical problem solving and critical thinking skills
- Proficiency in at least one programming language ( Python, if not Java or Scala)
- Proficiency in at least one distributed data processing framework ( Spark or similar)
- Proficiency in at least one cloud data Lakehouse platforms (AWS Data lake services or Databricks, alternatively Hadoop),
- Proficiency in at least one scheduling/orchestration tools ( Airflow, if not AWS Step Functions or similar)
- Proficiency with relational and NoSQL databases.
- Proficiency in data structures, data serialization formats (JSON, AVRO, Protobuf, or similar), and big-data storage formats (Parquet, Iceberg, or similar),
- Experience working in teams following Agile methodology
- Experience with test-driven development (TDD) or behavior-driven development (BDD) practices, as well as working with continuous integration and continuous deployment (CI/CD) tools.
- Proficiency in Python and Pyspark
- Proficiency in IaC (preferably Terraform, alternatively AWS cloud formation)
- Experience with AWS Glue, AWS S3, AWS Lakehouse, AWS Athena, Airflow, Kinesis and Apache Iceberg
- Experience working with Jenkins



