
at J.P. Morgan
Bulge Bracket Investment BanksPosted 2 months ago
No clicks
**Data Engineer III - Python/ Data Lake**: Design and deliver scalable data solutions. Expertise in Python, Spark, AWS data lake, and orchestration tools needed. 3+ years in data engineering required.
- Compensation
- Not specified USD
- City
- New York City
- Country
- United States
Currency: $ (USD)
Full Job Description
Location: New York, NY, United States
Be part of a dynamic team where your distinctive skills will contribute to a winning culture and team.
Job responsibilities
- Supports review of controls to ensure sufficient protection of enterprise data
- Advises and makes custom configuration changes in one to two tools to generate a product at the business or customer request
- Updates logical or physical data models based on new use cases
- Frequently uses SQL and understands NoSQL databases and their niche in the marketplace
- Adds to team culture of diversity, opportunity, inclusion, and respect
Required qualifications, capabilities, and skills
- Formal training or certification on data engineering concepts and 3+ years applied experience
- Experience across the data lifecycle
- Expertise in Python programming language for data engineering tasks (secondary alternative: Java)
- Expertise in cluster computing frameworks such as Spark or Flink
- Experience in building data lakehouse platforms (AWS data lake or Databricks or Hadoop)
- Experience in building DAGs/workflows using scheduling/orchestration tools (Airflow or AWS Step Functions or similar)
- Advanced at SQL (e.g., joins and aggregations)
- Working understanding of NoSQL databases
- Significant experience with statistical data analysis and ability to determine appropriate tools and data patterns to perform analysis
- Experience customizing changes in a tool to generate product
- Proficiency in developing data pipelines using AWS services such Glue, EMR, MSK, Kinesis, etc.
- Experience in using relational data stores (Postgres or similar) and NOSQL data stores (Cassandra or Dynamo or similar)
- Proficiency in IAC (Terraform)
- Knowledge of data serialization formats (e.g., JSON, Avro, Protobuf), big-data storage formats (e.g., Parquet, Iceberg, Hudi), data processing methodologies (batch, micro-batching, stream), and data modeling techniques (Dimensional, Data Vault, Kimball, Inmon)
