Bulge Bracket Investment Banks

Posted 13 days ago

No clicks

Senior PySpark Data Engineer: Design, develop, and maintain robust, scalable data pipelines using PySpark. Collaborate with stakeholders, optimize Spark jobs, and ensure data quality and compliance. Requires 6+ years of experience in data engineering, advanced Python skills, hands-on experience with Big Data ecosystems and distributed query engines, proficiency in SQL, and experience with ETL/ELT processes.

Compensation: Not specified
City: Pune
Country: India

Full Job Description

Senior PySpark Data Engineer

Apply (opens in new window)

Save

Job Req Id:

26955864

Location(s):

Pune, Maharashtra, India

Job Type:

On-Site/Resident

Posted:

Apr.. 29, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, youll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

Key Responsibilities About the Role

We are seeking a highly skilled and experienced Senior PySpark Data Engineer to join our dynamic data engineering team. The ideal candidate will have a strong background in building and managing large-scale data processing systems and a proven track record of working with cutting-edge Big Data technologies. You will be responsible for designing, developing, and maintaining our data pipelines, ensuring they are efficient, reliable, and scalable to meet our growing business needs.

Key Responsibilities

Design, develop, and maintain robust, scalable, and high-performance data pipelines using PySpark.
Develop, schedule, and monitor complex data workflows using orchestration tools like Apache Airflow.
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver high-quality data solutions.

Optimize and tune Spark jobs for performance and efficiency.
Implement data quality checks and ensure data integrity across all data pipelines.
Design and implement data models for optimal storage and retrieval.
Mentor junior data engineers and promote best practices in data engineering.
Ensure compliance with data governance and security policies.
Troubleshoot and resolve data-related issues in a timely manner.

Required Qualifications

6+ years of professional relevant experience in a data engineering role
Extensive hands-on experience with PySpark and advanced Python programming skills.
Proven experience with Big Data ecosystems, including Cloudera and/or DataBricks.
Hands-on experience with distributed query engines like Starburst (Trino/Presto).
Proficient in designing and managing complex workflows using scheduling tools, particularly Apache Airflow.
Strong expertise in SQL and experience with relational and non-relational databases.
Solid understanding of data warehousing concepts, ETL/ELT processes, and data modeling techniques.
Experience working in a Linux/Unix environment.
GIT HUB, CI/CD Pipeline

Education:

Bachelors degree/University degree or equivalent experience

This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi (opens in new window).

View Citis EEO Policy Statement (opens in new window) and the Know Your Rights (opens in new window) poster.

Apply (opens in new window)

Save

Bulge Bracket Investment Banks

13 days ago

No clicks

at Citi

ExperiencedNo visa sponsorship

Full Job Description

Senior PySpark Data Engineer

Apply (opens in new window)

Save

Job Req Id:

26955864

Location(s):

Pune, Maharashtra, India

Job Type:

On-Site/Resident

Posted:

Apr.. 29, 2026

Discover your future at Citi

Job Overview

Key Responsibilities About the Role

Key Responsibilities

Design, develop, and maintain robust, scalable, and high-performance data pipelines using PySpark.
Develop, schedule, and monitor complex data workflows using orchestration tools like Apache Airflow.
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver high-quality data solutions.

Optimize and tune Spark jobs for performance and efficiency.
Implement data quality checks and ensure data integrity across all data pipelines.
Design and implement data models for optimal storage and retrieval.
Mentor junior data engineers and promote best practices in data engineering.
Ensure compliance with data governance and security policies.
Troubleshoot and resolve data-related issues in a timely manner.

Required Qualifications

6+ years of professional relevant experience in a data engineering role
Extensive hands-on experience with PySpark and advanced Python programming skills.
Proven experience with Big Data ecosystems, including Cloudera and/or DataBricks.
Hands-on experience with distributed query engines like Starburst (Trino/Presto).
Proficient in designing and managing complex workflows using scheduling tools, particularly Apache Airflow.
Strong expertise in SQL and experience with relational and non-relational databases.
Solid understanding of data warehousing concepts, ETL/ELT processes, and data modeling techniques.
Experience working in a Linux/Unix environment.
GIT HUB, CI/CD Pipeline

Education:

Bachelors degree/University degree or equivalent experience

This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Apply (opens in new window)

Save

Senior PySpark Data Engineer

Full Job Description

Senior PySpark Data Engineer

Discover your future at Citi

Job Overview

Job Family Group:

Job Family:

Time Type:

Most Relevant Skills

Other Relevant Skills

SIMILAR OPPORTUNITIES