LOG IN
SIGN UP
Canary Wharfian - Online Investment Banking & Finance Community.
Sign In
or continue with e-mail and password
Forgot password?
Don't have an account?
Create an account
or continue with e-mail and password
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Data Engineer - Pyspark

ExperiencedNo visa sponsorship
Citi logo

at Citi

Bulge Bracket Investment Banks

Posted 2 months ago

No clicks

Senior data engineering role to build and maintain large-scale Big Data pipelines using PySpark/Spark, Hive, Hadoop and cloud-based data management technologies. Requires 4+ years of hands-on experience with Python/Scala, Unix scripting, SQL, ETL and performance tuning to redesign and optimize compute and data processes for real-time and batch systems. The role involves working with large datasets, data preprocessing, exposure to machine learning techniques and tools (Talend, Cloudera, Pepperdata), and stakeholder management within Credit Cards and Retail Banking in a hybrid onsite/offsite delivery model.

Compensation
Not specified

Currency: Not specified

City
Not specified
Country
India

Full Job Description

Data Engineer - Pyspark

Apply (opens in new window)
Save
Job Req Id:
26931109
Location(s):
Haryana, India
Job Type:
Hybrid
Posted:
Jan. 28, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

Responsibilities:

  • Engineering Degree with 4+ years of experience in BigData systems, Hive, Hadoop, Spark (Python/ scala) and cloud based data management technologies
  • Hands-on experience in Unix Scripting, Python and Scala programing along with strong experience in SQL.
  • Comfortable working with completed unstructured, undocumented code and turning it around into best in class code redesigning costly compute and data processes and aligning to best development standards
  • Experienced in working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding.
  • Well versed with necessary data preprocessing and application engineering skills
  • At least 3 years of experience designing software systems with intense computational needs across real time and batch process .
  • Experience and understanding of Supervised, unsupervised machine learning techniques
  • Exposure to data ingestion, ETL tools such as Talend, modeling tools, Performance Management tooling such as Pepper data, Cloudera stack will be a plus
  • Knowledge of data management, data governance, data security and regulatory practices
  • Ability to identify, clearly articulate and solve complex business problems and present them to the management in a structured and simpler form
  • Should have experience of working in onsite, offsite delivery model
  • Experience working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding.
  • Experience in Credit Cards and Retail Banking
  • Should have excellent communication and inter-personal skills
  • Strong process/project management skills
  • Multiple stake holder management
  • Control orientated and Risk awareness


Qualifications:

  • Fast Learner with a desire to excel and attitude to partner and solve problems in complex environments placing business objectives at center or all activity.
  • Experience in Performance Tuning, Code Re-engineering is preferred.
  • Experience in broad IT architecture and design preferred across data and channels
  • Experience in query tuning, automation technologies (Autosys, Jenkins, Service Now) preferred
  • Exposure to container technology, Machine learning will be a plus


Education:

  • Bachelors/University degree or equivalent experience

------------------------------------------------------

Job Family Group:

Decision Management

------------------------------------------------------

Job Family:

Data/Information Management

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.

Apply (opens in new window)
Save

Data Engineer - Pyspark

Compensation

Not specified

City: Not specified

Country: India

Citi logo
Bulge Bracket Investment Banks

2 months ago

No clicks

at Citi

ExperiencedNo visa sponsorship

Senior data engineering role to build and maintain large-scale Big Data pipelines using PySpark/Spark, Hive, Hadoop and cloud-based data management technologies. Requires 4+ years of hands-on experience with Python/Scala, Unix scripting, SQL, ETL and performance tuning to redesign and optimize compute and data processes for real-time and batch systems. The role involves working with large datasets, data preprocessing, exposure to machine learning techniques and tools (Talend, Cloudera, Pepperdata), and stakeholder management within Credit Cards and Retail Banking in a hybrid onsite/offsite delivery model.

Full Job Description

Data Engineer - Pyspark

Apply (opens in new window)
Save
Job Req Id:
26931109
Location(s):
Haryana, India
Job Type:
Hybrid
Posted:
Jan. 28, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

Responsibilities:

  • Engineering Degree with 4+ years of experience in BigData systems, Hive, Hadoop, Spark (Python/ scala) and cloud based data management technologies
  • Hands-on experience in Unix Scripting, Python and Scala programing along with strong experience in SQL.
  • Comfortable working with completed unstructured, undocumented code and turning it around into best in class code redesigning costly compute and data processes and aligning to best development standards
  • Experienced in working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding.
  • Well versed with necessary data preprocessing and application engineering skills
  • At least 3 years of experience designing software systems with intense computational needs across real time and batch process .
  • Experience and understanding of Supervised, unsupervised machine learning techniques
  • Exposure to data ingestion, ETL tools such as Talend, modeling tools, Performance Management tooling such as Pepper data, Cloudera stack will be a plus
  • Knowledge of data management, data governance, data security and regulatory practices
  • Ability to identify, clearly articulate and solve complex business problems and present them to the management in a structured and simpler form
  • Should have experience of working in onsite, offsite delivery model
  • Experience working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding.
  • Experience in Credit Cards and Retail Banking
  • Should have excellent communication and inter-personal skills
  • Strong process/project management skills
  • Multiple stake holder management
  • Control orientated and Risk awareness


Qualifications:

  • Fast Learner with a desire to excel and attitude to partner and solve problems in complex environments placing business objectives at center or all activity.
  • Experience in Performance Tuning, Code Re-engineering is preferred.
  • Experience in broad IT architecture and design preferred across data and channels
  • Experience in query tuning, automation technologies (Autosys, Jenkins, Service Now) preferred
  • Exposure to container technology, Machine learning will be a plus


Education:

  • Bachelors/University degree or equivalent experience

------------------------------------------------------

Job Family Group:

Decision Management

------------------------------------------------------

Job Family:

Data/Information Management

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View Citi’s EEO Policy Statement and the Know Your Rights poster.

Apply (opens in new window)
Save