LOG IN
SIGN UP
Canary Wharfian - Online Investment Banking & Finance Community.
Sign In
or continue with e-mail and password
Forgot password?
Don't have an account?
Create an account
or continue with e-mail and password
By signing up, you agree to our Terms & Conditions and Privacy Policy.

Data Engineer - Python, AI

ExperiencedNo visa sponsorship
Citi logo

at Citi

Bulge Bracket Investment Banks

Posted 10 days ago

No clicks

**Data Engineer - Python, AI**: Build & optimize data pipelines with PySpark, Pandas, and AI models (BERT, Flair, LLM). Develop APIs, integrate services, and manage CI/CD. 10+ years of Python experience required.

Compensation
Not specified

Currency: Not specified

City
Pune
Country
India

Full Job Description

Data Engineer - Python, AI

Apply (opens in new window)
Save

Job Req Id:

26965263

Location(s):

Pune, Maharashtra, India

Job Type:

Hybrid

Posted:

Mai. 27, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, youll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

Role Summary
We are looking for a mid-level Python Developer with combined experience in Data Engineering and AI/NLP engineering. The candidate will build NLP pipelines using libraries such as Flair, BERT, and LLM frameworks, and will also work on large-scale data processing using PySpark, Pandas, and related data tools. The role includes developing APIs, integrating with platform services, and supporting CI/CD deployments using GitHub and LightSpeed Enterprise.

Key Responsibilities

  • Develop and optimize ETL/data processing jobs using PySpark, Pandas, PyArrow, and related libraries.
  • Build and maintain NLP pipelines using Flair, BERT, and LLM-based models.
  • Develop scalable ingestion and data transformation pipelines for AI and analytics use cases.
  • Build and maintain Flask-based APIs for model inference and service integrations.
  • Use regular expressions for text cleaning, parsing, and NLP preprocessing.
  • Integrate caching and fast lookups using Redis.
  • Manage and deploy ML models using MLflow for tracking and versioning.
  • Support CI/CD workflows using GitHub, LightSpeed Enterprise, and deployment pipelines.
  • Create and maintain Autosys JILs for job scheduling and automation.
  • Use basic Linux commands for troubleshooting, operations, and deployment tasks.
  • Monitor application and system health using ITRS Geneos.
  • Write unit tests and improve automation test coverage (PyTest/unittest).
  • Work with REST APIs, microservices, and basic shell scripting.
  • Work with cloud services (ECS), including boto3.

Required Skills

  • 1012 years of hands-on Python programming experience.
  • Strong fundamentals in Python, OOP, and design patterns.
  • Experience with NLP libraries such as Flair, BERT, HuggingFace Transformers, or similar.
  • Solid experience with PySpark, Pandas, PyArrow, and distributed data pipelines.
  • Experience building APIs using Flask (FastAPI is a plus).
  • Experience with MLflow for model tracking and deployment.
  • Good understanding of CI/CD practices and Git workflows.
  • Experience working with Redis or similar in-memory stores.
  • Experience with Autosys JILs for job scheduling.
  • Comfortable with Linux command line and shell scripting.
  • Strong debugging, problem-solving, and teamwork skills.
  • Exposure to cloud services; AWS boto3 experience is an asset.

Nice-to-Have

  • Experience with Polars or Dask for high-performance data processing.
  • Experience with PyTorch or TensorFlow for model training.
  • Experience with Docker, Kubernetes, or containerized deployments.
  • Experience with monitoring tools such as ITRS Geneos.
  • Experience with FastAPI, Airflow, or Prefect.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi (opens in new window).

View Citis EEO Policy Statement (opens in new window) and the Know Your Rights (opens in new window) poster.

Apply (opens in new window)
Save

Data Engineer - Python, AI

Compensation

Not specified

City: Pune

Country: India

Citi logo
Bulge Bracket Investment Banks

10 days ago

No clicks

at Citi

ExperiencedNo visa sponsorship

**Data Engineer - Python, AI**: Build & optimize data pipelines with PySpark, Pandas, and AI models (BERT, Flair, LLM). Develop APIs, integrate services, and manage CI/CD. 10+ years of Python experience required.

Full Job Description

Data Engineer - Python, AI

Apply (opens in new window)
Save

Job Req Id:

26965263

Location(s):

Pune, Maharashtra, India

Job Type:

Hybrid

Posted:

Mai. 27, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, youll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

Role Summary
We are looking for a mid-level Python Developer with combined experience in Data Engineering and AI/NLP engineering. The candidate will build NLP pipelines using libraries such as Flair, BERT, and LLM frameworks, and will also work on large-scale data processing using PySpark, Pandas, and related data tools. The role includes developing APIs, integrating with platform services, and supporting CI/CD deployments using GitHub and LightSpeed Enterprise.

Key Responsibilities

  • Develop and optimize ETL/data processing jobs using PySpark, Pandas, PyArrow, and related libraries.
  • Build and maintain NLP pipelines using Flair, BERT, and LLM-based models.
  • Develop scalable ingestion and data transformation pipelines for AI and analytics use cases.
  • Build and maintain Flask-based APIs for model inference and service integrations.
  • Use regular expressions for text cleaning, parsing, and NLP preprocessing.
  • Integrate caching and fast lookups using Redis.
  • Manage and deploy ML models using MLflow for tracking and versioning.
  • Support CI/CD workflows using GitHub, LightSpeed Enterprise, and deployment pipelines.
  • Create and maintain Autosys JILs for job scheduling and automation.
  • Use basic Linux commands for troubleshooting, operations, and deployment tasks.
  • Monitor application and system health using ITRS Geneos.
  • Write unit tests and improve automation test coverage (PyTest/unittest).
  • Work with REST APIs, microservices, and basic shell scripting.
  • Work with cloud services (ECS), including boto3.

Required Skills

  • 1012 years of hands-on Python programming experience.
  • Strong fundamentals in Python, OOP, and design patterns.
  • Experience with NLP libraries such as Flair, BERT, HuggingFace Transformers, or similar.
  • Solid experience with PySpark, Pandas, PyArrow, and distributed data pipelines.
  • Experience building APIs using Flask (FastAPI is a plus).
  • Experience with MLflow for model tracking and deployment.
  • Good understanding of CI/CD practices and Git workflows.
  • Experience working with Redis or similar in-memory stores.
  • Experience with Autosys JILs for job scheduling.
  • Comfortable with Linux command line and shell scripting.
  • Strong debugging, problem-solving, and teamwork skills.
  • Exposure to cloud services; AWS boto3 experience is an asset.

Nice-to-Have

  • Experience with Polars or Dask for high-performance data processing.
  • Experience with PyTorch or TensorFlow for model training.
  • Experience with Docker, Kubernetes, or containerized deployments.
  • Experience with monitoring tools such as ITRS Geneos.
  • Experience with FastAPI, Airflow, or Prefect.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi (opens in new window).

View Citis EEO Policy Statement (opens in new window) and the Know Your Rights (opens in new window) poster.

Apply (opens in new window)
Save