
at Optiver
Proprietary TradingPosted 8 days ago
No clicks
**Senior Machine Learning Platform Engineer**: Build and improve PB scale ML infrastructure, optimize performance, and provide ML researchers and data scientists support, leveraging existing research clusters. - **Responsibilities**: Platform design & stability, efficiency optimization, infrastructure profiting, and troubleshooting. - **Challenges**: Enlarge ML platform for PB data modeling and simulations using diverse ML models. - **Requirements**: Proven experience in ML infrastructure at scale, Linux computing environments, containerization, distributed training, and data-intensive storage solutions. Keywords: Senior, Machine Learning, Platform Engineering, Large-Scale Infrastructure, GPU Clusters, Linux, Docker, Distributed Training, Data Storage.
- Compensation
- Not specified
- City
- Shanghai
- Country
- China
Currency: Not specified
Full Job Description
Senior Machine Learning Platform Engineer
Level
Experienced
Location
Shanghai
Department
Technology
Key Responsibilities
- Building the compute platform and machine learning libraries for large scale machine learning and simulation workloads
- Focus on compute platform stability and efficiency on both CPU and GPU clusters, making the platform observable and scalable
- Utilize cluster monitoring and profiling tools to identify bottlenecks and optimize both infrastructure and software system
- Troubleshoot and resolve issues related to OS, storage, network, and GPUs
Challenges You Will Tackle: design, build and improve our compute platform for PB scale data model training and simulations with a wide range of machine learning models by leveraging our existing research infrastructure.
Requirements:
- Solid experience in running production machine learning infrastructure at a large scale
- Experience in designing, deploying, profiling and troubleshooting in Linux-based computing environments
- Proficiency in containerization, parallel computing and distributed training algorithms
- Experience with storage solutions for large scale, cluster-based data intensive workloads
Bonus qualification:
- Experience of supporting machine learning researchers or data scientists for production workloads
WHAT YOU CAN EXPECT FROM US:
In return for you joining our elite team, you will be offered a competitive salary package as well as access to a plethora of Optiver-perks. To hear more about what it is like to work here and our great culture, apply now and take the first step towards the best career move you will ever make!
DIVERSITY AND INCLUSION
Optiver is committed to
PRIVACY DISCLAIMER
Optiver Optiver China Privacy Notice,
Personal information protection is of utmost importance to Optiver. Before you provide any personal information to us, we strongly urge you to read our




