[Remote] Principal ML Infra Engineer, Machine Learning Infrastructure & Data
Note: The job is a remote job and is open to candidates in USA. Upwork is the world’s largest work marketplace, connecting businesses with highly skilled professionals worldwide. As a Principal ML Infra Engineer, you will design, develop, and maintain scalable ML infrastructure components, collaborating with cross-functional teams to enhance machine learning initiatives.
- Responsibilities
- Own technical workstreams from start to finish, contribute to the team’s product roadmap, and be responsible for major technical decisions and tradeoffs.
- Effectively participate in team’s planning, code reviews, and design discussions.
- Consider the effects of projects across multiple teams and proactively manage conflicts.
- Work together with partner teams to achieve cross-departmental goals and satisfy broad requirements.
- Design, implement, and optimize distributed systems and infrastructure components to support large-scale machine learning workflows, including data ingestion, feature engineering, model training, and serving.
- Develop and maintain frameworks, libraries, and tools to streamline the end-to-end machine learning lifecycle, from data preparation, model training, evaluation, deployment, and monitoring.
- Architect and implement highly available, fault-tolerant, and secure systems that meet the performance and scalability requirements of production machine learning workloads.
- Collaborate and publish with machine learning researchers and data scientists on novel research and translate research into scalable and efficient software solutions.
- Stay current with the latest advancements in machine learning infrastructure, distributed computing, and cloud technologies, and integrate them into our platform to drive innovation.
- Mentor teammates, conduct code reviews, and uphold engineering best practices to ensure the delivery of high-quality software solutions.
- Skills
- Passion for ML Infrastructure: We value enthusiasm for advancing ML infrastructure.
- Proven Impact: Show us your track record of delivering impactful solutions.
- Innovative Thinker: Bring creativity and fresh ideas to the table.
- Technical Proficiency: Solid foundation in software engineering and ML concepts.
- Collaborative Mindset: Strong communication and teamwork skills are a must.
- Continuous Learner: Stay updated with the latest advancements in the field of AI.
- Problem-Solving Skills: Ability to tackle complex problems effectively.
- Adaptability: Thrive in a fast-paced, dynamic environment.
- Benefits
- Comprehensive medical insurance coverage for both you and your family
- Unlimited paid time off
- A 401(k) plan with matching contributions
- 12 weeks of paid parental leave
- An Employee Stock Purchase Plan
- Company Overview
- Upwork is an online marketplace that connects corporate businesses and organizations with field professionals and independent talents. It was founded in 2014, and is headquartered in Palo Alto, California, USA, with a workforce of 501-1000 employees. Its website is https://www.upwork.com.
- Company H1B Sponsorship
- Upwork has a track record of offering H1B sponsorships, with 6 in 2025, 28 in 2024, 23 in 2023, 35 in 2022, 29 in 2021, 22 in 2020. Please note that this does not guarantee sponsorship for this specific role.
Apply tot his job
Apply To this Job