KEY ACCOUNTABILITIES
Establish and Implement MLOps practices:
- Development of end-to-end MLOps framework and Machine Learning Pipeline using GCP, Vertex AI, and Software tools
- Serving Pipeline with multiple creation Vertex AI and GCP services. Improve ML pipeline documentation and understandability.
- Automate logging of model usage and predictions provided. Improve logging and diagnostic processes.
- Automate monitoring of models both for failures and degradation. Automate monitoring of data sources to identify issues and/or data changes.
- Design and implement dynamic re-training of ML pipelines using event-based or custom logic.
- Resource and Infra Monitoring configuration and pipeline development using GCP service.
- Branching strategies and Version Control using GitHub
- ML Pipeline orchestration and configuration using Airflow/Kubeflow.
- Code refactorization & coding best practices implementation as per industry standard
Implementing MLOps practices on a project and establishing MLOps best practices.
- Lead the investigation and resolution of production issues, perform root cause analysis, and recommend changes to reduce/eliminate re-occurrence of issues.
- Optimize deployment and change control processes for models.
- Create and operationalize quality assurance processes for ML models.
Lead the execution of ML Solutions @Scale:
- Partners with business stakeholders to design the right deliver value-added insights and intelligent solutions through ML and AI.
- Collaborates with Data Science Leads, ML System Engineering and Platform teams to ensure the models are deployed in a scaled and optimized way. Additionally, ensure support the post-production to ensure model performance degrades are proactively managed.
- Play a lead role in spearheading the development effort of new standards (design patterns, coding practices, orchestration patterns) and drive value and adoption across the Data Science team.
- Is considered an expert in the ML Ops and Model management space; brings together business knowledge, architecture, resources, people, and technology to create more effective solutions.
Research, Evolve and Publish best practices:
- Research and operationalize technology and processes necessary to scale ML Ops
- Recommend model changes to optimize cloud spend.
- Ability to research and recommend MLOps best practices on new technologies, platforms, and services.
- Drive ideation, design, and creation of new ML Architecture patterns in discussion with the Enterprise Architecture team.
- MLOps pipeline improvement plan and suggestion
Communication and Collaboration:
- Knowledge sharing with the broader analytics team and stakeholders.
- Communicate on the on-goings to embrace the remote and geographical culture.
- Ability to communicate the accomplishments, failures, and risks in timely manner.
- Knowledge sharing session with team for specific ML Ops topics. Coach and Mentor junior ML members in the team.
- Foster a collaborative and innovative team environment. Contribute to the overall effort to educate stakeholders on AI practices.
- Closely collaborates with the stakeholders on projects and data science leaders to ensure practices are developed and enhanced to support accelerated analytic development and maintainability.
Embrace a learning mindset:
- Continually invest in one’s knowledge and skillset through formal training, reading, and attending conferences and meetups
MINIMUM QUALIFICATIONS
- Full time graduate from an accredited University.
- Advanced degree in a quantitative field (CS, engineering, statistics, math, data science).
- Proven technical leadership in a large, complex matrixed organization.
- Relevant Machine Learning experience of 6+ years and overall 12+ years of Industry experience.
- Experience in supervised ML algorithms, optimization, and performance tuning.
- Track record of producing machine learning models and production infrastructure at scale.
- Strong verbal and written communication skills including the ability to interact effectively with colleagues of varying technical and non-technical abilities.
- Passionate about agile software processes, data-driven development, reliability, and systematic experimentation.
- Passion for learning new technologies and solving challenging problems.
- Good understanding of CI, CD, TDD, and tools such as Jenkins.
- Strong understanding of orchestration frameworks such Airflow/Kubeflow/MLFlow.
- Agile software development experience such as Kanban and Scrum.
- Experience in software version control team practices and tools such as GIT and TFS.
- Expertise in Data Transformation and Manipulation through Big-Query/SQL
- Professional experience with Vertex AI and GCP Services.
- Strong proficiency in Python.
PREFERRED QUALIFICATIONS
- GCP Machine Learning certification
- Understanding of CPG industry
- Exposure to Deep Learning/RL/LLMs
- Prior experience with CPG industry.
- Publications or contributions to the data science and AI community.
- Certifications in AI, machine learning, or related fields.