Key Responsibilities:
As a AI Ops Expert , Responsible and full ownership for the deliverables with greater defined quality standards with defined timeline and budget
· Design, implement, and manage AIops solutions to automate and optimize AI/ML workflows.
· Collaborate with data scientists, engineers, and other stakeholders to ensure seamless integration of AI/ML models into production.
· Monitor and maintain the health and performance of AI/ML systems.
· Develop and maintain CI/CD pipelines for AI/ML models.
· Implement best practices for model versioning, testing, and deployment.
· Troubleshoot and resolve issues related to AI/ML infrastructure and workflows.
· Stay up-to-date with the latest AIops, MLOps, and Kubernetes tools and technologies.
Profile required
Requirements and skills
· Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field.
· 2-7 year of relevant experience
· Proven experience in AIops, MLOps, or related fields.
· Strong proficiency in Python and experience with FastAPI.
· Strong handson expertise on Kubernetes (Or AKS)
· Hands-on experience with MS Azure and its AI/ML services, including Azure ML Flow.
· Proficiency in using DevContainer for development.
· Knowledge of CI/CD tools such as Jenkins, GitHub Actions, or Azure DevOps.
· Experience with containerization and orchestration tools like Docker and Kubernetes.
· Strong problem-solving skills and the ability to work in a fast-paced environment.
· Excellent communication and collaboration skills.
Preferred Skills:
· Experience with machine learning frameworks such as TensorFlow, PyTorch, or scikit-learn.
· Familiarity with data engineering tools like Apache Kafka, Apache Spark, or similar.
· Knowledge of monitoring and logging tools such as Prometheus, Grafana, or ELK stack.
· Understanding of data versioning tools like DVC or MLflow.
· Experience with infrastructure as code (IaC) tools like Terraform or Ansible.
· Proficiency in Azure-specific tools and services, such as:
· Azure Machine Learning (Azure ML)
· Azure DevOps
· Azure Kubernetes Service (AKS)
· Azure Functions
· Azure Logic Apps
· Azure Data Factory
· Azure Monitor and Application Insights