GS

Principal Data Engineer

GSK

9 months ago

10+ years

Work From Office

Bengaluru, Karnataka, Karnataka, India

  • Design, develop, and implement high-quality software solutions and high-performance data platforms.
  • Define technical roadmap and own delivery oversight for all in-scope systems working closely with Technology Product Owners and leadership teams.
  • Lead the integration of disparate systems, platforms, and data sources to create unified and interoperable solutions.
  • Agile Methodologies

    AI/ML

    RAG(Retrieval-Augmented Generation)

    Large Language Models (LLM)

    MLOps

    vector search

    Tensorflow

    Pytorch

    Cloud Platforms

    SQL

    Azure

    Version Control (Git)

    Apache Spark

    Job description & requirements

    Responsibilities:


    • Design, develop, and implement high-quality software solutions and high-performance data platforms that can handle large data sets with both batch and real time processing
    • Troubleshoot and address complex technical challenges and improve existing software to fit unique organization needs and ecosystem.
    • Provide leadership and consultation on technical matters to a variety of development functions.
    • Analyze business requirements and translates to technical requirements to develop proposals outlining how the organization’s products and services can meet these needs and be integrated and implemented with the organization's technical infrastructure.
    • Design end-to-end technical solutions that are robust, secure, resilient, scalable and maintainable which align with business requirements, architectural best practices and standards to facilitate seamless data flow and communication between systems.
    • Innovation mindset, looking for opportunities to enhance & improve the GCO products & tools with new tech innovation, AI/ML, LLM/NLP, Automation and helping to embed them in products
    • Lead the definition of scope, delivery approach, resourcing plan and cost models to deliver new technology (system, data and services as a product) or amend existing technologies to deliver business needs and objectives.
    • Define technical roadmap and own delivery oversight for all in-scope systems working closely with Technology Product Owners and leadership teams.
    • Identify technical risks, vulnerabilities, and dependencies and develop mitigation strategies to address them. Proactively assess and manage risks throughout the software development lifecycle.
    • Lead the integration of disparate systems, platforms, and data sources to create unified and interoperable solutions.
    • Provide technical leadership, guidance, and mentorship to Technology development teams sourced from GSK’s technology delivery partners including architects, engineers, testers and analysts. Collaborate with cross-functional teams to foster a culture of innovation, continuous improvement, and knowledge sharing.
    • Define and evolve the data architecture strategy, including data modeling, data storage, and data processing frameworks.
    • Optimize the performance of systems and applications through architectural design, performance tuning, and capacity planning. Identify bottlenecks, inefficiencies, and areas for improvement, and implement solutions to enhance system performance and responsiveness.
    • Apply software design principles to complex work in research, design and development of new or existing products, tools and processes required for operation, maintenance and testing 
    • Liaise with hardware, software, and systems design engineers to ensure that products and services are modified, configured and installed.
    • Maintain familiarity with architectural standards and vision, R&D Technology Roadmap, technical landscape, validation requirements, central service request processes, GSK policies, and other aspects of GSK environment.
    • Experiments with new technologies and supports continuous experimentation and learning of the teams to drive innovation in solving complex problems, overcoming limitations of incumbent technology and assess new ways of working to deliver efficiency, risk mitigation and optimization strategies.


    Why you?

    We are looking for professionals with these required skills to achieve our goals:


    Qualifications:

     

    • Bachelor's degree in Computer Science or Engineering.
    • Over 10 years of experience, with a minimum of 5 years in software development within large organizations.
    • Proven ability to create production-ready technical solutions using innovative technologies.
    • Experience with agile/scrum methodologies.
    • Strong expertise in architecting complex solutions on the Microsoft Azure cloud stack (IaaS, PaaS, and SaaS).
    • Hands-on experience in developing scalable, reliable, and high-performance data platforms to handle both large and small datasets along with expertise in implementing advanced AI/ML architectures, including Retrieval-Augmented Generation (RAG), Agentic-AI, Federated Learning, and Transformer-based models (e.g., OpenAI, Llama, BERT), using retrieval, embedding, and LLM techniques.
    • Development and Maintenance of MLOps pipelines for model versioning, monitoring, and retraining.
    • Experienced in vector search tools like Pinecone, Azure Cognitive Services, as well as AutoML, NASNet, GANs, and VAEs for generative tasks and representation learning.
    • Knowledge on ML/DL frameworks such as TensorFlow, PyTorch, Scikit-learn, and Hugging Face for data curation, predictive and prescriptive modelling.
    • Hands-On with integration and deployment of AI models using cloud services (AWS, Azure, GCP) in containerized environments (Docker, Kubernetes), building APIs, and microservices.
    • Optimization of models for accuracy, efficiency, and scalability.
    • Expertise in programming languages such as Python, PySpark, and strong SQL skills for data manipulation and performance tuning. Having knowledge on Java script tools is a plus.
    • Strong expertise in Azure services, including ADF, ADLS2, Azure Event Hub, Azure Functions, Databricks, Databricks Unity Catalog, version control (Git), CI/CD pipelines and data warehousing solutions like Snowflake/Azure Synapse.
    • Experience building batch and real-time data pipelines using tools such as Apache Kafka, Apache Spark, Flink, Airflow, or similar.
    • Strong understanding of architecture patterns including Kappa and Lambda patterns, with best practices for building cloud-ready systems.
    • Knowledge in data virtualization implementation using Denodo, with an understanding of Data Mesh and Data Fabric concepts.


    Personal skills:

    • Able to appreciate short term vs. long term goals and take both tactical and strategic decisions.
    • Great communication skills, ability to communicate complex technical concepts to a non-technical audience.
    • Strong organizational skills, the ideal candidate has the ability to work in a fast-paced. environment and has the ability to quickly adapt to changing priorities.
    • Stays on top of the latest trends and develops expertise in emerging cloud technologies.
    • Works well as a technical leader and individual contributor.
    • Build processes supporting data transformation, data structures, metadata, dependency, and workload management.
    • Able to assist the project team in planning and execution to achieve release level goals.
    • Is pro-active and vigilant in detecting, communicating effectively and managing risks to Product and Project leadership teams.


    Preferred Skills :

    If you have the following characteristics, it would be a plus:

    • Master’s degree in computer science or relevant field is preferred.
    • Architecture Strategy and Complex / End-to-End full stack solution development
    • Experience in Life sciences preferably in R&D industry
    • Azure Cloud experience/Certification

    Experience in experimentation / prototyping with innovative technologies including primarily AI/ML and GenAI.



    Experience :

    10+ years

    Job Domain/Function :

    Data Engineering

    Job Type :

    Work From Office

    Employment Type :

    Full Time

    Number Of Position(s) :

    1

    Educational Qualifications :

    Bachelor's Degree

    Location :

    Bengaluru, Karnataka, India, Bengaluru, Karnataka, India

    Create alert for similar jobs

    GS

    GSK

    At GSK, we unite science, technology, and talent to get ahead of disease together. Our goal is to improve the lives of billions across the world. By bringing together outstanding people in an inclusive environment, we can make an impact on a global scale.

    Similar Jobs