JO

Databricks Senior Developer - Data Engineering & Clinical Analytics

Johnson&Johnson

9 months ago

5 - 7 years

Hybrid

Hyderabad, Telangana, Telangana, India

  • Design and implement data pipelines using Databricks for ETL processes, leveraging Spark for large-scale data processing and transformation.
  • Continuously monitor and optimize the performance of Databricks jobs, ensuring efficient execution of data pipelines.
  • Work closely with clinical programmers, medical reviewers, data managers central monitors, data scientist, and business customers.
  • Data Bricks

    Spark

    SQL

    NoSQL

    ETL Tools

    Data Warehousing

    Data governance

    Cloud Platforms

    Software Development Life Cycle (SDLC)

    Agile Methodologies

    Scala

    Data Visualization (Tableau, Power BI)

    GEN AI

    Job description & requirements

    Key Responsibilities:


    Technical Leadership & Expertise:

    • Define and lead all aspects of the overall data engineering architecture and roadmap using Databricks and modern data engineering practices.
    • Mentor team members in solving technical challenges related to data pipelines and analytics through hands-on mentorship and support.

    Databricks Implementation:

    • Design and implement data pipelines using Databricks for ETL processes, leveraging Spark for large-scale data processing and transformation.
    • Develop Delta Lake tables and optimize data storage and query performance within the Databricks environment.
    • Create and maintain Databricks notebooks for data exploration, analytics, and machine learning model development, ensuring standard methodologies for documentation and collaboration.

    Performance Optimization:

    • Continuously monitor and optimize the performance of Databricks jobs, ensuring efficient execution of data pipelines.
    • Analyze and fine-tune data processing workflows to reduce runtime and enhance performance.
    • Develop strategies for optimizing Spark configurations and cluster resources for cost-effective and high-performance data processing.
    • Platform Operations:
    • Ensure the stability and reliability of the Databricks platform, managing cluster configuration, scaling, and resource allocation to meet workload demands.
    • Fix and resolve operational issues within the Databricks environment, collaborating with DevOps teams for seamless integration and deployment.

    Collaboration & Communication:

    • Work closely with clinical programmers, medical reviewers, data managers central monitors, data scientist, and business customers to understand clinical data needs and collaborate on analytics projects.
    • Facilitate communication between technical and non-technical teams to ensure alignment on data solutions and project goals.

    Development & Problem-Solving:

    • Effectively engage in the development of data pipelines and workflows within Databricks, supporting code reviews and technical discussions.
    • Drive testing and deployment of data solutions, ensuring automated testing measures are in place for quality assurance.
    • Scale proof-of-concept projects into production environments, ensuring performance, reliability, and maintainability.

    Clinical Data Review & Monitoring :

    • Leverage experience with clinical data management & clinical data review , including understanding of Clinical trial patient data ( such as CRF data , lab data , IVRS data ) , understanding of SDTM (Study Data Tabulation Model) and Clinical systems, to create efficient data workflows.
    • Supervise and maintain clinical data review & monitoring system to ensure stability, performance, and availability of data solutions, proactively addressing system issues and bottlenecks.
    • Assist in data migration strategies from legacy clinical systems, ensuring compliance with business rules and data governance standards.
    • Handle technical debt and continuously seek opportunities for improvement in clinical data processes and architecture.


    Qualifications:

    Education:

    • Bachelor’s degree or higher in Computer Science, Engineering, Mathematics, or a related field.


    Experience and Skills:

    Required:

    • At least five(5) years of relevant IT experience, with a strong focus on data engineering and clinical technology.
    • Demonstrable experience with Databricks and Spark in building data pipelines and analytics solutions.
    • In-depth knowledge of clinical data management, including familiarity with clinical data regulations and systems.
    • Proficiency in database technologies (e.g., SQL, NoSQL) and ETL tools.
    • Solid understanding of data warehousing concepts and data governance practices related to clinical data.
    • Familiarity with cloud services such as AWS, Azure, or GCP in relation to data solutions.
    • Experience in implementing AI/ML techniques within data engineering workflows is advantageous.
    • Strong SDLC foundations in Agile methodologies and experience in collaborative development environments.
    • Experience with programming languages such as Python, SQL, or Scala with experience in code reviews.
    • Excellent analytical and problem-solving skills, with a history of delivering data solutions in enterprise settings.
    • Good interpersonal skills; ability to convey technical information clearly to team members.

    Preferred:

    • Knowledge of data visualization tools like Tableau or Power BI is a plus.
    • Knowledge of AI implementations especially in area of Generative AI and Agentic AI


    Experience :

    5 - 7 years

    Job Domain/Function :

    Data Engineering

    Job Type :

    Hybrid

    Employment Type :

    Full Time

    Number Of Position(s) :

    1

    Educational Qualifications :

    Bachelor's Degree

    Location :

    Hyderabad, Telangana, India, Hyderabad, Telangana, India

    Create alert for similar jobs

    JO

    Johnson&Johnson

    Similar Jobs

    Databricks Senior Developer - Data Engineering & Clinical Analytics-Johnson&Johnson-Hyderabad, Telangana, India-5 - 7 years