ROLE RESPONSIBILITIES
- Develop data solutions to support data scientists and analytics/AI solutions, ensuring data quality, reliability, and efficiency
- Conduct exploratory data analysis and quality checks
- Deliver scalable data pipelines that ingest and integrate data from various information sources
- Contribute to best practices, standards, and documentation to ensure consistency and scalability
- Conduct data engineering research to advance design and development capabilities
- Guide junior developers on concepts such as data modeling, database architecture, data pipeline management, data ops and automation, tools, and best practices
- Demonstrate a proactive approach to identifying and resolving potential system issues
- Create and maintain robust technical documentation for data solutions to enable knowledge retention and sharing
- Collaborate with data scientists, engineers, and colleagues from across Pfizer to integrate AI and data science models into production solutions
- Partner with the AIDA Data and Platforms teams to enforce best practices for data engineering and data solutions
BASIC QUALIFICATIONS
- Bachelor's degree in computer science, information technology, software engineering, or a related field (Data Science, Computer Engineering, Computer Science, Information Systems, Engineering, or a related discipline).
- 5+ years of hands-on experience in working with SQL, Python, object-oriented scripting languages (e.g. Java, C++, etc..) in building data pipelines and processes. Proficiency in SQL programming, including the ability to create and debug stored procedures, functions, and views.
- Knowledge of modern data engineering frameworks and tools such as Snowflake, Redshift, Spark, Airflow, Hadoop, Kafka, and related technologies
- Experience working in a cloud-based analytics ecosystem (AWS, Snowflake, etc.)
- Understanding of Software Development Life Cycle (SDLC) and data science development lifecycle (CRISP)
- Highly self-motivated to deliver both independently and with strong team collaboration
- Ability to creatively take on new challenges and work outside comfort zone.
- Strong English communication skills (written & verbal)
PREFERRED QUALIFICATIONS
- Advanced degree in Data Science, Computer Engineering, Computer Science, Information Systems, or a related discipline (preferred, but not required)
- Experience with data science enabling technology, such as Dataiku Data Science Studio, AWS SageMaker or other data science platforms
- Familiarity with machine learning and AI technologies and their integration with data engineering pipelines
- Familiarity with containerization technologies like Docker and orchestration platforms like Kubernetes.
- Experience working effectively in a distributed remote team environment
- Hands on experience working in Agile teams, processes, and practices
- Proficiency in using version control systems like Git.
- Pharma & Life Science commercial functional knowledge
- Pharma & Life Science commercial data literacy