Job Description:
- Building scalable Data Pipelines
- Design, implement, and maintain end-to-end data pipelines to efficiently extract, transform, and load (ETL) data from diverse sources.
- Ensure data pipelines are reliable, scalable, and performance oriented.
- SQL Expertise:
- Write and optimize complex SQL queries for data extraction, transformation, and reporting.
- Collaborate with analysts and data scientists to provide structured data for analysis.
- Cloud Platform Experience:
- Utilize cloud services to enhance data processing and storage capabilities.
- Work towards the integration of tools into the data ecosystem.
- Documentation and Collaboration:
- Document data pipelines, procedures, and best practices to facilitate knowledge sharing.
- Collaborate closely with cross-functional teams to understand data requirements and deliver solutions.
Required skills:
- 3+ years of experience with SQL, Python,
- 1+ GCP BigQuery, DataFlow, GCS, Postgres
- 2+ years of experience building out data pipelines from scratch in a highly distributed and fault-tolerant manner.
- Comfortable with a broad array of relational and non-relational databases.
- Proven track record of building applications in a data-focused role (Cloud and Traditional Data Warehouse)
- Experience with CloudSQL, Cloud Functions and Pub/Sub, Cloud Composer etc.,
- Inquisitive, proactive, and interested in learning new tools and techniques.
- Familiarity with big data and machine learning tools and platforms. Comfortable with open source technologies including Apache Spark, Hadoop, Kafka.
- Strong oral, written and interpersonal communication skills
- Comfortable working in a dynamic environment where problems are not always well-defined.