Preferred Skills:
- Extensive experience in Google Cloud stack – Bigquery, GCS, Google Dataproc, Data Fusion, Data Flow, Cloud Pub-Sub, Composer, Cloud SQL, Google Kubernetes Engine, Cloud Run, Compute Engine, Artifact Registry, Google Data Studio)
- Experience in working with Python.
- Knowledge of Security, Identity and Access Management (IAM), Key Management (GMEK/CMEK), cloud IaaS components on GCP.
- Experience with SQL and NoSQL modern data stores.
- Experience in job scheduling using Oozie or Airflow/Cloud scheduler or any other ETL scheduler
- Design and build data pipelines from ingestion to consumption within a big data architecture using Java, Python or Scala.
- Experience with Cloud IaaS and DevOps Platforms like Terraform or equivalent technology.
- Experience in MDM, Metadata Management, Data Quality and Data Lineage tools.
- Implementing ETL framework over Big Data framework – Apache Spark and Hive.
Experience with managed code or scripting competency (Unix shell scripting, Python, JSON, and YML).