We are looking for a skilled Senior Data Engineer with 7+ years of experience in designing, building, and maintaining scalable data pipelines. The ideal candidate will have strong expertise in PySpark, SQL, Databricks, Azure, and Azure Data Factory (ADF). You will be responsible for developing efficient data solutions, optimizing workflows, and ensuring data reliability to support business decisions.
Key Responsibilities:
Design, develop, and deploy high-performance data pipelines using PySpark, Databricks, and Azure Data Factory (ADF).
Optimize and maintain SQL queries, stored procedures, and ETL processes for large-scale data processing.
Work with Azure cloud services (Azure Data Lake, Azure Synapse, Blob Storage, etc.) to build scalable data solutions.
Implement data integration, transformation, and warehousing solutions in Databricks.
Collaborate with data scientists, analysts, and business teams to understand data requirements and deliver actionable insights.
Ensure data quality, governance, and security best practices across all pipelines.
Troubleshoot and resolve data-related issues in production environments.
Automate workflows and improve efficiency using CI/CD pipelines.
Mentor junior engineers and provide technical guidance.
Required Skills & Qualifications:
7+ years of hands-on experience in Data Engineering.
Strong expertise in PySpark for large-scale data processing.
Proficiency in SQL (query optimization, stored procedures, performance tuning).
Hands-on experience with Azure cloud services (ADF, Azure Data Lake, Synapse, Blob Storage, etc.).
Experience with Databricks (Delta Lake, Spark optimization, notebook workflows).
Knowledge of data modeling, ETL/ELT processes, and data warehousing.
Familiarity with Python, Shell scripting, and automation.
Understanding of CI/CD pipelines and DevOps practices (Git, Azure DevOps).
Strong problem-solving skills and ability to work in a fast-paced environment