Responsibilities
As a Principal Data Engineer, you will be at the forefront of Oracle’s data initiatives, playing a pivotal role in transforming raw data into actionable insights. Collaborating with data scientists and business stakeholders, you will design scalable data pipelines, optimize data infrastructure, and ensure the availability of high-quality datasets for strategic analysis. This role goes beyond data engineering, requiring hands-on involvement in statistical analysis and predictive modeling. You will use techniques such as regression analysis, trend forecasting, and time-series modeling to extract meaningful insights from data, directly supporting business decision-making.
Basic Qualifications:
- 7+ years of experience in data engineering and analytics, with a strong background in designing scalable database architectures, building and optimizing data pipelines, and applying statistical analysis to deliver strategic insights across complex, high-volume data environments
- Deep knowledge of big data frameworks such as Apache Spark, Apache Flink, Apache Airflow, Presto, Kafka, and data warehouse solutions.
- Experience working with other cloud platform teams and accommodating requirements from those teams (compute, networking, search, store).
- Excellent written and verbal communication skills with the ability to present complex information in a clear, concise manner to all audiences
- Design and optimize database structures to ensure scalability, performance, and reliability within Oracle ADW and OCI environments. This includes maintaining schema integrity, managing database objects, and implementing efficient table structures that support seamless reporting and analytical needs.
- Build and manage data pipelines that automate the flow of data from diverse sources into Oracle databases, using ETL processes to transform data for analysis and reporting.
- Conduct data quality assessments, identify anomalies, and validate the accuracy of data ingested into our systems. Working alongside data governance teams, you will establish metrics to measure data quality and implement controls to uphold data integrity, ensuring reliable data for stakeholders.
- Mentor junior team members and share best practices in data analysis, modeling, and domain expertise.
Preferred Qualifications:
- Solid understanding of statistical methods, hypothesis testing, data distribution, regression analysis, and probability.
- Proficiency in Python for data analysis and statistical modeling. Experience with libraries like pandas, NumPy, and SciPy.
- Knowledge of methods and techniques for data quality assessment, anomaly detection, and validation processes. Skills in defining data quality metrics, creating data validation rules, and implementing controls to monitor and uphold data integrity.
- Familiarity with visualization tools (e.g., Tableau, Power BI, Oracle Analytics Cloud) and libraries (e.g., Matplotlib, Seaborn) to convey insights effectively.
- Strong communication skills for collaborating with stakeholders and translating business goals into technical data requirements.
- Ability to contextualize data insights in business terms and to present findings to non-technical stakeholders in a meaningful way.
- Ability to cleanse, transform, and aggregate data from various sources, ensuring it’s ready for analysis.
- Experience with relational database management and design, specifically in Oracle environments (e.g., Oracle Autonomous Data Warehouse, Oracle Database).
- Skills in designing, maintaining, and optimizing database schemas to ensure efficiency, scalability, and reliability.
- Advanced SQL skills for complex queries, indexing, stored procedures, and performance tuning.
- Experience with ETL tools such as Oracle Data Integrator (ODI), or other data integration frameworks.
Required Skills
Distributed Systems
Java (Programming Language)
Kubernetes
Operating Systems
Operational Excellence
Relational Databases
SQL (Structured Query Language)