SI

Data Engineer

Siemens

9 months ago

7 - 10 years

Work From Office

Bengaluru, Karnataka, Karnataka, India

Python (Programming Language)

machine learning tool

Mlops

Model Deployment

PostgreSQL

MySQL

NoSQL

ETL Tools

Continuous Integration and Continuous Delivery (CI/CD)

Apache Flink

Apache Spark Streaming

Cloud Platforms

Data Warehousing

Job description & requirements

Role Overview

We are seeking an experienced Data Engineer with 7-10 years of experience to design, develop, and optimize data pipelines while integrating machine learning (ML) capabilities into production workflows. The ideal candidate will have a strong background in data engineering, big data technologies, cloud platforms, and ML model deployment. This role requires expertise in building scalable data architectures, processing large datasets, and supporting machine learning operations (MLOps) to enable data-driven decision-making.


Key Responsibilities

Data Engineering & Pipeline Development

  • Design, develop, and maintain scalable, robust, and efficient data pipelines for batch and real-time data processing.
  • Build and optimize ETL/ELT workflows to extract, transform, and load structured and unstructured data from multiple sources.
  • Work with distributed data processing frameworks like Apache Spark, Hadoop, or Dask for large-scale data processing.
  • Ensure data integrity, quality, and security across the data pipelines.
  • Implement data governance, cataloging, and lineage tracking using appropriate tools.

Machine Learning Integration

  • Collaborate with data scientists to deploy, monitor, and optimize ML models in production.
  • Design and implement feature engineering pipelines to improve model performance.
  • Build and maintain MLOps workflows, including model versioning, retraining, and performance tracking.
  • Optimize ML model inference for low-latency and high-throughput applications.
  • Work with ML frameworks such as TensorFlow, PyTorch, Scikit-learn, and deployment tools like Kubeflow, MLflow, or SageMaker.

Cloud & Big Data Technologies

  • Architect and manage cloud-based data solutions using AWS, Azure, or GCP.
  • Utilize serverless computing (AWS Lambda, Azure Functions) and containerization (Docker, Kubernetes) for scalable deployment.
  • Work with data lakehouses (Delta Lake, Iceberg, Hudi) for efficient storage and retrieval.

Database & Storage Management

  • Design and optimize relational (PostgreSQL, MySQL, SQL Server) and NoSQL (MongoDB, Cassandra, DynamoDB) databases.
  • Manage and optimize data warehouses (Snowflake, BigQuery, Redshift, Databricks) for analytical workloads.
  • Implement data partitioning, indexing, and query optimizations for performance improvements.

Collaboration & Best Practices

  • Work closely with data scientists, software engineers, and DevOps teams to develop scalable and reusable data solutions.
  • Implement CI/CD pipelines for automated testing, deployment, and monitoring of data workflows.
  • Follow best practices in software engineering, data modeling, and documentation.
  • Continuously improve the data infrastructure by researching and adopting new technologies.


Required Skills & Qualifications

Technical Skills:

  • Programming Languages: Python, SQL, Scala, Java
  • Big Data Technologies: Apache Spark, Hadoop, Dask, Kafka
  • Cloud Platforms: AWS (Glue, S3, EMR, Lambda), Azure (Data Factory, Synapse), GCP (BigQuery, Dataflow)
  • Data Warehousing: Snowflake, Redshift, BigQuery, Databricks
  • Databases: PostgreSQL, MySQL, MongoDB, Cassandra
  • ETL/ELT Tools: Airflow, dbt, Talend, Informatica
  • Machine Learning Tools: MLflow, Kubeflow, TensorFlow, PyTorch, Scikit-learn
  • MLOps & Model Deployment: Docker, Kubernetes, SageMaker, Vertex AI
  • DevOps & CI/CD: Git, Jenkins, Terraform, CloudFormation

Soft Skills:

  • Strong analytical and problem-solving abilities.
  • Excellent collaboration and communication skills.
  • Ability to work in an agile and cross-functional team environment.
  • Strong documentation and technical writing skills.


Preferred Qualifications

  • Experience with real-time streaming solutions like Apache Flink or Spark Streaming.
  • Hands-on experience with vector databases and embeddings for ML-powered applications.
  • Knowledge of data security, privacy, and compliance frameworks (GDPR, HIPAA).
  • Experience with GraphQL and REST API development for data services.
  • Understanding of LLMs and AI-driven data analytics.

Experience :

7 - 10 years

Job Domain/Function :

Data Engineer

Job Type :

Work From Office

Employment Type :

Full Time

Number Of Position(s) :

1

Educational Qualifications :

Bachelor's Degree

Location :

Bengaluru, Karnataka, India, Bengaluru, Karnataka, India

Create alert for similar jobs

SI

Siemens

Siemens AG is a global technology powerhouse that combines the digital and physical worlds to benefit customers and society. The company focuses on industry, infrastructure, transport, and healthcare, creating technology with purpose to add real value for customers. Siemens is setting the course for long-term value creation through accelerated growth and stronger profitability with a simplified and leaner company structure.

Similar Jobs

Data Engineer-Siemens-Bengaluru, Karnataka, India-7 - 10 years