Your Role
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in data engineering, with a focus on building data pipelines and preprocessing data.
- Strong proficiency in programming languages such as Python, Java, or Scala.
- Hands-on experience with data processing frameworks and tools like Apache Spark, Hadoop, or similar.
- Proficiency in SQL and experience with relational and NoSQL databases.
- Experience with data visualization and EDA tools such as Pandas, Matplotlib, or Tableau.
- Familiarity with ML and AI concepts, particularly in relation to data preparation and pipelines.
- Experience with text, image, audio, and video data management, including labeling and cleansing.
- Exposure to EdgeAI applications and their unique data processing requirements (preferred).
- Strong problem-solving skills and the ability to work independently and collaboratively.
Your Profile
- Design, develop, and maintain scalable data pipelines to support ML models.
- Perform data preprocessing, cleansing, and labeling to ensure high-quality data inputs for ML applications.
- Conduct exploratory data analysis (EDA) to gather insights and identify data patterns.
- Collaborate with data scientists and ML engineers to align data pipeline requirements with model development needs.
- Create and manage datasets comprising text, image, audio, and video data for various ML applications.
- Implement best practices for data management, ensuring data integrity, consistency, and security.
- Optimize data workflows and processing pipelines for efficiency and performance.
- Utilize cloud-based data storage and processing solutions as needed.
- Stay current with industry trends and technologies to continuously improve data engineering processes.
- Provide technical support and guidance to junior data engineers and other team members.