Role Overview
The Data Scientist will be responsible to work on data science projects under the supervision of senior team members and deliver business outcomes. Key responsibilities include –
- Development of data science models.
- Work alongside software developers and software engineers to translate algorithms into viable products and services.
- Work in technical teams in development, deployment, and application of applied analytics, predictive analytics, and prescriptive analytics.
- Perform exploratory and targeted data analysis using descriptive statistics and other methods.
- Work with data engineers on data quality assessment, data cleansing and data analytics.
- Generate reports, annotated code, and other projects artifacts to document, archive, and communicate the work and outcomes.
- Share and discuss findings with team members.
Qualifications / Requirements:
- Masters or PhD degree in Statistics, Machine Learning, Computer Science or related field
- Proficiency in Python (mandatory).
- Demonstrated skill at data cleansing, data quality assessment, and using analytics for solving business problems
- Demonstrated skill in the use of applied analytics, descriptive statistics, feature extraction and predictive analytics on datasets
- Demonstrated skill at data visualization and storytelling for an audience of stakeholders
- Strong communication and interpersonal skills
Desired Characteristics:
- Influences within peer group.
- Implements specific component(s) of the roadmap.
- Evaluates features using well known or prescribed recipes and appropriately down selects to valuable ones.
- Aware of models such as CART / SVM / RF / Neural Net and the associated sub-models
- Uses Cross Validation and other Verification & Validation techniques to build robust models from large data sets.
- Understands the types of issues that impact data quality
- Performs basic data cleaning operations such as flagging missing and invalid data etc.
- Fits normal parameters to data and assess goodness of fit
- Can use and interpret t-tests, ANOVA and basic hypothesis testing with good utilization of p-values.
- Codes using modular practices for reusage and object-oriented Effectively shows visualization of data exploration using box, bubble and matrix plots.
- Has a basic understanding of GE Aerospace business and how the tools they are developing create value.