CA

Lead Data Science

Capgemini

7 months ago

10+ years

Work From Office

Gurgaon, Haryana, Haryana, India

  • Predictive and Prescriptive modelling using Statistical and Machine Learning algorithms
  • Information Extraction, Similarity Matching, Sentiment Analysis, Text Clustering, Semantic Analysis, Document Summarization etc.
  • Building Knowledge Graphs with unstructured data and knowledge graph optimizations like PageRank/TrustRank is expected
  • PYTHON

    Numpy

    SciPy

    Pandas

    RDBMS

    NoSQL

    Scikit-Learn

    Tensorflow

    Pytorch

    Big Data Platforms

    Cloud Platforms

    Agile Methodologies

    Unsupervised Learning

    Natural Language Processing (NLP)

    Job description & requirements

    Your role


    1. Predictive and Prescriptive modelling using Statistical and Machine Learning algorithms including but not limited to Time Series, Regression, Trees, Ensembles, Neural-Nets (Deep & Shallow – CNN, LSTM, Transformers etc.). Experience with open-source OCR engines like Tesseract, Speech recognition, Computer Vision, face recognition, emotion detection etc. is a plus.
    2. Unsupervised learning – Market Basket Analysis, Collaborative Filtering, Dimensionality Reduction, good understanding of common matrix decomposition approaches like SVD. Various Clustering approaches – Hierarchical, Centroid-based, Density-based, Distribution-based, Graph-based clustering like Spectral.
    3. NLP – Information Extraction, Similarity Matching, Sentiment Analysis, Text Clustering, Semantic Analysis, Document Summarization, Context Mapping/Understanding, Intent Classification, Word Embeddings, Vector Space Models, experience with libraries like NLTK, Spacy, Stanford Core-NLP is a plus. Usage of Transformers for NLP and experience with LLMs like (ChatGPT, Llama) and usage of RAGs (vector stores like LangChain & LangGraps), building Agentic AI applications.
    4. Graph Analytics – Familiarity with Graph Algorithms (Directed & Undirected) – Traversal (BFS, DFS), Cycle Detection (Bellman Ford, Flyod Warshall), Shortest Path (Dijkstra, A*) etc. Building Knowledge Graphs with unstructured data and knowledge graph optimizations like PageRank/TrustRank is expected
    5. Mathematical Optimization – Familiarity with common optimization algorithms, both discrete– Linear, Mixed-Integer, Goal, Dynamic etc and continuous – GD and its variants, Newton’s method etc. is expected. Experience with Simulated Annealing and exposure to ML inspired evolutionary optimization algorithms like Genetic Algorithm & Genetic Programming for optimization is a plus.
    6. Simulations – Monte Carlo Simulation, Discrete-Event Simulation, Agent-Based Simulation, Hybrid Simulation, System Dynamics, Genetic Algorithm based Simulation.
    7. Model Deployment – ML pipeline formation, data security and scrutiny check and ML-Ops for productionizing a built model on-premises and on cloud.


    Your Profile

    1. Programming Languages – Python – NumPy, SciPy, Pandas, MatPlotLib, Seaborne
    2. Databases – RDBMS (MySQL, Oracle etc.), NoSQL Stores (HBase, Cassandra etc.)
    3. ML/DL Frameworks – SciKitLearn, TensorFlow (Keras), PyTorch,
    4. Big data ML Frameworks - Spark (Spark-ML, Graph-X), H2O
    5. Cloud – Azure/AWS/GCP
    6. Experienced in Agile way of working, manage team effort and track through JIRA
    7. Experience in Proposal, RFP, RFQ and pitch creations and delivery to the big forum.
    8. Experience in POC, MVP, PoV and assets creations with innovative use cases
    9. Experience working in a consulting environment is highly desirable.


    Experience :

    10+ years

    Job Domain/Function :

    Data Science

    Job Type :

    Work From Office

    Employment Type :

    Full Time

    Number Of Position(s) :

    1

    Educational Qualifications :

    Bachelor's Degree

    Location :

    Gurgaon, Haryana, India, Gurgaon, Haryana, India

    Create alert for similar jobs

    CA

    Capgemini

    Similar Jobs