Research on state-of-the-art algorithms and models in NLP or Multimodal pre-trained models, especially in document understanding / document intelligence domain. Provide scalable technical implementation for real business challenges.
Research, develop, and implement state-of-the-art techniques in the field of Large Language Models (LLM) and Generative AI.
Conduct Proof-of-Concept (POC) and prototyping to explore and validate the flexibility of innovated ideas.
Understand business objectives and develop ML models that help to achieve them
Data preparation, data cleaning, data verification to make sure data quality for model training
Training models and hyper-parameters tuning
Model deployment, Model serving and integrate them within the business applications and embed in the business processes.
Work together with testing engineers to construct and execute test cases to verify model accuracy.
Contribute to the MLOps toolkits to manage machine learning lifecycle more effectively.
Qualifications
Doctor, Masters, or Bachelors degree in Machine Learning, Natural Language Processing, Computer Science, Data Science, Statistics or related areas.
Expertise in one or more of NLP, Multimodal Representation Learning, Gen AI / LLM.
Experienced in fine-tuning and developing LLM. Strong familiarity with emerging trends in LLMs and open-source platforms
Excellent programming skills in Python.
Familiarity with common ML frameworks like Pytorch, and basic libraries like sklearn, pandas, etc.
Research experience in document understanding / document intelligence fields is a big plus. E.g. Visual rich document classification, information extraction, document layout analysis, question answering, generative AI, LLM etc.
Understanding of MLOps or machine learning Model Lifecycle Management concepts
Strong analytical skills, problem solving skills, communication and interpersonal skills
Able to work under pressure and demonstrate initiative, enthusiasm and rapid learning capability.