Business Intelligence (AI / Data Engineer) 商务智能 (人工智能/数据工程师)
 Objective of job  
Build comprehensive datasets and maintain relevant pipelines that enable advanced analytics, dashboards, and reporting for the captive finance business. The Data Engineer will ensure high-quality, timely, and reliable data access for use-cases in sales/marketing, pricing/commissions, and delinquency/collections, and work closely with Data Scientists to operationalize ML and analytics solutions.  
Job Description  
• Design, implement, and maintain robust ETL/ELT pipelines to source, clean, and structure data.  
• Partner with Data Scientists to prepare datasets, engineer features, and enable model deployment. Responsible for implementing the vectorization of voice and text data using the encoder of the Gen AI Transformer architecture.  
• Ensure performance, scalability, and reliability of data warehouse and analytics systems. Implement data validation and monitoring processes to ensure data quality.  
• Collaborate with business units and IT to integrate new internal/external data sources.  
• Support the other data topics if needed, e.g. dashboards, AI use case and BI solutions with efficient backend data structures, contribute to data governance and compliance efforts, including metadata management and audit readiness, present technical options and architecture choices in workshops and management meetings.  
Qualification  
• Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field.  
• Sound knowledge of feature engineering and deriving algorithms, with 7+ years of experience in data engineering, data warehousing, or big data environments.  
• Proficiency in SQL, Spark (PySpark is preferred), and modern scheduler tools, with strong Python programming skills and practical experience in data processing and AI model support within big data environments.  
• Experience working with on-premises data lake and data warehouses.  
• Solid understanding of data modeling, pipeline orchestration, and performance optimization.  
• Proven experience supporting analytics and machine learning workflows (data prep, feature stores, deployment).  
• 5+ years’ experience of using modern data storage based on FOSS tools, such as Delta Table, Chroma, Neo4j, etc.  
• 5+ years’ experience of developing customized feature encoding algorithms (such as Dimensionality Reduction, Word Embedding, etc.) and applying orchestrator tools (such as Dagster, Prefect) to deliver automated and efficient data pipeline works.  
• Experienced in prompt engineering and API usage and fundamental principles of mainstream Large Language Models (LLMs), and familiar with the entire process of model fine-tuning, quantization, and deployment;  
• Be familiar with the entire RAG (Retrieval-Augmented Generation) process: text chunking, embedding model tuning, and vector database retrieval optimization.  
Have experience in end-to-end AI Agent development and be familiar with mainstream Agent frameworks (such as LangChain, Dify, Ollama).  
• Proficient in Linux operating systems, with experience in system configuration and troubleshooting.  
• Strong communication skills, with ability to work directly with business stakeholders and present solutions.  
• Fluent in Chinese and English.  
• Self-motivated, self-driven.