职位:数据工程负责人 – 
经验:9–12年 
角色概述 
我们正在寻找一位经验丰富的数据工程负责人(供应商),以支持我们的平台现代化计划。该人员将与内部团队合作,将应用程序从Cloudera CDH迁移至基于Kubernetes的全球数据平台,并确保高质量数据工程解决方案的及时交付。 
主要职责 
• 为从Cloudera(Spark、Hive、Kafka、Control-M)到Kubernetes技术栈(Spark 3.5、DBT、Airflow、MinIO/S3、Kafka、Solace)的迁移项目提供技术领导。 
• 领导小型团队及内部工程师,完成项目交付物。 
• 参与设计、架构讨论以及迁移计划,与内部负责人协作。 
• 构建并评审高性能、可用于生产环境的数据管道。 
• 确保符合标准、合规性及治理要求。 
• 向利益相关者提供状态报告、问题上报及交付跟踪。 
• 设计并实施迁移/加速框架,实现端到端自动化迁移。 
• 持续优化框架,确保稳定性、可扩展性,并支持多样化的用例和场景。 
• 与各类数据应用程序协作,推动并支持迁移流程。 
必备技能 
• 9–12年数据工程及大数据生态系统相关经验。 
• 熟练掌握Spark、Hive、Kafka、Solace等技术的实际应用。 
• 具备Kubernetes部署及容器化数据负载的工作经验。 
• 精通Python、Scala和/或Java。 
• 熟悉编排工具(Airflow、Control-M)及SQL转换框架(优先DBT)。 
• 了解对象存储(S3、MinIO)。 
• 具备数据湖库格式(Iceberg、Delta Lake、Hudi)的实际操作经验。 
• 拥有领导供应商或分布式团队完成企业项目的经验。
  
  Position: Data Engineering Lead – Vendor  
Experience: 9–12 Years  
Role Summary  
We are looking for an experienced Data Engineering Lead (Vendor) to support our platform  
modernization program. The resource will work with internal teams to migrate applications  
from Cloudera CDH to a Kubernetes-based global data platform and ensure timely delivery  
of high-quality data engineering solutions.  
Key Responsibilities  
• Provide technical leadership for migration projects from Cloudera (Spark, Hive, Kafka,  
Control-M) to Kubernetes stack (Spark 3.5, DBT, Airflow, MinIO/S3, Kafka, Solace).  
• Lead a small team and internal engineers to deliver project deliverables.  
• Participate in design, architecture discussions, and migration planning with internal  
leads.  
• Build and review high-performance, production-ready pipelines.  
• Ensure adherence to standards, compliance, and governance requirements.  
• Provide status reporting, escalations, and delivery tracking to stakeholders.  
• Design and implement migration/acceleration framework to automate end to end  
migration.  
• Continuous enhancements to the frameworks to ensure the stability, scalability and  
support for diverse use cases and scenarios.  
• Work with various data applications to enable and support the migration process.  
Required Skills  
• 9–12 years of experience in data engineering and big data ecosystems.  
• Strong hands-on expertise in Spark, Hive, Kafka, Solace.  
• Working experience with Kubernetes deployments and containerized data workloads.  
• Proficiency in Python, Scala, and/or Java.  
• Experience in orchestration tools (Airflow, Control-M) and SQL transformation  
frameworks (DBT preferred).  
• Familiarity with object stores (S3, MinIO).  
• Hands on experience of data Lakehouse formats (Iceberg, Delta Lake, Hudi).  
• Prior experience leading vendor or distributed teams for enterprise projects.