Location: Remote Experience Level: 10+ years in Data Architecture and Data Platform Design Department: Data & AI Engineering Role Summary: The Data Architect will design and govern the complete data ecosystem for the ValueX platform — including data ingestion, processing, modeling, storage, orchestration, and governance. They will define the data architecture blueprint supporting the system’s core modules: 1. Customer Segmentation 2. Business Decisioning & Offer Assignment Engine 3. Real-Time Offer Orchestration Module (ROOM) 4. ML/AI Model Management & Simulation Engine The architect will ensure scalability to handle 50 million+ customer profiles, real-time event streams, and ML-driven decisioning — with a focus on performance, cost optimization, and maintainability. Key Responsibilities Data Architecture & Design Define the end-to-end data architecture across batch, streaming, and real-time layers. Design Customer 360 and Offer 360 data models, including feature store and historical layers. Establish logical, physical, and semantic data models to enable segmentation, scoring, and orchestration. Define data contracts for Kafka topics, API payloads, and Adobe/CDP integrations. Set up data versioning and lineage frameworks to track data provenance. Data Ingestion & Integration Architect data ingestion pipelines from multiple telco sources: OCS, CRM, Kenan, Billing, DWH, Adobe AEP, Pricefx, and external APIs. Define patterns for real-time event ingestion (recharge, offer purchase, balance check). Standardize data access through APIs or data products for downstream modules. Design connectors for cloud storage (e.g., S3, Delta Lake) and integration middleware (e.g., n8n, DecisionRules.io, KNIME). Data Management & Governance Define and enforce data quality, lineage, catalog, and access policies. Establish metadata management frameworks (e.g., DataHub, Collibra, Amundsen). Set up data validation and DQ frameworks (Great Expectations, Deequ). Govern data partitioning, schema evolution, retention, and archiving strategies. Ensure compliance with data privacy and regulatory standards (e.g., PDPA, GDPR, local telecom data policies). Scalability, Cost & Performance Design for high performance and cost-efficient scalability (25M → 50M → 75M customers). Optimize compute/storage balance across environments (Dev, UAT, Prod). Define data lakehouse optimization strategies (Z-Order, Delta caching, compaction). Monitor and manage query performance, cluster sizing, and job orchestration costs Collaboration & Governance Work closely with Data Engineers, Data Scientists, and Application Developers to ensure architectural alignment. Lead architecture review boards and maintain data design documentation (ERD, flow diagrams, schema registry). Serve as technical liaison between business stakeholders, data teams, and platform vendors (Databricks, Adobe, Pricefx). Provide best practices and design patterns for model deployment, retraining, and data lifecycle management Required Skills & Expertise Data Architecture: Data lakehouse design, data modeling (dimensional, normalized, semantic), schema management ETL/ELT & Orchestration: Databricks, Snowflake, dbt, Airflow, AWS Glue, Azure Data Factory Streaming & Real-Time: Apache Kafka, Spark Streaming, Kinesis, Flink Data Modeling: Customer 360, Offer 360, Transaction 360, Feature Store Cloud Platforms: AWS (S3, Glue, Lambda, EMR), Azure (ADF, Synapse), or GCP (BigQuery, Dataflow) Storage & Compute: Delta Lake, Parquet, Iceberg, Snowflake Data Quality & Governance: Great Expectations, Deequ, DataHub, Collibra Programming & Scripting: Python, SQL, PySpark, YAML API & Integration: Design REST, GraphQL, Kafka Connect, JSON schema Security & Compliance: IAM, encryption (KMS), access control, masking, PDPA compliance Educational Background Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related fields. Certifications in AWS/Azure Data Architect, Databricks Certified Data Engineer, or Snowflake Architect are preferred