Data Mesh Architecture: Level 5 Maturity
At Level 5 of the [Data Maturity Lifecycle](DataMaturityLifecycle), organizations move away from centralized "monolithic" data teams. **Data Mesh** is a socio-technical shift that treats data as a product, owned by the domains that generate it (e.g., Checkout, Logistics, Marketing).
1. The Socio-Technical Shift
In Level 4, technology solved the ACID problem. In Level 5, architecture solves the **Scale problem**. When a central team becomes the bottleneck for 50+ business units, the only solution is decentralization.
The Four Pillars:
1. **Domain Ownership:** The "Checkout" team owns their analytical data just like they own their microservice.
2. **Data as a Product:** Datasets must be discoverable, trustworthy, and interoperable.
3. **Self-Serve Platform:** A central team provides the "paved path" (S3, Iceberg, dbt) so domains don't reinvent the wheel.
4. **Federated Governance:** Global standards (e.g., "customer_id" must be a UUID) are defined centrally but enforced locally.
2. Concrete Example: The Data Product Registry
A domain team publishes their data product to a central registry. This is not just a link; it's a [Data Contract](ShiftLeftDataEngineering).
**Domain: Logistics**
**Product: shipment_tracking_v2**
```yaml
Data Product Metadata
id: logistics.shipment_tracking
status: production
owner: logistics_eng_team
upstream_dependencies: [orders.completed_orders]
Technical Endpoint
endpoint: trino.logistics.shipment_gold
format: iceberg
location: s3://logistics-domain/gold/shipments/
SLA (Service Level Agreement)
availability: 99.95%
freshness: 30 minutes
```
3. Computational Governance
In a Mesh, governance is enforced via code (Open Policy Agent - OPA) and metadata tags.
- **Example:** If a domain publishes a field tagged `pii: true`, the central platform automatically applies masking in the federated query engine (e.g., Trino) for unauthorized users.
4. When to Mesh?
Mesh is not a silver bullet. It introduces significant overhead.
- **Complexity:** Managing 100+ decentralized pipelines is harder than managing 1 central pipeline.
- **Decision:** Mesh is only for organizations that have surpassed the cognitive limit of a central team (typically > 100 engineers and > 10 distinct domains).
---
**See Also:**
- [Shift Left Data Engineering](ShiftLeftDataEngineering) — The technical implementation of contracts.
- [Data Lakehouse](DataLakehouse) — The technical substrate for a Mesh.
- [Data Engineering Hub](DataEngineeringHub) — General data engineering principles.
---