Data Development Process
Goals​
Introduce the data development process in Watchmen.
Overview​
Role-oriented high-level flow for data developers, aligned with Watchmen key features.
Development Workflow​
-
Environment & Access
Install services, log in to Admin Workbench, set up data zones and users. -
Register Data Sources
Admin → Data Source. Configure connections and bind topics to storage. -
Configure Ingestion
Ingestion → Modules/Models/Tables/Configuration/Monitor. Set keys and triggers, run extraction, visualize execution flow. -
Model Topics & Enumerations
Admin → TopicandEnumeration. Define kind/type, factors, indexes, encryption; link enums; export scripts if needed. -
Orchestrate Pipelines & External Writers
Admin → PipelineandExternal Writer. Define triggers, stages, units and actions; integrate writers; test withSimulator. -
Validate & Profile
Admin → Monitor LogsandTopic Profile. Inspect runtime status, distributions and errors. -
Assure Data Quality
DQC → Monitor Rules/Run Statistics/Consanguinity/Catalog. Configure checks, monitor metrics and lineage. -
Operate & Automate
Admin → Toolbox. Schedule snapshots, trigger pipelines; version and alert on data freshness and quality. -
Optional Analytics Handoff
Console → Subject/Report/Dashboard. Publish datasets and visualizations for analysts.
Data-Modeling Best Practices​
- Start with a clear business glossary to ensure consistent naming.
- Prefer star or snowflake schemas or wide tables for analytics workloads
- Use surrogate keys to decouple source-system changes from analytical models.
- Store slowly changing dimensions (SCD) history explicitly (Type 2 or Type 4) when business needs change tracking.
- Normalize only where it reduces redundancy without hurting query performance; benchmark before and after.
- Document every entity, column, and relationship in the shared data-catalog; treat docs as code.
- Enforce data contracts at the model boundary (nullability, value ranges, referential integrity).
- Version your models; every breaking change should pass a deprecation window and downstream impact review.