data-engineering
Domain inferred from 5 atoms across the corpora.
Atom counts by kind
| Kind | Count |
|---|---|
| anti-pattern | 1 |
| fact | 1 |
| pattern | 1 |
| principle | 1 |
| rule | 1 |
Sample atoms
Batch Only Pipelines
Designing every data pipeline as a nightly (or hourly) batch job — typically Airflow + dbt + warehouse — when downstream use-cases require sub-minute freshness.…
Pii Tokenization
PCI-DSS, HIPAA, and GDPR Article 32 ('appropriate technical measures, such as pseudonymisation') all converge on the same architectural primitive: tokenization.…
Cdc Change Data Capture
Replicate a source-of-truth OLTP database to one or more downstream systems (data warehouse, search index, cache, replica) by tailing the database's transaction log rather than running periodic SELECT batches.
Idempotent Writes
Producers must attach a stable idempotency key (UUID, business-event-id, deterministic hash) to every write.…
Event Schema Versioning
Producer payloads must embed `schema_id` and `schema_version` (or use Confluent Schema Registry's wire-format magic byte + 4-byte schema id).…