Skip to main content

Glossary

Definitions of terms used across the documentation.

A

ACR (Azure Container Registry) — Azure-hosted container image registry.

AI Agent — Autonomous software component that performs data engineering tasks (governance, pipelines, operations, analytics) using AI and natural language. BigHammer offers multiple agents: Data Governance, NLP Pipelines, Automated DataOps, and Explore Data.

AI Data Engineer — BigHammer's flagship AI-powered system that designs, deploys, governs, and operates data systems end-to-end without manual coordination.

ACR (Azure Container Registry) — Azure-hosted container image registry. Used to store and pull container images for AKS deployments.

AKS (Azure Kubernetes Service) — Managed Kubernetes service on Azure. Used to run the platform and Airflow.

API Gateway — Entry point for API requests. Handles authentication, rate limiting, and routing to backend services.

Audit Worker — Background worker that processes audit events asynchronously. Consumes messages from RabbitMQ.

C

Catalog API — Service that manages data catalog, metadata, and schema information.

Catalog Worker — Background worker that processes catalog-related tasks (e.g., profiling, lineage). Uses Celery and RabbitMQ.

Celery — Distributed task queue framework. Used for async job processing (audit, catalog, AI agent workers).

Celery Beat — Scheduler component that triggers periodic Celery tasks at defined intervals.

ClusterSecretStore — External Secrets Operator resource that provides secrets to multiple namespaces. Scoped cluster-wide.

E

ECR (Elastic Container Registry) — AWS container image registry. Used as source for images synced to ACR.

ESO (External Secrets Operator) — Kubernetes operator that syncs secrets from external stores (e.g., Azure Key Vault) into Kubernetes Secrets.

ExternalSecret — ESO resource that defines which external secret to fetch and how to create the Kubernetes Secret.

K

Key Vault — Azure Key Vault. Stores secrets, certificates, and keys. ESO fetches secrets from Key Vault into the cluster.

Keycloak — Identity and access management (IAM) service. Handles authentication, SSO, and user management.

M

Message Queue — Broker for asynchronous messaging. RabbitMQ is used to decouple APIs from workers.

R

RabbitMQ — Message broker used for task queues. APIs publish tasks; workers consume and process them.

Redis — In-memory data store. Used for caching and as a Celery result backend (when configured).

S

SecretStore — ESO resource scoped to a single namespace. Fetches secrets from an external provider.

Workload Identity — Azure feature that allows pods to authenticate to Azure services (e.g., Key Vault) without storing credentials. Uses federated identity and service account tokens.

W

Worker — Background process that consumes tasks from a queue and executes them. Examples: Audit Worker, Catalog Worker, AI Agent Worker.