Bringing Alloy and Ember to Databricks in DataForge 10.0

Dec 9

Last week, we introduced the Alloy Architecture: a structured refinement model designed to eliminate hidden layers and create predictable, repeatable workflows. We also unveiled Ember, the metadata catalog that defines Alloy’s behavior using clear, declarative configuration.

Today, we’re announcing how Alloy and Ember come to life inside DataForge 10.0 for Databricks. This release reflects a major reinvestment in our Databricks architecture, delivering Unity Catalog integration, table-native refinement, federated metadata access, and a modernized compute layer.

Table-Native Refinement Inside Databricks

Older versions of DataForge relied on file-based refinement stages. While functional, this approach made it harder for teams to understand how data moved through the system or to align with Databricks-native governance tools.

In DataForge 10.0, each stage of the Alloy Architecture (ORE, MINERAL, ALLOY, INGOT, and PRODUCT) is now represented as a Delta table or view. This change makes refinement far more transparent and intuitive. Teams can inspect intermediate results directly, query refinement stages using Databricks SQL, and leverage Unity Catalog controls without needing to interpret abstract internal flows.

The end result is an Alloy execution model that feels native on Databricks, with refinement exposed through standard lakehouse primitives rather than internal file structures.

Ember: The Structured Definition Layer Behind Alloy

Alloy’s consistency is enabled by Ember - the relational metadata repository that defines how refinement should behave across domains. Ember stores explicit definitions for:

Source interpretation and normalization
Change detection logic
Attribute relationships and enrichment rules
Merge behavior in INGOT
Output shaping and delivery patterns

This metadata is stored in a Postgres-backed relational schema designed specifically for Alloy’s five-layer model. Rather than inferring logic from code, Ember provides an authoritative, declarative description of how data should be transformed.

Ember Available Directly in Unity Catalog via Lakehouse Federation

One of the most meaningful enhancements in DataForge 10.0 is that Ember’s metadata can now be queried directly in Databricks. Through Lakehouse Federation, Ember’s catalog appears as Unity Catalog tables without requiring extra connectors or APIs. Teams can join metadata to operational data, build governance dashboards, or analyze domain relationships using standard SQL, all inside Databricks.

Incremental Processing Built Into the Architecture

Incremental refinement is traditionally one of the most complex aspects of building pipelines. Alloy makes it a native part of the design. MINERAL isolates new and changed records; ALLOY enriches this reduced dataset before it scales; and INGOT merges updates back into the full table. Because Ember defines attribute behavior across these layers, Alloy can push heavy transformations earlier in the flow and reduce the amount of data processed at each stage.

This built-in incremental pattern reduces compute costs, speeds up refresh cycles, and eliminates the custom branching logic that often accumulates in traditional pipelines.

Performance and Developer Experience Improvements

DataForge 10.0 includes a series of enhancements designed to improve reliability and performance on Databricks:

Faster hub table checks using metadata-driven pruning
Smarter Delta merge strategies with reduced conflicts
More responsive connection tests to cut down on debugging time
Quicker Data Profile introspection across new sources
Improved job orchestration with clearer stage boundaries

These improvements shorten iteration loops and deliver a smoother developer experience.

Deep Alignment with Unity Catalog

Unity Catalog has become the governance backbone of Databricks deployments, and DataForge 10.0 aligns tightly with that model. New workspaces build hub tables natively under UC, enabling consistent lineage, access control, and oversight. Existing Hive deployments can be upgraded smoothly, bringing old environments into the new governance model without major disruptions.

DataForge now fits naturally into UC-driven lakehouse architectures, offering:

Unified governance across both data (Delta) and metadata (Ember)
Easier auditing and exploration
Consistent behavior across environments

Modernized Compute Layer

We’ve also refreshed the compute experience:

Support for DBR 16.4 LTS
Updated instance type recommendations
Ability to restart Talos directly from the UI
Terminology aligned with Databricks (“Compute” replacing “Clusters”)

These updates simplify administration and ensure a modern operational foundation.

A Unified Databricks Experience

Alloy provides the structure. Ember provides the metadata. Databricks provides the runtime and governance environment. Together, they create a version of DataForge that is more transparent, more predictable, and easier to operate than ever before.

Every stage is inspectable.
Every definition is queryable.
Every refinement behaves consistently across domains.

And this is only the first half of the 10.0 rollout.

What’s Coming Next

Later this week, we’ll extend Alloy and Ember to a new ecosystem as part of the second major platform announcement for DataForge 10.0.

Thursday’s announcement completes the release. Stay tuned!

Matthew Kosovec

Bringing Alloy and Ember to Databricks in DataForge 10.0

Table-Native Refinement Inside Databricks

Ember: The Structured Definition Layer Behind Alloy

Ember Available Directly in Unity Catalog via Lakehouse Federation

Incremental Processing Built Into the Architecture

Performance and Developer Experience Improvements

Deep Alignment with Unity Catalog

Modernized Compute Layer

A Unified Databricks Experience

What’s Coming Next

Product

Resources

Legal

Follow

Bringing Alloy and Ember to Databricks in DataForge 10.0

Table-Native Refinement Inside Databricks

Ember: The Structured Definition Layer Behind Alloy

Ember Available Directly in Unity Catalog via Lakehouse Federation

Incremental Processing Built Into the Architecture

Performance and Developer Experience Improvements

Deep Alignment with Unity Catalog

Modernized Compute Layer

A Unified Databricks Experience

What’s Coming Next

Bringing Alloy and Ember to Snowflake: DataForge Expands to a New Ecosystem

Introducing Ember: A Structured Data Catalog Built for the Alloy Architecture

Product

Resources

Legal

Follow