What is a Golden Record and how can you create one on DataHub?
02 March 2022 06:04 PM

A golden record is a single, well-defined version of all the data entities in an organizational ecosystem.

In the Data Hub Central, once you have gone through the process of ingest, map and master, the documents in the sm-<EntityType>-mastered collection would be considered golden records.

  • For instance, if your EntityType is customer, then every document in the collection sm-Customer-mastered would be a golden record.

So, you would:

  • First Ingest the data using Ingest step (or any other ML ingest mechanisms)
  • Create Entity Models
  • Map your data using Map steps
  • Create match/merge rules in the match/merge step or master step
  • Run the Master step
  • The outcome of the merge step would be called a golden record which you can find in the collection specified above

How to find these golden records?

On the query console, if you choose the data-hub-FINAL database and run the following query, the resulting documents would be the golden records: fn:collection("sm-Customer-mastered")

Note: The above example assumes the EntityType as "Customer" - you can replace that with your own EntityType when you run the query)

(3 vote(s))
Not helpful

Comments (0)