zondag 25 juni 2023

Fabric : Naming conventions for Microsoft Fabric


This is the first concept of defining a naming convention in Microsoft Fabric. I think it's good to define a naming convention for Fabric because the list of components in Fabric can grow enormeously. After a couple of experiments I already have these components and this is nothing when working in a real life situation.

Why should you use a naming convention anyway? By working in an unambiguous way it creates various advantages. These benefits will ensure that:
  • People are less dependent on the implicit knowledge of internal and external employees
  • Interchangeability of employees in project building, debugging and troubleshooting
  • Better overview (making adjustments easier)
  • Faster insight into (possible) problems
  • Faster troubleshooting
  • Auditability
  • Simpler impact analyses
  • Higher availability

You can look at naming conventions in two ways : prefix the items of postfix the items. So prefixing would mean that all the same items are organized together. Postfixing would mean that all the items that start with the same character belongs functional together. So, technical organized or functional organized!

Microsoft Fabric Components

What are the components of Microsoft Fabric (so far) :

  • Capacity : CP_<CapacityName>
  • Workspace : WS_<WorkspaceName>
  • Data pipeline :  DP_<DataPipeline>
  • Dataflow Gen2 :  DF_<DataFlowName>
  • Eventstream :  ES_<EventstreamName>
  • Experiment : EX_<ExperimentName>
  • KQL Database : KD_ <KQLDatabaseName>
  • KQL Query set :  KQ_<KQLQuerysetName>
  • LakeHouse : LH_<LakeHouseName>
  • Model : MD_<ModelName>
  • Notebook :  NB_<NotebookName>
  • Report : RP_<ReportName>
  • Spark Job Defintion :  SD_<SparkJobName>
  • Warehouse : WH_<WarehouseName>

As said before, you can also postfix the items, like <WarehouseName>_WH.

Naming convention Warehouse

Now warehouse is the place where I'm the most familiar with. In Warehouse you have (now) 4 kinds of objects : 
  • Procedures : sp<ProcedureName>
  • Functions : fn<FunctionName>
  • Tables : tbl<TableName>
  • Views : vw<ViewName>
Now I know that in SQL server there were issues calling stored procedures sp<Procedurename> because SQL Server would look first between the system stored procedures, before it would search the stored procedure between the stored procedures that users created. This would lead to a performanceloss. I'm not sure if this is the case in Microsoft Fabric. 

Yet another discussion point is naming views vw and tables tbl, because if you decide to materialize views into tables, you have to rename the objects and that could break the data pipeline.

There are also datatypes in Datawarehouse and when you use these in Procedures en Functions you can use this naming convention in order to quickly see the datatype of a variable. This can be handy.
  • Bigint : big<Variabelename>, example bigPatientId
  • Binary : bin<Variabelename>, example binMessage
  • Bit : bit<Variabelename>, example Isok
  • Char : chr<Variabelename>, example chrPatientName
  • Uniqueidentifier : guid<Variabelename>, example guidkey
  • Varbinary : vab<Variabelename>, example vabMessage
  • Varchar :  chv<Variabelename>, example chvPatientName
  • Date : dt<Variabelename>
  • Time : tm<Variabelename>
  • Datetime2 :  dtm<Variabelename>,  dtmAppointmentdate
  • Float : flt<Variabelename>, example fltmeasurement
  • Integer : int<Variabelename>, example intPatientID
  • Numeric or decimal : dec<Variabelename>, example decAmount
  • Smallint : sin<Variabelename>, example sinSubcategoryID
  • Real : rea<Variabelenaam>, example reaBedrag

Naming convention Data Engineering

To be continued

Naming convention Data Pipeline

To be continued

Naming convention PowerBI

To be continued

Final thoughts

This is a ongoing blogpost where I add naming conventions for the other Microsoft Fabric experiences

Geen opmerkingen:

Een reactie posten