MSBI : BI # 19 : Business Intelligence – Tools & Theory # 11 : Architecting the Data #1 : Enterprise Data Model & Granularity of the Data

Hi Folks,

This post is part of Series Business Intelligence – Tools & Theory

Currently running topic for this series is listed as below :

Series Business Intelligence – Tools & Theory

>>Chapter 1 : Business Intelligence an Introduction

>>Chapter 2 : Business Intelligence Essentials

>>Chapter 3 : Business Intelligence Types

>>Chapter 4 : Architecting the Data<You are here>

Continuing from my previous post on this series, If you have missed any link please visit link below

We are going to Cover the Following Points in this article

  • Types of Data
  • Enterprise Data Model
  • Enterprise Conceptual Model
  • Granularity of the Data

Types of Data

By now you must be familiar with business intelligence types. This unit introduces the need to architect data and the different types of data in business intelligence. It also discusses about the enterprise data model, granularity of data and various techniques of data reporting and querying. It familiarises you with the concept of metadata and data partitioning. It also describes the various aspects of total data quality management (TQDM).

Data is the most important asset of any organisation. It is the attribute of the real world. They are extracted or obtained through measurement, test or examination. It is recorded either on a paper or stored in a computer system. The data of an organisation are collected, aggregated, stored, and manipulated by various systems. Finally, stored data are extracted and turned into information in some of report or statistic. Therefore, it is very important to ensure the quality of the data which affects the overall performance of the processes. Data quality of an organisation can be monitored and controlled by total data quality management process.

The different types of data are used by a business oriented organisation. Each organisation is based on the needs and requirements which selects the type of data.

Data modelling is method which can define and analyse data requirements that are required to support the business processes of an organisation. There are various types of data modelling techniques are available. Enterprise data model, an architectural structure of data which integrates data of the organisation and gives single definition of data irrespective of applications. It gives blueprint of the data that is used in an organisation, which offers various advantages to control and monitor the data of an organisation.

There are various kinds of classification of data that are associated with business organisation.

The first type of classification is mentioned below:

· Qualitative data: This data is a categorical measurement that is measured not in terms of numbers but it is classified by means of natural language description. This is referred to as the categorical data in terms of statistics. When there is a natural ordering of the category. It is referred as the ordinal category variable. These variables can be judged in terms of sizes like small, fat, long and so on. The category of data which has no natural ordering is referred as nominal category variables like gender, race, and sports and so on.

· Quantitative data: This is the data which is measured in terms of numbers. It cannot be represented in natural language description. There are certain numbers that are not continuous and measurable. For example, social society number is data to which something cannot be added to or subtracted from. Quantitative data is associated with a scale measure. Ratio-scale is the most common type of scale used. Quantitative data that are more general are measured on interval scale, which also has an equidistant measure. Doubling principle breaks down this scale. For example, 50 degrees Celsius of temperature is not half as hot as a temperature of 100, but a difference of 10 degrees indicates the same difference in temperature anywhere along the scale.

The second classification of data that are associated with business organisation is the further classification of quantitative data discussed in the above the section.

The quantitative data can be further classified into continuous data and discrete data.

· Continuous data: This data are measured along a continuous scale which can be divided into fractions. This can be infinitely sub-divided to achieve the fine quality which means if data can be measured sufficiently accurately, two items can be compared and difference between them can be determined.

· Discrete data: This data is measured across a set of fixed values such as age is measured in years not in microseconds. Arbitrary data are used to measure discrete data.

The third classification of the data has four types of data used in business research. In this classification, each type of data adds more to the next one.

· Nominal data: Nominal is derived from the Latin word “nomen” which means name. Nominal data are the items that are differentiated by a simple naming system and are definable category data. This type of data is largely used in business organisation where every item is given different name in order avoid confusion. Numbers can be assigned to nominal data items. This data can be ordinal but it can be used only for referencing, they cannot be used for simplification purpose.

· Ordinal data: This data are represented on an ordinal scale where set of data are positioned on the scale by its order. Positions on the scale are indicated such as temporal position, superiority and so on. The order of items is normally defined by assigning numbers to them to show their relative position. Letter or some kind of symbols can also be used to used represent an item appropriately. This kind of data is useful in business organisation to categorise various data involved in business and assign them priority in terms of its importance. Arithmetic operations cannot be performed on ordinal data; they are just used to show the sequence.

· Interval data: This data can also be called as integer data. This is measured on the scale in which each position is equidistant from one another. This is used in business application tow measure the attributes along an arbitrary scale between two extreme possibilities. Interval data cannot be divided or multiplied.

· Ratio data: This data is represented on a ratio scale and are compared as the multiples of one another. In business application, this data is used to compare the impact of the data from one another. Impact may be positive like improvement in business or it can be negative impact as well like impact of the problem existing in an organisation. This data can be divided and multiplied.

Enterprise Data Model

An Enterprise Data Model (EDM) is a data architectural structure that is used for integration. It represents a single definition of data of an organisation irrespective of any system or application. Enterprise data model provide means to visualise, plan, build and implement the data system. It is independent of how the data is sourced, stored, processed and

accessed. It validates and represents the data that is important to an organisation and organise the rules to govern them.

EDM gives the blueprint of data that is produced and consumed in all the departments of an organisation, which further helps to resolve the inconsistencies and parochial analysis of the data used. It is also useful to identify the shareable and redundant data across the organisational boundary. The process of integrating the data in enterprise data model enables the single version of truth which is benefit for all in an organisation. It also minimises the data redundancy, errors and disparity. It concentrates more on the quality of data, consistency and accuracy. It is considered to be the starting point of all data design system.

Enterprise data modelling use “top-down – bottom-up” approach to design a data system. Usually enterprise data model is produced based on top-down steps. The bottom-up approach is important while a data system is to be designed in an efficient and practical manner using the existing data sources.

An enterprise data model can be decomposed into three levels. All the three levels of the enterprise data model are interrelated and each one has its unique identity and purpose.

The three levels of enterprise data model are:

1. Enterprise subject area model

2. Enterprise conceptual model

3. Enterprise conceptual entity model

Figure shows the structure of enterprise data model which depicts the different layers of it.

clip_image002

Structure of Enterprise Data Model

Enterprise Subject Area Model

Enterprise subject area model is a framework of enterprise data model. Any data that is useful to the business is called enterprise data and it is retained for additional use.

It is a very complex task to design, develop and maintain an enterprise data without breaking it in to manageable pieces even for a larger team. Enterprise subject area model uses a divide and conquer approach. Data produced and consumed across the entire organisation are represented within the subject area model.

Subject areas

Each subject area of the enterprise subject area model is a high level classification of data related to a group of concepts pertaining to a topic of interest in an organisation.

Subject area can have both generic business concepts like customer, product, employee, finance and industrial specific concepts.

Subject areas can be grouped into three categories in business. They are:

· Revenue

· Operation

· Support

· Revenue: This business category focuses on revenue activities of the organisation like revenue planning, accounting and reporting.

· Operation: This business category focuses on main functions of business like daily operations.

· Support: This business category focuses on the additional activities of business rather than the main business purpose.

Subject area data taxonomy

Data taxonomy is a hierarchical classification tool applied to data in the enterprise subject area model. It helps in understanding, architecting, designing, building and maintaining the data system. At the first level of its hierarchy, the data is classified into one of the three class namely foundational class, transactional class or informational class.

· Foundational data can define, support and create other data. It includes reference type data and metadata that are required to perform business transactions.

· Transactional data is the data that is obtained as a result of business transactions.

· Informational data is a summarised data, usually created from an operational data. It is the primary factor for decision support system.

Enterprise Conceptual Model

Enterprise Conceptual Model (ECM) forms the second level of the enterprise data model. It is created from the identification and definition of important business concepts of each subject area. Enterprise conceptual model has concepts, their definitions and relationships. The concepts give much greater business details than subject area.

ECM confirms the scope of the subject areas and their relationships. When a concept is defined, questions regarding what should be included in the subject area arise. A subject area can become a concept within an electronic concept model.

Enterprise data ownership is assigned to a business area at subject area level, whereas at the conceptual level business experts with a broad knowledge are assigned enterprise data ownership.

Data Concepts

Concepts describe the information produced and consumed by an organisation independent of applications. A concept brings in the relationships between the subject areas. It defines the subject areas and their scope. Concepts are found at different levels of granularity depending on its business relevance.

Concept gives the importance of data not an amount of data. Concepts may be found at different levels of granularity depending on their business relevance. Each concept may cover a very large or small area or volume of data. The point is that the concepts represent the important business ideas, not an amount of data. Concepts are the integrated view of business rather than stand alone area.

The enterprise conceptual model level is ideal for information systems planning activities. By representing future and existing information systems, the subsets of the concepts can be extracted. These subsets can be used as a tool to visualise and understand the existing information systems and also to identify the system overlaps and dependencies.

The enterprise conceptual model lays the foundation for creating the Enterprise Conceptual Entity Model (ECEM), which is the third level of the enterprise data model. It is difficult to create enterprise conceptual entity model without the framework of the enterprise conceptual model.

Enterprise Conceptual Entity Model

An enterprise conceptual entity model is the third and the last level in the hierarchy of enterprise data model. It is the detail level of the enterprise data model. It expands each concept within every subject area adding finer details.

In this level of EDM, business and its data rules are examined to create the major conceptual entities, their business keys, relationships, and important attributes. The framework provided by ECEM for data system design and development depends on the number of concepts expanded by it. More the number of concepts expanded, better framework is obtained.

An ECEM provides a data architectural framework for an organisation data design and stores while ensuring the data quality, scalability and integration. Data requirement and source of business supply finish material for data design. The finish material which gives details to complete the data designs are attached to the ECEM framework. An ECEM which acts integrated data architectural framework can also be used as the source of reusable data objects for construction of the organisation’s data stores.

Enterprise Conceptual Entity Model has four important components as follows:

· Entity Concepts: Conceptual entities are the important things in the business. The concepts are independent of technology and applications. Depending on the business and data relevance, entity concept exists at different levels of granularity. The level of granularity also depends on the information known at the time of its creation. An Entity Concept can also be subtype of enterprise data model. Each entity concept ultimately represents multiple logical entities and possibly physical tables.

· Relationships: Relationships between conceptual entities represent the data rules that are important to the business. Relationships component of enterprise conceptual entity model define the interdependency of the conceptual entities. The only relationships that are useful and important to the business data concepts are resolved. Relationships are defined in both directions. Names of relationships may not be displayed on the model, but they are always defined and documented. Names should make sense within an English sentence. Enterprise data integration is generally defined in terms of the keys and relationships. If a relationship does not work correctly, this indicates an incorrect assumption about the business rules.

· Name and Definition: The names of the entity concepts are business oriented and it is not influenced by systems or applications. Abbreviations and acronyms are not used in the naming structure. The names are simple and appropriately descriptive. The definitions of enterprise are created from the intersection of all business definitions and usage. The enterprise definition improves the context of information. It is as complete and detailed as necessary for clarity, while remaining simplistic and concise.

· Primary Keys and Significant Attributes: A conceptual entity contains a primary key that represents its unique identity in business terms. A key validates business rules. As entity concepts are related, these keys are inherited and they must work correctly. Attributes are the additional factors that are included for business significance and enterprise data integration.

Granularity of the Data

Granularity of the data is the measure of the degree of detail or the precision contained in the data. Business domain requires the maximum granularity which refers to the perfect precision. The data which has the maximum granularity is called as an atomic data. Granularity has various dimensions among which the time and space are considered to be the critical ones.

Granularity of the data can produce either raw data or cooked data based on the application. The raw data refers to the detailed records of the business transactions and market intelligence. The cooked data refers to the consolidated or derived reports of business results. Raw data gives the finer details compared to the cooked data.

Granularity of data is an important factor to be considered in the business which involves data warehousing. This is one the main reason for an organisation to build in business intelligence. Granularity is mentioned in the context of dimensional data structures which involves facts and dimensions. It refers to the level of details in the fact table of the database. More the detail in the fact table, higher the granularity of data and vice versa. The fact table which has higher granularity will have more number of rows in it.

Data to be considered for the granularity characteristics varies from application to application, as follows:

· An application where the entities are being sorted into categories, the granularity refers either to large number of narrow categories or small numbers of broad categories.

· An application where a group is being divided into subgroups, the granularity refers to the number of subgroups.

· An application where the space such as network of requirements or activities is being into carved into manageable chunks, the granularity refers to the size of the chunks.

· An application that requires lots of measurement to be taken, the granularity refers to the intervals in terms of time or space between the measurements.

Granularity refers to the level of depth data in some applications. Lower the depth of data, finer the granularity of data. For example, consider data and time dimension. The level of granularity could be year, quarter, month, week, day, hour, minute, second, hundredth of second. In data warehouse seconds and hundredth of seconds is not possible.

 

Hope you will like Series Business Intelligence – Tools & Theory series !

If you have not yet subscribe this Blog , Please subscribe it from “follow me” tab !

So that you will be updated @ real time and all updated knowledge in your mail daily for free without any RSS subscription OR news reading !!

Happy Learning and Sharing !!

For More information related to BI World visit our all Mentalist networks Blog

SQL Server Mentalist … SQL Learning Blog

Business Intelligence Mentalist … Business Intelligence World

Microsoft Mentalist … MVC,ASP.NET, WCF & LinQ

MSBI Mentalist … MS BI and SQL Server

NMUG Bloggers …Navi Mumbai User Group Blog

Architectural Shack … Architectural implementation and design patterns

DBA Mentalist …Advance SQL Server Blog

MVC Mentalist … MVC Learning Blog

Link Mentalist … Daily Best link @ your email

Infographics Mentalist … Image worth explaining thousand Words

Hadoop Mentalist … Blog on Big Data

BI Tools Analysis … BI Tools

Connect With me on

| Facebook |Twitter | LinkedIn| Google+ | Word Press | RSS | About Me

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: