Data Model and Data Quality Parameters

Q1. Explain and detail logical and conceptual design phases

Conceptual: Conceptual Data Model defines WHAT the system contains. This mean the user requirement from the business. This model is typically created by Business stakeholders and Data Architects. The purpose is to organize, scope and define business concepts and rules.

Logical: Defines HOW the system should be implemented regardless of the Database Management System (DBMS). This model is typically created by Data Architects and Business Analysts. The purpose is to develop technical map of rules and data structures.

Reference: https://www.guru99.com/data-modelling-conceptual-logical.html

Q2. Research and read articles for Data Quality parameters used in varieties day-to-day activities

Data quality is very important in business for their ROI (Return of Investment). So, the question is how do you measure the Data Quality?

To measure data quality, we need data quality metrics. It also helps in increasing the quality of the information. Among the various techniques of quality management, data quality metrics must be of a top-notch and clearly defined. These metrics encompass different aspect of quality, that can be summed up with the acronym “ACCIT” standing for Accuracy, Consistency, Completeness, Integrity, and Timeliness.

While data analysis can be quite complex, there are a few basic measurements that all key DQM stakeholders should be aware of. Data quality metrics are essential to provide the best and most solid basis you can have for future analyses. These metrics will also help you track the effectiveness of your quality improvement efforts, which is of course needed to make sure you are on the right tracks. Below is the key component for data Quality parameter:

Accuracy

Refers to business transactions or status changes as they happen in real time. Accuracy should be measured through source documentation (i.e., from the business interactions), but if not available, then through confirmation techniques of an independent nature. It will indicate whether data is void of significant errors.

A typical metric to measure accuracy is the ratio of data to errors, that tracks the amount of known errors (like a missing, an incomplete or a redundant entry) relatively to the data set. This ratio should of course increase over time, proving that the quality of your data gets better. There is no specific ratio of data to errors, as it very much depends on the size and nature of your data set – but the higher the better of course.

Consistency

Strictly speaking, consistency specifies that two data values pulled from separate data sets should not conflict with each other. However, consistency does not automatically imply correctness.

An example of consistency is for instance a rule that will verify that the sum of employee in each department of a company does not exceed the total number of employees in that organization. Other examples are the data parameter should be the same, if the temperature parameter is using Fahrenheit, all the data should be using Fahrenheit, not some of it using Celsius.

Completeness

Completeness will indicate if there is enough information to draw conclusions. Completeness can be measured by determining whether each data entry is a “full” data entry. All available data entry fields must be complete and sets of data records should not be missing any pertinent information.

For instance, a simple quality metric you can use is the number of empty values within a data set: in an inventory/warehousing context, that means that each line of item refers to a product and each of them must have a product identifier. Until that product identifier is filled, the line item is not valid. You should then monitor that metric over time with the goal to reduce it.

Integrity

Also known as data validation, integrity refers to the structural testing of data to ensure that the data complies with procedures. This means there are no unintended data errors, and it corresponds to its appropriate designation (e.g., date, month and year).

Here, it all comes down to the data transformation error rate. The metric you want to use tracks how many data transformation operations fail relatively to the whole – or in other words, how often the process of taking data stored in one format and converting it to a different one is not successfully performed.

Timeliness

Timeliness corresponds to the expectation for availability and accessibility of information. In other words, it measures the time between when data is expected and the moment when it is readily available for use.

A metric to evaluate timeliness is the data time-to-value. This is essential to measure and optimize this time, as it has many repercussions on the success of a business. The best moment to derive valuable information of data is always now, so the earliest you have access to that information, the better.

Whichever way you choose to improve the quality of your data, you will always need to measure the effectiveness of your efforts. All these data quality metrics examples make a good assessment of your processes and shouldn’t be left out of the picture. The more you assess, the better you can improve, so it is key to have it under control. The SQL structure should be concise and the return query should be not more than 2 second. Reference: https://www.datapine.com/blog/data-quality-management-and-metrics/

Please follow and like us:

Portfolio

Data Model and Data Quality Parameters

Leave a Reply Cancel reply