SQL Server data masking: is it absolutely necessary? Judging by the IBM report, it is. The company writes, “Reaching an all-time high, the cost of a data breach averaged USD 4.35 million in 2022.” IBM also found that 60% of breaches led companies to increase their prices. Sounds quite unpleasant and very troublesome, doesn't it?
The alarming statistics prompt us to raise the question of sensitive data masking more vigorously than ever. To ensure that your secret data stays intact, and potential pitfalls cease to exist after just one glance at your security program, we invite you to explore the SQL Server data masking types.
What Is SQL Server Data Masking, and Why Care about It?
The core goal of data masking in general and fake data in particular is to protect the actual data that organizations collect and store. Considering how much of it we generate daily in almost every sphere of our lives (from online purchases to banking and healthcare), the volumes of what companies have to guard are huge. The sensitive data we're talking about includes:
- Personally identifiable information (PII): a person's legal name, date of birth, social security number, addresses, and more;
- Protected health information (PHI): all kinds of medical data that must be exposed only to privileged users;
- Payment card information (PCI): credit and debit card numbers, expiration dates, security codes, etc.;
- Intellectual property (IP): literary and artistic works, designs, inventions, and others.
To keep all that intact, to prevent fraud and misuses of personal information, companies can hide or restrict access to sensitive data for unprivileged users with no chance to decipher or reverse engineer it. As a result, if hackers knock on your door, the only thing they will get is a handkerchief to wipe away the tears of disappointment: the actual data is stored safely some place else.
Even though direct data protection raised data masking to its glory, this security technique is also irreplaceable when it comes to other high-stake tasks:
- With a functional copy of the original or with the original that imposes user access restrictions, teams can perform their jobs more efficiently: there's no sensitive data exposure, thus, the workflow becomes risk-free. This allows developers and testers to fully shift focus from security matters to their core responsibilities.
- Since actual values aren't put at stake, critical issues like data loss and exfiltration cease to exist. Third-party threats, as well as insider ones, also significantly decrease. By the way, as Techjury reports, businesses in the US encounter about 2,500 internal security breaches daily, while more than 34% of businesses globally suffer from insider threats yearly. Additionally, 66% of organizations believe that malicious insider attacks or accidental breaches are more likely to happen than external attacks. So yes, the fewer people can access sensitive data, the sounder all parties sleep.
- With cloud computing gaining traction worldwide, a number of data sovereignty laws has been released. They aim to protect personal data of citizens from exposure to companies in different countries. To help overcome challenges that these laws create, data masking offers data sanitization (i.e., replacement of old values with masked ones that leaves no traces), which prompts businesses to enable cloud adoption and offshore development, testing, and other operations safely.
Now, when we are aware of how data masking works and why it is vital for organizations, let's turn our attention to the data masking types that SQL Server offers.
Static Data Masking in SQL Server vs. the Dynamic Data Masking Feature: the Basics
SQL Server built-in security features offer static and dynamic data masking types. What unites them is the inability to reverse the process: if you mask data with either of these methods, that'd be permanent. This is the only thing that dynamic data masks and static data masks have in common. Aside from that, the SQL masking techniques are totally different — starting from their benefits and drawbacks and ending with the way of functioning.
To grasp the essence of both SQL data masking types, let's take a look at the comparison table below:
Static Masking in SQL Server
Dynamic Masking in SQL Server
Modifies data at rest rather in active use. Creates clean copies of databases and either hides or eliminates the sensitive information and/or PII on them. The end user simply doesn't have any confidential data.
On-demand masking of sensitive data in-transit. The original data stays intact (it isn't separated, moved, or copied from the original source like it happens with the static type), so it is masked right in the database to prevent unauthorized users from accessing it.
Perfect as a functional alternative of a company's database for sales demos, software testing, and user training. Data from scratch isn't representative of the actual situation, so its inauthentic yet realistic version does the job.
Its primary goal is to apply role-based (object-level) security for databases and legacy applications, i.e., DDM mainly works well with read-only cases. Also, since the original data remains unaltered, it is easier to navigate it, avoid silos, and receive live updates on the state of affairs, which is crucial for business analytics.
Masking sensitive data occurs at the storage level, where all database users have access to the same masked data.
Masking data happens on-the-fly at query time and varies depending on user permission.
Static Data Masking in SQL Server: Benefits and Drawbacks
Like we've mentioned above, static masking in SQL Server aims to remove sensitive data from a database copy and make it safe to deploy the copy in non-production environments. Such an approach to the application of one of these SQL Server security features leads companies to the following benefits:
- Regulatory compliance. General Data Protection Regulation (GDPR), Health Insurance Portability and Accountability Act (HIPAA), Payment Card Industry Data Security Standard (PCI DSS), and other data protection standards set limits to the usage of data that can disclose identities of individuals. The utilization of a numeric data type and string data type with a random value allows organizations to reduce risks associated with the exposure of sensitive customer information and fully comply with regulations. The latter enhances a company's trustworthiness and credibility.
- Ease of use. If you're up against a simple database, there's no need for any extra tools to do the security job: a couple of SQL scripts would be enough (if your database uses a different language, use relevant scripts). Also, you can present a physical copy of the database to your client with no fear to sabotage the information: the sensitive data isn't visible.
- Flexibility and increased data accuracy. Testing, development, and DevOps* all work closely with data, which includes reading and writing it. SQL Server Management Studio lets these teams use their own copy of the database to the fullest and not worry about violating customer data. They can apply it to craft a proper sandboxed copy of a production environment, which enables them to perform more testing activities and significantly increase the product quality. It also grants DevOps teams the chance to apply better sample data for modeling and predictions.
- Enhanced security. There are SQL Server security features that help maintain the column level and row level security. However, if you select the static data masking type, the need for object-level, granular protection measures evaporates. Such an approach leads companies to time- and cost-efficiency because you can take care of everything at once and never return to the task (unless there's a new database copy to make).
*If you're interested in DevOps services, quality assurance, or any type of app development, let us know. We cover a wide spectrum of digital needs globally.
Are There Any Drawbacks?
While the idea to hide data statically is alluring, it has its downsides:
- SQL Server data masking for the static type occurs via batch processing, not in real time. It can take minutes or, more likely, hours.
- Since we're talking about big data that constantly alters, you have to be ready to run the extremely lengthy masking periodically to hold an up-to-date copy on your hands. You also need to prepare a lot of storage to contain alternative databases there.
- If you have to release your database to the public, that's fine: you'll just create one masked copy. However, if you're dealing with clients (that all have varying database requirements), it will turn out to be a daunting, time-consuming, and complex task. You have to be very careful with the preparation and distribution of custom copies, since it's easy to mix up rules.
- After you statically mask data, your original data disappears forever. Thus, this type of SQL Server masking works only for protection of production copies, not the production database.
Dynamic Data Masking in SQL Server: Benefits and Drawbacks
In SQL, dynamic data masking is all about altering things on the go, whenever such a need arises. This approach to security has its upsides:
- Control. You fully control the database and monitor client activities. You can implement masking rules and expand or limit someone's access to certain data at any given moment. After all, the sensitive information is right there — but you decide who sees it.
- Data unity. Since dynamic data masking doesn't affect the database itself, in theory, everyone should be able to use a single database successfully and receive live updates on the state of affairs. This helps avoid the data silos issue (hello, static masking!) and provide all team members with the same insights simultaneously.
- Masking flexibility. As Microsoft defines it, “DDM features full masking and partial masking functions, and a random mask for numeric data.” Also, you do not have to mask data in advance, which saves teams a lot of time and increases workflow efficiency.
Are There Any Drawbacks?
It's not always rainbows and sunshine in the dynamic realm. SQL Server dynamic data masking has a few quite heavy disadvantages:
- Despite its name, this security technique doesn't mask dynamically stored procedures. To do that, you have to rewrite the query results, not the query itself.
- And again — despite its name, it's not the best fit for a dynamic environment, as someone might write masked data back to the database and corrupt it.
- Dynamic data masking doesn't protect the underlying data, it masks information only as it responds to a query. Protection of the underlying data requires you to combine DDM with another SQL Server security feature like encryption, row-level security, or auditing.
- Consequently, it's not a complete solution to fully protect your data, and it will not be enough to comply with regulations like HIPAA.
- Dynamic masking is considered to be less secure than static, because at the end of the day, users are connecting to a database that still contains sensitive information.
#1. What is data masking in SQL?
That is the process of hiding or restricting access to sensitive data to protect it from fraud and misuse.
#2. What is dynamic masking?
As Microsoft writes about it, DDM “helps prevent unauthorized access to sensitive data by enabling customers to specify how much sensitive data to reveal with minimal effect on the application layer. DDM can be configured on designated database fields to hide sensitive data in the result sets of queries.”
#3. What is the difference between static data masking and dynamic data masking?
Static data masking replaces sensitive data in a database copy with structurally similar but inauthentic information at rest. Dynamic data masking alters sensitive data on-demand, which means it doesn't affect the original at-rest data and restricts access to it for unauthorized users.
#4. What is the difference between data masking and encryption in SQL Server?
Data masking protects data through either the elimination of its secret part or its replacement. Such replacement has a different value, but is structurally identical to the original. Encryption alters sensitive data to the point where it's unreadable without a special key.
#5. What is an example of masking in SQL?
There are many types of personal information that you can hide using data masking in SQL Server. The examples in this particular case would be phone number (xxxx-xxx-xx-8976) and email (email@example.com).
#6. Which version of SQL Server has dynamic data masking?
According to Microsoft, dynamic data masking is available in SQL Server 2016 (13. x) and Azure SQL Database, and is configured by using Transact-SQL commands.
#7. What are the limitations of dynamic data masking in SQL Server 2016?
DDM has quite a few serious downsides. One of them is the inability to prevent users from exposing sensitive data, because they can continually query the database. It also doesn't protect underlying data and allow you to comply with security regulations.
#8. Are there any data masking solutions for open source database engines?
Yes, there are. To name a few from that long list, PostgreSQL Anonymizer, MongoDB, and MySQL. Data masking with open source database engines is more than possible.
So Who Wins?
Dynamic and static data masking in SQL both present generous offers to those who want to strengthen their security and avoid drastic consequences of sensitive data misuse. Nevertheless, each of them has serious downsides. How to make a choice? Well, if you're up against massive data, but it's okay with you to copy it and do the masking on the copy, go with the static type. However, if you're dealing with a number of clients with varying requirements, and copying a dataset is too much work, settle for the dynamic mask. Either way, remember that it's okay to seek advice from a reliable third party to ensure that your security shield is SQL — Strong, Quality, and Legendary (sorry about that).
Nevertheless, if you want to learn more about how a SQL Server data masking script or MSSQL.DataMask works, or you are looking for an experienced business partner to develop, maintain, and secure your database, contact us!