1. 43, 50. implement index maintenance tasks
Hello. And in this section, we’re going to be implementing performance-related maintenance tasks. I’m going to start with indexes. So we’re going to be looking at index maintenance tasks and assessing the growth and fragmentation of a particular index. So first of all, we’ve had a look before at how to find missing indexes. We’ve had a look at the Sys. D.M.D.B. missing index details. So that tells me that there is a particular table that could do these indexes with these fields being included in an index. So we’ve had to look at that in an earlier video. Now we’ve also had a look at how indexes can be split. Assume we have this as a page of an index. So we have 12346, and now I want to insert five into that. But this is as many items as you can get on a particular page.
So what might happen is that I move six onto another page; maybe I’ll move four and five there as well; and then I have this page. So now I’ve got two pages in this index. Now let’s suppose that instead of starting with 12345, I start off with eleven, 2233-4455. And now I’m going to be inserting some additional items. Let’s insert 1234. Well, maybe I’ll put them on a later page, and then I’ll be linking backwards from this page to that page. So this is the first page, and this is the second page. This is how fragmentation can work, where you have an overall item that says, “Okay, go to page 2 first and then to page 1.” This is the order. So fragmentation happens when you split pages, but then you get pages in the wrong order in terms of how they can be read. So what is the fragmentation of particular database indexes? Well, we can use this to show us. So why should I care whether there is fragmentation?
Well, it can degrade query performance because there are more input and output requests with smaller numbers of data in each request, and each page can be fragmented up to 100%. So I’ve got this SysDML database index and physical statistics that can provide us with that information. Now, if I have a look at that the other way around, you can see it gives things like average fragmentation in percentage and the object ID. So you can look that up using the object underscore ID, the index ID, but you can also see which object it relates to. So this can let you know what the fragmentation is like. Now, you can also use the DBCC command show contig. However, this has been deprecated, and we don’t recommend that you use it anymore. Incidentally, if you’ve got column store indexes, then you should be using Sys. DmdbColumnStore Raw Group Physical Stats. So they both end with physical stats. And the main thing that you should note—I haven’t got any column store indexes—is the number of deleted rows and the total number of rows, because when you’ve got that information, you can divide one by the other to say, “Okay, this is how fragmented it is.” When you update a row in a column store, what happens is that it deletes or marks the original row for deletion and then creates a new row. So that’s why using deleted rows and total rows is quite useful. So we found that there’s an index you need to update.
What should you use? Well, you should use the Altar Index. Now we have two main options: reorganise and rebuild. And it used to be that a rebuild was great if you could spare time with that index not working. However, nowadays you have a bit that says “online equals on,” so you can do it online. When should you reorganize, and when should you rebuild? There are no hard-and-fast rules. If you look on the internet, you’ll probably see that reorganisation should be done if you’ve got fragmentation between 10 and 30% and reloading above 30%. And if you’ve got column store indexes, then you should reorganise if you’re above 20%, and it will remove the roles marked as deleted. However, there’s no hard and fast rule; those are just guidelines. So here is the reorganised section. So alter the index for all, or you can name a particular index on and then the name of the table reorganize. So you can see it didn’t take that long, but then it wasn’t a particularly big index. Now we can use rebuild, and we’ve got a few options with rebuild.
So, with online equality or online equality off fill factor, we can say. What does that mean? Well, let’s say we have a page that allows us to have five roles on it. In reality, it’s 8192 characters, but let’s say it’s five rolls. The fill factor is a percentage of how much it’s going to be filled up. So if I had this to start with and I only had a 60% fill factor, then if I want to insert two and four, I can do that within the existing page. So the fool factor allows us to have a bit of expansion, but then it does create a bigger index because we are having blank spades at their maximum duration. This is how long it takes to compute the index for rebuilding. It may be that you can only do 30 minutes. So that’s how you can do it. Now, this can be used with SQL Server 2017 or later and with Azure SQL Database, and then after those 30 minutes, well, it pauses. You can now say “resumable,” and we’ll make it so. So you can say “auto index,” “name of index,” “where that index is,” and “pause,” “buy,” or “resume.” So I won’t be needing to do this for this particular rebuild, as you can see, very quickly, but that’s what you can use. So indexes, we’ve previously gone through missing indexes, so it’s well worth revisiting that section.
If this is new to you, you can also look at the fragmentation of an index by using Dmdb index physical stats. So, for a given amount, you could say where the average fragmentation in percentage is, or even order it descending. As a result, the largest is at the top. So it’s given me an ambiguous column. It’s because I’ve got this star at the end, so that should work now. So here you can see the biggest average fragmentation, but there are some where the page count is so small. If it’s a small index, does high fragmentation really matter about high fragmentation? After all, it’s not going to slow down the machine that much. So I’d say multiplied by the page count to include the page count in the calculation. So I’ll be looking here. Okay, here is one particular index that could do with being rebuilt or reorganized. And as a rule of thumb, you may wish to reorganise when you are between 10% and 30% fragmented and rebuild if you are more than 30% fragmented. And if it’s resumeable, you can pause or bore the index if it’s starting to interfere with other users being able to use your database.
2. 44. implement statistics maintenance tasks
In this video, we’re going to talk about statistics maintenance tasks. What are statistics? Well, they’re used to create query plans to improve the speed of queries. The statistics contain information about the distribution of values and tables, or indexed views, in the columns. So maybe it says if this value is one, then there are a million rows. If this value is two, then there are 100 rows.
It uses it to estimate the cardinality, say, the number of roles in a result, which enables the query optimizer to create better plans. So in other words, it could use seek rather than scan if an index would be of use. Now, usually the query optimizer determines when statistics might be out of date and then updates them. However, you might want to manually update them if query execution times are slower, you are inserting ascending or descending key columns such as identity or timestamp columns, and after maintenance operations such as bulk insert, that is to say you’re inserting a million, 2 million, or a billion rows at once. You don’t need to do it after rebuilding or reorganising an index, as you’re not actually changing what roles are there, so the statistics don’t change. So there are two things to know. First and foremost, we must run SP underscore Update Stats.
As a result, this storage procedure updates statistics for all user-defined and internal tables. But if you just want to update one particular index, then you can use “update stats” and the name of the table or the name of the index. So I’m updating this particular one. So there are a few options you can use. First of all, you can use a full scan, so this scans all of the roles, or you can just get the computer to scan a sample. So you could say we sampled 10% or 20% of the roles. You can also say with resampling, which is its most recent sample rate. You can add a comma after the first two, and then persistunderscore sample underscore percentage continues. So whatever options you have chosen will be the default for future statistics updates. However, I’ve not actually been able to get that to work with the Azure SQL database, so I suspect it’s not available in this particular database. But it’s useful to know that it is an option. So this is how you can implement statistical maintenance tasks. You can either have the still procedure SP underscoreUpdate Stats do them all, or you can have one particular index updated with Update Statistics, name a table, name an index, and then with a full scan with a 10% sample, or we’ve resampled.
3. 45, 82. configure database auto-tuning and automate performance tuning
Now in this video, I want to talk about database auto-tuning. Auto tuning is a process that learns about your workload and identifies potential issues and improvements using the philosophy Learn, adapt, verify, and repeat. So configuring data for auto tuning is very easy. We just go down to intelligent performance and automatic tuning. You will now see three options: plan, create index, and drop index. These are the ones in the Azure SQL database. However, in a SQL managed instance, there is only a Force plan. The other two do not currently exist.
So let’s talk about the force plan. So this is also called force. Underscore the last underscore. Good. Underscore plan. Force last. Good planner. Now it is enabled by default. So you can see the current state as inherited from the server. So if we were to go into the server for this Azure SQL database and go down to the automatic tuning, you can see we have On, Off, or Inherit. So it’s currently inherited from the Azure defaults, which are on for this. So what does “force a good plan” mean? It means that if something went wrong with a plan, or if some plan change regression was discovered, So in other words, where something is no longer good, then the Last Good Plan should be used, or forced. That is the force strategy. So that’s roughly when the estimated gain is above 10 seconds or the number of errors in the new plan is above the recommended plan. So that’s Force plan. Also referred to as the “Force Last Good Plan.” Now, for your SQL database, we can automate index maintenance as well.
So we can change, create an index, and drop indexes so we can keep those from inheriting. So it was inherited from the server. The server inherits them from Azure and changes them to “on.” So you can see that the default in Azure is that they are off for both of these. So if I do this in the database, this is overriding the settings from the server. So note that the Drop Index option is not currently compatible with partitioning, switching, or index hints. So if you use those, then you can’t use Drop Index. As a result, indexes will be generated automatically. If the CPU data I/O and log IO are lower than 80% and it thinks it’s a good idea, the performance of queries using the automatically generated indexes will be reviewed, and if it doesn’t improve performance, then they’ll be automatically dropped. So let’s have a look at a few TSQL commands. First and foremost, Sys Indexes displays all of the indexes in a specific database. However, on the right hand side, we have Autocreated. There it is. While none of these are auto-generated, they are indexes that were created using auto-generation. So needless to say, if I run this now, there won’t be any. As a result, you can determine which indexes were created from Sys indexes.
Now, if I wanted to do what I’ve just done in the portal, I could do it like this. So alter the database name so I’ll put DP 300 set automatic tuning equals auto. I can also say that I inherited something or that I follow a tradition. So if I say auto, there we go. If I want something unique, I can put it in brackets. false: the most recent good plan equals The answer is yes. drop index equals But again, you won’t be able to use these last two. In a SQL managed instance, they aren’t there. So now I’ve done that. If I go back into the portal and refresh, you can see that it now shows the desired state. And then finally, what are the tuning recommendations, if any? You can find those in Dmdb tuning recommendations if you want to go underneath the hood and see what is actually happening. So configuring the database requires auto-tuning. So you can do this quite easily in the Azure Portal by going to automatic tuning, and you can also do it in TSQL by using Alter database database nameset automatic underscore tuning. You can also see which indexes are auto-created by going to Assist indexes and looking at the Altor underscore created column.
4. 47. manage storage capacity
In this video, we’re going to take a look at how we can manage storage capacity. So the following may apply just to Azure SQL databases, by the way, and not Azure SQL managed instances. So why might you want more storage capacity? Or you might need to just add more space, or even decrease the maximum capacity of a database elastic pool. Don’t forget: the more you allocate, the higher the potential cost. Or you might want to change to a different service tier. Now, there’s some terminology we need to look at. Data space used generally increases with inserts and decreases with deletes, but it is dependent on fragmentation to determine whether that happens.
Now, in contrast to the wave data space used, that’s the amount actually used. You have a data space allocated, so that is the amount that is made available for your data. Now, this can grow automatically. Let’s say you add a billion rows; then the amount used increases, but the amount allocated also increases. However, if I delete a billion rows, the amount of dataspace allocated doesn’t decrease automatically after these deletes. So there’s a bit in the middle. So the difference between data space allocated and data space used is data space allocated but unused. So this can be reclaimed when data files are shrunk. And then there is a data maximum size, so that’s the maximum that data space allocated could be. So let’s have a look at how we can display the allocated space. So, first of all, I need to be in the master database for this. So let’s change it to Master. So we can see here that we’ve got Sys resource statistics. And if I run that, you can see that I have got 32 megabytes. There are more columns in this particular table than we have looked at before. But as you can see, we’ve got 32 megabytes of allocated storage.
If this was an elastic pool, then I would use Sys elastic pool resource statistics. Again, we’ve had a look at that before. Now, if we want to display the maximum size, So here we’re displaying it for DP 300 using database properties, ex. As a result, the maximum size and database property x are provided. This gives you a property. So in this case, the maximum size is in bytes. So I need to switch from Master. If I do it in Master, the answer is null. So if we go to another database, we get this ridiculously big answer. It just means unlimited. If you want to view the current log size, we can do that using these system database files. Type one is a log file, so you can see that there is one log file. If I run this without the type, you can see one log file. We’ve also got things called roles and file streams. Now, I can shrink the transaction log file, which in this case is file ID 2. So I would run a DBCC shrink file, and in brackets, two DBCC, which, by the way, stands for database console commands. So that will shrink one individual file. So now you can see the result. So this file, this log file, occupies 3072 pages, 8192 characters, or 8 KB.
This is the minimum size of the number of used pages, or the number of eight kilobyte pages, and the estimated size that it could be shrunk down to. Now I can specify how small I want it to go; I could say I want it to go down to five megabytes. So if I want to do that, I’ll put the comma after “five.” I could also write “empty file” if I wish to actually empty it. Other options include no Truncate, which moves allocated pages to the front of the file, and Truncate only, which releases all free space but does not move any pages. But in reality, I would just use the BCC shrink file and then the number. Now, it can take a bit of a while to do it, depending on how big the size is. So a shrink operation is probably most effective when you’ve actually got a lot of unused space. For instance, you’ve dropped an array or truncated a table. That means removing all of the data. I wouldn’t do it that often, to be honest. If you want to shrink an entire database, you can also do that with shrink database and then, in brackets, the name of the database. Note that the name of the database is not in quote marks. It is only in this instance in DP 300. So this will go through all of the files. We know there could be three of them and will reduce their number. However, I wouldn’t do it that often because files need room to grow, and if you’re shrinking everything down, then they’ll just have to grow, and that takes time. So this is how we can manage storage capacity. In the next video, we’ll look at how we can assess the growth of databases and logs and report on database free space.
5. 50, 105. assess growth/fragmentation and report on database free space
In this video, we’re going to take a look at the growth of databases and logs and reports on database free space. So to assess growth in a database, we can use Sys Resource Stats. So again, this is something we keep bumping into. So we can see the start time, database name, and storage size in megabytes here. Now you need to be in the master database for this within your SQL database. If you don’t, then you’ll get an invalid object name. for an elastic pool. You would use Sys elastic pool. Resource stats. SP underlining Space Used can now be used to display the amount of database free space and the amount used. So you can see that this database has a database size of 40 megabytes with eight megabytes of unallocated space of eight megabytes.
If I change it to the DP 300 database, then you can see that there are different statistics. Now, I can display this by file. So if I use the Sys database underscore file, we took a look at that in the previous video. So again, here’s our log file, and you can see the current size. These are again measured in pages, each of which is 8 KB. So you have to multiply this by eight to get the number of kilobytes and a maximum size. So, if you want to see the number of pages that have been used as well as the total free space for a specific database, you can use Sys Dmdb file space usage. So you can see that we’ve allocated 1808 pages. We’ve got 240 that have not been used and are unallocated. Now, you can also use the DBCC command SQL Perf and bracket logspace. However, this only reports on transaction logspace statistics, but you can see the percentage of the logspace that is used there. Now, this is not valid for an Azure SQL database, but if I were to go to a VM, so I’m just going to go to my local machine and have a look at a database and right-click and go to reports, standard reports. You can see that we’ve got disc usage.
So there I could see more information about my database in a graphical form, along with the amount of disc space used by data files. However, that’s not able to be used in an Azure SQL database. If I tried to do that, then that option is simply not there. Now for the Tempt DB database, you can use these two, and I’m going to play both of them. So first of all, we’ve got SysDb space usage. So this shows the number of pages, and we’re using the word “internal” to represent the TempDB database number of pages allocated and de-allocated by each session. And then we’ve got a similar one here for SysDmDb taskspace usage. Again, we’re using internal, whereas the previous one used internal pages for each session. This one uses internal pages allocated and de-allocated for each task. So those are two that can be used: wave, temp, and DB. But look at the internal section on the right side, and you’ll see the columns. So it is these two columns that would be of most interest in looking at tempdb. So in this video, we’ve had a look at how to report on database free space. You can use Exec SP underscore space used. You can use SysDB files to display it by file. You can use SysDmdb FileSpaceUsage, which results in this. So it shows space usage for each datafile in a database on pages 8 and 9. Dbccsql perf brackets and log spaces can also be used. And for TempDB, you can also use SysDmDb session or task space usage to assess growth in the database. You can use our old friend, Sys resource stats, or Sys elastic pool resource stats.