4. Demo – DRS Distributed Power Management (DPM)
And in this video, we’re going to take a look at the power management settings that are available with DRS. So under power management, I can choose to enable distributed power management. And if I enable distributed power management, here’s what the impact is. DRS is good at migrating virtual machines around inside of my cluster for the purposes of load balancing using Vmotion. Now, let’s say it’s 03:00 A. m. In the morning and my workload is really low on all of my virtual machines. And I normally have ten hosts running. Well, at 03:00 A. m. , five hosts might get the job done. I might be able to run all of those workloads on five ESXi hosts.
What DPM will do is it will consolidate those virtual machines into those five hosts, and then it’ll place the other five hosts in the standby mode, which means those other five hosts are going to be powered off. So now I’m saving money on power requirements at night when I really don’t need it. And when my workload starts to ramp back up, distributed Power Management will start powering ESXi hosts back on to accommodate that increasing workload. So much like DRS. Distributed power management has different automation levels. If I choose manual, then DPM is simply going to give me recommendations. It’s going to tell me when I can power hosts off, and I’ll have to manually do that myself.
Or I can choose automatic. In which case, when my resource requirements go down, distributed Power Management will migrate VMs, it will consolidate them onto fewer hosts, and it will power off the hosts that it doesn’t need so much like DRS. I recommend leaving this slider here at three. For most cases, that’s the ideal scenario. It’s going to start to power off ESXi hosts when my resource utilization becomes very low, when it can afford to shut down those hosts. If I slide it down to more conservative, it’ll only power off those hosts if resource usage is extremely low. So I may not get the same power savings benefit.
And if I bring it all the way down to conservative, then Vcenter will only apply power on recommendations produced to meet vsphere. Ha requirements or user specified capacity requirements like reservations. Now, if I slide it up to more aggressive, power off, recommendations will be applied even if the utilization is moderately low. So it’s going to be more aggressive about powering off ESXi hosts. And if I bring it all the way up, anytime the host resource utilization becomes lower than the target utilization, it’s going to start shutting down hosts. So again, the ideal here is just leave this slider at three, leave it in the middle unless you’ve got a good reason to adjust it. And then finally, I’m going to click on my advanced options here.
If you watched my video on Advanced Options for Virtual Machines, this should look familiar. Here you can see a couple of the things that I’ve configured like the maximum virtual CPUs per core. And to try to balance the number of VMs equally across hosts, those advanced configuration parameters are configured. So the advanced options that I configured under Additional Options, those are reflected here in the Advanced options section. So I’m actually going to disable DPM. Now, the other thing that we have to bear in mind when it comes to DPM is that my hosts have to be capable of being woken up. We have to have some way for Vcenter to actually wake up these ESXi hosts when they’re needed.
So here we are in the VMware documentation at Docs vmware. com, and here you can see how to configure the IPMI or ILO settings for Distributed Power Manager. So in order for our host to get powered back on, we have to configure IPMI or ILO that can be used to dynamically and across the network power these ESXi hosts back on when they’re again. And there’s also a document here that will walk you through how to test wake on land for DPM. So that’s a very important prerequisite to configuring Distributed Power Management is to ensure that all the hosts can be effectively woken up using wake on land.
5. Demo – Migrating Hosts and Resource Pools to a DRS Cluster
And so here under Host and Clusters, you can see in the last few videos I’ve gone through the process of building this DRS cluster and then I’ve got a couple of standalone ESXi hosts here. So let’s focus on this first ESXi host for a moment. You can see that this ESXi host has one virtual machine called Database. It’s got another virtual machine called App Server. And that App Server virtual machine is running inside of a resource pool called Hi. And so this resource pool called Hi is utilizing the resources of its parent object, which in this case is this ESXi host. So I have high CPU and memory shares configured for this resource pool. Those shares are only relevant on this ESXi host. So just bear that in mind.
Then I’ve got another ESXi host here with another virtual machine on it as well. At the moment, all these virtual machines are powered off so we can power these VMs on. And migrating the hosts into my cluster isn’t going to impact these virtual machines that are running it’s no impact on my virtual machines themselves. So let’s take this first host and I’m just going to grab it, click and drag it right into my cluster. And now it’s asking me a question what would I like to do with the virtual machines and resource pools for Host 192168, dot one 9911. And so what I’m going to say is put all of this Host virtual machines in the Clusters root resource pool.
So what that’s going to do is it’s going to take the App Server and Database virtual machines and put them right in the root resource pool of the cluster and it is going to eliminate this high resource pool. The other option that I can choose here is to create a new resource pool and to preserve the existing resource pool hierarchy. So if there were other resource pools within this high resource pool, it would preserve those as well. So I’m going to go ahead and choose that option. It’ll name the resource pool whatever is shown here, but I’m just going to change the name of the resource pool to high and go ahead and hit.
OK. And so then when the host is migrated into the cluster, everything should look pretty similar to the way it was, except for a new resource pool was generated to contain all of these resource pools that were pulled in. And if I look at the automatically generated resource pool, it is not configured with high shares for memory or CPU. So the desired effect of this resource pool isn’t really there anymore. It’s getting a high number of shares, but it’s inside of a pool that’s getting a normal number of shares. So I may not want to maintain this resource pool as is. Maybe I take my App Server, I drag it into this new autogenerated resource pool and I can go ahead and eliminate that old resource pool and kind of revisit the resource settings that I’ve configured here. So dragging standalone hosts into a cluster is a good time to kind of reevaluate the configurations of your resource pools. And just be careful with this because it may really mess up the resource structure that you’ve configured with resource pools on standalone hosts. And so, personally, what I’ll typically do if I’m dragging standalone hosts into a DRS cluster is I’ll revisit my entire resource pool structure at that time, build some new resource pools, and migrate the VMs over to them as appropriate just to make sure everything is right.
And then I’ll just grab my second host, drag it into the DRS cluster, and I’m just going to put all of these virtual machines right into the root resource pool of the cluster. Now, if I wanted to, I could create a resource pool for all the VMs from this host and drop those VMs into a resource pool underneath this cluster. That’s an option that’s available to me. I don’t want to do that. I’m just going to drag this to the root of the cluster and delete this automatically generated resource pool. So there’s a couple of different ways you can do it. Really, the only time that a lot of complexity gets introduced when you’re dragging a host into a DRS cluster is if you have pre existing resource pools. If you don’t have those, then most of the time what you do is you’ll just drag the host in. You won’t create a resource pool. You’ll just let the virtual machines go to the root resource pool, which is the DRS cluster itself. So now if I look at my DRS cluster, I can click on it.
I can see over here the hosts that are running inside of that DRS cluster and notice the appearance of my Vsphere client here. I can’t tell which VM is running on which host. Who cares? They’re going to move around anyways. They’re going to be migrated around anyhow. So if I want to know which VM is running on which host, there’s a couple of different ways that I can find that. I could click on a VM, and on the summary screen, I can see the host that this VM is running on, or I can click on a host, go to the VMs tab, and see all the virtual machines that are running on that particular host. Just know that this is not static.
Now, I have a DRS cluster in fully automated mode, so anytime now, database might move to this host, or App server might move to this host. DRS is going to keep this cluster balanced. And so these virtual machines may move around from host to host.
6. Demo – Monitor a DRS Cluster
So here I am in my Vsphere environment. I’m logged into my Vsphere client and you can see here under Hosts and Clusters, I currently have three ESXi hosts, all of them running ESXi 7. 0. And I’ve created a cluster here. So I have this cluster created. And at the summary screen, we can see there’s a little arrow area for DRS. So I do have DRS running on this cluster. And you can see here the cluster DRS score. Now this is the average DRS score of all of the virtual machines in the cluster. So remember, if you’re used to Vsphere six seven or six five or any prior version of Vsphere, what DRS used to do was manage the balancing of workload across all of the hosts in the cluster.
So it really took kind of a host centric view and tried to equalize the workload on all of the hosts. What it does now is it specifically looks at individual VMs and tries to figure out which host each VM is going to perform the best on. And so we can see our cluster DRS score is 89%. That means that out of the VMs running on all of these hosts, the average DRS score is 89%. Now I’m just going to take a moment to boot up a couple more virtual machines. And when I boot up, you notice what it’s doing. It’s giving me an initial placement recommendation. It’s telling me which host it recommends that I run this virtual machine on. And the reason that it’s doing that is because this particular DRS cluster is not configured in fully automated mode. Otherwise it would just pick where to run each VM.
So if I go to my cluster and I go to Configure, you can see here the DRS automation level is set to manual. And so now let’s take a look at the cluster DRS score. And as I boot up more virtual machines, this is changing. And it looks like there’s one VM that has a really poor DRS score. And so DRS is going to continue to analyze the performance of these VMs and it’s going to determine what the DRS score is for those individual virtual machines. So this little screen here and you can see it just updated again, is giving me a nice little baseline as to the overall average DRS score of all of my virtual machines. So for my next step here, I just wanted to show you this imbalance here.
I’m going to click on the cluster, I’m going to go to Configure, and I’m actually going to change DRS, the automation level to fully automated so that it can move virtual machines around however it sees fit. And I’m going to set my migration threshold right in the middle, which is where it should usually be unless you have a reason to set it either more or less aggressive. And so now if DRS observes that it has the ability to improve performance it can start migrating virtual machines around to improve the overall performance. And we’ll be able to see that here in our recent tasks. And so look what happened. I reconfigured my cluster and immediately it migrated a virtual machine and now all of my VM DRS scores have improved drastically.
So that’s one of the first places that you can monitor things. I can go to recent tasks here and see what’s going on. I can also go to the cluster and under Monitor, I can go to tasks and events, and I can see what’s been going on inside of my DRS cluster, which virtual machines have been moved around, and I can see what initiated it. And I can see this was system initiated. So this migration of the database VM from my first host to my second host happened automatically. So, yeah, I can monitor exactly what’s going on with DRS there under tasks and see what it’s doing. I can also go to the DRS history and I can see all of the actions that DRS has specifically taken. So if I don’t want to kind of sort through everything in tasks, the DRS history gives me a great consolidated view of all the actions that DRS has taken. And if there’s any recommendations, I can see them here.
If I want to refresh those recommendations, I can run DRS now to determine if any moves should be made. I can see any DRS faults listed here. And then I’ve got this VM DRS score. So here I can see all of my virtual machines and I can sort them by which one has the highest DRS score and which one has the worst DRS score. And then from a general cluster perspective, I can see the overall CPU utilization per host on all of the hosts in my cluster. I can see the memory utilization per host within my cluster and the network utilization as well. So I can observe some of those critical metrics on a per host basis by looking at the DRS monitoring within the cluster. So let’s try and make something happen here. I’m going to go to my cluster under configure. I’m going to change it back to manual automation, and I’m going to click OK. And then I’m going to take some of the virtual machines that are running on my second host and I’m going to move them to the first host. And so what I’m trying to do here is start to overwhelm that first host a little bit. So now I’ve migrated my database server back to this first ESXi host. Let’s go back to the cluster. Let’s go back to Monitor and under DRS, I’m going to go to recommendations. It’s not recommending anything right now.
Let me run DRS now and see if it has any recommended changes. And there is one migrate database from the first host to the second host. So now I can go ahead and apply those recommendations and get those VMs moved around as DRS sees fit. Now, if there’s any faults with DRS, those will be displayed here. So at the moment, we can see that there are currently not any DRS faults. Let’s go ahead and try to create one. So here on host 1921-6819 914, I’ve got the App server and the Database server. And what I’ve done is I’ve placed both of those virtual machines on local storage. Now, so I should not be able to migrate App server to the second host because it has local storage. And so I can’t even see that second host listed here. It’s not even available.
So I can’t migrate this virtual machine to the other host. It’s stuck on 109 216819 914. And I did the same thing with App server and Database. So now that I know that both of those VMs are stuck on this first host, let me go to my cluster, and I’m going to go to VM host rules, and I’m going to create a new rule. And I’m basically going to create a rule that is going to keep certain virtual machines apart. I want these virtual machines running on different ESXi hosts, and I’m going to pick App Server and database. Now, I know that they can’t run on different hosts because they’re both on local storage. So I’m going to create a VM host rule that DRS is going to try to keep these virtual machines separated, keep them on different ESXi hosts. And so now let’s go to DRS and let’s take a look at the recommendations and let’s run DRS now and see if it gives me any recommendations.
Nothing there. Let’s go to faults. It could not fix that antiafinity rule violation again, because both of these VMs are on the local storage of this first host. And DRS doesn’t do storage Vmotion. It only does compute vmotion It only moves VMs from one host to another. So that’s an example of a fault that could exist inside of my DRS cluster. Now, another problem that could potentially exist in my cluster is if I go to my ESXi hosts and I take a look at them, do they have a VM kernel port that is marked for V motion traffic? So I can see my first host here. It has a VM kernel port marked for Vmotion, and my second host that also has a VM kernel port for Vmotion.
What about my third host? My third host does not. So my third host is part of this DRS cluster but is not really doing anything. The way that DRS works is it leverages VMotions to migrate virtual machines around. And if there’s a host that’s in the cluster and Vmotion is not properly configured on that ESXi host, then DRS can’t leverage that ESXi host. DRS cannot move anything to or from that ESXi host as long as V motion is broken. And again, if I’m trying to monitor this cluster as a whole. I can, of course, go and take a look at the overview performance charts. I can also take a look if I want to delve down into individual resources here and take a closer look. I can bring up some advanced performance charts and so I can create an advanced performance chart to look at all sorts of things. Like, for example, my DRS scores and how that DRS average score has been changing over time.