101. Queue Types in SQS
Everyone is welcome back. Up until now, we have been discussing a lot of configuration-related parameters for SQS queues, and in today’s lecture, we will be speaking about two types of SQS queues that are available. One is the standard queue, and the other is the FAFO-based queue. So the F4 base queue is quite recent, and they were not available earlier. So let’s go ahead and understand the difference between both of them. So when you click on “Create a new queue” here, you will see there are two queues available: standard and FIFO. Now each one of them comes with its own benefits, and the higher-level comparison is already present over here. So let’s look at the second difference. It says that a message is delivered at least once, but occasionally more than one copy or message is delivered. So this basically tells us that, occasionally, it can happen that duplicate messages or the same message can be delivered more than one time. So this is the property of the standard queue, although it does not happen often. When you talk about the Fo-based queue, it says a message is guaranteed to be delivered at least once, but all the duplicates of the messages are removed.
So this is a very important property that helps in a lot of use cases. The second important difference between a standard and a FIFO queue is that standard queues have nearly unlimited transactions per second, which is TPS. So this is very important because a standard queue supports a very high amount of throughput. However, when you talk about FIFO, FIFO currently has 300 transactions per second, which is supported.
The third and most important distinction is that FIFO is based on “first in, first out” delivery. So what that means is that if a message comes first inside the queue, then that will be the first message to go up and go out when a consumer pulls. So this is very important because what happens in the standard queue works on “best effort” ordering, which states that occasionally messages are delivered in an order different from the one in which they were sent. So if message one is sent first to the SQS queue, it might happen that the SQS queue might send message two first. So the order of messages being sent out of the queue can be a little bit different from the original messages sent by the producer. So this is quite important. So, what we’ll do today is create afifoq and name this queue Klaus.
Allow me to choose from So, if you choose a fifo, the extension should be dot filo. Just remember that. Now what I’ll do is do a quick create queue, and now you can see the FIFO queue is being created. So, whenever a use case arises in which you cannot afford to have a duplicate or when you actually require messages to arrive in a specific order, the order in which they were sent by the producer, That is the time when you use the FIFO-based queue type. So there are two types of queues you’ll see over here. One is the standard, and the second is the f IO. So you have to understand and, I would say, remember the difference between both of these queues. Please also keep in mind that the size of the queues is currently limited. For example, it cannot support very high throughput. So depending upon the use cases that will be in your organization, the queue type that you will be selecting will really differ. So again, this is it. About this lecture: I hope this short lecture has been informative for you, and I’d like to see you in the next lecture.
102. Scaling to Traffic Patterns
Hey everyone, and welcome back to the Knowledge Portal video series. We already discussed scalability and the different types of scalability that can be performed, which are major horizontal and vertical. All of those were just theories. Today we will be understanding scalability from a practical point of view is concerned.So let’s go ahead and understand how we can design scalable systems in our environment. Now, just to review, scalability is the ability of a system to change in size depending upon its needs. Now, if there is a huge amount of traffic, the servers should scale up, and if there is a small amount of traffic, the servers should scale down. So this is a very simple example of what scalability is. Remember the rubber band example that we had discussed?
Now, whenever you design your infrastructure, it should be designed in such a way that it supports scalability depending on the traffic patterns. So this is a very basic graph that you can see. Now, this graph is basically divided in terms of traffic patterns as far as night and daytime are concerned. So, during the day, the traffic patterns are quite dense. That means the website is receiving quite a good amount of traffic during the daytime. However, traffic is much lighter at night. I would really say it is less than, say, 30% compared to that of the daytime traffic. Now, in order to handle this traffic, let’s assume you have five servers running. Now the five servers are quite capable of handling the traffic patterns during the day, and you are spending quite a good amount of money using those five servers. However, since during the night the traffic is quite low and your five servers are running twenty-four hours a day, you are actually overutilizing your resources when the traffic is less. As a result, this is not an ideal architecture.
So the ideal architecture should be that when traffic is low, fewer servers are used, and when traffic is high, more servers are used. And that is what interdisciplinary architecture should be. And that not only helps you from a cost perspective, but it will also help you in terms of traffic spike times. So this is where the amazing feature of automatic scaling comes in. So auto-scaling is one of the features of AWS, and it allows us to scale up and down the EC in two instances depending on the condition, which is defined by the system administrator or the solutions architect. So let’s understand this with a use case. Now Medium Corp. is an e-commerce organisation in India, and it is hosted completely on AWS. Now, in the past six months, the AWS builds have skyrocketed, and the CEO has asked you to reconsider the design of your infrastructure. It should not only support the operation, but it should do so at a low cost.
Now, the second important point is that since it is an e-commerce company targeting Indian consumers, the traffic pattern varies drastically during the daytime and at night time. So, similar to what we discussed in the graph, this is a very simple use case. Now, what really is happening is that the AWS bills have really skyrocketed. Now, one thing that is seen from the graph is that daytime traffic is quite huge, but nighttime traffic is quite low. So now what you can do as a solution architect is a question, and we can explore the auto-scaling service. So let’s see how we can achieve this. It can now be accomplished based on the instance’s average load. This is one of the criteria for the auto-scaling group. So let’s assume you have a base instance of two servers. So at any given amount of time during the day as well as during the night, you will have two servers that will run all the time. The number is 27.
Now, if the CPU utilization, which is directly related to traffic, exceeds 70%, the condition is at two more instances. So if you see over here, I have two servers, and those two servers are capable of handling the average load, which is around 25%. But suddenly there is a huge spike that reaches 75%. If you only have two servers, the server will definitely begin to time out. And thus, whenever you create the condition that the CPU utilisation reaches more than 70%, you add two more servers. So you will have four servers at that specific time. And after that, when that traffic spike goes down, you see the traffic spike go down. So you have one more condition over here, which says that if average CPU utilisation is less than 25%, then remove the two instances that were created during the high load period. And these are the policies that you can configure as part of the auto-scaling configurations concern. So I’ll just show you in a simple management console.
So if you see over here, I have auto-scaling groups that are configured. Let me show you the settings that are part of this page. Now, if you have a minimum of one server, it means that the minimum server that should be running at any time of day or night is one. The maximum number of instances is three. So the maximum number of instances it can scale horizontally is three. And if you look into the scaling policies, I have two scaling policies depending upon the cloud watch alarms. So, if I go to “cloud watch,” I’ll just show you that I have two alarms here: one for high CPU utilisation and one for low CPU utilization. So, high CPU utilisation means that if the CPU utilisation is greater than 70%, then this alarm will sound. And there is one more alarm that says if CPU utilisation is less than or equal to 25%, then this alarm will be triggered. So now we can use this alarm in conjunction with the auto-scaling policies. So I have two policies over here: increased group size and decreased group size.
Now, in the increased group size, you see in the action column that it says to add two instances when CPU utilisation is greater than 70%. And there is one more policy called “decrease group size,” which basically says remove two instances. Remove two instances when CPU utilisation has decreased back to 25%. So this is a very simple policy. When your traffic load increases, your server will increase horizontally. And when your traffic load decreases, your server will shrink horizontally. Perfect. So before we close this session, let me show you an interesting thing. Now, if you see over here, the minimum instance is one, which means one instance will be present all the time. Now if you see over here, I have one instance that is running. Let me go over here, and it will direct me to the instance. Now let me terminate the instance. Let’s see what happens when I terminate this specific instance.
Now, according to the auto scaling, there should be at least one instance running at all times, and we have actually terminated that one instance. Let it shut down. So, in the meantime, I’ll just enable auto scaling. I’ll open up the auto-scaling group. And now, if you see the health status has become unhealthy for the specific instance, So what exactly will happen is that once the autoscaling detects that the instance is in an unhealthy state, it will actually start one more instance. So let’s just confirm and wait for a minute or two for the instance to terminate. Okay, so the instance has been terminated, and auto scaling should now verify the minimum number of instances that should be running at any given time. Because the minimum number of instances is one and that instance’s status is unhealthy, autoscaling will create one more instance as part of the auto scaling group policies. So let’s wait a minute and see what happens.
The auto-scaling group will launch one more instance. And now, if you will see, the health status has become healthy, and there is one more instance that has been created. So the lifecycle is still pending. Now, if I just check it in the simple console, you’ll see that another instance has started automatically. That is the beauty of auto scaling. So this is just a demo that I wanted to show you for this lecture. In the upcoming lecture, we will actually go into much more detail related to how we can configure auto scaling and what the best practises are as far as auto scaling is concerned. So I hope you now have a basic understanding of what an auto scale is. And I look forward to seeing you at the next lecture.
103. Introduction to Auto Scaling
Hey everyone, and welcome back to the Knowledge Portal video series. We are now continuing our journey with the scalability aspect. Today we will be understanding more details related to the auto-scaling-based configuration. Now, we already looked at auto scaling, which basically helps us achieve the horizontal scalability of each instance depending on the condition defined by the user. Now, in the previous use case, we had two conditions. The first condition was that the CPU utilization be greater than 70% before adding two servers. The second condition was to scale down by two servers if the CPU utilisation was greater than but less than 25%. So that was the condition we imposed. We can definitely have multiple conditions now. It is not necessary that you reach 100% CPU utilization. There are various other resource factors that you can add up in the conditional factor.
Now, we were discussing the use case of the medium-sized organization, and the solution to the use case will be with the help of auto-scaling. So, we’ve already seen that the amount of traffic decreases significantly at night. As a result, because there is less traffic at night, the overall CPU utilisation will be low as well. And in such a case, the server can scale down depending on the condition we added to the CPU utilization. Again, in the morning, when the traffic starts to increase, the overall CPU utilisation will go high, and then our policy related to scaling up will be enforced. So the servos will be scaled up. So this is a very simple way to achieve the use case related to the medium corporation that we had discussed. And again, this use case is very common throughout most of the organization. So even if you are a solution architect working in an organization, you might come across implementing such use cases. Going forward with an understanding of the auto scaling components There are two major components of automatic scaling. The first important component is launch configuration, and the second important component is the auto scaling group. Now, this is the launch configuration on the left hand side, and on the right hand side, we have the auto-scaling groups.
So let’s go in and understand the difference between a launch configuration and an auto-scaling group. Now, whenever you want to create auto-scaling policies, the first thing that you need to do is define the launch configuration. And after launch configuration is defined, the next step would be to configure the auto-scaling groups. So what really comes under “launch configuration” is like, if you remember from the earlier practical session, let me just show it to you. If you go to the auto-scaling section, there are two components that you can find. One is the launch configuration, and the other is the auto scaling group. So if you remember, we had terminated this specific instance, and auto scaling automatically came up with one more instance. Now it is terminated because I just terminated it to stay within the free tire, but if you see, how will the auto scaling know what is the instance that needs to be launched? For example, if this is the instance of type “two microtype,” you will now see instances of type “two microtype,” each with its own AMI ID. This is the Amazon Linux-based instance, and the key pair associated with it is “Kplabs.” So, how does auto-scaling know what type of instance is required? What operating system should be chosen, and what security groups, IAMrole, and so on, should be created?
All of this information is included in the launch configuration when a new instance is launched using auto-scaling. So the important concepts related to the name of the new instance that will be launched are: from which AMI it will be launched, whether it will be an Uber to Amazon Linux or CentOS, and what will be the instance type? Whether T stands for “two micro” or M stands for “four dots,” What will be the security group associated with that instance? What would the Im role and similar configuration look like? So all these are part of the launch configuration. Once this is selected, Then comes the auto-scaling group, in which we define the minimum and maximum capacity. Like we had already discussed, if minimum is one and maximum is three, then by default, the minimum would be one instance that will be launched, and depending upon the scaling policies, like average CPU utilisation being higher than 70%, then new instances will be launched depending upon the launch configuration-related settings.
The auto-scaling group also has settings for where these instances will be launched in which VPC and subnet, as well as whether the newly launched instance should be directly attached to a load balancer. It also examines data from health checks. So, let’s return to auto scaling and investigate both of these configuration sets. So if I open the launch configuration over here, I have created a launch configuration, and the first aspect is the AMI ID. So this AMI ID corresponds to the latest Amazon Linux operating system, which is available in the marketplace. Then you have the instance type, which is TWO micro, and the key name, which is KP Labs. You have the security group, which is defined, and you also have blocked devices, which are part of the launch configuration.
So whenever a new instance is launched in the auto scaling, that instance will be launched based on this AMI with this type with the key pair KP Labs with the security group that is mentioned over here, and it will have the specific block devices. So this is the launch configuration. So definitely, whenever we want to create an auto-scaling environment, the first thing that would need to be created is a launch configuration. Once the launch configuration is created, the second aspect would be the auto-scaling group. Now, within the auto-scaling group, we basically put details related to the minimum instance, the maximum instance, the health check type, and also details related to the VPC and subnets in which the instance will be launched. Now the second tab, which is part of the auto scaling, is the activity history, which basically will tell you when a new instance is launched or when an instance is terminated.
So all the activity history will be part of the second tab. You also have the scaling policies that we previously discussed. In our case, if the CPU utilisation is greater than or equal to 70%, then in the scaling policy we have defined that it will automatically add two more instances. So this is part of the scaling policy. You also have the instances. So, if there are any instances running, you will see them in this tab, and you have other options such as notification, which will send you an email stating that whenever a new instance is launched or terminated, depending on your notification settings, you will receive an email. So that is the fundamental information about auto-scaling components and what they are. So, hopefully, the fundamentals of auto scaling are now clear. In the next lecture, we will be creating our own launch configuration and auto-scaling so that we can cover the practical scenario as well. So I hope this has been informative for you, and I look forward to seeing you in the next lecture.
104. Auto Scaling – Practical
Hey everyone, and welcome back to the Knowledge Poodle video series. So in today’s practical session, we will be creating a first launch configuration as well as an auto scaling group, and we’ll look into what the various configuration parameters that need to be created are. So let’s start. So the first thing, as we already discussed, is to create a launch configuration. So I’ll click on “Create Launch Configuration.” Now you have to select the AMI. So it can either be from the marketplace or, if you have some custom web application that is already created, you can create an AMI and select it as well. But for our sample demo session, I will use Amazon Linux as our operating system. Now you have to select the instance types that will be launched. In terms of auto scaling, we’ll go with two micro because it’s under the free tire. I choose Configure Details.
I need to give the name of the launch configuration. So I’ll say “Kplabs.” KP. Labs. Identity. So this is the identity of the instance that will be launched. You also have the setting related to whether you want to do a spot instance. Along with that, you can select the IAM rule as well. I’ll just select “none.” You also have the option to enable detailed monitoring. Remember, if you do detailed monitoring, it comes with an advance cost. Now if you go into the advanced details, this is the user data that you can fill in. So what I’ll do is, whenever a new instance is launched, I want NGINX to be installed in that instance automatically. So within the user data, I’ll write a simple bash script. Yum Yinstall Apple Release, I’ll say. yum yinstall NGINX.
Then I’ll do an echo. In KP Labs, this is the auto-scaling group practical. Remember that the default HTML index is not in warwick, but in user sharenginex HTML. So I remember the entire path. Hopefully, this is correct. And once you have created your index HTML, I’ll restart the engineering package. Perfect. So we now have user data. I’ll go and make a storage space. So for the time being, 8 GB of storage is quite good enough. Let’s go to the configure security group. I’ll add one more rule over here. Since we are working with NGINX, I’ll allow NGINX to be accessible to the world. So port 80 is open. I’ll select Review, and I’ll click on Create Launch Configuration. I also have to select the key pair so that I can log into the instance, and we click on Launch Configuration. Perfect. And once the launch configuration is created, you see, it will automatically give you a button that says “create an auto-scaling group.” So I’ll just click close over here, and we can create it manually. So just to verify, this is the launch configuration, and it took the AMI of the Amazon Linux machine. The instance type is P, two micro, along with the block devices that we have assigned and the security group-related IDs. Perfect.
So now we’ll go to the auto-scaling groups, and I’ll create an auto-scaling group over here. Now, whenever you create an auto-scaling group, you have to select the launch configuration as the first step. So we’ll select the launch configuration as Kplabs’ identity. This is the one that we created. Now you need to give the name of this autoscaling group, which I’ll call Kplabsg, the group sites. It will start with one instance. It is asking me into which VPC I want to launch subnets. I’ll give three subnets: one for A, one for B, and one for C. So, basically, this will assume that the first instance launched will be part of one A. It will be part of one B when it scales up, when it creates two in the second instance.
When it creates the third instance, it will be part of one C. Perfect. Now we’ll configure a scaling policy. I’ll select the second option. So it asks me to scale the number of instances to one, and depending on your configuration, you can even put five. So whenever the load increases, it can horizontally scale up to five instances. So I’ll just select three for now and click on the specific option. And now you have two policies over here. One is to increase the group size, and you need to create a new alarm. So I’ll deselect send notification to SNS, and in this cloud watch alarm, what I’ll do is you have to create a policy. Let’s assume if the average CPU utilization is greater than or equal to 70% for at least one consecutive period of time—I would say 1 minute over a year—and you have to give the name of the alarm. So I’ll say KP Lab scale. So this is the name of the alarm, you see. So if the CPU utilization is greater than or equal to 70%, then this alarm will be triggered. So I click on “create alarm,” and if this alarm is triggered, what should be done? and that is something that we have to define. So I’ll say you should add two instances. Perfect.
Now we’ll have to create a second policy as well, which says decrease the group size, add one more alarm, and deselect the SNS topic. And here it says whenever average CPU utilization this time is less than equal to, say, 20% for at least one consecutive period of, say, 1 minute, the name of the alarm will be KP Labs’ hyphen scale. I’ll set the alarm over here and hyphen down. Perfect. So now I have two policies. I’ll go ahead and configure notifications now. So this is where we can add a notification. So if a new instance gets launched, if an instance gets terminated, or due to some activity in the life cycle, if we want to get notified via email, you can do that over here. So I’ll leave this up as a default. Let me just cancel it. I’ll then click Review to create an auto-scaling group. So the auto-scaling group is created. Now, if you see over here, since our minimum instance is one, what auto scaling will do is it will automatically start one instance. So if you see the first instance, it has already been started.
Now, if you go to the instance table, let me just quickly verify. Notice how the first instance starts automatically. Perfect. And it has started, and the statistic is still initializing. Let me go ahead and try to do a Telnet over here so that we can know whether the instance is up or it is still initializing. Perfect. So it seems to be up. Now let’s go back, and I’ll copy the public IP address. I’ll copy and paste it here, and now you can see that NGINX is automatically installed, and the configuration that we did in the user data is now visible over here. So this is the basic information about how to create a simple auto-scaling group. I hope this has been informative for you. Now again, we’ll be going into more detail in the upcoming sections, but I would really encourage you to practise this once. And once you complete the practical on your side, we can go ahead with the next session. So I hope this has been informative for you, and I look forward to seeing you in the next lecture.
105. Auto Scaling – Scaling Up Operations
Hey everyone, and welcome back to the Knowledge Poll video series. Now, in the previous lecture, we configured our first auto-scaling group. We had also defined two policies related to Cloud Watch, which will either increase or launch two instances. If the CPU load is greater than 70%, and once the CPU load is less than 20%, it will terminate those two instances. Now, we have just configured the policy, but we have never really checked whether it really works or not.
So in today’s lecture, we will actually increase the CPU load in this specific instance and see if our auto-scaling group policies are working or not. So let’s do one thing: let me log into the instance that got created. Perfect. Now, if we just look at the CPU load, it’s not quite a lot. 0.5% as far as user space is concerned, and 0.2% as far as system space is concerned. So let’s do one thing. Let’s increase the CPU load so that we can determine whether the auto-scaling policies are working or not. So let me do one thing. I’ll just log into the instance in two separate tabs. Let me just confirm. Perfect. Great. So we have two tabs over here. What we’ll do is increase the load manually. So in order to increase the load, the easiest way is to run the DD command. The DD input file will be dev, and the zero output file will be devnull.
As a result, not only will CPU usage increase, but so will disc I/O operations. So, let’s run the top command to see if the CPU is being increased or not. If you will see, it is actually almost 100% idols now. Perfect. So let’s fire up this command. And now, if you see, the CPU has started to increase. So it is 30% user space, and let’s wait. And perfect. So it has actually reached 99.9%. So as far as our scaling policy is concerned, within a minute the two new servers should be launched. So let’s wait for a minute and see the beauty of auto-scaling groups. Along with that, if you had enabled monitoring, specifically detailed monitoring, you would have been able to see the spike. Let me actually enable it right now. If you see that, you will be charged based on this. Let me just enable it right now. So let me just refresh, and let’s wait for the auto scaling to brighten it up. So I’ll just click on “auto-scaling groups.” Okay, so we have one instance right now; let’s just wait for it to grow up.
Auto scaling is something that has really improved a lot. And trust me, I would really encourage you to practise auto-scaling as much as you can. Because if you go out for interviews as a solutions architect, you will be asked all of those questions, specifically about high availability and how auto scaling works. So this is something that you should remember. This is also important when you go ahead and do the higher-level certifications, like the dubious professional solutions architect profession. So this is something where it will also be useful. Perfect.
As a result, it remains healthy. So I’m hoping that the CPU utilisation has risen above 70%. Just quickly open the monitoring section and let me select five minutes. And now you will see that the CPU utilisation has actually reached 100%. So let’s wait. And now, if you see, the instances have started to scale. Now there are two instances that have been launched. Now that the instances will be launched again, NGINX will automatically be installed in these two instances, and our custom message will automatically be included because the user data is something that we had configured in the launch configuration. Perfect. So this is the fundamental concept of scaling up. I hope this has been useful for you, and I’d like to thank you for viewing.