Microsoft Azure Solutions Architect Expert AZ-305 Topic: Design a Data Archiving Strategy
December 14, 2022

1. Storage Account – Data Archiving and Access Tiers

So continuing on talking about business continuity, in this video, we’re going to talk about something a little different, and that’s the concept of data archiving. Now, data archiving is not the same as data backups, although you might want to keep your data backups for long periods of time. You might want to keep database backups for a year, two years, or five years. But data archiving could be anything, not just database backups. Now, we’ve just been talking about Azure Site Recovery and the ability to get your site recovered from a disaster in a very short period of time. So for the company to be up and running again within 20 or 30 minutes, that obviously has huge value. We’re not taking away from the value of that. But data archiving is a little different.

So this is the concept. Whatever type of file you have, you have to keep it somewhere accessible for a longer period of time. If we’re talking about backups, it could be your virtual machine or your physical machine backups. And you have a backup strategy that means that you keep 30-day-old backups, 60-day-old backups, and 180-day backups, because you never know what you’re going to need them for.

But you’ll be thankful that you can go back in time and compare what the database was like back then to what it’s like today as a way of trying to recover from some other type of disaster, maybe corruption or something. So you will need to store these files, but you don’t need them immediately. I once worked in a company where we received a large number of files from external sites every day. Files were constantly arriving via FTP. We would process them, ingest them into a database, and that would go into our systems, and then finally into an application that we’re running. But we used to keep those raw data files that were sent in by the external party, just in case.

You never know if the import failed or the imports failed for a couple of days. If you’ve got the files there, you don’t have to contact another company and send the files again. Having the files ready is beneficial in terms of being able to recover from a process failure. It also could be a legal requirement where you want to look back at the files that were sent six months ago because of some dispute that you’re having, and being able to call up those files and say, “Yeah, we never deleted them, they’re just sitting right there,” could be helpful for many other processes. There’s also accounting, governance tax, and lots of other reasons why you have to keep files. GDPR and other data privacy examples may require you to keep a record of the permissions you obtained when users agreed to be messaged by you. You have to keep track of that. And it has to be kept in a reliable spot. So this is archival data; these are items, files that you must have but will not use right away.

They’re not transactional, and they’re not immediate. Now, within the Azure storage account, there are four levels of performance, right? The Performance tier is a special tier above the Regular tier with much faster sub millisecond response times. There’s the “Hot” tier, which is the default. That is, you both write and read the file. It’s just a normal level of performance. And there are these things that we might not know much about but are cool, like the archive tiers. And in this video, we’re going to talk about everything below “Hot,” which is cool and archived. Now, when you go into “Cool” and then finally into “Archive,” there are significant cost savings for the storage of the files. So when we’re looking at hot storage, that costs two cents per gigabyte to store. Then, when you look at Cool Storage, that’s like one cent per gigabyte. And when you’re looking at archive storage, that’s like one fifth of a cent. You really go down in price quite significantly when you go from “hot” to “cool” to “archive.” So, like, the archive tier, like I said, is something like 90% cheaper than the hot tier. Now, that comes at a price. Like the last slide said, it’s cheaper to store, but it’s more expensive and more difficult to access.

2. Access Tier Requirements

So continuing on with talking about storage accounts and data archiving, let’s look at the reasons why you would want to use it. What are some of the prerequisites? What do you have to have in place before you would consider using the data archiving tier? Now, if you think back to the older days, if you’re young, you might not even know this, but back in the day, companies used to back up their servers to a tape drive, right?

So there would be big spinning reels of tape. Maybe you saw this in an old movie. And they would have to remove that tape from the tape player and store it somewhere. And if they ever needed those files, they would have to go search through the closet, find the tape it’s on, load the tape up, and then they would ingest that data from that tape. So you can look at the Archive storage tier or something similar to that, where it’s not easily accessible but is accessible if you really need it.

So these are for files that, for whatever reason, you need to keep but would never need at a moment’s notice. You will always have some advance notice to Axis them. That’s for the Archive tier specifically. Now, the tiering—Hot, Cool, and Archive—is only available in newer storage accounts, and that is general-purpose V 2 primarily. It’s also accessible through a Blob storage account. It is not available in general-purpose v one. It’s pretty easy to migrate from general-purpose V1 to V two.

As a result, there are numerous compelling reasons to do so for new features, particularly something like this in the future. Microsoft has stated that they will be adding new features to general purpose accounts rather than Blob storage accounts. So I think that’s eventually going away, and all storage accounts will be version two and above. Now, by default, when you add a new file to a storage account, it’s defaulted to the Hot Access tier. The Hot Access Tier is just the normal tier, the regular tier. You pay a really good rate to store it, and you pay a pretty decent rate to retrieve it, so every access, write, and read would cost you something as well. Now, as we said in the last video, when you go down to the Cool tier, you’re actually saving about 50%.

So there’s quite a bit of savings just to go from hot to cool. But it does cost you more. It’s about five times more expensive to read those files. And so you wouldn’t want to store files in a cool tier that you need to access frequently, like several times a day. But, if you have a file that reminds you of old family photos, and you’re storing them in the cloud, you might want to put them in a Cool Access account because you don’t see your old family photos every day.

But hey, you wouldn’t mind paying a couple of cents more when you finally get around to wanting to see them because you’ve saved so much in storage. By lowering the storage cost, you can actually—I meant to say in the last slide—set your account to default to cool. So what does that mean? That means that any new files added to the storage account could come in at the Cool level, so that just becomes the default. Hot and Cool are the only two options for default settings.

The archive tier can never be a default. You always have to individually select a file and make it an archive. And that’s why it’s only at the object level only. So, to reiterate, these are object-level settings to retrieve a file that, once set into Archive and you want to download it, could take several hours to retrieve. So that’s the disadvantage of saving a file in the archive tier: the time it takes.

The more files you want to retrieve, the longer it takes, etc. So there’s also the requirement that you remain in the archive tier for at least 180 days. So don’t put these old family photos into the archive unless you’re prepared to keep them there for about six months, because you might be saving 90% on the storage cost, but they want you to keep them there for six months and not for a date. Once in the Archive tier, those files function similarly to automatic navigators. You can’t read them, you can’t copy them, you can’t overwrite them, and you can’t even do anything to them. You have access to the metadata, but the files themselves are completely inaccessible to you. Rehydration is the process of returning a file from Archive to the cool or hot tier. And again, you go into a queue, and depending on how many files are ahead of you, it could take minutes or it could take hours, depending on how busy they are. So to get stuff out of the archive, it’s a bit of a process. Returning to the magnetic tape. They have to go into the closet and find the tape. They have to have a free machine. There’s no free machine.

Okay, let’s wait for one to finish and put the tape onto the drive. So I’m not sure, as you’re using tape right now, but it does take them a long time to get those files out of this archive. And there’s a new priority preview feature called High Priority. So if you really, really need those files and you can’t wait 6 hours and you want to get it done within an hour, there is a preview mode called high priority, and of course that’s going to cost you more. Go into the Azure portal, and we can see on the left, within a storage account, there’s a Blob service section. And there’s lifecycle management in there. Now, this is pretty cool, so I can go and set up a rule that is basically an automatic method of getting files from hot to cool to archive. And here’s a screenshot of a rule I was setting up that would, after 120 days, move a hot blob into cool storage. And after a year, 365 days, it would move that cool file into archive storage. As a result, you can configure this lifecycle to delete files that are no longer needed. save you money.

3. Access Tier Service Level Agreements (SLAs)

Now, we can also mention that Microsoft makes certain promises when it comes to the accessibility of these files. And these are called service level agreements, or SLAs. Now, at the beginning of the section mentioned, there are actually four tiers, and we haven’t talked about the Premium tier that much. However, the Premium tier is essentially a very high, very high performance tier. Microsoft has put this into a storage account that is obviously an SSD. It’s a hyper-fast SSD that runs in fast hardware and a fast network. And you can see here that they’re promising less than ten milliseconds for the first bite. So if you have files where it’s very important that we get that first bite out to the end user in a snap of a finger or less than ten milliseconds, then Premium Performance is available to you.

Premium Performance cannot be used in conjunction with the GlobalRedundancy feature, and there is no minimum duration for remaining in Premium. Microsoft offers a 99.9% availability guarantee for the Premium Performance files when we get to the regular tier. The Hot Tier, which is two by default, is the default. Unless you change it, that’s still 99.9% availability. But if you get into global redundancy, then you can basically get ten times the availability with 99.9%, and that’s because the files are stored outside of the geography and in a different region of the world. There’s no minimum duration. You’ll be charged for the gigabytes used in a month, and you can delete them at any time. And response times are under 1 second. So they have millisecond latency, but it cannot be less than ten milliseconds. So, slightly slower than the performance tier, but still quite fast.

To get a file from storage within a second, moving down to the cool tier, we’re going to actually get a bit lost. As a result, the availability is only 99%. That means that one in every 100 requests to read a Cool Tier file might fail. At least Microsoft’s not going to back it up more than that. And you can use the redundancy service, GlobalRedundancy, and get 99.9% availability for that. However, the Cool tier requires a minimum storage duration of 30 days. So even if you try to delete the file within those 30 days, you’re still going to be charged for the full 30 days. And the response time is comparable to that of the Hot tier. Finally, the archive we were discussing has these files offline. So there’s no availability. You have to rehydrate the file, and then you have to wait 60 minutes, 3 hours, or 6 hours to get your file back. So there’s no availability. SLA, there’s no concept of global redundancy for the archived tier. You do have that 180-day minimum, and it takes hours from the time that you decide you want the file to the file being in your possession. Having said that, there are 90% savings off the Hot tier. So if you really, really want to put something away in a deep freeze, then consider the archive.

Leave a Reply

How It Works

img
Step 1. Choose Exam
on ExamLabs
Download IT Exams Questions & Answers
img
Step 2. Open Exam with
Avanset Exam Simulator
Press here to download VCE Exam Simulator that simulates real exam environment
img
Step 3. Study
& Pass
IT Exams Anywhere, Anytime!