Amazon AWS Certified Developer Associate Topic: Databases Overview & Concepts
December 20, 2022

1. Database Essentials

So let’s start with: what is a relational database? Relational databases have been around for a long time because they are what most of us are used to. So if you think of a traditional spreadsheet, an Excel spreadsheet, for example, the database itself is going to be the file name of that spreadsheet. Then there are different tables or workbooks depending on the technology you’re using. And then inside those workbooks, you’ve got different rows and different fields. So a relational database looks just like this. So you’ve got your ID column, you’ve got your first name column, you’ve got your surname column, and you’ve got your gender column. And so these are the different fields that make up your database. And then each row is essentially a record.

So row number one is a record about me. It says my first name, my surname, my gender, et cetera. Row number two is another record about my colleague Anthony Stanley, et cetera. Another thing to notice, and we’ll come back to this in the data warehousing section, is that each field always contains the same type of data. So the ID field is always going to contain a number. The first name field is always going to contain text, and the gender field is always going to contain a single character. and that’s either going to be male or female. And we’ll find out why this is important in the database warehousing part of this course. There are various types of relational databases. We’ve got Microsoft SQL Server, we’ve got Oracle, we’ve got My SQL Server, we’ve got PostgreSQL, we’ve got Aurora, and now we have MariaDB.

And Maria DB has just come out with Reinvent 2015. So there are currently six relational databases that are available on RDS, and it’s important to remember those going into the exam. And by the way, guys, RDS and Dynamo DB do come up a lot in the exam, so it’s really important to go through and read the FAQs as a part of your study for the exam. It’s really important to understand the differences between those two and when you would use one over the other. And we’ll go into that in a bit more detail as we progress throughout this course.

So non-relational databases are fairly new; they’ve only been around since roughly 2004. It can be a bit difficult to get your head around. Examples of non-relational databases out there are things like CouchDB, MongoDB, Cloudant, et cetera. But in the Amazon world, we use something called Dynamo DBS, and these are document-oriented databases. There are various types of NoSQL databases available, such as tabular, key-value pairs, and so on. But in terms of the ones that we use with Amazon, they’re document orientated. So what do I mean by “documents”? Well, you’ve got your database, and then you’ve got a collection, a document, and then you’ve got key-value pairs. And that sounds a bit confusing at first, but when you relate it back to relational databases, it makes a lot of sense. So you’ve got your database, and then inside you’ve got a collection, and a collection is basically just a table. Your tables consist of different documents, and a document is basically your record set or row in your table.

And then you’ve got your key-value pairs, which are essentially just different fields within that table. So if we have a look here, here is a document, and it consists of different key-value pairs. So we can see the first key-value pair is ourID, and basically its value is five one two. Then we got our key value pair of firstname, which has the value John, and so on. So with no SQL databases, you always open the document with curly brackets, and then you basically split out your key-value pairs. So you will have different key-value pairs going through, and you can essentially fill this data with anything. You can also have embedded data structures, so you can have a table and then an embedded key-value pair. So your address consists of the street, the suburb, et cetera.

So there you have it: a high-level overview of no SQL databases. Now let’s move on to database warehousing. And again, this is a fairly new or emerging technology that’s basically come out of the woodwork in the last ten years or so. It was first supported by Microsoft SQL Server in 2008, and it was then reinforced in SQL Server 2012. And basically, data warehousing is used for business intelligence. So things like Cognos, Jaspersoft, SQL Server Reporting Services, Oracle, Hyperion, SAP, NetWeaver, et cetera use data warehousing. And basically, data warehousing is used to pull in very large and complex data sets, and it’s usually used by management to do queries on data such as current performance versus targets, et cetera. So let’s look at some examples of where we would use data warehouses versus RDS. So there are two different types of transactions that are commonly referred to in It.

There’s online transaction processing and online analytics processing, and basically they differ in terms of the types of queries that are being run. So with online transaction processing, it’s a very transactional process (pardon the pun), but essentially you might be using a MySQL database and you might be saying, “Okay, I want to see order 212121.” So you say “select star ID equals 212121,” and that will basically pull up a record set. So it will pull up things like the name, the date, the address that the order is going to be delivered to, the delivery status, etc. So it’s a very transactional transaction; it will only happen once, and it will hit the database very lightly. So let’s now compare this to an online analytics processing transaction. Online analytics processing is simply where someone is trying to analyse a large volume of information.

So let’s use the example of a manager. A manager is being asked to figure out the net profit for a particular product. So this might be a digital radio product, and they need to understand the net profit over a given amount of time, let’s say over twelve months, for this digital radio in both the stores in EMEA as well as the stores in the Pacific region. So if you think about what is needed to calculate that, we’re going to need to know the number of radios that have been sold in EMEA and the number of radios that have been sold in the Pacific. We’re going to need to know the unit cost of the radio in each region. We’re going to need to know the sales price of each radio in the region. And the sale prices might be different in France compared to the UK. because you’ve got different currencies. Likewise, in the Pacific region, it might be the Japanese yen versus the Australian dollar. So your sales prices are all going to be different. And then you need to figure out the sales price minus the unit cost.

If you think about the amount of data that’s going to be involved in this sort of transaction, it’s going to be huge. And you’re going to be hitting the database really, really hard with lots of different queries, and it’s going to have a big effect on the database. Now, with relational databases, traditionally, they’re very difficult to scale out. Essentially, you need to have a production database. And if you’re running these queries on your production database while also trying to do online transaction processing, where you’re trying to allow your customers to place new orders, it’s going to hurt the database and really affect your IO. So basically, database warehousing has come into existence where you more or less have a copy of your production database, and this copy of the database is where you’re going to run your OLAP transactions on.And the way data warehousing is actually designed, it actually uses a completely different architecture, both from a database perspective and from an infrastructure layer, in order to maximise those types of query times.

And an example of this is using things like columnar storage. And we’ll go over that in more detail in the course’s Redshift section. But, as I previously stated, the database type is always the same in terms of the fields. So if your database field is “name,” it’s always going to be characters. If your database field is your order number, that’s always going to be a number. When we start using columnar storage, we can actually improve performance times for online analytics processing, but we’ll cover that in the Redshift section of the course and make it a lot clearer. So let’s move on to ElastiCache. ElastiCache is a web service that makes it easy to deploy, operate, and scale in-memory caching in the cloud. The service improves the performance of Web applications by allowing you to retrieve information from fast-managed in-memory caches instead of relying entirely on slower disc-based databases. Currently, Elastic Cache supports two open-source memory caching engines:

Me cache, D, and Radis. Essentially, you’ll be tested on a variety of scenarios, and you’ll be asked to improve your database performance speeds. And with elastic cache, what’s really cool is that it’s going to cache the most consistently queried aspects of your database. So if your web application is constantly requesting what the top ten deals are for today, you might want to cache that information in ElastiCache.

And then, when your Web application queries for the top ten deals, it will get that from ElastiCache rather than your database. And it just takes a load off your database and allows you to basically improve the performance of your underlying database. And finally, let’s move onto database migration services. Now, DMs were announced at Reinvent in 2015. It’s a very, very new technology. It’s still in preview, as you saw from the console earlier, but it might be an exam topic sometime in 2016. So I’m just going to explain what it is. So it stands for Database Migration Services, and it’s pretty amazing stuff. As a result, you can automatically migrate your production database to AWS. And once the migration has started, AWS manages all the complexities of the migration process.

Things like data type, transformation compression, and parallel transfer for faster data transfers And it also ensures that data changes to your production database that occurred during the migration process are automatically replicated to the target. Now, the really cool thing about DMs is that it gives you this schema conversion tool that basically allows you to convert your source database to a different target database.

So you can have an application that’s using a legacy Oracle database. And Oracle is extremely expensive when it comes to licensing, but a lot of legacy applications still use Oracle. So what can you do? Do you just constantly pay these prices to Oracle? Well, now that you’ve got DMs, you can actually take your Oracle database and convert it to a free MySQL database, so you could convert it to Aurora. And with the schema conversion tool, it actually takes things like custom code, including views, stored procedures, and functions, and all these different types of technology. We’ll put it into a format that is compatible with MySQL. And so when they announced this, I don’t think Oracle was very happy.

The media just said: “Is this the death knell for Oracle?” We’ll see what happens there. But it does mean now that you can move your legacy databases over to free open-source databases such as MySQL, okay? So that’s really it for this lecture and section, guys. We really don’t cover the Developer Associate Exams in a lot of detail simply because it’s not covered in a lot of detail in the exam.

However, DynamoDB is on the exam a lot. So we’ve got an entire section on DynamoDB, which is up next in terms of RDS. Just remember that it’s online transaction processing. There are six different engines. So there’s SQL Server, MySQL, PostgreSQL, Oracle, Aurora, and Maria DB; which is the newest engine that was added at Reinvent 2015? We then have DynamoDB, which is our NoSQL service.

We have Redshift for online analytics processing. And then we’ve got Elastic ache, which is us in-memory caching service in the cloud. And it consists of two different engines: The MemcacheD or Me cached engine and the Radis engine. And then, finally, we have database migration services. So that’s it. It really is a database. 101.

If you do want to get more hands-on with RDS, I recommend taking the Certified Solutions Architect Associate Course. We do get really hands-on, so it’s probably worth having a read of the RDS FAQ section on Amazon’s website. But really, you don’t need to know much about RDS in order to pass the developer exam. So that’s it for this section of the course, guys. Let’s move on to DynamoDB. And we’re going to get really hands-on and technical, and you’re going to learn everything there is to know about DynamoDB in the next section of the course. If you have any questions, please let me know. If not, feel free to move on to the next section. Thank you.

Leave a Reply

How It Works

img
Step 1. Choose Exam
on ExamLabs
Download IT Exams Questions & Answers
img
Step 2. Open Exam with
Avanset Exam Simulator
Press here to download VCE Exam Simulator that simulates real exam environment
img
Step 3. Study
& Pass
IT Exams Anywhere, Anytime!