1 00:00:00,670 --> 00:00:01,690 In this video, 2 00:00:01,690 --> 00:00:05,950 I am going to talk to you about designing analytical stores. 3 00:00:05,950 --> 00:00:07,580 So what we're going to be taking a look at 4 00:00:07,580 --> 00:00:12,320 is how do we choose an analytical data store in Azure? 5 00:00:12,320 --> 00:00:13,153 And with that, 6 00:00:13,153 --> 00:00:15,070 we're going to be looking at some key criteria, 7 00:00:15,070 --> 00:00:16,260 exploring options, 8 00:00:16,260 --> 00:00:19,390 and looking at the capability matrix to help us 9 00:00:19,390 --> 00:00:22,090 to figure out how we choose analytical stores. 10 00:00:22,090 --> 00:00:24,523 So with that, let's dive in and get started. 11 00:00:26,540 --> 00:00:28,440 So the first thing is exploring the options. 12 00:00:28,440 --> 00:00:31,780 And let me tell you, there are a lot of options. 13 00:00:31,780 --> 00:00:34,760 So what you need to understand is to think 14 00:00:34,760 --> 00:00:36,270 through questions that can help 15 00:00:36,270 --> 00:00:40,000 you to narrow down these options into the one 16 00:00:40,000 --> 00:00:42,640 or two that would make the most for you. 17 00:00:42,640 --> 00:00:45,680 I do want you to be aware of this list, though, 18 00:00:45,680 --> 00:00:48,820 and also keep in mind for the DP-203. 19 00:00:48,820 --> 00:00:51,440 There's more on this list since in the DP-203. 20 00:00:51,440 --> 00:00:55,490 For instance, we don't cover Cosmos DB on the DP-203. 21 00:00:55,490 --> 00:00:59,740 And we don't cover Azure SQL Database on the DP-203, 22 00:00:59,740 --> 00:01:02,460 but you need to be aware of these types of services 23 00:01:02,460 --> 00:01:04,660 because that's going to help you in your career. 24 00:01:04,660 --> 00:01:07,420 And that's also going to help you to narrow down options 25 00:01:07,420 --> 00:01:09,803 when you're looking at questions on the exam. 26 00:01:11,220 --> 00:01:12,053 Alright, so with that, 27 00:01:12,053 --> 00:01:13,510 what are the questions that we need to know 28 00:01:13,510 --> 00:01:14,860 to help us to choose? 29 00:01:14,860 --> 00:01:18,490 Well, the first one is relational versus non-relational. 30 00:01:18,490 --> 00:01:20,230 This is a huge question. 31 00:01:20,230 --> 00:01:23,360 And if we're looking at a relational database, 32 00:01:23,360 --> 00:01:25,470 we're looking at things that are SQL, right? 33 00:01:25,470 --> 00:01:28,750 So Azure SQL Database or things like that, right? 34 00:01:28,750 --> 00:01:31,130 If we are looking for non-relational, well, 35 00:01:31,130 --> 00:01:33,810 now we're looking down the path of Cosmos DB 36 00:01:33,810 --> 00:01:35,780 or something similar to that 37 00:01:35,780 --> 00:01:39,380 that also helps us in our storage options as well. 38 00:01:39,380 --> 00:01:41,950 So relational versus non-relational is definitely 39 00:01:41,950 --> 00:01:43,500 the place that you could still start. 40 00:01:43,500 --> 00:01:44,700 And just a pro tip. 41 00:01:44,700 --> 00:01:48,160 As you are thinking about how to build the database, 42 00:01:48,160 --> 00:01:50,480 don't just think about the project that you're on. 43 00:01:50,480 --> 00:01:51,740 Think about how that project 44 00:01:51,740 --> 00:01:53,890 is going to look in 3 or 5 years. 45 00:01:53,890 --> 00:01:56,170 So what kinds of data is going to go in there? 46 00:01:56,170 --> 00:01:57,570 What kinds of groups 47 00:01:57,570 --> 00:01:59,530 are going to be interacting with that data? 48 00:01:59,530 --> 00:02:01,590 How are they going to be interacting with the data? 49 00:02:01,590 --> 00:02:05,000 Is it going to be marketing experts that have 50 00:02:05,000 --> 00:02:07,530 no technical background trying to do something 51 00:02:07,530 --> 00:02:10,830 or is it only going to be database architects? 52 00:02:10,830 --> 00:02:13,040 So that's going to help you to kind of narrow down 53 00:02:13,040 --> 00:02:14,930 that relational versus non-relational 54 00:02:14,930 --> 00:02:17,293 and where your database is going to be heading. 55 00:02:19,780 --> 00:02:23,130 Next, what type of database model do you need? 56 00:02:23,130 --> 00:02:26,020 So we could do key value or document, 57 00:02:26,020 --> 00:02:28,270 or column, or graph, right? 58 00:02:28,270 --> 00:02:30,560 There's quite a few different database models. 59 00:02:30,560 --> 00:02:32,760 So think about what you're going to have. 60 00:02:32,760 --> 00:02:34,720 And a lot of that is going to be based upon 61 00:02:34,720 --> 00:02:36,250 what you currently have. 62 00:02:36,250 --> 00:02:37,760 So if you have a legacy system, 63 00:02:37,760 --> 00:02:41,410 trying to change that system may not be the correct route. 64 00:02:41,410 --> 00:02:43,610 A better use might be looking at what you have 65 00:02:43,610 --> 00:02:46,710 and figuring out how you can carry that forward 66 00:02:46,710 --> 00:02:48,510 or expand upon it. 67 00:02:48,510 --> 00:02:50,640 Or you may need to scrap the entire thing, 68 00:02:50,640 --> 00:02:52,260 but understanding the database model 69 00:02:52,260 --> 00:02:55,190 that you're going to want is also a critical factor 70 00:02:55,190 --> 00:02:57,173 in deciding your store. 71 00:02:58,530 --> 00:02:59,520 So what other services 72 00:02:59,520 --> 00:03:00,790 or activities are going to be 73 00:03:00,790 --> 00:03:03,580 in your data engineering projects and pipelines? 74 00:03:03,580 --> 00:03:07,850 So are we doing large-scale transformations in Databricks? 75 00:03:07,850 --> 00:03:11,210 Are we tying in a bunch of MongoDBs? 76 00:03:11,210 --> 00:03:13,470 If we are doing things like that, 77 00:03:13,470 --> 00:03:16,290 that may skew us more towards, "hey, 78 00:03:16,290 --> 00:03:18,550 Cosmos DB because you can use Mongo". 79 00:03:18,550 --> 00:03:22,200 Or if we're using Databricks, okay, 80 00:03:22,200 --> 00:03:23,320 that's going to help us to know 81 00:03:23,320 --> 00:03:25,120 as well what kind of size we have. 82 00:03:25,120 --> 00:03:28,930 Or Azure Synapse might be a good solution as well, right? 83 00:03:28,930 --> 00:03:30,500 So that kind of helps us to figure out 84 00:03:30,500 --> 00:03:32,610 what we need as we look at the other services 85 00:03:32,610 --> 00:03:33,763 in the pipeline. 86 00:03:34,700 --> 00:03:35,533 And then finally, 87 00:03:35,533 --> 00:03:38,150 what kind of scalability needs do you have? 88 00:03:38,150 --> 00:03:39,170 So again, 89 00:03:39,170 --> 00:03:40,750 how is this database 90 00:03:40,750 --> 00:03:44,080 or this store going to grow over time? 91 00:03:44,080 --> 00:03:46,410 And that's going to be both in processing 92 00:03:46,410 --> 00:03:48,500 or queries coming into it. 93 00:03:48,500 --> 00:03:51,050 That's also going to be in size as well. 94 00:03:51,050 --> 00:03:53,033 So think about both of those things. 95 00:03:55,380 --> 00:03:56,830 So there's one more thing that we need 96 00:03:56,830 --> 00:03:59,120 to look at before we wrap up this lesson. 97 00:03:59,120 --> 00:04:01,190 And that is the capability matrix, 98 00:04:01,190 --> 00:04:03,490 which if anything is going to be the most likely 99 00:04:03,490 --> 00:04:05,840 thing that you would see on the DP-203. 100 00:04:05,840 --> 00:04:08,630 So let's jump over and take a look at that. 101 00:04:08,630 --> 00:04:11,930 And I'm going to throw a link in the video description. 102 00:04:11,930 --> 00:04:13,540 So make sure you take a look at that. 103 00:04:13,540 --> 00:04:16,400 This is under choosing an analytical data store. 104 00:04:16,400 --> 00:04:18,290 There's a capability matrix here 105 00:04:18,290 --> 00:04:19,720 that I would highly suggest 106 00:04:19,720 --> 00:04:22,110 that you take some time to read through. 107 00:04:22,110 --> 00:04:24,190 So plan on taking the next 10 minutes 108 00:04:24,190 --> 00:04:25,510 or so after this video 109 00:04:25,510 --> 00:04:27,180 and just kind of look through this. 110 00:04:27,180 --> 00:04:28,610 This is going to give you a list 111 00:04:28,610 --> 00:04:32,060 of all of those different resources or options 112 00:04:32,060 --> 00:04:34,260 and it's going to break it down by category. 113 00:04:34,260 --> 00:04:35,700 So you can see there's general, 114 00:04:35,700 --> 00:04:38,200 there's scalability capabilities, 115 00:04:38,200 --> 00:04:41,180 and then security capabilities as well. 116 00:04:41,180 --> 00:04:44,310 So I don't think you should memorize this. 117 00:04:44,310 --> 00:04:47,470 In fact, I'm going to say don't memorize this for the DP-203 118 00:04:47,470 --> 00:04:48,570 or even for your job. 119 00:04:48,570 --> 00:04:50,050 You can always go to the link. 120 00:04:50,050 --> 00:04:51,890 But what I would suggest you do is, again, 121 00:04:51,890 --> 00:04:54,690 take about 10 minutes or so, look through this, 122 00:04:54,690 --> 00:04:55,760 and spend a little bit of time 123 00:04:55,760 --> 00:04:58,290 just kind of getting familiar with what's on here. 124 00:04:58,290 --> 00:05:00,480 And a few of the main stays 125 00:05:00,480 --> 00:05:05,357 such as Synapse here for the DP-203. 126 00:05:09,470 --> 00:05:12,160 Lastly, let's talk about some key points to remember. 127 00:05:12,160 --> 00:05:15,040 One, don't skip the pre-implementation stages. 128 00:05:15,040 --> 00:05:17,220 I've said this a couple of times on this course, 129 00:05:17,220 --> 00:05:19,730 but don't skip the planning stages. 130 00:05:19,730 --> 00:05:22,240 That's really going to hurt you down the road. 131 00:05:22,240 --> 00:05:26,540 And be thinking about ways to justify your position. 132 00:05:26,540 --> 00:05:28,470 So you need to spend a little bit of time 133 00:05:28,470 --> 00:05:29,820 doing research on your own. 134 00:05:29,820 --> 00:05:30,930 And then present that. 135 00:05:30,930 --> 00:05:33,670 And present why you think you should do it that way. 136 00:05:33,670 --> 00:05:35,480 Again, in your role, 137 00:05:35,480 --> 00:05:37,870 you're going to be the expert in a lot of these types 138 00:05:37,870 --> 00:05:42,840 of matters of figuring out how to architect your solution. 139 00:05:42,840 --> 00:05:45,010 Second, think through key questions. 140 00:05:45,010 --> 00:05:47,630 Again, be familiar with that capability matrix. 141 00:05:47,630 --> 00:05:49,950 Think about how your database is going to grow, 142 00:05:49,950 --> 00:05:52,710 think about the business needs for your database, 143 00:05:52,710 --> 00:05:54,780 and the types of queries that's going to be run, 144 00:05:54,780 --> 00:05:58,440 and where that data is going to flow out of and flow into. 145 00:05:58,440 --> 00:06:00,890 So think about those things as well. 146 00:06:00,890 --> 00:06:03,310 And with that, we are actually done with this lesson. 147 00:06:03,310 --> 00:06:05,590 Short but sweet, but very important. 148 00:06:05,590 --> 00:06:08,560 So make sure that you don't skip the pre-planning. 149 00:06:08,560 --> 00:06:10,160 I'll see you in the next lesson.