1 00:00:00,070 --> 00:00:02,480 ‫So let's have a look at DynamoDB streams. 2 00:00:02,480 --> 00:00:06,610 ‫So streams are an ordered list of item-level modifications, 3 00:00:06,610 --> 00:00:08,890 ‫such as create, update, and delete 4 00:00:08,890 --> 00:00:10,550 ‫that are happening within a table. 5 00:00:10,550 --> 00:00:12,440 ‫So whenever you will insert an item 6 00:00:12,440 --> 00:00:14,000 ‫or modify it or delete it, 7 00:00:14,000 --> 00:00:17,810 ‫then that modification would be visible in the stream. 8 00:00:17,810 --> 00:00:19,240 ‫And the stream will represent 9 00:00:19,240 --> 00:00:22,870 ‫the list of all the modifications over time in your table. 10 00:00:22,870 --> 00:00:25,240 ‫So stream records can be sent to multiple places, 11 00:00:25,240 --> 00:00:26,890 ‫such as Kinesis Data Streams, 12 00:00:26,890 --> 00:00:28,800 ‫so you can send a DynamoDB Stream into Kinesis 13 00:00:28,800 --> 00:00:30,940 ‫and then do whatever you want with it. 14 00:00:30,940 --> 00:00:32,580 ‫You can also use a Lambda function 15 00:00:32,580 --> 00:00:34,870 ‫to read directly from your DynamoDB Streams, 16 00:00:34,870 --> 00:00:37,270 ‫or you can use the Kinesis Client Library applications 17 00:00:37,270 --> 00:00:40,580 ‫to read directly as well from a DynamoDB Streams. 18 00:00:40,580 --> 00:00:42,570 ‫The data retention within a DynamoDB Stream 19 00:00:42,570 --> 00:00:43,720 ‫is up to 24 hours, 20 00:00:43,720 --> 00:00:44,920 ‫so you need to make sure 21 00:00:44,920 --> 00:00:47,330 ‫to either persist it somewhere like Kinesis Data Stream 22 00:00:47,330 --> 00:00:48,998 ‫where you can have a longer retention 23 00:00:48,998 --> 00:00:53,090 ‫or use whatever Lambda or KCL application 24 00:00:53,090 --> 00:00:55,540 ‫to persist it somewhere more durable. 25 00:00:55,540 --> 00:00:57,320 ‫The use cases for using DynamoDB Streams 26 00:00:57,320 --> 00:00:59,140 ‫is to react to changes in real-time 27 00:00:59,140 --> 00:01:00,960 ‫happening in your DynamoDB tables. 28 00:01:00,960 --> 00:01:02,817 ‫For example, having a flow to welcome you, 29 00:01:02,817 --> 00:01:05,130 ‫to send a welcome email to your users, 30 00:01:05,130 --> 00:01:06,019 ‫to do analytics, 31 00:01:06,019 --> 00:01:07,870 ‫to transform the stream 32 00:01:07,870 --> 00:01:10,460 ‫and create derivative tables in DynamoDB, 33 00:01:10,460 --> 00:01:11,930 ‫or to send data into OpenSearch 34 00:01:11,930 --> 00:01:13,860 ‫for indexing and giving search capabilities 35 00:01:13,860 --> 00:01:15,330 ‫on top of DynamoDB. 36 00:01:15,330 --> 00:01:17,300 ‫Or if you wanted to implement global tables 37 00:01:17,300 --> 00:01:18,880 ‫and cross-region replication, 38 00:01:18,880 --> 00:01:21,970 ‫you would need to have streams in the first place. 39 00:01:21,970 --> 00:01:24,530 ‫So if we look at the architecture of DynamoDB Streams, 40 00:01:24,530 --> 00:01:25,960 ‫so we have our application, 41 00:01:25,960 --> 00:01:27,210 ‫which does create, update, 42 00:01:27,210 --> 00:01:29,710 ‫and deletes operations on our table, 43 00:01:29,710 --> 00:01:31,830 ‫and any of these changes is going to appear 44 00:01:31,830 --> 00:01:33,270 ‫in a DynamoDB Stream. 45 00:01:33,270 --> 00:01:35,810 ‫So from there, Kinesis Data Streams can be a receiver 46 00:01:35,810 --> 00:01:37,230 ‫of your DynamoDB stream. 47 00:01:37,230 --> 00:01:39,100 ‫And because we're using KDS, Kinesis Data Streams, 48 00:01:39,100 --> 00:01:41,460 ‫then we can have Kinesis Data Firehose as a result, 49 00:01:41,460 --> 00:01:44,010 ‫and then maybe send it to Amazon Redshift 50 00:01:44,010 --> 00:01:45,460 ‫to perform some analytics queries 51 00:01:45,460 --> 00:01:48,110 ‫on top of your data in DynamoDB, 52 00:01:48,110 --> 00:01:49,800 ‫or sending to Amazon S3 53 00:01:49,800 --> 00:01:52,790 ‫for archival of all these changes, in case we need to, 54 00:01:52,790 --> 00:01:55,920 ‫or sending it to OpenSearch Service, okay, 55 00:01:55,920 --> 00:01:58,150 ‫to index it and to create a search capability 56 00:01:58,150 --> 00:02:00,300 ‫on top of your DynamoDB table. 57 00:02:00,300 --> 00:02:01,730 ‫The cool thing about this architecture 58 00:02:01,730 --> 00:02:04,990 ‫is that it's pretty much everything is managed by AWS. 59 00:02:04,990 --> 00:02:07,370 ‫If you wanted to add your own custom logic, 60 00:02:07,370 --> 00:02:09,600 ‫you could use a processing layer in which you would create 61 00:02:09,600 --> 00:02:11,696 ‫either a Kinesis Client Library App, 62 00:02:11,696 --> 00:02:12,850 ‫maybe running on EC2, 63 00:02:12,850 --> 00:02:14,370 ‫or a Lambda function 64 00:02:14,370 --> 00:02:16,030 ‫that will be reading from DynamoDB streams. 65 00:02:16,030 --> 00:02:18,870 ‫And from this, you can implement any sort of logic you want. 66 00:02:18,870 --> 00:02:20,880 ‫So for example, you can have messaging 67 00:02:20,880 --> 00:02:22,860 ‫or send notifications using Amazon SNS. 68 00:02:22,860 --> 00:02:24,450 ‫You could do some filtering and transformation 69 00:02:24,450 --> 00:02:27,140 ‫and then reinsert the data into a DynamoDB table, 70 00:02:27,140 --> 00:02:29,560 ‫or for example, you can also use Lambda to send data 71 00:02:29,560 --> 00:02:31,640 ‫into OpenSearch, if you wanted to, okay? 72 00:02:31,640 --> 00:02:33,360 ‫So this gives you a different kind of architectures 73 00:02:33,360 --> 00:02:35,770 ‫and all the possibilities that open up 74 00:02:35,770 --> 00:02:37,860 ‫by using DynamoDB Streams. 75 00:02:37,860 --> 00:02:41,000 ‫So if we consider the stream, then what do we have? 76 00:02:41,000 --> 00:02:41,833 ‫In the stream, 77 00:02:41,833 --> 00:02:43,343 ‫we have the ability to choose the information 78 00:02:43,343 --> 00:02:45,290 ‫that will be appearing in it. 79 00:02:45,290 --> 00:02:47,007 ‫So we can, for example, have only the keys, 80 00:02:47,007 --> 00:02:48,870 ‫and it will just show you a list 81 00:02:48,870 --> 00:02:51,103 ‫of all the key attributes that have been modified, 82 00:02:51,103 --> 00:02:53,100 ‫NEW_IMAGE, which represents the new item 83 00:02:53,100 --> 00:02:54,740 ‫after it was modified, 84 00:02:54,740 --> 00:02:56,410 ‫OLD_IMAGE, which represents the entire item 85 00:02:56,410 --> 00:02:59,180 ‫as it appears before it was modified. 86 00:02:59,180 --> 00:03:00,932 ‫And if you wanted to get the full information, 87 00:03:00,932 --> 00:03:03,130 ‫you can get NEW_AND_OLD_IMAGES, 88 00:03:03,130 --> 00:03:05,910 ‫which gives you both the new and the old image of the item, 89 00:03:05,910 --> 00:03:09,800 ‫and therefore we can see what changes have happened. 90 00:03:09,800 --> 00:03:12,838 ‫Now, DynamoDB Streams are made of shards, 91 00:03:12,838 --> 00:03:14,580 ‫just like Kinesis Data Streams, 92 00:03:14,580 --> 00:03:15,840 ‫so they're very, very similar. 93 00:03:15,840 --> 00:03:18,550 ‫And this is why the Kinesis Client Library 94 00:03:18,550 --> 00:03:20,450 ‫works against both DynamoDB Streams 95 00:03:20,450 --> 00:03:21,950 ‫and Kinesis Data Streams. 96 00:03:21,950 --> 00:03:23,810 ‫So the cool thing about DynamoDB Streams though 97 00:03:23,810 --> 00:03:25,970 ‫is that we don't have to provision any kind of shards. 98 00:03:25,970 --> 00:03:27,690 ‫This is done automatically by AWS, 99 00:03:27,690 --> 00:03:29,870 ‫so it is really a hands-off approach. 100 00:03:29,870 --> 00:03:32,470 ‫Now, if you enable DynamoDB Stream, just so you know, 101 00:03:32,470 --> 00:03:34,770 ‫the records are not going to be retroactively populated 102 00:03:34,770 --> 00:03:36,140 ‫in the stream after enabling it. 103 00:03:36,140 --> 00:03:38,100 ‫Okay, this is an exam trick. 104 00:03:38,100 --> 00:03:39,044 ‫So once you enable the stream, 105 00:03:39,044 --> 00:03:41,900 ‫only then will you receive these updates 106 00:03:41,900 --> 00:03:43,840 ‫based on the changes that are appearing 107 00:03:43,840 --> 00:03:45,770 ‫in your DynamoDB table. 108 00:03:45,770 --> 00:03:46,650 ‫Finally, let's have a look 109 00:03:46,650 --> 00:03:48,800 ‫at how DynamoDB Streams and Lambda work. 110 00:03:48,800 --> 00:03:51,080 ‫So for this, we need to define an Event Source Mapping 111 00:03:51,080 --> 00:03:52,840 ‫to read from a DynamoDB Stream. 112 00:03:52,840 --> 00:03:54,450 ‫And then you need to ensure that the Lambda function 113 00:03:54,450 --> 00:03:55,700 ‫has the appropriate permissions 114 00:03:55,700 --> 00:03:57,790 ‫to pull from the DynamoDB Stream. 115 00:03:57,790 --> 00:04:00,970 ‫And then the Lambda function will be invoked synchronously. 116 00:04:00,970 --> 00:04:02,050 ‫So let's take an example. 117 00:04:02,050 --> 00:04:04,180 ‫The table goes into a DynamoDB Stream. 118 00:04:04,180 --> 00:04:06,030 ‫The Lambda function will have an Event Source Mapping, 119 00:04:06,030 --> 00:04:06,968 ‫which is an internal process 120 00:04:06,968 --> 00:04:09,210 ‫that will be pulling the DynamoDB Stream 121 00:04:09,210 --> 00:04:11,020 ‫and retrieving records in batches 122 00:04:11,020 --> 00:04:12,300 ‫from the DynamoDB Stream. 123 00:04:12,300 --> 00:04:14,090 ‫And once some records are passed on 124 00:04:14,090 --> 00:04:15,420 ‫to the Event Source Mapping, 125 00:04:15,420 --> 00:04:16,760 ‫then the Event Source Mapping 126 00:04:16,760 --> 00:04:18,850 ‫will invoke your Lambda functions synchronously 127 00:04:18,850 --> 00:04:22,760 ‫with a batch of records from your DynamoDB Stream, okay? 128 00:04:22,760 --> 00:04:23,680 ‫So that's it for this lecture. 129 00:04:23,680 --> 00:04:24,513 ‫I hope you liked it, 130 00:04:24,513 --> 00:04:26,873 ‫and I will see you in the next lecture for some hands-on.