1 00:00:00,360 --> 00:00:02,040 Hello the beautiful people. 2 00:00:02,040 --> 00:00:07,140 And welcome to this video where you're going to start learning all about file archiving and compression. 3 00:00:07,140 --> 00:00:12,330 And I'm sure you've heard millions of times about the benefits of keeping backups of your data, and 4 00:00:12,330 --> 00:00:15,000 that's exactly what we're going to be covering in the next few videos. 5 00:00:15,000 --> 00:00:19,440 In the next few videos, you're going to learn how to archive and compress your files in Linux to create 6 00:00:19,440 --> 00:00:20,070 backups. 7 00:00:20,070 --> 00:00:22,080 Now just creating the backups isn't good enough. 8 00:00:22,080 --> 00:00:27,030 You're also going to learn how you can restore data from those backups in the future. 9 00:00:27,030 --> 00:00:30,450 So when you need to uncompressed it, your data will come back out perfectly. 10 00:00:30,450 --> 00:00:30,990 By the end. 11 00:00:30,990 --> 00:00:34,290 You're basically going to be able to create and restore your own backups to make sure you don't lose 12 00:00:34,290 --> 00:00:35,670 any important data. 13 00:00:35,670 --> 00:00:40,200 And you're also going to learn about various compression algorithms so you can save the maximum amount 14 00:00:40,200 --> 00:00:42,420 of space on your file system as possible. 15 00:00:42,420 --> 00:00:44,760 So let's go ahead and jump right into it. 16 00:00:46,030 --> 00:00:46,420 Okay. 17 00:00:46,420 --> 00:00:50,920 So when you're compressing and archiving files in Linux, the first thing you need to know about is 18 00:00:50,920 --> 00:00:52,780 something called a tarball. 19 00:00:52,810 --> 00:00:57,730 Now I want you to imagine that you're at a shopping center or your local store and you're buying some 20 00:00:57,730 --> 00:00:58,390 apples. 21 00:00:58,960 --> 00:01:00,460 Now, let me ask you a question. 22 00:01:00,730 --> 00:01:03,190 How would you make those apples after picking them up? 23 00:01:03,190 --> 00:01:04,209 Easier to carry. 24 00:01:04,840 --> 00:01:05,890 Will you put them in a bag? 25 00:01:05,890 --> 00:01:06,380 Right. 26 00:01:06,400 --> 00:01:09,960 And that's what creating a tarball is basically doing. 27 00:01:10,270 --> 00:01:15,910 So creating a tarball is basically a way of putting your files in a bag to make them easier to compress 28 00:01:15,910 --> 00:01:17,110 and make them easier to store. 29 00:01:17,350 --> 00:01:23,380 And when you add files to a tarball, you make it possible to store all of your files in one place. 30 00:01:23,740 --> 00:01:31,000 The tarball itself doesn't do any compression, but tar balls can be compressed using compression algorithms. 31 00:01:31,270 --> 00:01:34,360 So archiving files is basically a two step process. 32 00:01:34,390 --> 00:01:36,620 First, you make a table. 33 00:01:36,970 --> 00:01:37,990 You make a tarball. 34 00:01:37,990 --> 00:01:43,480 And then secondly, you compress the tarball using some kind of compression algorithm. 35 00:01:43,870 --> 00:01:47,050 So let's go ahead and create our very first tarball. 36 00:01:47,260 --> 00:01:52,720 So we're on our desktop and we can see that we've got file one txt, file two dot text and file three 37 00:01:52,720 --> 00:01:56,080 dot txt and we're going to put those inside a tarball. 38 00:01:56,170 --> 00:02:01,990 Now if I do an LZ on on these files and just have a look at them, you can see that they're all about 39 00:02:01,990 --> 00:02:07,810 ten kilobytes in size, 9.8 kilobytes or about ten kilobytes in size. 40 00:02:07,810 --> 00:02:10,840 And we're going to put all of these into a tarball. 41 00:02:11,410 --> 00:02:14,470 So to do so, we're going to use the tar command. 42 00:02:14,470 --> 00:02:15,880 The tar command. 43 00:02:16,480 --> 00:02:22,720 Now, the tar command takes quite a few different options in order for this to work. 44 00:02:22,750 --> 00:02:26,140 So first we type tar, then we need to give it three options. 45 00:02:26,410 --> 00:02:27,250 So let's type a dash. 46 00:02:27,250 --> 00:02:29,420 And the first option is the C option. 47 00:02:29,440 --> 00:02:35,110 Now the C option lets the tar command know that we want to create a new archive. 48 00:02:35,740 --> 00:02:40,720 The next is V, which lets the tar command know that we want to. 49 00:02:40,960 --> 00:02:42,280 We want it to speak to us. 50 00:02:42,280 --> 00:02:45,010 We want it to give us some feedback so we know how it's doing now. 51 00:02:45,040 --> 00:02:49,210 V stands for verbose and it basically says, Hey, tar command, don't do this. 52 00:02:49,210 --> 00:02:51,760 Silently, talk to me and let me know what's up. 53 00:02:51,970 --> 00:02:54,520 Now the V option is entirely optional. 54 00:02:54,520 --> 00:02:57,720 You don't need it to make an archive, but it's good practice to keep it there. 55 00:02:58,210 --> 00:03:00,720 And the final option is the F option. 56 00:03:00,730 --> 00:03:05,500 Now, the F option is so that it lets the tar command accept files. 57 00:03:06,730 --> 00:03:11,710 So now we need to tell the tar command what we want to call our tarball. 58 00:03:12,250 --> 00:03:16,390 Now, by convention, we end tar ball files with the dot tar file extension. 59 00:03:16,390 --> 00:03:19,000 This is just a convention, so we know what they are called later. 60 00:03:19,000 --> 00:03:23,740 So let's call it our archive dot tar. 61 00:03:24,940 --> 00:03:30,310 And the finally, what we need to do now is we need to tell the tar command what's going to go inside 62 00:03:30,310 --> 00:03:31,420 our tarball. 63 00:03:31,930 --> 00:03:37,240 So we want to file one txt file two point thc and file three dot txt. 64 00:03:37,270 --> 00:03:42,880 So let's just use a wild card to save some typing and put file 123. txt. 65 00:03:43,270 --> 00:03:45,270 And with that we're ready to go. 66 00:03:45,280 --> 00:03:49,480 So what we're saying is, hey, tar command, we want you to create a new archive. 67 00:03:49,480 --> 00:03:50,710 So we give it the C option. 68 00:03:50,710 --> 00:03:52,990 We're saying speak to us so we know what's going on. 69 00:03:52,990 --> 00:03:57,760 So that's the V option and we're going to give it the F option so that it can accept new files and create 70 00:03:57,760 --> 00:03:58,290 a new archive. 71 00:03:58,300 --> 00:04:03,370 So let's just go ahead and press enter and we see that it told us what's going on and now we've got 72 00:04:03,370 --> 00:04:06,970 this new tar file that's actually been created. 73 00:04:06,970 --> 00:04:10,840 And you can see that it looks like a little box, like like a moving box. 74 00:04:11,110 --> 00:04:12,940 So you know that it's that it's an archive. 75 00:04:12,940 --> 00:04:19,000 And if we do an LZ, we can actually see that the tar ball file has actually been highlighted in red 76 00:04:19,000 --> 00:04:21,339 just to make it stand out that bit more, which is quite nice. 77 00:04:21,339 --> 00:04:25,240 So hooray, we've actually managed to create our first tar ball. 78 00:04:25,720 --> 00:04:29,290 Now that command, this tar command had quite a few different options in it. 79 00:04:29,290 --> 00:04:32,620 So what I've done is for all the commands dealing with archiving. 80 00:04:32,620 --> 00:04:36,670 So in this video, in the next video, I've made a cheat sheet for you to download that details how 81 00:04:36,670 --> 00:04:38,650 to do different archiving tasks. 82 00:04:38,650 --> 00:04:40,180 So don't worry about memorizing. 83 00:04:40,180 --> 00:04:45,550 Now you can find that in the resources section for the next video when we finish dealing with archiving. 84 00:04:45,550 --> 00:04:50,530 Okay, so let's take a look at how large the file is by using LHS with the L option. 85 00:04:50,530 --> 00:04:54,790 And actually what we're going to do is we're going to look for the ones that have dot tar in them using 86 00:04:54,940 --> 00:04:55,750 using grep. 87 00:04:55,750 --> 00:04:59,770 So we just see what has got tar on it to focus our results a little bit. 88 00:04:59,770 --> 00:05:06,070 And we see that actually it's about 40,960 bytes. 89 00:05:06,400 --> 00:05:16,990 Well, that's strange because we had three about 10,000 byte files and now we've got 40,960 bytes for 90 00:05:16,990 --> 00:05:17,920 the tarball. 91 00:05:18,310 --> 00:05:21,040 Shouldn't it be 30,000 bytes or somewhere near there? 92 00:05:21,280 --> 00:05:22,330 Well, not exactly. 93 00:05:22,660 --> 00:05:24,490 Well, think of the apples example. 94 00:05:24,490 --> 00:05:30,580 If each apple weighed 100 grams and you have three apples and then you put them in a bag with a whole 95 00:05:30,580 --> 00:05:32,290 bag, weigh 300 grams. 96 00:05:33,070 --> 00:05:35,220 No, because the bag has some weight as well. 97 00:05:35,230 --> 00:05:35,680 Right. 98 00:05:35,680 --> 00:05:40,810 So what we're seeing here is just that in order to create a tar ball, we need to add some data. 99 00:05:40,810 --> 00:05:45,430 So just think of the extra size being like the extra weight from a bag when you put apples in it. 100 00:05:45,430 --> 00:05:47,980 It's just part of the convenience of building a tar ball. 101 00:05:47,980 --> 00:05:53,350 But don't worry, when we compress the tar ball, it will be much, much smaller than the files would 102 00:05:53,350 --> 00:05:54,310 have been on their own. 103 00:05:54,310 --> 00:05:59,620 Okay, now we can tell that this is a tar ball by looking the dot tar file extension. 104 00:05:59,620 --> 00:06:00,130 Right. 105 00:06:00,130 --> 00:06:03,640 But remember that in Linux file extensions don't really mean anything. 106 00:06:03,640 --> 00:06:08,530 They they usually just a convenience so that we can see it as humans, but they don't mean anything 107 00:06:08,530 --> 00:06:09,820 necessarily to the system. 108 00:06:09,820 --> 00:06:14,260 So to double check that it is a tar ball, we can use the file command that you learned about earlier 109 00:06:14,260 --> 00:06:14,710 in the course. 110 00:06:14,710 --> 00:06:21,100 So if we do file cleanly to clear the screen file archive dot tar and you'll see that it tells us that 111 00:06:21,100 --> 00:06:23,860 it is indeed a tar archive. 112 00:06:23,860 --> 00:06:24,290 Okay. 113 00:06:24,430 --> 00:06:31,460 Now if we renamed this tarball so we renamed it to be our archive dot blam. 114 00:06:31,720 --> 00:06:42,070 Let's say now if we do, if we do the file command on that so file of our archive, you see that the, 115 00:06:42,070 --> 00:06:48,580 the Linux system wasn't tricked and it still knows that it's a tar archive even though it didn't have 116 00:06:48,580 --> 00:06:50,290 the tar file extension. 117 00:06:50,290 --> 00:06:55,030 So let's change that back because we don't need any more confusion than is necessary. 118 00:06:55,030 --> 00:06:59,200 But that's just to let you know that the the tar file extension isn't particularly anything special 119 00:06:59,200 --> 00:07:00,130 to the system. 120 00:07:00,130 --> 00:07:05,770 But again, it might be to other programs that you have installed upon it, but you can always check 121 00:07:05,770 --> 00:07:09,070 what the true nature of something is using the file command. 122 00:07:09,490 --> 00:07:15,130 But the tar file extension is just something that allows you and potentially other users to know what 123 00:07:15,160 --> 00:07:16,870 type of file it is at a glance. 124 00:07:17,350 --> 00:07:21,670 Now you can actually take a look at what is in a tar ball without having to extract it. 125 00:07:21,670 --> 00:07:27,820 So let's say you downloaded this this archive from from the interwebs and you wanted to take a look 126 00:07:27,820 --> 00:07:28,870 at what's inside. 127 00:07:28,870 --> 00:07:29,290 Okay. 128 00:07:29,380 --> 00:07:31,840 Well, you can do that using the tar command. 129 00:07:31,840 --> 00:07:34,240 And what you do is you give it two options. 130 00:07:34,480 --> 00:07:38,710 You do the T option and the F option and then you tell it our archives. 131 00:07:38,710 --> 00:07:45,580 TAR okay, so the T option means test label and it basically lets you check what's inside the tar file 132 00:07:45,580 --> 00:07:46,990 and the F option. 133 00:07:47,020 --> 00:07:50,710 It's just necessary in order to pass a file to the tar command. 134 00:07:51,160 --> 00:07:56,620 So when we do that, it tells us that file one and file two and file three are inside the tarball. 135 00:07:56,620 --> 00:07:58,000 So that's pretty cool right now. 136 00:07:58,000 --> 00:08:02,200 We're going to come on to compression in a little bit, but first, let's take a look at how we can 137 00:08:02,200 --> 00:08:05,230 get these files outside of our tar ball again. 138 00:08:05,470 --> 00:08:06,730 Well, it's actually very simple. 139 00:08:06,730 --> 00:08:11,170 First, let's delete the file whatever dot txt. 140 00:08:11,410 --> 00:08:15,880 So they've all disappeared and we've got just our tarball here and we're going to extract the files 141 00:08:15,880 --> 00:08:16,960 out of that. 142 00:08:16,960 --> 00:08:17,330 Okay. 143 00:08:17,500 --> 00:08:20,260 So let's try and do this now. 144 00:08:21,190 --> 00:08:25,080 The way we'll do it is just it's actually very simple, right? 145 00:08:25,090 --> 00:08:28,630 So you'll have the tar command and remember to create an archive. 146 00:08:28,630 --> 00:08:34,929 You had C VRF and then you will grow our archive tar and then we gave it some files. 147 00:08:34,929 --> 00:08:37,659 Okay, now to extract, it's very simple. 148 00:08:38,020 --> 00:08:43,330 Instead of C for create you use X for extract and everything else is the same. 149 00:08:43,330 --> 00:08:47,590 But you don't even have to give it any files because you're not putting any files into the archive. 150 00:08:47,590 --> 00:08:51,760 It's just that tar x to extract v for for verbose. 151 00:08:51,760 --> 00:08:54,370 So it tells us what it's actually doing and f is necessary. 152 00:08:54,370 --> 00:09:00,100 So you can pass a file to the tar command and we'll we press enter we'll see that it is it has indeed 153 00:09:00,100 --> 00:09:01,000 extracted out for us. 154 00:09:01,000 --> 00:09:03,730 File one, file two and file three from the archive. 155 00:09:03,730 --> 00:09:09,160 And we have those back in the, in the directory that we that we were in when we run the command. 156 00:09:09,160 --> 00:09:09,880 So there we go. 157 00:09:09,880 --> 00:09:10,360 Pretty easy. 158 00:09:10,360 --> 00:09:10,690 Right. 159 00:09:10,690 --> 00:09:13,660 And again, all of this is in the cheat sheet for us. 160 00:09:13,930 --> 00:09:18,010 So now you might be thinking, okay, we've just extracted these files from the tar. 161 00:09:18,040 --> 00:09:18,910 From the tar ball. 162 00:09:18,910 --> 00:09:20,530 Is the tar ball now empty? 163 00:09:20,530 --> 00:09:21,760 Well, actually, it's not. 164 00:09:21,760 --> 00:09:26,500 You can just you can actually check that the files are still in there by telling it, just like you 165 00:09:26,500 --> 00:09:27,040 saw before. 166 00:09:27,070 --> 00:09:32,440 Checking inside with the TNF options, you can check that file one, file two and file three are still 167 00:09:32,440 --> 00:09:33,880 inside the table. 168 00:09:33,880 --> 00:09:39,490 So no matter how many times you extract from a table, the table still contains what's inside of it. 169 00:09:39,490 --> 00:09:44,650 So it's like an A, it's a reusable bag that maybe you get from shops. 170 00:09:44,650 --> 00:09:45,880 Okay, so that's pretty awesome. 171 00:09:45,880 --> 00:09:46,330 Right now. 172 00:09:46,330 --> 00:09:48,850 I'm going to break the video here for in interest of time. 173 00:09:48,850 --> 00:09:53,170 And in the next video, we're going to be talking about how we can compress our tar balls to make them 174 00:09:53,170 --> 00:09:53,740 smaller. 175 00:09:53,740 --> 00:09:56,920 So for all that goodness, I'll see you in the next video.