1 00:00:00,510 --> 00:00:01,330 Welcome back. 2 00:00:01,770 --> 00:00:06,560 we have walked through the fat16 image. Let’s add a small file module in the loader. 3 00:00:07,520 --> 00:00:10,130 we first take a look at file.h 4 00:00:12,430 --> 00:00:14,560 The structures are defined in the file.h. 5 00:00:15,400 --> 00:00:18,220 As you can see, we have bios parameter block 6 00:00:19,370 --> 00:00:23,880 and directory entry structure which we have discussed in detail in the last video. 7 00:00:24,790 --> 00:00:27,940 Remember we loaded the fat16 image to this address, 8 00:00:28,940 --> 00:00:34,520 so we define a constant fs base to represent the base address of the file image. 9 00:00:35,670 --> 00:00:39,810 Ok let’s move to file.c and see what we have in the file. 10 00:00:42,160 --> 00:00:50,620 The header files we will use include file.h, lib.h print.h debug.h and stdbool.h 11 00:00:51,840 --> 00:00:57,480 Just as we did with other modules, the first function we will implement is initialization function. 12 00:00:58,260 --> 00:01:04,650 We name it init fs which takes no parameters and doesn’t return a value. 13 00:01:05,560 --> 00:01:11,050 In the function, the first thing we are going to do is we are going to get the address of the bios parameter block. 14 00:01:11,680 --> 00:01:14,650 Because all the essential data is stored in this structure. 15 00:01:15,610 --> 00:01:23,980 The function we use in this case is get fs bpb. After we get the address of bpb, 16 00:01:23,980 --> 00:01:24,820 we perform a simple check. 17 00:01:25,940 --> 00:01:32,300 The last two bytes of the first sector should be 55 and aa. If we find that they are not equal to the values, 18 00:01:32,300 --> 00:01:39,290 we print invalid signature and assert false to stop the system 19 00:01:39,290 --> 00:01:41,810 because what we have in this location is not the image if that happens. 20 00:01:42,920 --> 00:01:45,440 OK, that's it for the initialization function. 21 00:01:46,710 --> 00:01:50,550 Next we get to function get fs bpb. 22 00:01:51,870 --> 00:01:58,430 As you can see, this function returns a pointer to structure bpb. As we have parsed the fat16 image manually 23 00:01:58,440 --> 00:02:03,410 and the whole image file is loaded in the address fs base, 24 00:02:04,020 --> 00:02:09,550 so we first find the partition entry at the offset 1be. 25 00:02:09,570 --> 00:02:12,080 The starting lba is located at offset 8, 26 00:02:12,300 --> 00:02:14,520 So we add 8 to the offset. 27 00:02:15,730 --> 00:02:21,400 The lba value is 4 bytes long. We define a variable lba to get the lba value. 28 00:02:22,600 --> 00:02:28,630 With the starting lba retrieved, then we locate the fat16 partition by adding the lba offset 29 00:02:28,630 --> 00:02:29,780 to the base address. 30 00:02:31,200 --> 00:02:38,310 The sector is assumed to be 512 bytes. So here we multiply the lba by 512. 31 00:02:38,810 --> 00:02:42,500 The result is the base address of the partition and we return this value. 32 00:02:43,780 --> 00:02:46,720 Alright, at this point, the initialization is done. 33 00:02:48,200 --> 00:02:53,930 Next we will find the kernel file. To find a file, we need to locate the root directory section 34 00:02:54,200 --> 00:02:57,020 where each entry represents a file or folder. 35 00:02:58,440 --> 00:03:00,630 So first off, we write function 36 00:03:01,840 --> 00:03:03,130 get root directory. 37 00:03:04,200 --> 00:03:09,630 The function returns the base address of root directory section which stores the directory entry arrays. 38 00:03:09,630 --> 00:03:15,060 So the type of the value is a pointer to structure directory entry. 39 00:03:16,050 --> 00:03:20,940 In the function, we use the info stored in the bpb to locate the directory section. 40 00:03:21,900 --> 00:03:27,900 So we define a variable bpb and get the base address with function get fs bpb. 41 00:03:29,230 --> 00:03:34,900 Next we calculate the offset. As we have learned in the last lecture, the root directory section follows the fat tables. 42 00:03:34,900 --> 00:03:40,560 So we multiply the table size by the count of fat tables. 43 00:03:41,260 --> 00:03:43,690 And don’t forget to add the reserved sectors. 44 00:03:44,820 --> 00:03:48,600 Then times the bytes per sectors. The result is the offset, 45 00:03:49,730 --> 00:03:51,890 we add it to the base of the partition 46 00:03:52,540 --> 00:03:55,220 and this is the base address of the root directory section. 47 00:03:56,310 --> 00:04:02,340 Next function we will implement is get root directory count 48 00:04:02,360 --> 00:04:05,130 which will return the count of the entries root directory can hold. 49 00:04:06,430 --> 00:04:14,080 The count value is also stored in the bios parameter block. So we get the bpb and the field root entry count stores the value 50 00:04:14,080 --> 00:04:14,890 . 51 00:04:15,400 --> 00:04:16,600 So we simply return 52 00:04:16,810 --> 00:04:18,149 the value root entry count. 53 00:04:19,290 --> 00:04:23,640 Ok with root directory info being retrieved, let’s see how to search a file. 54 00:04:25,200 --> 00:04:28,320 To search a file, we define a function search file. 55 00:04:30,040 --> 00:04:36,010 It takes one parameter the path name. The path name includes file name and extension name. 56 00:04:36,940 --> 00:04:38,660 The character . separates them. 57 00:04:39,040 --> 00:04:42,730 Therefore, we will split the path name before we search the file. 58 00:04:44,240 --> 00:04:49,970 As you can see, the file name and extension name are like this, the file name is 8 bytes data 59 00:04:49,970 --> 00:04:52,830 and extension name is 3 byte data. 60 00:04:53,270 --> 00:04:58,580 If we have characters less than that, the free space is filled with space or blank. 61 00:05:00,130 --> 00:05:07,120 Since we search files in the root directory section, we define variable root directory count and directory entry pointer 62 00:05:07,120 --> 00:05:07,690 . 63 00:05:09,500 --> 00:05:16,190 Now we split the path by calling function split path which takes 3 arguments, that is 64 00:05:16,190 --> 00:05:16,480 , 65 00:05:16,490 --> 00:05:19,790 the path name we want to split, the file name and extension name. 66 00:05:21,030 --> 00:05:27,420 After we return from it, we check its status, if the status is true, we will search the file 67 00:05:27,420 --> 00:05:33,810 using file name and extension name. Otherwise we return the max value to the caller 68 00:05:33,810 --> 00:05:36,540 indicating that the file is not found. 69 00:05:37,630 --> 00:05:38,680 OK, let's continue. 70 00:05:40,340 --> 00:05:45,710 To find the file in the root directory, we use directory pointer and directory count. 71 00:05:47,650 --> 00:05:53,050 In the for loop, we loop through each of the entries to see if the file is there. 72 00:05:53,050 --> 00:06:00,760 If the first byte of the file name is 0 meaning that the entry is not used, or the file name is e5 73 00:06:00,760 --> 00:06:02,920 which means the file in the entry is deleted, 74 00:06:03,840 --> 00:06:05,860 If that is the case, we continue the loop. 75 00:06:07,550 --> 00:06:10,760 The next check we will perform is for the long file name, 76 00:06:11,420 --> 00:06:17,500 we don't support the long file name features in the system, so if we find a long file name entry, 77 00:06:17,780 --> 00:06:21,560 we just skip this entry. The attribute of a long file entry 78 00:06:21,560 --> 00:06:23,160 is F. 79 00:06:24,240 --> 00:06:28,920 So if we find that, the attribute is equal to F, we continue the process. 80 00:06:30,580 --> 00:06:36,100 If all the checks pass, we will do the comparison which is done in function is file name equal. 81 00:06:36,940 --> 00:06:43,430 We pass the current root directory entry, the file name and extension name. If they are equal, 82 00:06:43,450 --> 00:06:47,260 this is the file we are looking for and return the index to the caller. 83 00:06:48,210 --> 00:06:50,610 OK, that's it for the function search file. 84 00:06:51,360 --> 00:06:54,510 Let’s take a look at function is file name equal. 85 00:06:55,840 --> 00:07:01,180 This function is simple, all it does is compare the file name as well as extension name with the directory entry 86 00:07:01,180 --> 00:07:02,140 . 87 00:07:03,100 --> 00:07:06,850 If the result is equal, we return true, otherwise we return false. 88 00:07:08,120 --> 00:07:08,850 Let's move on. 89 00:07:09,380 --> 00:07:13,970 We haven’t talked about split path. Let’s see what we have in this function. 90 00:07:15,860 --> 00:07:22,010 As you can see in this function, the first part is for parsing the file name 91 00:07:22,010 --> 00:07:24,590 and the second part is for the extension name if the file has one. 92 00:07:26,230 --> 00:07:31,330 So when we retrieve the file name, we loop through each character of the path name 93 00:07:31,330 --> 00:07:37,000 until we get 8 characters or we find the character . which means the following characters are for the extension name. 94 00:07:37,000 --> 00:07:40,840 or we reach the end of the path name, the null character. 95 00:07:42,370 --> 00:07:48,700 For simplicity, we don’t support subfolders in the system, so we don’t allow forward slash included in the path name 96 00:07:48,700 --> 00:07:49,260 . 97 00:07:49,780 --> 00:07:52,030 If we find that, we return false. 98 00:07:53,290 --> 00:07:56,710 As for normal characters, we just copy them in the name buffer. 99 00:07:57,920 --> 00:08:01,550 After we exit out the loop, the file name is stored in the buffer. 100 00:08:02,810 --> 00:08:04,850 Then we check if the character is dot, 101 00:08:05,920 --> 00:08:08,620 if it is a dot, we will get the extension name. 102 00:08:09,760 --> 00:08:14,920 Here we increment I to make it point to the next character which is the start of the extension name 103 00:08:14,920 --> 00:08:15,640 . 104 00:08:16,620 --> 00:08:23,390 In the for loop, we retrieve at most 3 characters. We also check the null character, if it is null, 105 00:08:23,430 --> 00:08:25,080 then we reach the end of the name. 106 00:08:26,330 --> 00:08:31,970 We don’t allow forward slash in the extension name either. So if we find that, we return false. 107 00:08:33,470 --> 00:08:36,530 Next we copy the characters to the extension name buffer. 108 00:08:37,549 --> 00:08:44,000 After it is done, we will check the last character to see if it is null character. If it’s not a null character 109 00:08:44,270 --> 00:08:44,870 , 110 00:08:44,910 --> 00:08:48,200 it means that we still have characters after we retrieve the name. 111 00:08:48,740 --> 00:08:50,420 and this is not a valid name. 112 00:08:50,750 --> 00:08:51,920 So we return false. 113 00:08:52,640 --> 00:08:54,470 Otherwise, we return true to the caller. 114 00:08:56,010 --> 00:08:59,700 What we are going to do next is we are going to load file. 115 00:09:00,570 --> 00:09:02,190 So let's take a look at function 116 00:09:03,990 --> 00:09:04,740 Load file. 117 00:09:06,040 --> 00:09:11,260 It takes two parameters path name and the memory address we want to load the file into. 118 00:09:12,430 --> 00:09:20,750 First off, we define a few variables, index which holds the directory entry index. The file size, 119 00:09:20,770 --> 00:09:28,090 the first cluster index of the file, the directory entry pointer and the return value which is initialized with -1 120 00:09:28,090 --> 00:09:28,360 . 121 00:09:29,460 --> 00:09:32,700 To load a file, the first thing we will do is search the file. 122 00:09:33,640 --> 00:09:39,400 If the file exists, the return value is the index of the root directory entry. So we check the return value 123 00:09:39,400 --> 00:09:39,730 . 124 00:09:40,790 --> 00:09:42,960 If it’s not equal to the maximum value, 125 00:09:42,990 --> 00:09:45,000 it means that the index is valid. 126 00:09:46,000 --> 00:09:52,480 We get the root directory by the function get root directory. Then locate the entry with the index. 127 00:09:53,880 --> 00:09:58,260 The data we want in the directory entry is the file size and cluster index. 128 00:09:59,010 --> 00:10:01,380 They will be passed in the function read file. 129 00:10:02,770 --> 00:10:08,800 Function read file will read the data of the file to the memory address specified by us. The return value of the read file 130 00:10:08,800 --> 00:10:10,070 , 131 00:10:10,120 --> 00:10:11,950 is the size of data it actually reads. 132 00:10:13,120 --> 00:10:16,420 If it reads all the data, we return 0 to the caller. 133 00:10:16,960 --> 00:10:21,400 Otherwise we return -1 indicating that we don’t load the file successfully. 134 00:10:22,520 --> 00:10:24,650 OK, let's move to read file function. 135 00:10:25,710 --> 00:10:32,250 The first argument is the first cluster index and then the memory address we want to load the data into. 136 00:10:32,700 --> 00:10:35,930 The size of the data is specified in the last argument. 137 00:10:36,540 --> 00:10:41,880 As you see, the actual work of reading a file is done in function read raw data. 138 00:10:42,900 --> 00:10:47,050 And read file function simply wraps read raw data and return the size. 139 00:10:48,360 --> 00:10:53,610 Before we get to function read raw data, we need other functions to retrieve the cluster info. 140 00:10:54,540 --> 00:10:59,990 We have seen how to find all the clusters of a file with the file allocation table in the last video. 141 00:11:01,140 --> 00:11:08,310 The first data we need is the address of fat table. We start with function get fat table. 142 00:11:12,240 --> 00:11:18,870 Just as we did before, we get the bios parameter block. Then the offset of the fat table 143 00:11:18,870 --> 00:11:21,330 is the reserved sector count times the bytes per sector. 144 00:11:22,250 --> 00:11:24,830 Add it to the base of the partition and we are done. 145 00:11:26,470 --> 00:11:33,960 Next function is get cluster value. The parameter it takes is the cluster index. In the function, 146 00:11:33,970 --> 00:11:38,110 we define a variable fat table and get the table by the function get fat table. 147 00:11:39,200 --> 00:11:45,710 Remember each cluster value is 2 bytes. So we define a pointer pointing to 16-bit value. 148 00:11:47,560 --> 00:11:53,200 To get the cluster value is simple, all we need to do is locate the item with the index and return the value 149 00:11:53,200 --> 00:11:53,370 . 150 00:11:54,690 --> 00:12:01,410 Next function is get the cluster offset. It takes one parameter which is the cluster index 151 00:12:01,410 --> 00:12:02,010 . 152 00:12:03,340 --> 00:12:09,790 To find the address the cluster value represents, we need to know the reserved sector count, the fat table size 153 00:12:09,790 --> 00:12:12,450 and the size of root directory section. 154 00:12:13,090 --> 00:12:18,880 So here we define 3 variables reserved size, fat size and directory section size. 155 00:12:20,280 --> 00:12:26,950 Remember cluster value starts from 2. We assert cluster index is greater than or equal to 2 156 00:12:26,950 --> 00:12:27,300 . 157 00:12:28,820 --> 00:12:32,270 Now we can get the bpb and calculate the sizes. 158 00:12:33,550 --> 00:12:37,360 The reserved size is reserved sector count times the sector size. 159 00:12:38,490 --> 00:12:45,150 The fat table size is the fat count * fat table sector count * sector size. 160 00:12:46,560 --> 00:12:51,750 the directory section size is the directory entry count * the size of directory entry. 161 00:12:53,220 --> 00:12:56,070 With all these three variables, we add them together 162 00:12:57,240 --> 00:13:03,740 and the cluster index starts from 2. So we subtract 2 from it and then times cluster size. 163 00:13:04,640 --> 00:13:07,010 Ok this is the offset of the cluster. 164 00:13:08,780 --> 00:13:18,020 The last one is get cluster size. The cluster size is stored in the bios parameter block. So we first get the bpb. 165 00:13:18,020 --> 00:13:24,500 Then we retrieve the data by multiplying bytes per sector by the sector per cluster and this is the cluster size 166 00:13:24,500 --> 00:13:24,950 . 167 00:13:26,670 --> 00:13:31,860 Ok all the functions are prepared. we can implement read raw data. 168 00:13:33,740 --> 00:13:36,710 The function returns the actual size of data being read. 169 00:13:37,970 --> 00:13:44,540 The variables we need in this function include bpb, the data pointer which points to 170 00:13:44,540 --> 00:13:47,120 the data we want to copy. 171 00:13:47,120 --> 00:13:49,320 The read size indicates how much data we read. 172 00:13:49,820 --> 00:13:52,430 The cluster size and cluster index. 173 00:13:54,080 --> 00:13:58,670 First off, we retrieve the bios parameter block and the cluster size. 174 00:13:59,950 --> 00:14:03,310 The index here is used to locate the fat table entries. 175 00:14:04,390 --> 00:14:10,570 So the first index is the starting cluster index. Since we could have multiple clusters in the file, 176 00:14:11,120 --> 00:14:14,920 the while loop is used to read all the clusters until we get to the end, 177 00:14:15,790 --> 00:14:20,890 in this case, the condition is read size is greater than or equal to the size. 178 00:14:22,450 --> 00:14:29,350 OK, now we get the address of the data using cluster index, The function is get cluster offset and 179 00:14:29,350 --> 00:14:35,560 we add it to the address of the partition and this is the address of the data indicated by the cluster index 180 00:14:35,560 --> 00:14:36,070 . 181 00:14:36,970 --> 00:14:39,370 Then we go ahead and get the next cluster value 182 00:14:39,370 --> 00:14:43,960 using the index and the return value is next cluster index. 183 00:14:45,220 --> 00:14:51,850 If the next cluster index is greater than fff7, then it means that this is the last cluster 184 00:14:51,970 --> 00:14:55,240 and we copy the data to the buffer then we break the loop. 185 00:14:56,550 --> 00:15:03,120 If the next cluster index is less than fff7, then it’s a valid cluster index 186 00:15:03,120 --> 00:15:04,700 and the current cluster is not the last one. 187 00:15:05,460 --> 00:15:09,000 So we simply copy one cluster of data to the buffer. 188 00:15:09,000 --> 00:15:10,920 Then we update the buffer and read size. 189 00:15:11,670 --> 00:15:15,420 Since we copied one cluster of data. we add one cluster size. 190 00:15:16,410 --> 00:15:19,980 After we are done, we return the actual size of data being read. 191 00:15:21,110 --> 00:15:26,540 And also, here we can perform a simple check. The cluster index should start from 2. 192 00:15:26,930 --> 00:15:32,780 So if it is less than 2, we simply return to max value, indicating that the read operation failed. 193 00:15:34,070 --> 00:15:36,080 OK, we finished file module. 194 00:15:38,020 --> 00:15:40,120 Now, let's go to the main function of the loader. 195 00:15:42,120 --> 00:15:42,990 We include 196 00:15:46,320 --> 00:15:51,450 file header, and in the main function, we get rid of printk function 197 00:15:52,790 --> 00:15:54,270 and initialize the file system. 198 00:15:58,420 --> 00:16:06,540 After the initialization is done, we can load the kernel file and the user file, so we load file the 199 00:16:06,550 --> 00:16:07,840 kernel.bin 200 00:16:10,030 --> 00:16:13,030 the base address is set to 200000. 201 00:16:17,840 --> 00:16:18,700 Then we load 202 00:16:19,830 --> 00:16:20,970 user.bin 203 00:16:22,880 --> 00:16:26,600 and the base address we want to load the file into is 30000. 204 00:16:28,790 --> 00:16:36,980 Since what we do here is load the kernel, we use assertion to do the check, so we include debug.h 205 00:16:37,640 --> 00:16:38,480 and assert 206 00:16:41,020 --> 00:16:43,270 the load file functions succeed. 207 00:16:47,600 --> 00:16:49,160 After the main function is done. 208 00:16:51,120 --> 00:16:53,100 Let's go to the entry assembly file. 209 00:16:55,000 --> 00:17:00,550 As you can see, after we return from the main function, we jump to the infinite loop. 210 00:17:00,550 --> 00:17:04,829 Since we have loaded the kernel file, we can jump to kernel. 211 00:17:05,470 --> 00:17:07,230 So we mov rax 212 00:17:08,440 --> 00:17:12,849 the virtual address of the kernel is the same as we saw in the previous lectures. 213 00:17:13,920 --> 00:17:15,210 THen we jump to rax. 214 00:17:17,589 --> 00:17:20,740 If everything goes well, we should see kernel is running. 215 00:17:22,800 --> 00:17:24,359 Since we add a new module, 216 00:17:25,980 --> 00:17:28,800 we need to add the file in the build script. 217 00:17:32,560 --> 00:17:35,620 and add file.o to the linker command. 218 00:17:36,600 --> 00:17:38,360 OK, the loader project is done. 219 00:17:39,310 --> 00:17:40,660 let's build the loader project. 220 00:17:42,880 --> 00:17:46,930 We open the terminal and navigate to the loader directory. 221 00:17:48,520 --> 00:17:49,750 we run the build script. 222 00:17:52,450 --> 00:17:56,140 OK, the loader.bin file is written into the image. 223 00:17:57,180 --> 00:18:02,730 Another thing I need to mention is that because in the last section, the kernel needs 3 user programs 224 00:18:02,730 --> 00:18:04,230 to run properly. 225 00:18:05,030 --> 00:18:08,930 And now we have only loaded one user program in the loader. 226 00:18:09,980 --> 00:18:12,980 So in the main function of the kernel project. 227 00:18:14,080 --> 00:18:21,580 The only thing we are going to do in the kmain is print message, so we comment out all the other functions. 228 00:18:24,130 --> 00:18:28,270 And print kernel is running. 229 00:18:31,820 --> 00:18:33,050 OK, let's see the result. 230 00:18:36,330 --> 00:18:37,860 So we build the kernel project. 231 00:18:39,630 --> 00:18:42,660 We haven't changed the build script, so let's do some editing. 232 00:18:50,860 --> 00:18:56,160 The loader.bin file is not used here as well as boot.bin file 233 00:19:00,290 --> 00:19:04,070 So we don't write boot.bin and loader.bin to the image. 234 00:19:06,890 --> 00:19:14,330 As we have talked about in this section, we don't write the kernel.bin file and user files into the OS image. 235 00:19:15,260 --> 00:19:17,960 So the commands here are not used either. 236 00:19:19,290 --> 00:19:23,760 The command we used here is only for compiling the kernel project. 237 00:19:24,690 --> 00:19:26,040 OK, let's build the project. 238 00:19:32,980 --> 00:19:40,960 Before we run bochs, we have to mount the image so that we can copy user.bin and kernel.bin to this image. 239 00:19:51,420 --> 00:19:57,090 Since we don't have user.bin file in this lecture, we just rename the boot.bin to user.bin 240 00:19:57,240 --> 00:20:02,790 and copy kernel.bin and user.bin to the image. 241 00:20:11,680 --> 00:20:13,810 OK, now we dismount the image. 242 00:20:17,590 --> 00:20:19,810 Let's run bochs to see what we get. 243 00:20:24,650 --> 00:20:28,190 Alright, you can see the message kernel is running printed on screen. 244 00:20:29,260 --> 00:20:30,610 That's it for this lecture. 245 00:20:30,700 --> 00:20:32,110 See you in the next video.