1
00:00:00,510 --> 00:00:01,330
Welcome back.

2
00:00:01,770 --> 00:00:06,560
we have walked through the fat16 image. Let’s add a small file module in the loader.

3
00:00:07,520 --> 00:00:10,130
we first take a look at file.h

4
00:00:12,430 --> 00:00:14,560
The structures are defined in the file.h.  

5
00:00:15,400 --> 00:00:18,220
As you can see, we have bios parameter block 

6
00:00:19,370 --> 00:00:23,880
and directory entry structure which we have discussed in detail in the last video.

7
00:00:24,790 --> 00:00:27,940
Remember we loaded the fat16 image to this address, 

8
00:00:28,940 --> 00:00:34,520
so we define a constant fs base to represent the base address of the file image. 

9
00:00:35,670 --> 00:00:39,810
Ok let’s move to  file.c and see what we have in the file. 

10
00:00:42,160 --> 00:00:50,620
The header files we will use include file.h, lib.h print.h debug.h and stdbool.h

11
00:00:51,840 --> 00:00:57,480
Just as we did with other modules, the first function we will implement is initialization function. 

12
00:00:58,260 --> 00:01:04,650
We name it init fs which takes no parameters and doesn’t return a value. 

13
00:01:05,560 --> 00:01:11,050
In the function, the first thing we are going to do is we are going to get the address of the bios parameter block. 

14
00:01:11,680 --> 00:01:14,650
Because all the essential data is stored in this structure. 

15
00:01:15,610 --> 00:01:23,980
The function we use in this case is get fs bpb. After we get the address of bpb, 

16
00:01:23,980 --> 00:01:24,820
we perform a simple check.

17
00:01:25,940 --> 00:01:32,300
The last two bytes of the first sector should be 55 and aa. If we find that they are not equal to the values,

18
00:01:32,300 --> 00:01:39,290
we print invalid signature and assert false to stop the system 

19
00:01:39,290 --> 00:01:41,810
because what we have in this location is not the image if that happens.

20
00:01:42,920 --> 00:01:45,440
OK, that's it for the initialization function.

21
00:01:46,710 --> 00:01:50,550
Next we get to function get fs bpb. 

22
00:01:51,870 --> 00:01:58,430
As you can see, this function returns a pointer to structure bpb. As we have parsed the fat16 image manually

23
00:01:58,440 --> 00:02:03,410
and the whole image file is loaded in the address fs base, 

24
00:02:04,020 --> 00:02:09,550
so we first find the partition entry at the offset 1be. 

25
00:02:09,570 --> 00:02:12,080
The starting lba is located at offset 8, 

26
00:02:12,300 --> 00:02:14,520
So we add 8 to the offset. 

27
00:02:15,730 --> 00:02:21,400
The lba value is 4 bytes long. We define a variable lba to get the lba value. 

28
00:02:22,600 --> 00:02:28,630
With the starting lba retrieved, then we locate the fat16 partition by adding the lba offset

29
00:02:28,630 --> 00:02:29,780
to the base address. 

30
00:02:31,200 --> 00:02:38,310
The sector is assumed to be 512 bytes. So here we multiply the lba by 512. 

31
00:02:38,810 --> 00:02:42,500
The result is the base address of the partition and we return this value. 

32
00:02:43,780 --> 00:02:46,720
Alright, at this point,  the initialization is done. 

33
00:02:48,200 --> 00:02:53,930
Next we will find the kernel file. To find a file, we need to locate the root directory section

34
00:02:54,200 --> 00:02:57,020
where each entry represents a file or folder.

35
00:02:58,440 --> 00:03:00,630
So first off, we write function 

36
00:03:01,840 --> 00:03:03,130
get root directory. 

37
00:03:04,200 --> 00:03:09,630
The function returns the base address of root directory section which stores the directory entry arrays. 

38
00:03:09,630 --> 00:03:15,060
So the type of the value is a pointer to structure directory entry. 

39
00:03:16,050 --> 00:03:20,940
In the function, we use the info stored in the bpb to locate the directory section. 

40
00:03:21,900 --> 00:03:27,900
So we define a variable bpb and get the base address with function get fs bpb. 

41
00:03:29,230 --> 00:03:34,900
Next we calculate the offset. As we have learned in the last lecture, the root directory section follows the fat tables. 

42
00:03:34,900 --> 00:03:40,560
So we multiply the table size by the count of fat tables. 

43
00:03:41,260 --> 00:03:43,690
And don’t forget to add the reserved sectors.

44
00:03:44,820 --> 00:03:48,600
Then times the bytes per sectors. The result is the offset, 

45
00:03:49,730 --> 00:03:51,890
we add it to the base of the partition 

46
00:03:52,540 --> 00:03:55,220
and this is the base address of the root directory section.

47
00:03:56,310 --> 00:04:02,340
Next function we will implement is get root directory count 

48
00:04:02,360 --> 00:04:05,130
which will return the count of the entries root directory can hold. 

49
00:04:06,430 --> 00:04:14,080
The count value is also stored in the bios parameter block. So we get the bpb and the field root entry count stores the value

50
00:04:14,080 --> 00:04:14,890
.

51
00:04:15,400 --> 00:04:16,600
So we simply return

52
00:04:16,810 --> 00:04:18,149
the value root entry count.

53
00:04:19,290 --> 00:04:23,640
Ok with root directory info being retrieved, let’s see how to search a file.

54
00:04:25,200 --> 00:04:28,320
To search a file, we define a function search file.

55
00:04:30,040 --> 00:04:36,010
It takes one parameter the path name. The path name includes file name and extension name. 

56
00:04:36,940 --> 00:04:38,660
The character . separates them. 

57
00:04:39,040 --> 00:04:42,730
Therefore, we will split the path name before we search the file.  

58
00:04:44,240 --> 00:04:49,970
As you can see, the file name and extension name are like this, the file name is 8 bytes data 

59
00:04:49,970 --> 00:04:52,830
and extension name is 3 byte data. 

60
00:04:53,270 --> 00:04:58,580
If we have characters less than that, the free space is filled with space or blank.

61
00:05:00,130 --> 00:05:07,120
Since we search files in the root directory section, we define variable root directory count and directory entry pointer

62
00:05:07,120 --> 00:05:07,690
.

63
00:05:09,500 --> 00:05:16,190
Now we split the path by calling function split path which takes 3 arguments, that is

64
00:05:16,190 --> 00:05:16,480
,

65
00:05:16,490 --> 00:05:19,790
the path name we want to split, the file name and extension name.

66
00:05:21,030 --> 00:05:27,420
After we return from it, we check its status, if the status is true, we will search the file 

67
00:05:27,420 --> 00:05:33,810
using file name and extension name. Otherwise we return the max value to the caller

68
00:05:33,810 --> 00:05:36,540
indicating that the file is not found.

69
00:05:37,630 --> 00:05:38,680
OK, let's continue.

70
00:05:40,340 --> 00:05:45,710
To find the file in the root directory, we use directory pointer and directory count. 

71
00:05:47,650 --> 00:05:53,050
In the for loop, we loop through each of the entries to see if the file is there.  

72
00:05:53,050 --> 00:06:00,760
If the first byte of the file name is 0 meaning that the entry is not used, or the file name is e5 

73
00:06:00,760 --> 00:06:02,920
which means the file in the entry is deleted, 

74
00:06:03,840 --> 00:06:05,860
If that is the case, we continue the loop.

75
00:06:07,550 --> 00:06:10,760
The next check we will perform is for the long file name, 

76
00:06:11,420 --> 00:06:17,500
we don't support the long file name features in the system, so if we find a long file name entry,

77
00:06:17,780 --> 00:06:21,560
we just skip this entry. The attribute of a long file entry

78
00:06:21,560 --> 00:06:23,160
is F.

79
00:06:24,240 --> 00:06:28,920
So if we find that, the attribute is equal to F, we continue the process.

80
00:06:30,580 --> 00:06:36,100
If all the checks pass, we will do the comparison which is done in function is file name equal. 

81
00:06:36,940 --> 00:06:43,430
We pass the current root directory entry, the file name and extension name.  If they are equal, 

82
00:06:43,450 --> 00:06:47,260
this is the file we are looking for and return the index to the caller.

83
00:06:48,210 --> 00:06:50,610
OK, that's it for the function search file.

84
00:06:51,360 --> 00:06:54,510
Let’s take a look at function is file name equal. 

85
00:06:55,840 --> 00:07:01,180
This function is simple, all it does is compare the file name as well as extension name with the directory entry

86
00:07:01,180 --> 00:07:02,140
.

87
00:07:03,100 --> 00:07:06,850
If the result is equal, we return true, otherwise we return false. 

88
00:07:08,120 --> 00:07:08,850
Let's move on.

89
00:07:09,380 --> 00:07:13,970
We haven’t talked about split path. Let’s see what we have in this function.

90
00:07:15,860 --> 00:07:22,010
As you can see in this function, the first part is for parsing the file name 

91
00:07:22,010 --> 00:07:24,590
and the second part is for the extension name if the file has one. 

92
00:07:26,230 --> 00:07:31,330
So when we retrieve the file name, we loop through each character of the path name 

93
00:07:31,330 --> 00:07:37,000
until we get 8 characters or we find the character . which means the following characters are for the extension name. 

94
00:07:37,000 --> 00:07:40,840
or we reach the end of the path name, the null character. 

95
00:07:42,370 --> 00:07:48,700
For simplicity, we don’t support subfolders in the system, so we don’t allow forward slash included in the path name

96
00:07:48,700 --> 00:07:49,260
.

97
00:07:49,780 --> 00:07:52,030
If we find that, we return false. 

98
00:07:53,290 --> 00:07:56,710
As for normal characters, we just copy them in the name buffer. 

99
00:07:57,920 --> 00:08:01,550
After we exit out the loop, the file name is stored in the buffer. 

100
00:08:02,810 --> 00:08:04,850
Then we check if the character is dot, 

101
00:08:05,920 --> 00:08:08,620
if it is a dot, we will get the extension name. 

102
00:08:09,760 --> 00:08:14,920
Here we increment I to make it point to the next character which is the start of the extension name 

103
00:08:14,920 --> 00:08:15,640
.

104
00:08:16,620 --> 00:08:23,390
In the for loop, we retrieve at most 3 characters. We also check the null character, if it is null, 

105
00:08:23,430 --> 00:08:25,080
then we reach the end of the name.  

106
00:08:26,330 --> 00:08:31,970
We don’t allow forward slash in the extension name either.  So if we find that, we return false.

107
00:08:33,470 --> 00:08:36,530
Next we copy the characters to the extension name buffer. 

108
00:08:37,549 --> 00:08:44,000
After it is done, we will check the last character to see if it is null character. If it’s not a null character

109
00:08:44,270 --> 00:08:44,870
,

110
00:08:44,910 --> 00:08:48,200
it means that we still have characters after we retrieve the name. 

111
00:08:48,740 --> 00:08:50,420
and this is not a valid name.

112
00:08:50,750 --> 00:08:51,920
So we return false.

113
00:08:52,640 --> 00:08:54,470
Otherwise, we return true to the caller.

114
00:08:56,010 --> 00:08:59,700
What we are going to do next is we are going to load file.

115
00:09:00,570 --> 00:09:02,190
So let's take a look at function

116
00:09:03,990 --> 00:09:04,740
Load file.

117
00:09:06,040 --> 00:09:11,260
It takes two parameters path name and the memory address we want to load the file into. 

118
00:09:12,430 --> 00:09:20,750
First off, we define a few variables, index which holds the directory entry index. The file size, 

119
00:09:20,770 --> 00:09:28,090
the first cluster index of the file, the directory entry pointer and the return value which is initialized with -1

120
00:09:28,090 --> 00:09:28,360
.

121
00:09:29,460 --> 00:09:32,700
To load a file, the first thing we will do is search the file. 

122
00:09:33,640 --> 00:09:39,400
If the file exists, the return value is the index of the root directory entry. So we check the return value

123
00:09:39,400 --> 00:09:39,730
.

124
00:09:40,790 --> 00:09:42,960
If it’s not equal to the maximum value, 

125
00:09:42,990 --> 00:09:45,000
it means that the index is valid. 

126
00:09:46,000 --> 00:09:52,480
We get the root directory by the function get root directory. Then locate the entry with the index. 

127
00:09:53,880 --> 00:09:58,260
The data we want in the directory entry is the file size and cluster index. 

128
00:09:59,010 --> 00:10:01,380
They will be passed in the function read file.

129
00:10:02,770 --> 00:10:08,800
Function read file will read the data of the file to the memory address specified by us. The return value of the read file

130
00:10:08,800 --> 00:10:10,070
,

131
00:10:10,120 --> 00:10:11,950
is the size of data it actually reads.

132
00:10:13,120 --> 00:10:16,420
If it reads all the data, we return 0 to the caller. 

133
00:10:16,960 --> 00:10:21,400
Otherwise we return -1 indicating that we don’t load the file successfully.

134
00:10:22,520 --> 00:10:24,650
OK, let's move to read file function.

135
00:10:25,710 --> 00:10:32,250
The first argument is the first cluster index and then the memory address we want to load the data into. 

136
00:10:32,700 --> 00:10:35,930
The size of the data is specified in the last argument. 

137
00:10:36,540 --> 00:10:41,880
As you see, the actual work of reading a file is done in function read raw data. 

138
00:10:42,900 --> 00:10:47,050
And read file function simply wraps read raw data and return the size. 

139
00:10:48,360 --> 00:10:53,610
Before we get to function read raw data, we need other functions to retrieve the cluster info.

140
00:10:54,540 --> 00:10:59,990
We have seen how to find all the clusters of a file with the file allocation table in the last video.  

141
00:11:01,140 --> 00:11:08,310
The first data we need is the address of fat table. We start with function get fat table. 

142
00:11:12,240 --> 00:11:18,870
Just as we did before, we get the bios parameter block. Then the offset of the fat table

143
00:11:18,870 --> 00:11:21,330
is the reserved sector count times the bytes per sector. 

144
00:11:22,250 --> 00:11:24,830
Add it to the base of the partition and we are done. 

145
00:11:26,470 --> 00:11:33,960
Next function is get cluster value. The parameter it takes is the cluster index. In the function, 

146
00:11:33,970 --> 00:11:38,110
we define a variable fat table and get the table by the function get fat table. 

147
00:11:39,200 --> 00:11:45,710
Remember each cluster value is 2 bytes. So we define a pointer pointing to 16-bit value. 

148
00:11:47,560 --> 00:11:53,200
To get the cluster value is simple, all we need to do is locate the item with the index and return the value

149
00:11:53,200 --> 00:11:53,370
.

150
00:11:54,690 --> 00:12:01,410
Next function is get the cluster offset. It takes one parameter which is the cluster index

151
00:12:01,410 --> 00:12:02,010
.

152
00:12:03,340 --> 00:12:09,790
To find the address the cluster value represents, we need to know the reserved sector count, the fat table size 

153
00:12:09,790 --> 00:12:12,450
and the size of root directory section. 

154
00:12:13,090 --> 00:12:18,880
So here we define 3 variables reserved size, fat size and directory section size. 

155
00:12:20,280 --> 00:12:26,950
Remember cluster value starts from 2. We assert cluster index is greater than or equal to 2

156
00:12:26,950 --> 00:12:27,300
.

157
00:12:28,820 --> 00:12:32,270
Now we can get the bpb and calculate the sizes. 

158
00:12:33,550 --> 00:12:37,360
The reserved size is reserved sector count times the sector size. 

159
00:12:38,490 --> 00:12:45,150
The fat table size is the fat count * fat table sector count * sector size. 

160
00:12:46,560 --> 00:12:51,750
the directory section size is the directory entry count * the size of directory entry. 

161
00:12:53,220 --> 00:12:56,070
With all these three variables, we add them together 

162
00:12:57,240 --> 00:13:03,740
and the cluster index starts from 2. So we subtract 2 from it and then times cluster size. 

163
00:13:04,640 --> 00:13:07,010
Ok this is the offset of the cluster.

164
00:13:08,780 --> 00:13:18,020
The last one is get cluster size. The cluster size is stored in the bios parameter block.  So we first get the bpb. 

165
00:13:18,020 --> 00:13:24,500
Then we retrieve the data by multiplying bytes per sector by the sector per cluster and this is the cluster size

166
00:13:24,500 --> 00:13:24,950
.

167
00:13:26,670 --> 00:13:31,860
Ok all the functions are prepared. we can implement read raw data. 

168
00:13:33,740 --> 00:13:36,710
The function returns the actual size of data being read. 

169
00:13:37,970 --> 00:13:44,540
The variables we need in this function include bpb, the data pointer which points to 

170
00:13:44,540 --> 00:13:47,120
the data we want to copy. 

171
00:13:47,120 --> 00:13:49,320
The read size indicates how much data we read. 

172
00:13:49,820 --> 00:13:52,430
The cluster size and cluster index. 

173
00:13:54,080 --> 00:13:58,670
First off, we retrieve the bios parameter block and the cluster size. 

174
00:13:59,950 --> 00:14:03,310
The index here is used to locate the fat table entries. 

175
00:14:04,390 --> 00:14:10,570
So the first index is the starting cluster index. Since we could have multiple clusters in the file, 

176
00:14:11,120 --> 00:14:14,920
the while loop is used to read all the clusters until we get to the end, 

177
00:14:15,790 --> 00:14:20,890
in this case, the condition is read size is greater than or equal to the size.  

178
00:14:22,450 --> 00:14:29,350
OK, now we get the address of the data using cluster index, The function is get cluster offset and

179
00:14:29,350 --> 00:14:35,560
we add it to the address of the partition and this is the address of the data indicated by the cluster index

180
00:14:35,560 --> 00:14:36,070
.

181
00:14:36,970 --> 00:14:39,370
Then we go ahead and get the next cluster value 

182
00:14:39,370 --> 00:14:43,960
using the index and the return value is next cluster index.

183
00:14:45,220 --> 00:14:51,850
If the next cluster index is greater than fff7, then it means that this is the last cluster 

184
00:14:51,970 --> 00:14:55,240
and we copy the data to the buffer then we break the loop. 

185
00:14:56,550 --> 00:15:03,120
If the next cluster index is less than fff7, then it’s a valid cluster index 

186
00:15:03,120 --> 00:15:04,700
and the current cluster is not the last one.

187
00:15:05,460 --> 00:15:09,000
So we simply copy one cluster of data to the buffer. 

188
00:15:09,000 --> 00:15:10,920
Then we update the buffer and read size. 

189
00:15:11,670 --> 00:15:15,420
Since we copied one cluster of data. we add one cluster size. 

190
00:15:16,410 --> 00:15:19,980
After we are done, we return the actual size of data being read.

191
00:15:21,110 --> 00:15:26,540
And also, here we can perform a simple check. The cluster index should start from 2.

192
00:15:26,930 --> 00:15:32,780
So if it is less than 2, we simply return to max value, indicating that the read operation failed.

193
00:15:34,070 --> 00:15:36,080
OK, we finished file module.

194
00:15:38,020 --> 00:15:40,120
Now, let's go to the main function of the loader.

195
00:15:42,120 --> 00:15:42,990
We include

196
00:15:46,320 --> 00:15:51,450
file header, and in the main function, we get rid of printk function

197
00:15:52,790 --> 00:15:54,270
and initialize the file system.

198
00:15:58,420 --> 00:16:06,540
After the initialization is done, we can load the kernel file and the user file, so we load file the

199
00:16:06,550 --> 00:16:07,840
kernel.bin

200
00:16:10,030 --> 00:16:13,030
the base address is set to 200000.

201
00:16:17,840 --> 00:16:18,700
Then we load

202
00:16:19,830 --> 00:16:20,970
user.bin

203
00:16:22,880 --> 00:16:26,600
and the  base address we want to load the file into is 30000.

204
00:16:28,790 --> 00:16:36,980
Since what we do here is load the kernel, we use assertion to do the check, so we include debug.h

205
00:16:37,640 --> 00:16:38,480
and assert

206
00:16:41,020 --> 00:16:43,270
the load file functions succeed.

207
00:16:47,600 --> 00:16:49,160
After the main function is done.

208
00:16:51,120 --> 00:16:53,100
Let's go to the entry assembly file.

209
00:16:55,000 --> 00:17:00,550
As you can see, after we return from the main function, we jump to the infinite loop.

210
00:17:00,550 --> 00:17:04,829
Since we have loaded the kernel file, we can jump to kernel.

211
00:17:05,470 --> 00:17:07,230
So we mov rax

212
00:17:08,440 --> 00:17:12,849
the virtual address of the kernel is the same as we saw in the previous lectures.

213
00:17:13,920 --> 00:17:15,210
THen we  jump to rax.

214
00:17:17,589 --> 00:17:20,740
If everything goes well, we should see kernel is running.

215
00:17:22,800 --> 00:17:24,359
Since we add a new module,

216
00:17:25,980 --> 00:17:28,800
we need to add the file in the build script.

217
00:17:32,560 --> 00:17:35,620
and add file.o to the linker command.

218
00:17:36,600 --> 00:17:38,360
OK, the loader project is done.

219
00:17:39,310 --> 00:17:40,660
let's build the loader project.

220
00:17:42,880 --> 00:17:46,930
We open the terminal and navigate to the loader directory.

221
00:17:48,520 --> 00:17:49,750
we run the build script.

222
00:17:52,450 --> 00:17:56,140
OK, the loader.bin file is written into the image.

223
00:17:57,180 --> 00:18:02,730
Another thing I need to mention is that because in the last section, the kernel needs 3 user programs

224
00:18:02,730 --> 00:18:04,230
to run properly.

225
00:18:05,030 --> 00:18:08,930
And now we have only loaded one user program in the loader.

226
00:18:09,980 --> 00:18:12,980
So in the main function of the kernel project.

227
00:18:14,080 --> 00:18:21,580
The only thing we are going to do in the kmain is print message, so we comment out all the other functions.

228
00:18:24,130 --> 00:18:28,270
And print kernel is running.

229
00:18:31,820 --> 00:18:33,050
OK, let's see the result.

230
00:18:36,330 --> 00:18:37,860
So we build the kernel project.

231
00:18:39,630 --> 00:18:42,660
We haven't changed the build script, so let's do some editing.

232
00:18:50,860 --> 00:18:56,160
The loader.bin file is not used here as well as boot.bin file

233
00:19:00,290 --> 00:19:04,070
So we don't write boot.bin and loader.bin to the image.

234
00:19:06,890 --> 00:19:14,330
As we have talked about in this section, we don't write the kernel.bin file and user files into the OS image.

235
00:19:15,260 --> 00:19:17,960
So the commands here are not used either.

236
00:19:19,290 --> 00:19:23,760
The command we used here is only for compiling the kernel project.

237
00:19:24,690 --> 00:19:26,040
OK, let's build the project.

238
00:19:32,980 --> 00:19:40,960
Before we run bochs, we have to mount the image so that we can copy user.bin and kernel.bin to this image.

239
00:19:51,420 --> 00:19:57,090
Since we don't have user.bin file in this lecture, we just rename the boot.bin to user.bin 

240
00:19:57,240 --> 00:20:02,790
and copy kernel.bin and user.bin to the image.

241
00:20:11,680 --> 00:20:13,810
OK, now we dismount the image.

242
00:20:17,590 --> 00:20:19,810
Let's run bochs to see what we get.

243
00:20:24,650 --> 00:20:28,190
Alright, you can see the message kernel is running printed on screen.

244
00:20:29,260 --> 00:20:30,610
That's it for this lecture.

245
00:20:30,700 --> 00:20:32,110
See you in the next video.