1 00:00:00,560 --> 00:00:01,760 In a table. 2 00:00:01,940 --> 00:00:07,640 It's not unusual for a column to contain rows with a duplicate values in the teachers table. 3 00:00:07,640 --> 00:00:17,570 For example, school column lists the same school names multiple times because each school employs many 4 00:00:17,570 --> 00:00:18,320 teachers. 5 00:00:18,320 --> 00:00:24,020 So to understand the range of values in a column, we can use the distinct keyword as a part of the 6 00:00:24,020 --> 00:00:29,300 query that eliminates duplicates and shows only unique values. 7 00:00:29,300 --> 00:00:34,430 So use the distinct immediately after the select here. 8 00:00:34,430 --> 00:00:39,080 So firstly, actually, instead of here, we will use the order by school as well. 9 00:00:39,080 --> 00:00:42,620 And now we will use select. 10 00:00:43,530 --> 00:00:44,250 This thing. 11 00:00:44,250 --> 00:00:47,520 Actually, Slater here, select here. 12 00:00:47,550 --> 00:00:50,360 Distinct this thing. 13 00:00:51,510 --> 00:00:53,100 And school here. 14 00:00:54,930 --> 00:00:55,350 From. 15 00:00:56,380 --> 00:00:58,150 From teachers. 16 00:00:59,150 --> 00:00:59,750 Table. 17 00:01:00,230 --> 00:01:02,090 And order by. 18 00:01:03,460 --> 00:01:05,020 Ordered by school. 19 00:01:06,610 --> 00:01:07,060 That's it. 20 00:01:09,350 --> 00:01:15,320 And here, as you can see here, the result is this just a two output, even though it's six rows are 21 00:01:15,320 --> 00:01:20,630 in the table, the output shows the two unique school names in the school column. 22 00:01:21,020 --> 00:01:27,140 So this is a helpful first step toward assembling and assessing data quality. 23 00:01:27,170 --> 00:01:32,330 For example, if a school name is spelled more than one way, use spellings. 24 00:01:32,360 --> 00:01:37,830 Variations will be easy to spot and correct, especially if you sort the output. 25 00:01:37,850 --> 00:01:46,830 So when you are working with dates or numbers distinct will help highlight inconsistent or broken formatting. 26 00:01:46,850 --> 00:01:54,080 For example, you might inherit a dataset which dates were entered in a column formatted with a text 27 00:01:54,620 --> 00:01:55,160 data type. 28 00:01:55,280 --> 00:01:56,150 So. 29 00:01:57,220 --> 00:02:00,340 And this is the practice that you should avoid, by the way. 30 00:02:00,340 --> 00:02:05,200 And this allows the dates malformed dates to exist. 31 00:02:06,370 --> 00:02:13,240 So and the distinct keywords also works on more than one column at a time. 32 00:02:13,240 --> 00:02:17,260 So if we add a column, so order by. 33 00:02:17,530 --> 00:02:26,020 So select the things school and salary from teachers order by school and salary. 34 00:02:27,530 --> 00:02:29,120 Here and now. 35 00:02:29,120 --> 00:02:36,860 The query returns here and the query returns each unique or distinct salary earned at each school. 36 00:02:36,860 --> 00:02:46,330 So the two teachers at Myers Middle School earns near $43,500. 37 00:02:46,340 --> 00:02:49,490 That pair is listed in just one row. 38 00:02:50,490 --> 00:02:56,220 And the courier returns five rows rather than all six in a table. 39 00:02:57,080 --> 00:03:03,190 And the techniques gives us the ability to ask for each x in the table. 40 00:03:03,200 --> 00:03:05,640 What are the A values? 41 00:03:05,660 --> 00:03:06,230 Right. 42 00:03:06,230 --> 00:03:13,670 So for each factory, what are the all the chemicals it produced for each selection district? 43 00:03:13,700 --> 00:03:19,010 Who are all the candidates running for office for each concert hall? 44 00:03:19,220 --> 00:03:22,610 Who are the artists playing this month? 45 00:03:22,700 --> 00:03:32,810 Also offers more sophisticated techniques with aggregate functions that lets us count sum and find minimum 46 00:03:32,810 --> 00:03:34,460 and maximum values. 47 00:03:34,460 --> 00:03:44,420 So I will cover this in detail of later lectures of our course so we can also filter the rows with where 48 00:03:44,720 --> 00:03:47,030 here sometimes we where keywords. 49 00:03:47,030 --> 00:03:54,560 So sometimes you will want to limit the rows curtains to only those in which one or more columns meet 50 00:03:54,560 --> 00:03:55,970 certain criteria. 51 00:03:55,970 --> 00:04:04,260 So using teachers as an example, you might want to find all teachers hired before a particular year, 52 00:04:04,290 --> 00:04:12,720 or all teachers making more than 75 or $70,000 at elementary schools. 53 00:04:12,720 --> 00:04:16,500 So for these tasks we use the where clause. 54 00:04:16,500 --> 00:04:25,710 So the where clause allows you to find rows that match a specific value, a range of values or multiple 55 00:04:25,710 --> 00:04:28,890 values based on a criteria supplied via an operator. 56 00:04:28,890 --> 00:04:34,470 And this is the keyword that lets us perform math comparison and logical operations. 57 00:04:34,470 --> 00:04:39,270 So you can also use criteria to exclude rows here. 58 00:04:39,360 --> 00:04:42,150 So now we will do some basic example here. 59 00:04:42,150 --> 00:04:44,940 Select the last name. 60 00:04:44,970 --> 00:04:46,140 Last name. 61 00:04:47,080 --> 00:04:48,730 And school. 62 00:04:50,550 --> 00:04:51,120 To. 63 00:04:51,120 --> 00:04:52,500 And higher date. 64 00:04:52,980 --> 00:04:53,640 Higher. 65 00:04:54,740 --> 00:04:55,920 Air date. 66 00:04:56,600 --> 00:04:57,620 From teachers. 67 00:04:57,830 --> 00:04:59,630 From teachers. 68 00:04:59,780 --> 00:05:02,560 So and we will use where? 69 00:05:02,570 --> 00:05:03,170 Where? 70 00:05:03,170 --> 00:05:04,940 Keyword school. 71 00:05:05,120 --> 00:05:10,040 And now we will add the let's use the Myers Middle School. 72 00:05:10,040 --> 00:05:10,520 Right. 73 00:05:10,580 --> 00:05:11,750 Myers. 74 00:05:13,080 --> 00:05:13,260 My. 75 00:05:16,480 --> 00:05:16,960 Medal. 76 00:05:19,640 --> 00:05:20,120 School. 77 00:05:23,690 --> 00:05:24,140 That's it. 78 00:05:24,530 --> 00:05:27,200 So now after that, we will do the semicolon. 79 00:05:27,200 --> 00:05:30,260 And here that's it, as you can see here. 80 00:05:30,500 --> 00:05:37,820 So here I'm using the equals comparison operator to find rows that exactly match a value. 81 00:05:37,850 --> 00:05:44,320 But of course, you can use other operators with where to customize your filter criteria. 82 00:05:44,330 --> 00:05:50,870 So here we have some most commonly used comparison operators. 83 00:05:50,870 --> 00:05:53,930 So firstly, this is the equal to. 84 00:05:54,050 --> 00:05:55,430 So this is. 85 00:05:56,550 --> 00:05:58,140 Actually, let's use the here. 86 00:05:58,470 --> 00:06:00,660 So this is the equal to. 87 00:06:01,950 --> 00:06:11,460 And this is the not equal to greater than less than greater than or equal to less than or equal to. 88 00:06:11,460 --> 00:06:13,950 And we also have between. 89 00:06:14,670 --> 00:06:17,700 This is the between means within a range. 90 00:06:17,700 --> 00:06:21,450 So in here, like. 91 00:06:22,490 --> 00:06:23,510 I like. 92 00:06:24,020 --> 00:06:25,520 We will use the patch. 93 00:06:26,150 --> 00:06:27,560 So this is the case. 94 00:06:27,650 --> 00:06:28,190 Insensitive. 95 00:06:28,520 --> 00:06:29,810 Case Insensitive. 96 00:06:29,810 --> 00:06:37,760 Like I like and like is a case sensitive and not here not negates a condition. 97 00:06:37,760 --> 00:06:41,000 So now let's go here. 98 00:06:41,000 --> 00:06:47,990 You can also, by the way, find that on Internet in Wikipedia, these are pretty popular comparison 99 00:06:48,020 --> 00:06:49,700 operators in programming. 100 00:06:49,850 --> 00:06:54,410 So and I will also share these operators with you here. 101 00:06:54,500 --> 00:06:56,570 So now let's. 102 00:06:58,450 --> 00:07:00,250 Use the equal operator. 103 00:07:01,100 --> 00:07:02,240 Here with that. 104 00:07:02,240 --> 00:07:04,500 So where? 105 00:07:05,910 --> 00:07:06,810 First name. 106 00:07:07,320 --> 00:07:09,060 First name. 107 00:07:13,030 --> 00:07:14,230 Uh, it was the call. 108 00:07:14,230 --> 00:07:14,740 Right? 109 00:07:14,980 --> 00:07:16,090 Or last name. 110 00:07:16,090 --> 00:07:17,410 We can also use last name. 111 00:07:17,410 --> 00:07:17,740 Where? 112 00:07:17,740 --> 00:07:19,390 Last name. 113 00:07:19,810 --> 00:07:22,390 And actually, let's see also first name. 114 00:07:22,840 --> 00:07:24,280 First name. 115 00:07:25,440 --> 00:07:25,800 Here. 116 00:07:25,950 --> 00:07:28,650 So where last name equals Jane. 117 00:07:33,160 --> 00:07:34,300 Last name call. 118 00:07:34,360 --> 00:07:34,870 Right? 119 00:07:36,150 --> 00:07:36,450 Holy. 120 00:07:40,240 --> 00:07:46,480 And here, as you can see, Samuel Cole Myers Middle School and 2005, August 1st. 121 00:07:46,750 --> 00:07:54,460 So next, we can also list our school names in the table, but exclude them if the Roosevelt High School 122 00:07:54,490 --> 00:08:03,130 using the not equal operator remember our not equal operator was this and we can for not equal we can 123 00:08:03,130 --> 00:08:04,930 also use this so you can choose it. 124 00:08:06,370 --> 00:08:08,080 And here we will. 125 00:08:08,090 --> 00:08:15,760 So to do that, to exclude all Roosevelt High School result from we will use where. 126 00:08:16,880 --> 00:08:17,510 School. 127 00:08:20,490 --> 00:08:21,060 Cool. 128 00:08:21,060 --> 00:08:23,460 Not equals to f. 129 00:08:23,460 --> 00:08:24,020 T. 130 00:08:24,540 --> 00:08:25,530 F t. 131 00:08:26,660 --> 00:08:28,520 Whose wealth? 132 00:08:34,330 --> 00:08:35,170 Just about. 133 00:08:37,410 --> 00:08:38,070 High school. 134 00:08:40,830 --> 00:08:41,520 Now. 135 00:08:42,430 --> 00:08:42,730 Here. 136 00:08:43,530 --> 00:08:53,550 So we excluded here we use the less than operator to the list teachers here and we are here that we 137 00:08:53,550 --> 00:08:58,300 are only seeing the Myers Middle School on the output. 138 00:08:58,320 --> 00:09:00,000 So now we will do. 139 00:09:00,480 --> 00:09:08,620 Let's actually use the less than operator to list teachers hired before January 1st, 2002 or January 140 00:09:08,620 --> 00:09:11,490 1st or January 2nd, 2000. 141 00:09:11,520 --> 00:09:15,720 Here, using the date format here, of course, we're going to use date format here. 142 00:09:15,720 --> 00:09:21,570 So where hire date is less than operator. 143 00:09:21,600 --> 00:09:23,400 Remember, this is the less than operator. 144 00:09:25,280 --> 00:09:26,150 And. 145 00:09:27,620 --> 00:09:28,180 Here. 146 00:09:28,190 --> 00:09:29,780 So sorry. 147 00:09:29,780 --> 00:09:30,530 Let's actually. 148 00:09:35,270 --> 00:09:35,960 Display. 149 00:09:37,750 --> 00:09:38,740 This display, we can assume. 150 00:09:39,940 --> 00:09:40,650 100. 151 00:09:41,640 --> 00:09:42,480 I hope we can. 152 00:09:43,230 --> 00:09:44,370 Okay, Perfect. 153 00:09:44,850 --> 00:09:45,690 Just perfect. 154 00:09:46,230 --> 00:09:53,160 So now we will use their higher date is less than 2000. 155 00:09:54,270 --> 00:09:55,650 Uh, January. 156 00:09:59,410 --> 00:09:59,740 Here. 157 00:10:00,310 --> 00:10:05,380 So you remember this is the month of January, and here, let's make it 20. 158 00:10:06,490 --> 00:10:07,870 So that's it. 159 00:10:10,730 --> 00:10:11,210 Run. 160 00:10:11,210 --> 00:10:20,250 And here, as you can see here, we just got one result because there's just only one, which is Lee 161 00:10:20,270 --> 00:10:25,790 Reynolds that hired before 2000 January twins. 162 00:10:25,820 --> 00:10:26,330 Right. 163 00:10:26,330 --> 00:10:36,290 So then we can also find the teachers who earned 43,000 or more using the less than greater than or 164 00:10:36,290 --> 00:10:37,640 equal operator. 165 00:10:37,640 --> 00:10:41,780 So now we will use the salary. 166 00:10:43,370 --> 00:10:44,630 Salary. 167 00:10:46,240 --> 00:10:49,840 Greater than or equal to 43,000. 168 00:10:52,600 --> 00:10:54,100 Here, as you can see here. 169 00:10:54,100 --> 00:10:57,700 And let's also print the salary here. 170 00:10:57,700 --> 00:11:02,860 And we can also, instead of writing each one of here, just select one teachers. 171 00:11:03,890 --> 00:11:06,770 As you can see here, the salary is numeric. 172 00:11:08,770 --> 00:11:09,950 And that's it. 173 00:11:10,000 --> 00:11:13,840 So we can also do, for example, 43,500. 174 00:11:13,870 --> 00:11:14,380 Right. 175 00:11:14,410 --> 00:11:15,820 So if we do this. 176 00:11:16,850 --> 00:11:18,560 We will still get that same result. 177 00:11:18,560 --> 00:11:24,620 But if we just do one more and here, as you can see, we just have one output here. 178 00:11:25,390 --> 00:11:32,650 So the next query uses here, let's say we will use the between operator to find teachers who earn from 179 00:11:34,030 --> 00:11:36,820 earn from 30,000 to 70,000. 180 00:11:36,820 --> 00:11:37,360 Right. 181 00:11:37,540 --> 00:11:44,680 And also keep in mind that between is inclusive, meaning that result will include values matching the 182 00:11:44,680 --> 00:11:47,730 start and end ranges specified. 183 00:11:47,740 --> 00:11:48,700 So. 184 00:11:49,810 --> 00:11:52,870 Now select our. 185 00:11:53,640 --> 00:12:09,450 Everything from teachers and where salary between bit been or actually let's make it 40,000 40,060 5000. 186 00:12:09,720 --> 00:12:10,740 Now that's it. 187 00:12:10,740 --> 00:12:11,850 So now run it. 188 00:12:11,850 --> 00:12:13,080 And here. 189 00:12:13,920 --> 00:12:14,580 That's it. 190 00:12:14,580 --> 00:12:15,690 This is our. 191 00:12:18,770 --> 00:12:22,460 Output that we get with between operator. 192 00:12:22,580 --> 00:12:33,080 So you also keep in mind that use caution with between because it's inclusive nature can lead to inadvertent 193 00:12:33,110 --> 00:12:34,950 double counting values. 194 00:12:34,970 --> 00:12:44,360 So for example, if you filter for values between 10 and 12 and run a second query using a between 20 195 00:12:44,360 --> 00:12:45,170 and. 196 00:12:46,050 --> 00:12:51,120 Uh, 30 a row with the value of 20 will appear in both query results. 197 00:12:51,120 --> 00:12:57,330 So you can avoid this by using the more explicit, greater than and less than operators to define ranges. 198 00:12:57,360 --> 00:13:01,650 For example, they are salary. 199 00:13:03,170 --> 00:13:09,020 Uh, surveyor salary greater than or equal to 40,000. 200 00:13:10,300 --> 00:13:15,970 And salary less than or equals to 65,000. 201 00:13:16,450 --> 00:13:17,320 So. 202 00:13:19,170 --> 00:13:26,760 We will return to these operators throughout this course because they will play a key role in helping 203 00:13:26,760 --> 00:13:31,380 us ferret out the data and answers we want to find. 204 00:13:33,400 --> 00:13:38,050 My name is Stefon, and I'm waiting you in another lecture of Oxley. 205 00:13:38,050 --> 00:13:39,520 So see you later.