1 00:00:01,130 --> 00:00:05,410 In this video, we will learn how to draw scatterplot and billycart. 2 00:00:07,600 --> 00:00:13,620 Scatterplot are also known as equation's, and these are also a very common type of chart. 3 00:00:15,680 --> 00:00:24,950 A scatterplot differs from most of the other types in that both sexes display values in a scatterplot 4 00:00:25,670 --> 00:00:31,120 that is the horizontal axis is not a category axis in a scatterplot. 5 00:00:31,790 --> 00:00:33,560 It is also having values. 6 00:00:36,220 --> 00:00:40,470 This type of chart is often used to show the relationship between two variables. 7 00:00:42,940 --> 00:00:50,980 In this example, I have taken the monthly marketing emails that the company sent me and the corresponding 8 00:00:50,980 --> 00:00:52,570 sales it is obtaining. 9 00:00:53,710 --> 00:01:01,510 So in the month of January 2008, the company sent out nine hundred four marketing e-mails and the amount 10 00:01:01,510 --> 00:01:03,680 of sales it did was eighty nine. 11 00:01:04,870 --> 00:01:12,190 So the question that I want to answer is, is did any relationship between marketing emails and sales 12 00:01:13,780 --> 00:01:15,100 to find that relationship? 13 00:01:15,430 --> 00:01:19,530 I can blog or scatterplot of these two variables. 14 00:01:20,740 --> 00:01:29,440 So on the X axis, I can take marketing e-mail and on the Y axis I can take the sales value and each 15 00:01:29,440 --> 00:01:32,440 point will be X Y. 16 00:01:32,650 --> 00:01:34,720 So nine hundred for eighty nine. 17 00:01:35,940 --> 00:01:43,950 When you plot each individual point like this, this whole plot is called a scatterplot, and if you 18 00:01:43,950 --> 00:01:50,700 look at the scatterplot, probably you can imagine that most of the points are telling you that there 19 00:01:50,700 --> 00:01:55,770 is a linear relationship between the x axis of evil and the y axis of evil. 20 00:01:56,730 --> 00:02:03,940 That is, if you increase the marketing emails, among the sales is correspondingly increasing. 21 00:02:05,730 --> 00:02:11,130 So such relationships can be identified using scatterplot. 22 00:02:13,900 --> 00:02:18,220 Let us learn how to create the scatterplot elderly discharged. 23 00:02:21,580 --> 00:02:27,760 Since Scatterplot has to be rebuilt, we will select these two variables that we want to plot. 24 00:02:34,320 --> 00:02:39,750 And we will go to recommend a charter option and we will select this scatterplot. 25 00:02:46,400 --> 00:02:51,900 Here I have the scatterplot, it is looking a little bit different than the previous one. 26 00:02:52,460 --> 00:02:59,510 The reason is that in the previous scatterplot I was showing you the Y axis, which was starting from 27 00:02:59,510 --> 00:03:01,130 a value of 75. 28 00:03:02,360 --> 00:03:03,800 Here it is, starting from U. 29 00:03:06,040 --> 00:03:11,170 If your aim is to show the absolute values, it is better to start from zero. 30 00:03:12,520 --> 00:03:18,580 If your aim is to show the relationship between two variables, you can start with different values 31 00:03:18,580 --> 00:03:21,910 such as 75, how to change the value. 32 00:03:22,030 --> 00:03:31,320 You select that y axis labels to the formatting axis option and you change the bond from zero to 75. 33 00:03:33,980 --> 00:03:37,490 The I want to show that there is some linear relationship. 34 00:03:39,100 --> 00:03:42,590 I have changed the lower amount of my Y-axis to 75. 35 00:03:43,320 --> 00:03:50,440 Now it is like zooming in into that portion of the chart where most of my points are like. 36 00:03:53,840 --> 00:03:59,090 Now, what are the different types of scatterplot that will again go to dangerous type? 37 00:04:01,150 --> 00:04:07,300 And here you can see that we have created the first one, which is a simple scatterplot, the second 38 00:04:07,300 --> 00:04:10,390 one is a scatterplot with smooth lines. 39 00:04:12,720 --> 00:04:20,580 The scatterplot with smooth lines can be used to show how the values are changing in the cities, so 40 00:04:21,120 --> 00:04:28,220 it will connect all the points in this series from the starting first point to the next point, then 41 00:04:28,230 --> 00:04:30,800 to the next point using smoothed lines. 42 00:04:30,900 --> 00:04:35,750 That is, if you look at any two point, it is not joined by a straight line. 43 00:04:35,760 --> 00:04:39,840 It is joined by a curved line so that it connects only points. 44 00:04:43,950 --> 00:04:52,890 The third option is scatterplot with smooth lines, but no DaMarcus, as you can see, the earlier one 45 00:04:53,040 --> 00:04:57,750 had these datapoint, these small circles highlighting the data point. 46 00:04:58,480 --> 00:05:02,690 If you do not want these small circles, you can select the third option. 47 00:05:03,150 --> 00:05:06,800 It will have a smooth line, but it will have no marker. 48 00:05:07,410 --> 00:05:11,160 The point of this is if you want to emphasize on relationship only. 49 00:05:14,450 --> 00:05:17,900 The fourth option is scatterplot with lined. 50 00:05:20,580 --> 00:05:26,280 So as I told you, the previous one had smooth lines, that it had curves. 51 00:05:28,340 --> 00:05:33,980 Instead of straight lines, if you select this one, it has straight lines connecting the two points. 52 00:05:36,630 --> 00:05:43,270 Next option is to get a lot of lines, but not the same as before. 53 00:05:43,290 --> 00:05:46,800 But the small circles will not be present if you select this one. 54 00:06:01,740 --> 00:06:10,230 One additional feature that comes with Scatterplot is trend line, as I told you, that scatterplot 55 00:06:10,230 --> 00:06:12,930 is used to identify a relationship between two variables. 56 00:06:14,370 --> 00:06:19,830 One method is to visually check out what trend is there between the two variables. 57 00:06:20,940 --> 00:06:23,460 The other option is to draw a trend line. 58 00:06:25,050 --> 00:06:29,220 Trend line is another element so you can add it by clicking this symbol. 59 00:06:32,190 --> 00:06:36,420 If you simply click the link by default, it will draw linear trend line. 60 00:06:37,500 --> 00:06:39,210 There are other options of trendlines. 61 00:06:39,210 --> 00:06:41,660 Also, let us first draw linearly. 62 00:06:46,980 --> 00:06:54,870 When I select this train line, this linear rail line itself draws the line such that it minimizes the 63 00:06:54,870 --> 00:07:00,090 difference between each data point and the corresponding value and it only. 64 00:07:02,630 --> 00:07:11,960 Overall, this trend line is suggesting that there is a positive relationship between sales and marketing 65 00:07:11,960 --> 00:07:12,560 image. 66 00:07:14,680 --> 00:07:22,660 You can also see in this trend line, if I increase the x axis value, that is the marketing emails 67 00:07:22,810 --> 00:07:29,290 from nearly 70 to 80, that is, there is 100 units increase Greece. 68 00:07:30,010 --> 00:07:32,940 I sent additional hundred emails to the customers. 69 00:07:34,330 --> 00:07:43,720 I will increase the sales by from 79 APR to 88 APR to. 70 00:07:44,840 --> 00:07:53,450 By sending out hundred additional emails, we will have an increased sale of nine to 10 units to the 71 00:07:53,450 --> 00:08:00,440 slope of this line is telling you the change in the y axis with the changing x axis. 72 00:08:03,770 --> 00:08:09,920 The other types of criminals and also you're going to change from linear to logarithmic. 73 00:08:10,930 --> 00:08:19,540 Polynomial, although we do not see any much difference in this dataset, but in your dataset, it is 74 00:08:19,540 --> 00:08:27,460 always better to draw all these types of train length first and then visually identify, which is fitting 75 00:08:27,460 --> 00:08:30,670 the data better and use that trend line. 76 00:08:33,300 --> 00:08:37,500 So using this trend line, you can also forecast this old. 77 00:08:38,680 --> 00:08:44,500 So let's that if you know the number of e-mails you are going to think you can find out the corresponding 78 00:08:44,500 --> 00:08:47,260 sales value basis this friendly. 79 00:08:51,640 --> 00:08:58,900 And once you have plotted this linear trend line and you want to find out what is the equation of this 80 00:08:59,020 --> 00:09:08,290 linear line that is considering this is Y axis and this is X axis, you want to find out why is equal 81 00:09:08,290 --> 00:09:11,380 to X plus B and what is the value of A and B? 82 00:09:12,310 --> 00:09:15,220 You can do that by taking this option. 83 00:09:18,390 --> 00:09:20,460 This will give you the description of this line. 84 00:09:22,970 --> 00:09:29,240 Here's the equation, bicycle two point zero eight nine times X plus eight point eight. 85 00:09:30,800 --> 00:09:36,770 So what this means is if you want to find out what will be your sales when you have when you are sending 86 00:09:36,770 --> 00:09:41,290 out 1000 e-mails, you can just put the value of excess 1000. 87 00:09:42,080 --> 00:09:48,020 It will come out to eighty nine point seven plus eight point eight. 88 00:09:49,970 --> 00:09:56,600 So your total sales will be eighty nine point seven plus eight point eight, which will be nearly ninety 89 00:09:56,600 --> 00:09:57,230 eight point three. 90 00:10:01,920 --> 00:10:09,330 So this is all this equation can be used, basically the point of using a scatterplot is to find out 91 00:10:09,330 --> 00:10:11,600 the relationship between two variables. 92 00:10:12,870 --> 00:10:22,040 If you identify a relationship visually, you can also plot a trend line using this adding element. 93 00:10:22,050 --> 00:10:28,290 But then once you plot a trend line and you are happy with the trend line and you would like to use 94 00:10:28,620 --> 00:10:35,250 the equation of that trend line, you can find that equation by clicking on this line, selecting its 95 00:10:35,250 --> 00:10:37,980 formatting options and taking this box. 96 00:10:39,240 --> 00:10:46,830 If you are into that analytics and you understand the terms of R-squared and and intercept, you can 97 00:10:46,830 --> 00:10:48,060 add those options also. 98 00:10:51,020 --> 00:10:56,240 So the R-squared value for this line is point footed. 99 00:11:00,700 --> 00:11:10,660 So the scatterplot these last two options are remaining, this is a bold plot and this is a predictable 100 00:11:10,660 --> 00:11:18,790 plot, then you want to identify a relationship between two variables only we use these two dimensional 101 00:11:18,820 --> 00:11:19,570 scatterplot. 102 00:11:20,680 --> 00:11:27,400 But if you have a third dimension also, that is there is a third variable also and you want to see 103 00:11:27,400 --> 00:11:34,150 the relationship between first, second and the third variable, you can use these to the entry level 104 00:11:34,180 --> 00:11:34,620 plot. 105 00:11:35,320 --> 00:11:36,150 Let me show you how. 106 00:11:40,260 --> 00:11:48,330 In this dataset, I have three data cities, I had eight participants in my weight loss program. 107 00:11:50,100 --> 00:11:54,060 These are the original weight of these eight participants. 108 00:11:55,420 --> 00:12:02,800 This is the time spent by these participants in our program, and this is the amount of weight lost 109 00:12:02,800 --> 00:12:04,880 by each individual participant. 110 00:12:06,460 --> 00:12:13,000 I want to find out the effect of these two variables in determining this third variable. 111 00:12:15,360 --> 00:12:23,910 So what I'm going to do is I use a wheelchair, which will have on the x axis the original weight of 112 00:12:23,940 --> 00:12:31,830 each individual participant on the Y axis, it will have the number of weeks department was in program 113 00:12:33,120 --> 00:12:36,880 and the radius of this bubble will be this third variable. 114 00:12:37,950 --> 00:12:47,910 The idea behind creating this chart is if in this bubble chart circled with larger areas are coming 115 00:12:47,910 --> 00:12:58,500 in a particular area of this chart, you can assign that maximum weight loss is being achieved by people 116 00:12:58,860 --> 00:13:00,720 belonging to that particular category. 117 00:13:01,420 --> 00:13:06,960 For example, most of the weight loss has been achieved by people in this range. 118 00:13:08,800 --> 00:13:15,790 So people belonging to the weight category of 200 to 300, 20 probably 119 00:13:18,460 --> 00:13:26,850 achieved the maximum weight loss and at least you should be in the program for two or three weeks. 120 00:13:27,430 --> 00:13:33,490 So this square area constitute most of the big circles. 121 00:13:35,730 --> 00:13:41,700 And you can clearly identify the range in which these circles are occurring. 122 00:13:43,950 --> 00:13:51,290 So basically, when you have three data cities and you want to find the effect of two of the data is 123 00:13:51,300 --> 00:13:55,050 on the third data cities up bubble chart is used. 124 00:13:56,550 --> 00:14:01,830 So now let us learn how to draw a bubble chart and delete this one. 125 00:14:02,770 --> 00:14:13,820 We will select these three cities and go to the bubble chart by default in Excel. 126 00:14:13,980 --> 00:14:20,070 The first series is taken as the radius of the bubbles and the other two cities are taken as the x axis 127 00:14:20,070 --> 00:14:20,850 and y axis. 128 00:14:21,570 --> 00:14:28,350 But instead, what I want to do is I want to take the first variable as it exists in this y axis. 129 00:14:28,350 --> 00:14:29,700 And the third idea it is. 130 00:14:30,330 --> 00:14:34,950 So I have to go to selected option and I will change. 131 00:14:43,110 --> 00:14:48,240 So the X value series is the original with. 132 00:14:49,910 --> 00:14:50,450 Sidis. 133 00:14:52,900 --> 00:15:01,600 The way that you see this is weeks and program and the bubble radius will be this one. 134 00:15:03,650 --> 00:15:04,100 OK. 135 00:15:08,040 --> 00:15:12,240 And it will automatically decide what should be the horizontal axis label. 136 00:15:12,600 --> 00:15:17,820 You can click on Okay and this is the bubble chart that we wanted to create. 137 00:15:20,100 --> 00:15:23,130 The second option in global chart is a bubble chart. 138 00:15:23,520 --> 00:15:25,730 As you can see, this is a truly bubble chart. 139 00:15:25,920 --> 00:15:27,030 Here you have circles. 140 00:15:27,360 --> 00:15:30,840 If you created in 3D, these will become spheres. 141 00:15:32,660 --> 00:15:35,360 So each circle is now looking like a small ball. 142 00:15:41,300 --> 00:15:47,390 So just like Scatterplot, you can use bubble chart to identify trends and create, to invent. 143 00:15:50,430 --> 00:15:52,500 Once you have created the trend line, you can. 144 00:15:55,640 --> 00:15:59,810 Formatted online in line also, you can change its color, weight, etc.. 145 00:16:02,940 --> 00:16:10,340 So Scatterplot and Belgard are basically used to identify a relationship between two or three variables. 146 00:16:11,270 --> 00:16:12,890 And this is how we create them.