Application of statistical concepts in the determination of weight variation in samples G. E. Institute of Biology, College of Science University of the Philippines, Diliman, Quezon City, Philippines Date Submitted: April 23, 2013 ABSTRACT Statistics is a mathematical science dealing with the collection, organization, analysis, interpretation, and presentation of data. It provides a more accurate way of expressing data rather than mere observation.
This experiment used the different statistical concepts such as the Q test, mean, standard deviation, relative standard deviation, range, relative range, and confidence limits or confidence intervals. The results generated from these tests are used as a basis to check whether the values obtained from weighing 10, 25 centavo coins using an analytical balance and which were grouped into two data sets, are acceptable or not. It can be seen that when the statistical concepts were applied to data set 1 and data set 2, the resulting values obtained do not greatly vary.
However, it can’t be proven that the results do not differ significantly since there was no test performed to check this. RESULTS AND DISCUSSION Different weights were obtained from the 10- 25 centavo coins using the analytical balance. Each weight is considered as a single sample. The samples were grouped into two data sets. The first dataset contains six samples while the second data set contains 10 samples. The table below shows the two data sets with their corresponding samples. Table 1. Weight of samples grouped into two sets DATA SET 1 (g)| DATA SET 2 (g)| . 5348| 3. 5348| 3. 556| 3. 556| 3. 5806| 3. 5875| 3. 5902| 3. 5806| 3. 6113| 3. 5851| 3. 6484| 3. 5875| | 3. 5902| | 3. 6113| | 3. 624| | 3. 6484| The Q test was performed for each of the data sets. Other statistical parameters were also calculated. When one or more of the measured values obtained within a set is/are different from the rest, the Q test can be used to check if the suspected value or values should be retained or rejected (chem. uoa. gr). However this is only used for a small number of samples or replicates in a given set.
Hence, this test is used in this experiment. Furthermore, the Q test is an example of a significance test. The outcome of this test is the acceptance or the rejection of the null hypothesis. For the purpose of this paper, the null hypothesis would be that, there is no significant difference between the suspected value or values from that of the rest of the values obtained. Equation 1 shows how Qexp was obtained where Xq is the suspected value, Xn is the value closest to the suspected value, and R is the range. Equation 1. Q test formula
When Qtab<Qexp, the calculated Qexp value is rejected. However when Qtab>Qexp, the calculated Qexp value is accepted. The results shown in the table below shows that the calculated Qexp values are less than their corresponding Qtab values at 95% confidence level. Hence, these values are accepted. This also means that there is no significant difference between the suspected values from that of the rest of the values obtained. Furthermore, any discrepancies or differences are due to purely random and not systematic errors. Table 2. Obtained Qexp versus Qtab
Data Set| Suspect Values| Qtab| Qexp| Conclusion| 1| H: 3. 6484 g| 0. 625| 0. 3266| ACCEPT| | L: 3. 5348 g| 0. 625| 0. 1866| ACCEPT| 2| H: 3. 6484 g| 0. 466| 0. 2148| ACCEPT| | L: 3. 5348 g| 0. 466| 0. 1866| ACCEPT| To discuss further, random errors are errors that are due to unknown or unpredictable changes in the experiment (physics. umd. edu). These errors are beyond the control of the person taking the measurement. On the other hand, systematic errors are errors that usually come from the instruments used in taking the measurement (citycollegiate. com).
As mentioned, other statistical parameters were also calculated. These include the mean, standard deviation, relative standard deviation, range, relative range, and confidence limits (at 95% confidence level) The first of these other parameters being tested is the mean. The mean is one of the three parameters under measures of central tendency. According to the Australian Bureau of Statistics, “a measure of central tendency is a summary measure that attempts to describe a whole set of data with a single value that represents the middle or center of its distribution”.
By this definition, in this experiment, the mean is defined as the sum of the values of the samples in a dataset divided by the number of samples in the dataset. Equation 2 shows how the mean is calculated. Equation 2. Mean formula As shown in table 3, two mean values were obtained from the two datasets in the experiment. Table 3. The values obtained for each parameter tested for the two data sets in this experiment Parameter| Data Set 1| Data Set 2| Mean ()| 3. 5869 g| 3. 5879 g| S. D. (s)| 0. 04024 g| 0. 0355 g| R. S. D. (RSD)| 11. 22 ppt| 9. 50 ppt| Range (R)| 0. 1136 g| 0. 1136 g| Relative Range (RR)| 31. 67 ppt| 31. 66 ppt| Confidence Limit (CL)| 3. 5869 ± 0. 02876 g| 3. 5879 ± 0. 02398 g| As indicated in the table above, the mean value for dataset 1 and 2 are 3. 5869g and 3. 5879g, respectively. When compared to the current and official weight of the 25 centavo coin, the weight obtained from the experiment is less than that of the weight presented in the Bangko Sentral ng Pilipinas (BSP) website. The official weight indicated in the BSP website is 3. 8 grams (bsp. gov. ph).
One possible reason that can account for the difference in weight between the weight values obtained from the experiment from that obtained from the BSP website, is the year that the coins were manufactured and the percent material composition of the 25 centavo coin. It could be that BSP created the coins used in the experiment in a different year and with a different per cent material composition thus it weighed less than that of the current and official weight of the 25 centavo coin. However, there is no compelling evidence for this since the BSP website does not have any information as to when the coins weighing 3. g were made or if it has a different per cent material composition from the previous coins manufactured. The second and third parameters are the standard deviation and the relative standard deviation. These two statistical parameters are under the measures of precision. Precision refers to how close the results obtained from the samples are to each other (fao. org). By this, the standard deviation can be defined as the precision in a series of repetitive measurements around the mean (files. chem. vt. edu). Furthermore, standard deviation is the square root of variance. Equation 3 shows how standard deviation is calculated. Equation 3.
Standard deviation formula | | Relative standard deviation on the other hand is used for convenience since large numerical values of the samples in a data set generate a large standard deviation value (fao. org). Hence, to easily compare the uncertainty between different measurements of varying absolute magnitude, relative standard deviation is used. It is expressed as parts per thousand. Equation 4 shows how relative standard deviation is calculated. Equation 4. Relative standard deviation formula. Similar to the mean, two standard deviation and relative standard deviation values for datasets 1 and 2 were obtained (refer to table 3).
For the purpose of this paper, only the resulting values for the standard deviations of datasets 1 and 2, which are 0. 04024g and 0. 0355g, respectively, will be interpreted since the standard deviation values generated are not that large. Since standard deviation refers to how close the values of the sample in a data set are around the mean, a high standard deviation value means that the value of each sample in a data set are widely scattered, while a low standard deviation value means that the value of each sample in a data set are close to the mean (fgse. nova. edu).
When the standard deviation value was applied to each value of a sample in the two data sets of the experiment, the resulting values obtained were close to the mean. Hence, there are no deviant values of the individual values of the samples of each data set. The fourth and fifth parameters are the range and relative range. Like the standard deviation and relative standard deviation, these two are under the measures of precision. The range is just the difference between the highest and lowest sample values in a data set and is the simplest measure of spread (Statistics. aerd. com). The range is easy and useful, however its use is limited. On the other hand, relative range, just like relative standard deviation, is expressed in relative terms. Equation 5 and 6 shows how the range and relative range are calculated. Equation 5. Range formula Equation 6. Relative range formula The last parameter is the confidence intervals or confidence limits. It is under the measures of accuracy. Accuracy refers to the closeness of the values obtained (in most cases, the mean) in a given set to the true value (eqavet. eu).
By this, the confidence interval is defined as the estimated range of values which is likely to include an unknown population parameter (Easton and McColl). Furthermore, Easton and McColl said that, “the width of the confidence interval gives some idea about the uncertainty of the unknown parameter. A very wide interval may indicate that more data should be collected before anything very definite can be said about the parameter”. Equation 7 shows how confidence limits or confidence intervals are calculated. Equation 7. Confidence limits or confidence interval formula.
When the results of the parameters tested for dataset 1 and data set 2 are compared with each other, it can be seen that, the results from dataset 1 and 2 do not greatly vary. From the results obtained from the Q-test up to the confidence limits or confidence intervals, the results are close to each other. However, the above statement is just based on observation since there is no test performed to check whether the resulting values from the test parameters obtained from dataset 1 and 2 differ or do not differ significantly with each other. REFERENCES 6 Basic Statistical Tools. Food and Agriculture Organization of the United Nations. fao. org. Web 22 April * Accuracy of Measurement. Eqavet. eqavet. eu. Web. 22 April 2013 * Coin design and production. Bangko Sentral ng Pilipinas. bsp. gov. ph. Web. 22 April 2013 * Dixon’s Q-test: Detection of a single outlier. University of Athens Department of Chemistry. Chem. uoa. gr. Web. 21 April 2013 * Easton, V. J. and McColl, J. H. Confidence Interval. Statistics Glossary. Stats. gla. ac. uk. Web. 22 April 2013 * Introduction to Fundamental Concepts of Chemistry.
City Collegiate. citycollegiate. com. Web 22 April 2013 * Measures of Central Tendency. Australian Bureau of Statistics. abs. gov. au. Web. 21 April 2013 * Measures of Precision. The Chemistry Hypermedia Project. files. chem. vt. edu. Web 21 April 2013 * Measures of Spread. Laerd Statistics. Statistics. laerd. com. Web. 22 April 2013 * Random vs Systematic Error. Physics University of Maryland. umdphysics. umd. edu. Web. 21 April 2013 * The Standard Deviation and the Normal Curve. Nova Southeastern University. fgse. nova. edu. Web 22 April 2013