Remember – You simply cannot go wrong with the world-famous ‘Electronic Statistics Handbook’ at your side!
The only Internet Resource about Statistics Recommended by Encyclopedia Britannica
StatSoft has freely provided the Electronic Statistics Textbook as a public service for more than 17 years now.
This Textbook offers training in the understanding and application of statistics. The material was developed at the StatSoft R&D department based on many years of teaching undergraduate and graduate statistics courses and covers a wide variety of applications, including laboratory research (biomedical, agricultural, etc.), business statistics, credit scoring, forecasting, social science statistics and survey research, data mining, engineering and quality control applications, and many others.
The Electronic Textbook begins with an overview of the relevant elementary (pivotal) concepts and continues with a more in depth exploration of specific areas of statistics, organized by “modules” and accessible by buttons, representing classes of analytic techniques. A glossary of statistical terms and a list of references for further study are included.
(Electronic Version): StatSoft, Inc. (2012). Electronic Statistics Textbook. Tulsa, OK: StatSoft. WEB: http://www.statsoft.com/textbook/.
(Printed Version): Hill, T. & Lewicki, P. (2007). STATISTICS: Methods and Applications. StatSoft, Tulsa, OK.
Overview of Elementary Concepts in Statistics. In this introduction, we will briefly discuss those elementary statistical concepts that provide the necessary foundations for more specialized expertise in any area of statistical data analysis. The selected topics illustrate the basic assumptions of most statistical methods and/or have been demonstrated in research to be necessary components of one’s general understanding of the “quantitative nature” of reality (Nisbett, et al., 1987). Because of space limitations, we will focus mostly on the functional aspects of the concepts discussed and the presentation will be very short. Further information on each of those concepts can be found in the Introductory Overview and Examples sections of this manual and in statistical textbooks. Recommended introductory textbooks are: Kachigan (1986), and Runyon and Haber (1976); for a more advanced discussion of elementary theory and assumptions of statistics, see the classic books by Hays (1988), and Kendall and Stuart (1979).
How to Interpret Statistical Analysis Results
Written by: STATISTICA News
Statistical tests examine a variety of relationships in data, but they share some common elements. Typically, statistical tests state a null and alternative hypothesis, calculate a test statistic, and report an associated p-value, and then the analyst makes a conclusion from the tests. This process is followed for simple tests as well as complex ones. Once you achieve a basic understanding of the process of statistical hypothesis testing, the concepts can be generalized to all tests.
Stating the Hypothesis
Statistical tests start with a null and alternative hypothesis. These hypotheses are statements about the population from which the sample was drawn. The sample data are used to support either the null or alternative hypothesis. A given test has one or more standard null and alternative hypotheses. For example, a one sample t-test has three possible hypotheses:
where μ represents the population mean and μ0 is the hypothesized mean. The first is a two-sided hypothesis where the researcher is looking for a significant difference between the population mean and the hypothesized mean. The second and third are one-sided alternatives where the researcher hypothesized that the true mean is either greater than (2) or less than (3) the hypothesized mean.
In a test for normality of data, the null hypothesis is: H0 : X ~ N(μ,σ) versus the alternative that the data are not normally distributed.
Calculating the Test Statistic and p-Value
Test statistics are used to decide between the null and alternative hypotheses. They can follow one of a variety of statistical distributions. This makes test statistics harder to interpret. The critical value for deciding between the null and alternative hypothesis varies by test.
A p-value is the probability of obtaining a sample data set as extreme as the observed data, given that the null hypothesis is true. While not technically accurate, it is much easier to think of the p-value as support for the null hypothesis. Before the analysis, a threshold is chosen, called alpha or level of significance. If the calculated p-value is less than the threshold, typically 0.05, then the null hypothesis is rejected in favor of the alternative. Said another way, the test is statistically significant. In STATISTICA, statistically significant p-values are reported in red.
Conclusion and Interpretation of Results
The p-value computed by the test leads you to reject or fail to reject the null hypothesis. (When the p-value is reported in red, reject the null hypothesis.) This conclusion should then be interpreted in terms of your project. A good interpretation will not mention hypotheses or test statistics. The interpretation will simply state the conclusion in the context of the problem.
Fail to Reject H0
When a test fails to reject the null hypothesis, it means that insufficient evidence exists to support the alternative hypothesis. Some examples of this include:
- A significant difference does not exist between the population means of A and B
- The correlation between A and B is not significantly different from 0.
- The distribution of the data is not significantly different from Normal.
- The regression parameter does not explain a significant amount of the variability in y. (The regression parameter is not significantly different from 0.)
The conclusion is not to accept the null hypothesis. The insignificant result from the test may be because the null is true. It may also be because either random chance or too small of a sample made it impossible to detect the significance.
When a test does reject the null hypothesis, it does so in favor of the alternative hypothesis. The reject H0 conclusions for the same tests given above are:
- A significant difference exists between the population means of A and B. (Or, the population mean of A is significantly greater than the population mean of B.
- The correlation between A and B is significantly different from 0.
- The distribution of the data is significantly different from Normal.
- The regression parameter does explain a significant amount of the variability in y. (The regression parameter is significantly different from 0.)
In a pain relief study, researchers are studying the effects of the pain relief medicine, aspirin, compared to a placebo. Pain relief scores were recorded for two groups of people who were given either aspirin or the placebo. Greater pain relief scores indicate better pain relief. The hypothesis to test is that pain relief will be different for patients given the aspirin compared to the placebo. Let’s write these as statistical hypotheses.
H0 : μasprin = μplacebo
Ha : μasprin ≠ μplacebo
The null hypothesis states that average pain relief for patients given aspirin is equal to the average pain relief for patients given a placebo. The alternative hypothesis (which is what the research team believes to be true) states that the average pain relief for patients given aspirin is not equal to relief from the placebo.
Looking at the table of output, the sample mean pain relief for patients given aspirin is 59.1, which is greater than the average placebo pain relief of 56.3. These sample statistics are used to compute a test statistic to make inference about the populations they represent. That test statistic is computed to be 2.09. The test used follows a Studentized t distribution with N-2=16 degrees of freedom. The p-value for this test is 0.0527.
To make a conclusion, the p-value is compared to the alpha level of significance. In this case, alpha = 0.05. The p-value = 0.0527 > 0.05 = alpha. The test fails to reject the null hypothesis. The test does not show a significant difference between the mean pain relief for patients given aspirin and those given a placebo.
Conclusion: A significant difference does not exist between the population average pain relief for patients given aspirin vs. those given a placebo.
The conclusion is not that the means are equal, but that they are not significantly different. It is possible that a difference in average pain relief does exist between the two groups. One possible reason for this is that the experiment did not collect enough data (samples). With additional data points, the statistical power of the test is improved. Another possibility is that random chance led to a sample with greater variability or a different mean than what is typical of the population.