Data Science

Posts

Why can't we just perform multiple ANOVAs?

January 05, 2019

Post Hoc Tests For ANOVA In the case where the explanatory variable represents more than two groups, a significant ANOVA does not tell us which groups are different from the others. To determine which groups are different from the others, we would need to perform a post hoc test. A post hoc test conducts post hoc paired comparisons. Post hoc means after the fact. And these post hoc paired comparisons must be conducted in a particular way in order to prevent excessive type 1 error. Type 1 error, as you'll recall, occurs when you make an incorrect decision about the null hypothesis. That is, you reject the null hypothesis when the null hypothesis is true. Why can't we just perform multiple ANOVAs? As you know, we accept significance and reject the null hypothesis at P less than or equal to 0.05. A 5% chance that we're wrong and have committed a type 1 error. There's actually a 5% chance of making a type 1 error for each analysis of variance ...

Scientific Method

December 29, 2018

Scientific Method The scientific method is based on systematic observation and consistent logic. Applying the scientific method increases our chances of coming up with valid explanations. It also provides a way to evaluate the plausibility of our scientific claims, or hypotheses. And the strength of the empirical evidence that we provide for these hypotheses in our empirical study or research. Empirically testable Replicable Objective Transparent Falsifiable logically consistent Empirically Testable This means that it should be possible to collect empirical or physical evidence or observations that will either support or contradict the hypothesis. Replicable A study and its findings should be replicable. Meaning we should be able to consistently repeat the original study. If the expected result occurs only once, or in very few cases, then the result could just have been coincidental...

Variables Types

December 23, 2018

Variables Types Quantitative variables Numerical, measurable quantities in which arithmetic operations often make sense. Continuous, could take on any value within an interval, many possible values. Discrete, countable value, finite number of values. Categorical or qualitative variables Classifies items into different groups Ordinal, groups have an order or ranking Nominal, groups are merely names, no ranking. Examples Employees were asked to report their typical daily commute time, in minutes. What type of variable would their response be considered? Employees were asked to report their typical daily mode of transportation to and from work (i.e. Car, Bike, Bus, etc.). What type of variable would their response be considered? The company wanted to know how employees perceived the work of upper management. Employees were asked to report the satisfaction of upper management using a 1 to 5 scale (with the following representations: 1 - Extremely Unsat...

The idea behind ANOVA

December 17, 2018

The idea behind ANOVA When you're testing hypothesis with the categorical explanatory variable and a quantitative response variable the tool that you should use is Analysis of Variance, also called ANOVA The question we need to answer with the ANOVA F Test is, are the differences among the sample means due to true differences among the population means, or merely due to sampling variability? In order to answer this question, using our data, we obviously need to look at the variation among the sample means. But that's not enough. We also need to look at the variation among the sample means relative to the variation within the groups. So F is the variation among sample means divided by the variation within groups. In other words, we need to look at the quantity, variation among sample means, divided by variation within groups. Which measures to what extent the difference among the sample groups, means, dominates over the us...

Choose A Statistical Test

December 17, 2018

Choose A Statistical Test You will always be interpreting p values, regardless of the inferential test that you use. The specific statistical test that you use to evaluate your hypotheses, will depend on the type of explanatory and response variables that you have chosen.

What is a p value?

December 17, 2018

What is a p value? The p-value provides an estimate of how often we would get the obtained result by chance if in fact, the null hypothesis is true. In statistics a result is called statistically significant if it's unlikely to have occurred by chance alone. The most commonly used standard or cutoff is 0.05 or 5%. Because this standard, or cutoff is so important it has a special name. It's called the significance level of a test, and is usually denoted by the Greek letter alpha, so alpha equals 0.05. If the p-value is small, less than 0.05, this suggests that it is more than 95% likely that the association of interest would be present following repeated samples drawn from the population. AKA, a sampling distribution. If the p-value is less than alpha, which is usually 0.05, then the data we got is considered to be rare or surprising enough when the null hypothesis, H 0 is true. And we say, that the data provides significant evidence against the ...

Search This Blog

Data Science

Posts

Confounder Variable

Why can't we just perform multiple ANOVAs?

Scientific Method

Variables Types

The idea behind ANOVA

Choose A Statistical Test

What is a p value?