Posts

Confounder Variable

Confounder Variable  A confounder  is a variable that is related to both the independent and dependent variable, and partially,  or even entirely, accounts for the relationship between these two. An important thing to note about confounders, is that they  are not included in the hypothesis and they are generally not measured.  This makes it impossible to determine what the actual effect of a confounder was. The only thing to do is to repeat the study, and control the confounder  by making sure it takes on the same value for all participants. Example If we want to estimate house price and in our data set we have those variables : Size  Number of bedrooms  Address ZipCode Age of house  We might find that size is strongly correlated to the price of the house, but in our dataset, the size also correlated to the zip code. and if we loop deeper we might find that the size of the hou...

Why can't we just perform multiple ANOVAs?

Image
Post Hoc Tests For ANOVA In the case where the explanatory variable represents more than two groups, a significant ANOVA does not tell us which groups are different from the others. To determine which groups are different from the others, we would need to perform a post hoc test. A post hoc test conducts post hoc paired comparisons. Post hoc means after the fact. And these post hoc paired comparisons must be conducted in a particular way in order to prevent excessive type 1 error. Type 1 error, as you'll recall, occurs when you make an incorrect decision about the null hypothesis. That is, you reject the null hypothesis when the null hypothesis is true. Why can't we just perform multiple ANOVAs? As you know, we accept significance and reject the null hypothesis at P less than or equal to 0.05. A 5% chance that we're wrong and have committed a type 1 error. There's actually a 5% chance of making a type 1 error for each analysis of variance ...

Scientific Method

Scientific Method The scientific method is based on systematic observation and   consistent logic.   Applying the scientific method increases our chances of coming up   with valid explanations.   It also provides a way to evaluate the plausibility of our scientific claims,   or hypotheses.   And the strength of the empirical evidence that we provide for   these hypotheses in our empirical study or research. Empirically testable Replicable Objective Transparent Falsifiable logically consistent Empirically Testable This means that it should be possible to collect empirical or physical evidence or observations that will either support or contradict the hypothesis. Replicable A study and its findings should be replicable. Meaning we should be able to consistently repeat the original study. If the expected result occurs only once, or in very few cases, then the result could just have been coincidental...

Variables Types

Variables Types Quantitative variables Numerical, measurable quantities in which arithmetic operations often make sense. Continuous, could take on any value within an interval, many possible values. Discrete, countable value, finite number of values. Categorical or qualitative variables Classifies items into different groups Ordinal, groups have an order or ranking Nominal, groups are merely names, no ranking.  Examples Employees were asked to report their typical daily commute time, in minutes. What type of variable would their response be considered? Employees were asked to report their typical daily mode of transportation to and from work (i.e. Car, Bike, Bus, etc.). What type of variable would their response be considered? The company wanted to know how employees perceived the work of upper management. Employees were asked to report the satisfaction of upper management using a 1 to 5 scale (with the following representations: 1 - Extremely Unsat...

The idea behind ANOVA

The idea behind ANOVA When you're testing hypothesis with the categorical explanatory variable and a quantitative response variable the tool that you should use is Analysis of Variance, also called ANOVA The question we need to answer with the ANOVA F Test is, are the differences among the sample means due to true differences among the population means, or merely due to sampling variability? In order to answer this question, using our data, we obviously need to look at the variation among the sample means. But that's not enough. We also need to look at the variation among the sample means relative to the variation within the groups. So F is the variation among sample means divided by the variation within groups. In other words, we need to look at the quantity, variation among sample means, divided by variation within groups. Which measures to what extent the difference among the sample groups, means, dominates over the us...

Choose A Statistical Test

Image
Choose A Statistical Test You will always be interpreting p values, regardless of the inferential test that you use. The specific statistical test that you use to evaluate your hypotheses, will depend on the type of explanatory and response variables that you have chosen.

What is a p value?

What is a p value? The p-value provides an estimate of how often we would get the obtained result by chance if in fact, the null hypothesis is true. In statistics a result is called statistically significant if it's unlikely to have occurred by chance alone. The most commonly used standard or cutoff is 0.05 or 5%. Because this standard, or cutoff is so important it has a special name. It's called the significance level of a test, and is usually denoted by the Greek letter alpha, so alpha equals 0.05.   If the p-value is small, less than 0.05, this suggests that it is more than 95% likely that the association of interest would be present following repeated samples drawn from the population. AKA, a sampling distribution. If the p-value is less than alpha, which is usually 0.05, then the data we got is considered to be rare or surprising enough when the null hypothesis, H 0 is true. And we say, that the data provides significant evidence against the ...