This topic describes the available options for significance calculations.
In the options on significativity window, the options are as follows:
Test against: Select what comparison will be made, as follows:
Counts when independent: comparison of cell with total (Chi²). For details, see counts when independent.
All other columns: comparison of the column profile with the percentage obtained in the other columns. For details, see compared to all other columns.
All other rows: comparison of the row profile with the percentage obtained in the other rows. For details, see compared to all other rows.
Count threshold: The minimum count to be taken into account in a cell.
High significancy (%): The percentage at which values are to be regarded as highly significant.
Normal significativity (%): The percentage at which values are to be regarded as of normal significance.
Low High significativity (%): The percentage at which values are to be regarded as of low significance.
In the results, significant values will be indicated by the following symbols:
High threshold: +++ or --- (a significance of 99%)
Medium threshold: ++ or -- (a significance of 95%)
Lower threshold: + or - (a significance of 99%)
"+" indicates that it is a significant increase
"-" indicates that it is a significant decrease
The test allows comparison of Test Values with threshold values.
N= Total base
EffInd(i,j) = Total(i) * Total(j) /
PrcInd (i,j) = EffInd(ij) /
PrcObs (i,j) = EffObs(ij) /N
(PrcObs(ij)-PrcInd(ij))
Test value = --------------------------------------------------
squared root(PrcInd(ij)*(1-PrcInd(ij))/N)
=> if the test value > threshold = Significant difference, upwards if the coefficient is positive or downwards if the coefficient is negative.
We make the hypothesis that there is no significant difference between the observed and theoretical frequencies. We calculate the value of the Khi2: this is the square of the difference between the observed frequency and the theoretical frequency, divided by the theoretical frequency, which is the sum of all responses. Thus, we use the Khi2's tables to obtain the result with the number of degrees of freedom (which is the number or response items minus 1).
Khi2= SUM ( Eoi - Eti )2 / Eti
where Eoi is the observed counts and the Eti is the theoretical counts.
Example:
Sex Male Female
Observed counts 169 161
Theoretical counts 165 165
Khi2= ( 169 - 165) 2 / 165 + ( 161 - 165 )2 / 165 = 0,058564
Reading the Khi2 tables with a degree of freedom of 1, we get the probability of 97.1% of being mistaken if we refuse the hypothesis. We can therefore accept the hypothesis that the sample is representative of the population.
After having done a cross-tab count, we can ask whether there is a dependency between the two questions. In other words, does knowing a response to one question give us information on another question? For example, we might ask whether there is a significant difference on the preference for a kind of packaging between men and women. 62.5% of men preferred "packaging B" compared to 83.33% of women. This difference may seem significant, but what would it have been if the results had been 80% and 82%, or if we had only surveyed 10 people? To establish whether or not there is a significant dependence between the two closed questions, we therefore use the Khi² test.
Before defining what dependency is, we shall attempt to define what independence is. This signifies that having information on a question does not yield any information on the other. This implies that each of the row profiles are equal in the single counts. The same goes for the column profiles.
There are three available tests :
In the tools menu, choose the desired method of calculation.
To obtain the counts for independence, we calculate the Cntindij :
Fi. : Frequency in line
F.j : Frequency in column
Cij : Counts observed
N: Size of the population
We then have :
Cntindij= Fi. X F.j X N
The Khi2 will be calculated by summing over each cell the squared difference, the observed and the independence counts divided by the independence counts. If the tables are the same, the value will default to 0.
Khi2= SUMij (Eij - Cntindij)2 / Efindij
When we read this value khi2 table at "n" degrees of freedom, we find the value of the error percentage for refusing the independence hypothesis.
The Number of Degrees of Freedom are calculated in the following way :
NDL= (n-1) response items in column X (n-1) response items in line.
We calculate from the table of the squared differences of observed and the independence counts divided by the independence counts, the contributions to the Khi2. The "-" or "+" signs indicate whether we are below or above the counts for independence.
N1 = Total(j)
N2 = N - N1
If N1 and N2 >
P1 = Observe(i,j) / N1
P2 = (Total(i) - Observe(i,j) ) / N2
If the standard deviation is known, we calculate 'D', which follows a normal mathematical expectation law = 0 ( p1-p2=0), and standard deviation s'd=ROOT(f*(1-f) * (1/N1+1/N2)), where f is an estimate calculated as follows:
f= (p1*n1+p2*n2)/n1+n2
s'd = Squared root(f*(1-f) * (1/N1+1/N2))
f
D= ------------
s'd
If the standard deviation is not known, we calculate 'D', which follows a normal mathematical expectation law p1-p2, and standard deviation sd=Squared ROOT((p1*(1-p1))/n1 + (p2*(1-p2))/n2),
|
(P1-P2) Test Value = -------------------------------------------------------- P1*(1-P1) P2 * (1 - P2) Squared Root (----------------- + ------------------- ) N1 N2 |
=> if the test value> threshold = Significant difference, upwards if the coefficient is positive or downwards if the coefficient is negative.
N1 = Total(i)
N2 = dBase - N1
If N1 an N2 >
P1 = Observe(i,j) / N1
P2 = (Total(j) - Observe(i,j) ) / N2
If the standard deviation is known, we calculate 'D', which follows a normal mathematical expectation law = 0 ( p1-p2=0), and standard deviation s'd=ROOT(f*(1-f) * (1/N1+1/N2)), where f is an estimate calculated as follows:
f= (p1*n1+p2*n2)/n1+n2
s'd = ROOT(f*(1-f) * (1/N1+1/N2))
f
D= ------------
s'd
If the standard deviation is not known, we calculate 'D', which follows a normal mathematical expectation law p1-p2, and standard deviation sd=Squared ROOT((p1*(1-p1))/n1 + (p2*(1-p2))/n2),
(P1-P2)
Test Value = -----------------------------------------
P1*(1-P1) P2 * (1 - P2)
Root (----------------- + ------------------- )
N1 N2
=> if the test value> threshold = Significant difference, upwards if the coefficient is positive or downwards if the coefficient is negative.