Significance (askiaanalyse)

This topic describes the available options for significance calculations.

Note: for detailed explanations of specific tests, see comparing proportions and mean comparison.

In the options on significativity window, the options are as follows:

Test against: Select what comparison will be made, as follows:
- Counts when independent: comparison of cell with total (Chi²). For details, see counts when independent.
- All other columns: comparison of the column profile with the percentage obtained in the other columns. For details, see compared to all other columns.
- All other rows: comparison of the row profile with the percentage obtained in the other rows. For details, see compared to all other rows.
Count threshold: The minimum count to be taken into account in a cell.
High significancy (%): The percentage at which values are to be regarded as highly significant.
Normal significativity (%): The percentage at which values are to be regarded as of normal significance.
Low High significativity (%): The percentage at which values are to be regarded as of low significance.

Interpreting the results

In the results, significant values will be indicated by the following symbols:

High threshold: +++ or --- (a significance of 99%)
Medium threshold: ++ or -- (a significance of 95%)
Lower threshold: + or - (a significance of 99%)

"+" indicates that it is a significant increase
"-" indicates that it is a significant decrease

The test allows comparison of Test Values with threshold values.

N= Total base

Note: it is possible to define a threshold to be 0, so that the test is not run at that threshold. For example, if you want "+" to appear at 90%, specify 0,0,90. If you only want "++" to appear at 95%, specify 0,95,0.

Counts when independent

EffInd(i,j) = Total(i) * Total(j) / PrcInd (i,j) = EffInd(ij) / PrcObs (i,j) = EffObs(ij) /N

(PrcObs(ij)-PrcInd(ij))

Test value = --------------------------------------------------

squared root(PrcInd(ij)*(1-PrcInd(ij))/N)

=> if the test value > threshold = Significant difference, upwards if the coefficient is positive or downwards if the coefficient is negative.

Principle of the khi2 test on a closed question

We make the hypothesis that there is no significant difference between the observed and theoretical frequencies. We calculate the value of the Khi2: this is the square of the difference between the observed frequency and the theoretical frequency, divided by the theoretical frequency, which is the sum of all responses. Thus, we use the Khi2's tables to obtain the result with the number of degrees of freedom (which is the number or response items minus 1).

Khi2= SUM ( Eoi - Eti )2 / Eti
where Eoi is the observed counts and the Eti is the theoretical counts.

Example:

Sex Male Female Observed counts 169 161 Theoretical counts 165 165

Khi2= ( 169 - 165) 2 / 165 + ( 161 - 165 )2 / 165 = 0,058564

Reading the Khi2 tables with a degree of freedom of 1, we get the probability of 97.1% of being mistaken if we refuse the hypothesis. We can therefore accept the hypothesis that the sample is representative of the population.

Independence testing with Khi2

After having done a cross-tab count, we can ask whether there is a dependency between the two questions. In other words, does knowing a response to one question give us information on another question? For example, we might ask whether there is a significant difference on the preference for a kind of packaging between men and women. 62.5% of men preferred "packaging B" compared to 83.33% of women. This difference may seem significant, but what would it have been if the results had been 80% and 82%, or if we had only surveyed 10 people? To establish whether or not there is a significant dependence between the two closed questions, we therefore use the Khi² test.

Before defining what dependency is, we shall attempt to define what independence is. This signifies that having information on a question does not yield any information on the other. This implies that each of the row profiles are equal in the single counts. The same goes for the column profiles.

There are three available tests :

Comparing cells (Khi2): classic test
Comparing columns: Complementary columns (Column TOTAL - cell to compare)
Comparing rows: Complementary rows (Row TOTAL - cell to compare)

In the tools menu, choose the desired method of calculation.

To obtain the counts for independence, we calculate the Cntindij :

Fi. : Frequency in line
F.j : Frequency in column
Cij : Counts observed
N: Size of the population

We then have :

Cntindij= Fi. X F.j X N

The Khi2 will be calculated by summing over each cell the squared difference, the observed and the independence counts divided by the independence counts. If the tables are the same, the value will default to 0.

Khi2= SUMij (Eij - Cntindij)2 / Efindij

When we read this value khi2 table at "n" degrees of freedom, we find the value of the error percentage for refusing the independence hypothesis.

The Number of Degrees of Freedom are calculated in the following way :

NDL= (n-1) response items in column X (n-1) response items in line.

We calculate from the table of the squared differences of observed and the independence counts divided by the independence counts, the contributions to the Khi2. The "-" or "+" signs indicate whether we are below or above the counts for independence.

Note: The use of the Khi2 test is recommended when you have a sample of at least 20 people and if all your theoretical counts to independence are greater than 5.

For details on interpreting the results, see interpreting the results.

Compared to all other columns

N1 = Total(j) N2 = N - N1

If N1 and N2 >

P1 = Observe(i,j) / N1 P2 = (Total(i) - Observe(i,j) ) / N2

If the standard deviation is known, we calculate 'D', which follows a normal mathematical expectation law = 0 ( p1-p2=0), and standard deviation s'd=ROOT(f*(1-f) * (1/N1+1/N2)), where f is an estimate calculated as follows:

f= (p1*n1+p2*n2)/n1+n2 s'd = Squared root(f*(1-f) * (1/N1+1/N2))

f

D= ------------

s'd

If the standard deviation is not known, we calculate 'D', which follows a normal mathematical expectation law p1-p2, and standard deviation sd=Squared ROOT((p1*(1-p1))/n1 + (p2*(1-p2))/n2),

(P1-P2)

Test Value = --------------------------------------------------------

P1*(1-P1) P2 * (1 - P2)

Squared Root (----------------- + ------------------- )

N1 N2

=> if the test value> threshold = Significant difference, upwards if the coefficient is positive or downwards if the coefficient is negative.

Compared to all other rows

N1 = Total(i) N2 = dBase - N1

If N1 an N2 >

P1 = Observe(i,j) / N1 P2 = (Total(j) - Observe(i,j) ) / N2

f= (p1*n1+p2*n2)/n1+n2 s'd = ROOT(f*(1-f) * (1/N1+1/N2))

f

D= ------------

s'd

If the standard deviation is not known, we calculate 'D', which follows a normal mathematical expectation law p1-p2, and standard deviation sd=Squared ROOT((p1*(1-p1))/n1 + (p2*(1-p2))/n2),

(P1-P2)

Test Value = -----------------------------------------

P1*(1-P1) P2 * (1 - P2)

Root (----------------- + ------------------- )

N1 N2

=> if the test value> threshold = Significant difference, upwards if the coefficient is positive or downwards if the coefficient is negative.

Create your own Knowledge Base