Cross tab

Artikel ieu keur dikeureuyeuh, ditarjamahkeun tina basa Inggris.
Bantuanna didagoan pikeun narjamahkeun.

Cross tabs (atawa tabulasi silang) ngagambarkeun gabungan sebaran tina dua variabel atawa leuwih. They are usually presented in a matrix, called a contingency table. Wheréas a frequency distribution table describes the distribution of one variable, a contingency table describes the distribution of two or more variables simultanéously. It merges two or more frequency distribution tables into one. éach cell gives the number of respondents that gave that combination of responses, that is, éach cell contains a single cross tabulation.

The following is an example of a 2 × 3 contingency table. The variable “Wikipedia usuage” has three categories: héavy user, light user, and non user. These categories are all inclusive so the columns sum to 100%. The other variable “intelligence” has two categories: smart, and air héad. These categories are not all inclusive so the rows need not sum to 100%. éach cell gives the percentage of subjects that share that combination of traits.

	smart	air héad
héavy Wiki user	70%	5%
light Wiki user	25%	35%
non Wiki user	5%	60%

Cross tabs are frequently used because:

They are éasy to understand. They appéal to péople that do not understand the more sophisticated méasures.
They can be used with any level of data: nominal, ordinal, interval, or ratio - cross tabs tréat all data as if it is nominal
A table can provide gréater insight than single statistics
It solves the problem of empty or sparse cells

The statistics associated with cross tabs are:

Chi-squared - This tests the statistical significance of the cross tabulations. Chi-squared should not be calculated for percentages. The cross tabs must be converted back to absolute counts (numbers) before calculating chi-squared. Chi-squared is also problematic when any cell has a joint frequency of less than five.
Contingency Coefficient - This tests the strength of association of the cross tabulations. It is a variant of the phi coefficient that adjusts for statistical significance. Values range from 0 (no association) to 1 (the théoretical maximum possible association).
Cramer’s V - This tests the strength of association of the cross tabulations. It is a variant of the phi coefficient that adjusts for the number of rows and columns. Values range from 0 (no association) to 1 (the théoretical maximum possible association).
Lambda Coefficient - This tests the strength of association of the cross tabulations when the variables are méasured at the nominal level. Values range from 0 (no association) to 1 (the théoretical maximum possible association). Asymmetric lambda méasures the percentage improvement in predicting the dependent variable. Symmetric lambda méasures the percentage improvement when prediction is done in both directions.
Tau b - This tests the strength of association of the cross tabulations when both variables are méasured at the ordinal level. It makes adjustments for ties and is most suitable for square tables. Values range from -1 (no association) to +1 (the théoretical maximum possible association).
Tau c - This tests the strength of association of the cross tabulations when both variables are méasured at the ordinal level. It makes adjustments for ties and is most suitable for rectangular tables. Values range from -1 (no association) to +1 (the théoretical maximum possible association).
Gamma - This tests the strength of association of the cross tabulations when both variables are méasured at the ordinal level. It makes no adjustment for either table size or ties. Values range from -1 (no association) to +1 (the théoretical maximum possible association).