
Hunter College, City University of New York, Department of Curriculum &
Teaching
EDSTATS Primer
Review of Statistics
Select a Topic
DEFINITION AND KEY TERMS
Definition - Statistics is a body of mathematical techniques or
processes for gathering, organizing, analyzing, and interpreting numerical data.
It is a basic tool of measurement, evaluation, and research.
Key terms to be familiar with are:
- Case, subject or observation - person, place or thing which is the object
of the research. In educational research, it frequently involves students or
teachers.
- Data Element or variable - an item of data which is collected for each
case in the study, and which can vary or have more than one value. Common
variables collected in educational research are sex, ethnicity, year of birth,
test scores, etc.
- Value - each individual piece of information (i.e. answer, score,
response, etc.) for each variable in a study. For example, frequent values
for the variable "Sex" are "F" or "M" (also coded
sometimes as 1 or 2) which represent female and male respectively.
- Data Record - a collection of data elements or variables.
- Data File - a collection of data records.
- Degrees of Freedom - a mathematical concept which indicates the number of
observations or values in a distribution that are independent of each other or
are free to vary. They are used with various measures such as t-tests,
analysis of variance, Chi-square, etc. to refine the results of treatments of
probability or chance in determining statistical significance. For example,
if you have a distribution of three numbers which could vary but the sum of
which has to equal 100, although one could select three separate numbers, in
reality, one only has to select two numbers because the third number would be
determined by the first two numbers. More precisely, if you select 30
and 50, the third number has to be 20. The numbers 30 and 50 are independent
but 20 is dependent on the first two numbers. In this example, there are two
independent values or two degrees of freedom. Calculating the degrees of
freedom for many statistical measures can be time consuming and complex.
Fortunately, most statistical computer software packages calculate degrees of
freedom automatically. The abbreviation for degrees of freedom is "DF"
and appears routinely on many statistical reports.
- Scales of Measurement - assignment of numbers to data to help categorize,
organize, and interpret them. There are four types of measurement scales
- Nominal Scale - numbers represent categories or classifications such as sex
codes, ethnicity codes, etc.
- Ordinal Scale - numbers represent rank order such as a ranking of a class
by grade point average.
- Interval Scale - similar to ordinal scale and, in addition, numbers
represent equal intervals between each number such as most standardized test
scores.
- Ratio Scale - similar to ordinal and interval scales, and, in addition,
has an absolute zero so that numbers can be compared by ratios such as one
number being two times or three times larger than another number.
- Statistical Significance - an indication of the probability of a finding
having occurred by chance. It has nothing to do with importance but is
simply an indication of probability. Researchers have adopted a
general standard of statistical significance, referred to as the .05 level of
significance, that is, the finding would have to occur at least
95% of the time. Analysis of variance, t-tests, regression, Chi-square make
extensive use of statistical significance.
- Standard Error - a statistical inference that assumes that the true
measure (i.e. mean, correlation, difference of means) lies within a stipulated
range from slightly above to slightly below the actual value calculated for the
measure.
Return to Beginning
TYPES OF DATA
In the application of statistical treatments, two types of data are
recognized:
- Parametric Data - data which is measured and which is assumed to be
normally or near normally distributed. Examples include most standardized
tests such as I.Q. tests, S.A.T., G.R.E., etc.
- Non-Parametric Data - data which is distribution-free, and which is
generally counted or ranked. Examples include demographic data such as
sex or ethnicity; and categorized data such as pass/fail, responses
such as yes/no.
TYPES OF ANALYSIS
In the application of statistical treatments, two types of analysis are
recognized:
- Descriptive Analysis - limits generalizations or conclusions, based on
statistical analysis, to the particular group of individuals or cases
observed. No attempt is made to extend these generalizations or conclusions
beyond the the observed group.
- Inferential Analysis - Draws conclusions about a larger population based
on a smaller sample which is assumed to be representative of the larger
population from which it is drawn. An important aspect of inferential analysis
is establishing the representativeness of the smaller sample population which
is usually based on a random distribution.
Return to Beginning
STATISTICAL MEASURES
- Measures of Central Tendency - are averages or what is typical for a group
of values such as scores, grades, etc. The three major measures of central
tendency are the mean, median and mode.
- Measures of Spread or Dispersion - are statistical measures which show
contrasts or differences in a group of values. The major measures of spread are
the range, deviation, variance, and standard deviation.
- Measures of Relative Position - are conversions of values, usually
standardized test scores, to show where a given value stands in relation to
other values of the same grouping. The most common example is
the conversion of scores on standardized tests to show where a given student
stands in relation to other students of the same age, grade level, etc.
Sigma scores, College Board scores, percentiles, stanines, and standard scores
are examples of converted test scores.
- Measures of Relationship - are statistical measures which show a
relationship between two or more paired variables or two or more sets of
data. The major statistical measure of relationship is the
correlation coefficient.
Return to Beginning