Statistics

Unit Review Sheet

These facts and definitions should be mastered throughout this unit. This page can be used for periodic review and study as you are finishing the unit and in the future.

Facts and Definitions

Lesson 1: Data and Statistical Questions

Statistics is a field of mathematics that involves collecting, analyzing, interpreting, and presenting information.
Data is pieces of information.
A statistical question is designed to collect data. It asks about a specific attribute and expects to have variability in the answers.
Numerical (quantitative) data is information given in the form of numbers.
Categorical (qualitative) data is information given in the form of words.

Lesson 2: Populations and Samples

Population is the whole group that is being studied.
A sample is a smaller part of a population that is likely to provide data that is representative of the population as a whole.
Bias is any influence that will result in data that is not accurately representative of the whole population.
Random sampling is a method of selecting a sample so that every member of the population has an equal chance of being selected.
A convenience sample is a sample of a population that is chosen because it provides an easy method of data collection.

Lesson 3: Frequency Tables and Dot Plots

A data set is a collection of all the information gathered from a statistical question.
Frequency refers to the number of times a value appears in a data set.
A frequency table is a table that lists each unique value in a data set and how many times that value occurs.
A distribution is a set of data that has been arranged with a specific order.
A dot plot is a graph that shows the frequency of data using dots.

Lesson 4: Stem-and-Leaf Plots

A stem-and-leaf plot is a graph that organizes numerical data so that the frequency of the data values and the overall distribution of the data is easily seen.

Lesson 5: Histograms

A histogram is a bar graph used to display numerical data that is grouped into intervals. An interval represents a range of data rather than a single data value.

Lesson 6: Measures of Center

Data sets can be described by the shape of their distribution. A data graph may show uniform distribution, skewed distribution, symmetric distribution, or random distribution.
A measure of center is a number that identifies the center value of a distribution. Mean, median, and mode are measures of center.
The mean of a data set is the average of all the data values in the set. The mean is calculated by adding all the data values in a data set together and dividing by the number of data values.
The median is the value that is at the exact center of a data set. To find the median, first arrange the data values in numerical order. For an odd number of data values, the median is the number in the center. To find the median of an even number of data values, add the two center data values and divide the sum by two.
The mode of a data set is the value that occurs the most frequently. To find the mode, arrange the data values in numerical order, then look for the value that is repeated the most. A data set can have more than one mode or no mode at all.
An outlier is a data value that is much higher or lower than the rest of the data values.

Lesson 7: Measures of Variability

Variability refers to how spread out the data values are and how they relate to one another.
The range is the difference between the maximum data value and the minimum data value.
The interquartile range is a number that shows the difference between the first quartile value and the third quartile value. It represents about half of the data values in a data set.
The mean absolute deviation is the average of the distances of each individual data value from the mean of the overall data set.
A box plot is a graph that shows how spread out a data set is using a five-number summary. You may also see this graph called a box-and-whisker plot.
The five numbers used to summarize the data on a box plot are: the minimum value, the maximum value, the median, the first quartile, and the third quartile.

Lesson 8: Making Inferences

An inference is a reasonable conclusion about a population that comes from analyzing the data from a sample of the population.
Sampling variability is the difference in measure between two or more samples of the same population. In general, the larger the sample size, the smaller the sampling variability.

Lesson 9: Comparing Populations

Data from two or more populations can be compared using visual and mathematical observations.
Mathematical computations enable a person to be confident in inferences made using visual observations.

Lesson 10: Unit 8 Test

[none]

Final Project: Statistical Study

[none]