16 min readโขaugust 18, 2023
A Q
A Q
We know that studying for your AP exams can be stressful, but Fiveable has your back! We created a study plan to help you crush your AP Statistics exam. This guide will continue to update with information about the 2024 exams, as well as helpful resources to help you do your best on test day.ย Unlock Cram Modeย for access to our cram eventsโstudents who have successfully passed their AP exams will answer your questions and guide your last-minute studying LIVE! And don't miss out on unlimited access to our database of thousands of practice questions.
Going into test day, this is the format to expect:
Section 1: Multiple Choice - 50% of your score
40 questions in 1 hr 30 mins
Section 2: Free Response - 50% of your score
6 questions in 1 hr 30 mins
Part A: 65 mins
1 multipart question with a focus on collecting data
1 multipart question with a focus on exploring data
1 multipart question with a focus on probability and sampling distributions
1 multipart question with a focus on inference
1 multipart question that combines 2 or more skill categories
Part B: 25 mins
1 investigative task that assesses multiple skill categories and content areas
๐ Check out the 2023 AP Statistics Free-Response Section posted on the College Board site.
First, download theย AP Statistics Cheatsheet PDFย - a single sheet that covers everything you need to know at a high level. Take note of your strengths and weaknesses!
We've put together the study plan found below to help you study between now and May. This will cover all of the units and essay types to prepare you for your exam. Pay special attention to the units that you need the most improvement in.
Study, practice, and review for test day with other students during our live cram sessions viaย Cram Mode. Cram live streams will teach, review, and practice important topics from AP courses, college admission tests, and college admission topics. These streams are hosted by experienced students who know what you need to succeed.
Before you begin studying, take some time to get organized.
๐ฅ Create a study space.
Make sure you have a designated place at home to study. Somewhere you can keep all of your materials, where you can focus on learning, and where you are comfortable. Spend some time prepping the space with everything you need and you can even let others in the family know that this is your study space.ย
๐ Organize your study materials.
Get your notebook, textbook, prep books, or whatever other physical materials you have. Also, create a space for you to keep track of review. Start a new section in your notebook to take notes or start a Google Doc to keep track of your notes. Get yourself set up!
๐ Plan designated times for studying.
The hardest part about studying from home is sticking to a routine. Decide on one hour every day that you can dedicate to studying. This can be any time of the day, whatever works best for you. Set a timer on your phone for that time and really try to stick to it. The routine will help you stay on track.
๐ Decide on an accountability plan.
How will you hold yourself accountable to this study plan? You may or may not have a teacher or rules set up to help you stay on track, so you need to set some for yourself. First, set your goal. This could be studying for x number of hours or getting through a unit. Then, create a reward for yourself. If you reach your goal, then x. This will help stay focused!
Unit 1 is about creating and analyzing graphs of data. This includes both categorical and quantitative data. For categorical data, we should be able to read and create tables and bar graphs and calculate proportions/percentages. For quantitative data, we should be able to read and create dot plots, stemplots, histograms, and boxplots. We should also be able to describe the shape, center, variability (spread), and any unusual features of a distribution of quantitative data. This includes making calculations such as mean, median, range, interquartile range (IQR), and standard deviation. Our descriptions and calculations can be used to compare data from multiple groups. Finally, Unit 1 ends with describing the position of individuals within a quantitative data set, including using percentiles and z-scores. This leads us to an initial exploration of the Normal Distribution, though we will study that more in-depth in Units 4-5.
๐ Read these study guides:
1.0 Unit 1 Overview
1.10 The Normal Distribution
๐ฅ Watch these videos from the Fiveable archives:
Analyzing Categorical Variables: An intro to some key terms and graphs (use first 15 minutes)
Describing Data in a Distribution: A breakdown of percentile a cumulative graphs
Normal Distributions: A good intro to all things Normal!
๐ฐ Check out these articles:
Relative Dominance: A real-life example of how z-scores can help compare individuals from different distributions, using golfers (source: Grantland)
โ๏ธ Practice:
Practice an AP-Style Problem: check out this post and practice your free-response skills!
๐ Check out some online applets:
Mean vs. Median interactive applet: Play with this applet to get a sense of how changing different data values impacts the mean and median
Normal Distribution applet: A visual of the Standard Normal Curve. Update the mean and standard deviation to look at any data set.
Unit 2 is about creating and analyzing graphs of data when two variables are measured about each individual in a data set. For categorical data, we should be able to read and create two-way tables or segmented bar graphs and calculate conditional percentages. These can be used to comment on the association (or lack thereof) between the two variables. For quantitative data, we should be able to read, create, and describe scatterplots, which can also be used to comment on the apparent association between two variables.
The second half of Unit 2 is then focused on linear regression, a process by which we can make predictions about one quantitative variable (a response variable) using another (an explanatory variable). We should be able to use Least-Squares Regression Lines to make these predictions, and interpret several components of the LSRLs (including slope, intercept, and other calculated values such as s or r2)
๐ Read these study guides:
2.0 Unit 2 Overview
2.4 Representing the Relationship Between Two Quantitative Variables
2.5 Correlation
2.7 Residuals
๐ฅ Watch these videos from the Fiveable archives:
Analyzing Categorical Variables: Start at 14:38 for an example of two-way tables and stay for segmented bar graphs
Describing Scatterplots & Association: How to describe the direction, strength, and form of an association, as well as an introduction to the correlation coefficient r
Using Least-Squares Regression Lines: How to make predictions from regression lines and calculating residuals
Advanced Linear Regression: Interpreting โsโ, โr2โ, and reading computer outputs of regression data
โ๏ธ Practice:
Practice an AP-Style Problem: check out this post and practice your free-response skills!
๐ Check out some online applets:
Least-Squares Regression: Try to guess the least-squares regression line from a scatterplot of data
๐ย Just for fun!
Spurious Correlations: Data sets with very high โrโ values thatโฆ wellโฆ youโll see... [Source: Tyler Vigen]
While Units 1-2 were about graphing and analyzing sets of data, Unit 3 is about examining the methods through which we can collect that data. For sample surveys, we should be able to describe various methods of selecting samples, particularly the random methods (simple random, stratified random, cluster, and systematic samples). However, not all samples are collected through a random process, and we should be prepared to discuss possible sources of bias in surveys (including via non-random selection processes).
We then turn to the differences between observational studies and experiments, and the features of a well-designed experiment. We should be able to define many common terms associated with experiments (many of which youโve likely seen in other courses!), and compare and contrast several common experimental designs: completely randomized design, randomized block design, and matched-pairs design.
๐ Read these study guides:
3.0 Unit 3 Overview
3.1 Introducing Statistics: Do the Data We Collected Tell the Truth?
3.3 Random Sampling and Data Collection
๐ฅ Watch these videos from the Fiveable archives:
Sampling Methods and Sources of Bias: A breakdown of the different ways we can take samples, and how to talk about bias on the AP exam.
Experiments and Observational Studies: All things experiments! Includes a discussion of the possible pitfalls of observational studies (confounding)
โ๏ธ Practice:
AP-Style Problem #1: a practice question on surveys and sampling methods.
AP-Style Problem #2: a practice question on observational studies/experiments
Unit 4 is where AP Statistics gets โmath-y,โ with lots of calculations and formulas. We are asked to calculate or interpret probabilities in a variety of settings, beginning with the understanding that probability reflects what we should expect to occur over the long run. We should be able to design and execute simulations for a given scenario - and then the calculations begin. We should be able to calculate the probability of multiple events using a variety of strategies (including Two-Way Tables, Tree Diagrams, and/or Venn Diagrams).
We should also be able to categorize different events as โmutually exclusiveโ or โindependent,โ with justification. Conditional probability [P(A | B)] plays a big role in this part of the unit. Shifting over to random variables, we should be able to calculate the mean (expected value) or standard deviation of a random variable, and combine them using similar rules to Unit 1. We conclude Unit 4 with a look at Binomial and Geometric random variables, which are two special types of variables that arise frequently in applications.
๐ Read these study guides:
๐ฅ Watch these videos from the Fiveable archives:
Randomness & Simulation: Explore some definitions (and myths) about probability and randomness
Basic Probability Rules: A breakdown of commonly-tested probability rules, using Two-Way Tables for most scenarios
Random Variables & Binomial/Geometric Distributions: A summary of Random Variable facts & formulas
๐ฐ Check out these articles:
Statistics in Court: Incorrect Probabilities: An exploration of the misuse of probability rules in court cases [source: Significance Magazine]
โ๏ธ Practice:
Practice FRQ #1: Some basic probability calculations using a discrete random variable
Practice FRQ #2: Test your knowledge of binomial scenarios and simulations
Practice FRQ #3: A scenario involving a two-way table
๐ Check out some online applets:
Dice & The Law of Large Numbers: Play with this applet to get a sense of how probability works over the โlong runโ
Coin Flips: A similar applet using coin flips
Unit 5 provides the bridge from descriptive statistics (Units 1-4) to inferential statistics (Units 6-9). After reviewing the Normal Distribution and introducing the idea of using sample statistics (like p or x) to estimate population parameters, we explore the creation of sampling distributions.
We meet the conditions for inference: random samples, large samples (for categorical variables, we need at least 10 expected successes and failures; for quantitative variables, we need n to be at least 30), and independent observations (which turns into the โ10% ruleโ for sampling without replacement: if the sample size n is less than 10% of the population size N, we can do calculations as if we sampled with replacement).
If these conditions are met, the sampling distribution we build will be approximately Normal and all of our formulas for calculating the mean and standard deviation of sampling distributions on the formula sheet will hold. We then build sampling distributions for sample proportions/sample means and the difference of sample proportions/sample means.
๐ Read these study guides:
5.0 Unit 5 Overview
5.1 Introducing Statistics: Why is My Sample Not Like Yours?
5.2 The Normal Distribution, Revisited
5.6 Sampling Distributions for Differences in Sample Proportions
๐ฅ Watch these videos from the Fiveable archives:
Sampling Distributions for Proportions: an intro to vocabulary surrounding sampling distributions, and a simulation using a virtual โcandy machineโ
Sampling Distributions for Means: an intro to the building of a sampling distribution for x-bar and a summary of the Central Limit Theorem
Unit 5 Practice FRQ: describe a sampling distribution and compute an associated probability
๐ Check out some online applets:
The "Candy Machine": Build a sampling distribution for p-hat.
Sampling Distribution for x-bar: See the Central Limit Theorem in action! Definitely try to make a โcustomโ graph to give the population a unique shape.
Unit 6 is where we meet Confidence Intervals and Hypothesis Tests for the first time, specifically z-intervals and z-tests for population proportions. After learning โthe basicsโ about confidence intervals (whatโs a confidence level? Whatโs a margin of error?), we construct and interpret 1 and 2-sample z-intervals.
These intervals, built from samples, can be used to justify claims about a population. Then, after exploring the rationale behind hypothesis tests (including how to write null/alternative hypotheses and interpret a p-value in context), we run 1 and 2-sample z-tests. Finally, we meet โErrorsโ: both Type I (rejecting a true H0) and Type II (failing to reject a false H0), and define the โPowerโ of a test as the probability of correctly rejecting a false H0. This unit is often heavily tested and is well worth your time to review!
๐ Read these study guides:
6.0 Unit 6 Overview
6.2 Constructing a Confidence Interval for a Population Proportion
6.3 Justifying a Claim Based on a Confidence Interval for a Population Proportion
6.8 Confidence Intervals for the Difference of Two Proportions
6.9 Justifying a Claim Based on a Confidence Interval for a Difference of Population Proportions
6.10 Setting Up a Test for the Difference of Two Population Proportions
6.11 Carrying Out a Test for the Difference of Two Population Proportions
๐ฅ Watch these videos from the Fiveable archives:
Confidence intervals for p: An intro to Confidence Intervals and a breakdown of how to construct and interpret 1-sample z-intervals.
Hypothesis Tests for p: An intro to Hypothesis Tests and practice running 1 and 2-sample z-tests.
Errors & Power of a Test: A breakdown of the types of errors in hypothesis testing, and how to increase the power of a test.
๐ฐ Check out these articles:
Understanding Type I and Type II Errors: A breakdown of the Types of Errors with โboy who cried wolfโ examples [Source: William Schmarzo]
โ๏ธ Practice:
Unit 6 Practice FRQ #1: Test your knowledge about Confidence Intervals!
๐ Check out some online applets:
Confidence Intervals for p: play with the population parameters and see what we mean by โconfidence levelโ
Reasoning of a Hypothesis Test: demonstrates the idea of Hypothesis Testing using basketball free-throws.ย
Unit 7 is an extension of Unit 6: we basically do everything again, but with t-procedures instead of z-procedures! We build Confidence Intervals and run Hypothesis Tests for a population mean or a difference of population means.
For the difference of population means, we must be able to distinguish between if we are running a 2-sample procedure or a matched-pairs procedure (in which we will use a 1-sample procedure to execute the process).
๐ Read these study guides:
7.0 Unit 7 Overview
7.2 Constructing a Confidence Interval for a Population Mean
7.3 Justifying a Claim About a Population Mean Based on a Confidence Interval
7.7 Justifying a Claim About the Difference of Two Means Based on a Confidence Interval
7.8 Setting up a Test for the Difference of Two Population Means
7.9 Carrying Out a Test for the Difference of Two Population Means
7.10 Skills Focus: Selecting, Implementing, and Communicating Inference Procedures
๐ฅ Watch these videos from the Fiveable archives:
Hypothesis Tests for Mu: Lots of good FRQ practice
Errors & Power of a Test: A breakdown of the types of errors in hypothesis testing, and how to increase the power of a test. (same as from Unit 6)
Review of z and t procedures: A (mostly) comprehensive review of Units 6 and 7. Great for last-minute preparations!
โ๏ธ Practice:
Unit 7 Practice FRQ #1: Should we shut down the production line?
๐ Check out some online applets:
Confidence Intervals for Mu: play with the population parameters and see what we mean by โconfidence levelโ
Statistical Power: Explore how the โPowerโ of a test is impacted by various inputs
Unit 8 is where we learn about chi-square tests, which can be used when there are two or more categorical variables. Weโll learn how to select from the following tests: the chi-square test for goodness of fit (for a distribution of proportions of one categorical variable in a population), the chi-square test for independence (for associations between categorical variables within a single population), or the chi-square test for homogeneity (for comparing distributions of a categorical variable across populations or treatments).ย
๐ Read these study guides:
8.0 Unit 8 Overview
8.5 Setting up a Chi-Square Test for Homogeneity or Independence
8.6 Carrying Out a Chi-Square Test for Homogeneity or Independence
8.7 Skills Focus: Selecting an Appropriate Inference Procedure for Categorical Data
๐ฅ Watch these videos:
๐ฐ Check out these articles:
โ๏ธ Practice:
๐ Check out some online applets:
Unit 9 will teach students how to construct confidence intervals for and perform significance tests about the slope of a population regression line when appropriate conditions are met. Surprisingly, there is variability in slope, which differs from studentsโ experience in previous courses. Slopes will likely vary as part of an approximately normal sampling distribution centered at the (true) slope of the population regression line relating spring length to hanging mass.ย
๐ Read these study guides:
9.0 Unit 9 Overview
9.2 Confidence Intervals for the Slope of a Regression Model
9.3 Justifying a Claim About the Slope of a Regression Model Based on a Confidence Interval
9.6 Skills Focus: Selecting an Appropriate Inference Procedure
๐ฅ Watch these videos:
โ๏ธ Practice:
๐ Check out some online applets:
Bias in Surveys
: Bias in surveys refers to systematic errors that occur when there's an inconsistency between the survey results and the true values in the target population due to flaws in data collection methods or respondent behavior.Chi-square test for goodness of fit
: The chi-square test for goodness of fit is a statistical test that determines whether an observed frequency distribution differs significantly from an expected frequency distribution. It is commonly used when we want to assess how well observed data fits with theoretical expectations.Chi-square test for homogeneity
: The chi-square test for homogeneity is a statistical test used to compare the distributions of multiple groups or populations based on categorical data. It determines whether the proportions or frequencies across different groups are significantly different from each other.Chi-square test for independence
: The chi-square test for independence is a statistical test used to determine if there is a relationship between two categorical variables. It assesses whether the observed frequencies in each category are significantly different from what would be expected if the variables were independent.Cluster Sampling
: Cluster sampling is a sampling technique where the population is divided into groups or clusters, and a random sample of clusters is selected. Then, all individuals within the selected clusters are included in the sample.Completely Randomized Design
: Completely randomized design is an experimental design where subjects or items are randomly assigned to different treatment groups. It helps eliminate bias and ensures that any differences observed between groups are due to the treatments applied.Confidence Interval
: A confidence interval is a range of values that is likely to contain the true value of a population parameter. It provides an estimate along with a level of confidence about how accurate the estimate is.Confidence intervals for the slope of a regression model
: Confidence intervals for the slope of a regression model provide a range of plausible values for the true slope parameter. They indicate the uncertainty associated with estimating the relationship between two variables in a linear regression analysis.Correlation Coefficient
: The correlation coefficient is a statistical measure that quantifies the strength and direction of the linear relationship between two variables. It ranges from -1 to 1, where -1 indicates a perfect negative relationship, 0 indicates no relationship, and 1 indicates a perfect positive relationship.Cram Mode
: Cram mode refers to the last-minute studying strategy where students try to quickly learn a large amount of information right before an exam.Experiments
: Experiments are research methods where the researcher manipulates one or more variables to observe their effect on another variable, while controlling other factors.Free Response
: A type of question in which students are required to provide a written response, rather than selecting from multiple choice options.Hypothesis Testing
: Hypothesis testing is a statistical method used to make inferences about population parameters based on sample data. It involves formulating a null hypothesis and an alternative hypothesis, collecting data, calculating test statistics, and making decisions about rejecting or failing to reject the null hypothesis.Least-Squares Regression Lines
: Least-squares regression lines are used to model the relationship between two variables by minimizing the sum of the squared differences between observed data points and predicted values. They provide an equation that represents the best-fit line through the data.Live Cram Sessions
: Live cram sessions are intensive review sessions conducted in real-time, usually before an exam or test. They involve focused studying, quick recall exercises, and active engagement with the material.Margin of Error
: The margin of error is a measure of the uncertainty or variability in survey results. It represents the range within which the true population parameter is likely to fall.Matched-Pairs Design
: A matched-pairs design is an experimental design where pairs of subjects who are similar in some important aspect are selected, and one subject from each pair receives one treatment while the other subject receives a different treatment. This helps to reduce variability caused by individual differences.Normal Distribution
: A normal distribution is a symmetric bell-shaped probability distribution characterized by its mean and standard deviation. It follows a specific mathematical formula called Gaussian distribution.null/alternative hypotheses
: Null hypothesis refers to assuming there is no significant relationship or difference between variables, while alternative hypothesis suggests otherwise based on available evidence or theories.Observational Studies
: Observational studies are research methods where the researcher observes and collects data on individuals without manipulating any variables.P-value
: The p-value is a probability value that helps determine whether an observed result is statistically significant or occurred by chance. It quantifies how strong or weak evidence against a null hypothesis exists.Power of a test
: The power of a statistical test is the probability that it correctly rejects the null hypothesis when the alternative hypothesis is true. In other words, it measures the ability of a test to detect an effect or difference if one truly exists.Probability and Sampling Distributions
: Probability and sampling distributions involve the study of the likelihood of events occurring and how data is distributed. It explores the chances of different outcomes happening and how those outcomes are spread out.Random Sampling
: Random sampling is a method of selecting individuals from a population in such a way that every individual has an equal chance of being chosen. It helps to ensure that the sample represents the population accurately.Randomized Block Design
: Randomized block design is a method used in experimental design where subjects or items are divided into homogeneous groups called blocks. Within each block, treatments are randomly assigned to minimize the effect of confounding variables.Residuals
: Residuals are the differences between observed values and predicted values in a regression analysis. They represent the vertical distances between data points and the least-squares regression line.Scatterplots
: Scatterplots are graphs that display the relationship between two quantitative variables. Each point on the graph represents a pair of values, one for each variable.Significance Level
: The significance level, also known as alpha (ฮฑ), determines how much evidence we need to reject the null hypothesis. It represents the probability of making a Type I error.Stratified Sampling
: Stratified sampling is a sampling method where the population is divided into distinct subgroups (strata) based on certain characteristics. Then, random samples are taken from each stratum proportionally to its size.Study Materials
: Study materials refer to resources such as textbooks, notes, practice problems, and online resources that students use to review and prepare for exams.Study Plan
: A study plan is a structured schedule that outlines specific tasks and goals for studying. It helps students stay organized, manage their time effectively, and ensure they cover all the necessary material.Study Space
: A study space refers to a dedicated area where students can focus on their academic work without distractions. It is typically organized, comfortable, well-lit, and equipped with necessary materials.Systematic Samples
: Systematic samples are obtained by selecting every kth individual from the population. The first individual is randomly chosen between 1 and k.t-procedures for population mean or difference of population means
: T-procedures are statistical methods used to make inferences about population means or differences between population means based on sample data. They rely on t-distributions instead of normal distributions when certain assumptions about the population are not met.Type I Error
: Type I error refers to rejecting a true null hypothesis. It occurs when we conclude there is a significant difference or relationship between variables when there actually isn't one.Type II error
: Type II error occurs when we fail to reject a null hypothesis that is actually false. In other words, it's the mistake of accepting the null hypothesis when we should have rejected it.z-intervals and z-tests for population proportions
: Z-intervals and z-tests for population proportions are statistical methods used to make inferences about population proportions based on sample data using normal distribution theory.16 min readโขaugust 18, 2023
A Q
A Q
We know that studying for your AP exams can be stressful, but Fiveable has your back! We created a study plan to help you crush your AP Statistics exam. This guide will continue to update with information about the 2024 exams, as well as helpful resources to help you do your best on test day.ย Unlock Cram Modeย for access to our cram eventsโstudents who have successfully passed their AP exams will answer your questions and guide your last-minute studying LIVE! And don't miss out on unlimited access to our database of thousands of practice questions.
Going into test day, this is the format to expect:
Section 1: Multiple Choice - 50% of your score
40 questions in 1 hr 30 mins
Section 2: Free Response - 50% of your score
6 questions in 1 hr 30 mins
Part A: 65 mins
1 multipart question with a focus on collecting data
1 multipart question with a focus on exploring data
1 multipart question with a focus on probability and sampling distributions
1 multipart question with a focus on inference
1 multipart question that combines 2 or more skill categories
Part B: 25 mins
1 investigative task that assesses multiple skill categories and content areas
๐ Check out the 2023 AP Statistics Free-Response Section posted on the College Board site.
First, download theย AP Statistics Cheatsheet PDFย - a single sheet that covers everything you need to know at a high level. Take note of your strengths and weaknesses!
We've put together the study plan found below to help you study between now and May. This will cover all of the units and essay types to prepare you for your exam. Pay special attention to the units that you need the most improvement in.
Study, practice, and review for test day with other students during our live cram sessions viaย Cram Mode. Cram live streams will teach, review, and practice important topics from AP courses, college admission tests, and college admission topics. These streams are hosted by experienced students who know what you need to succeed.
Before you begin studying, take some time to get organized.
๐ฅ Create a study space.
Make sure you have a designated place at home to study. Somewhere you can keep all of your materials, where you can focus on learning, and where you are comfortable. Spend some time prepping the space with everything you need and you can even let others in the family know that this is your study space.ย
๐ Organize your study materials.
Get your notebook, textbook, prep books, or whatever other physical materials you have. Also, create a space for you to keep track of review. Start a new section in your notebook to take notes or start a Google Doc to keep track of your notes. Get yourself set up!
๐ Plan designated times for studying.
The hardest part about studying from home is sticking to a routine. Decide on one hour every day that you can dedicate to studying. This can be any time of the day, whatever works best for you. Set a timer on your phone for that time and really try to stick to it. The routine will help you stay on track.
๐ Decide on an accountability plan.
How will you hold yourself accountable to this study plan? You may or may not have a teacher or rules set up to help you stay on track, so you need to set some for yourself. First, set your goal. This could be studying for x number of hours or getting through a unit. Then, create a reward for yourself. If you reach your goal, then x. This will help stay focused!
Unit 1 is about creating and analyzing graphs of data. This includes both categorical and quantitative data. For categorical data, we should be able to read and create tables and bar graphs and calculate proportions/percentages. For quantitative data, we should be able to read and create dot plots, stemplots, histograms, and boxplots. We should also be able to describe the shape, center, variability (spread), and any unusual features of a distribution of quantitative data. This includes making calculations such as mean, median, range, interquartile range (IQR), and standard deviation. Our descriptions and calculations can be used to compare data from multiple groups. Finally, Unit 1 ends with describing the position of individuals within a quantitative data set, including using percentiles and z-scores. This leads us to an initial exploration of the Normal Distribution, though we will study that more in-depth in Units 4-5.
๐ Read these study guides:
1.0 Unit 1 Overview
1.10 The Normal Distribution
๐ฅ Watch these videos from the Fiveable archives:
Analyzing Categorical Variables: An intro to some key terms and graphs (use first 15 minutes)
Describing Data in a Distribution: A breakdown of percentile a cumulative graphs
Normal Distributions: A good intro to all things Normal!
๐ฐ Check out these articles:
Relative Dominance: A real-life example of how z-scores can help compare individuals from different distributions, using golfers (source: Grantland)
โ๏ธ Practice:
Practice an AP-Style Problem: check out this post and practice your free-response skills!
๐ Check out some online applets:
Mean vs. Median interactive applet: Play with this applet to get a sense of how changing different data values impacts the mean and median
Normal Distribution applet: A visual of the Standard Normal Curve. Update the mean and standard deviation to look at any data set.
Unit 2 is about creating and analyzing graphs of data when two variables are measured about each individual in a data set. For categorical data, we should be able to read and create two-way tables or segmented bar graphs and calculate conditional percentages. These can be used to comment on the association (or lack thereof) between the two variables. For quantitative data, we should be able to read, create, and describe scatterplots, which can also be used to comment on the apparent association between two variables.
The second half of Unit 2 is then focused on linear regression, a process by which we can make predictions about one quantitative variable (a response variable) using another (an explanatory variable). We should be able to use Least-Squares Regression Lines to make these predictions, and interpret several components of the LSRLs (including slope, intercept, and other calculated values such as s or r2)
๐ Read these study guides:
2.0 Unit 2 Overview
2.4 Representing the Relationship Between Two Quantitative Variables
2.5 Correlation
2.7 Residuals
๐ฅ Watch these videos from the Fiveable archives:
Analyzing Categorical Variables: Start at 14:38 for an example of two-way tables and stay for segmented bar graphs
Describing Scatterplots & Association: How to describe the direction, strength, and form of an association, as well as an introduction to the correlation coefficient r
Using Least-Squares Regression Lines: How to make predictions from regression lines and calculating residuals
Advanced Linear Regression: Interpreting โsโ, โr2โ, and reading computer outputs of regression data
โ๏ธ Practice:
Practice an AP-Style Problem: check out this post and practice your free-response skills!
๐ Check out some online applets:
Least-Squares Regression: Try to guess the least-squares regression line from a scatterplot of data
๐ย Just for fun!
Spurious Correlations: Data sets with very high โrโ values thatโฆ wellโฆ youโll see... [Source: Tyler Vigen]
While Units 1-2 were about graphing and analyzing sets of data, Unit 3 is about examining the methods through which we can collect that data. For sample surveys, we should be able to describe various methods of selecting samples, particularly the random methods (simple random, stratified random, cluster, and systematic samples). However, not all samples are collected through a random process, and we should be prepared to discuss possible sources of bias in surveys (including via non-random selection processes).
We then turn to the differences between observational studies and experiments, and the features of a well-designed experiment. We should be able to define many common terms associated with experiments (many of which youโve likely seen in other courses!), and compare and contrast several common experimental designs: completely randomized design, randomized block design, and matched-pairs design.
๐ Read these study guides:
3.0 Unit 3 Overview
3.1 Introducing Statistics: Do the Data We Collected Tell the Truth?
3.3 Random Sampling and Data Collection
๐ฅ Watch these videos from the Fiveable archives:
Sampling Methods and Sources of Bias: A breakdown of the different ways we can take samples, and how to talk about bias on the AP exam.
Experiments and Observational Studies: All things experiments! Includes a discussion of the possible pitfalls of observational studies (confounding)
โ๏ธ Practice:
AP-Style Problem #1: a practice question on surveys and sampling methods.
AP-Style Problem #2: a practice question on observational studies/experiments
Unit 4 is where AP Statistics gets โmath-y,โ with lots of calculations and formulas. We are asked to calculate or interpret probabilities in a variety of settings, beginning with the understanding that probability reflects what we should expect to occur over the long run. We should be able to design and execute simulations for a given scenario - and then the calculations begin. We should be able to calculate the probability of multiple events using a variety of strategies (including Two-Way Tables, Tree Diagrams, and/or Venn Diagrams).
We should also be able to categorize different events as โmutually exclusiveโ or โindependent,โ with justification. Conditional probability [P(A | B)] plays a big role in this part of the unit. Shifting over to random variables, we should be able to calculate the mean (expected value) or standard deviation of a random variable, and combine them using similar rules to Unit 1. We conclude Unit 4 with a look at Binomial and Geometric random variables, which are two special types of variables that arise frequently in applications.
๐ Read these study guides:
๐ฅ Watch these videos from the Fiveable archives:
Randomness & Simulation: Explore some definitions (and myths) about probability and randomness
Basic Probability Rules: A breakdown of commonly-tested probability rules, using Two-Way Tables for most scenarios
Random Variables & Binomial/Geometric Distributions: A summary of Random Variable facts & formulas
๐ฐ Check out these articles:
Statistics in Court: Incorrect Probabilities: An exploration of the misuse of probability rules in court cases [source: Significance Magazine]
โ๏ธ Practice:
Practice FRQ #1: Some basic probability calculations using a discrete random variable
Practice FRQ #2: Test your knowledge of binomial scenarios and simulations
Practice FRQ #3: A scenario involving a two-way table
๐ Check out some online applets:
Dice & The Law of Large Numbers: Play with this applet to get a sense of how probability works over the โlong runโ
Coin Flips: A similar applet using coin flips
Unit 5 provides the bridge from descriptive statistics (Units 1-4) to inferential statistics (Units 6-9). After reviewing the Normal Distribution and introducing the idea of using sample statistics (like p or x) to estimate population parameters, we explore the creation of sampling distributions.
We meet the conditions for inference: random samples, large samples (for categorical variables, we need at least 10 expected successes and failures; for quantitative variables, we need n to be at least 30), and independent observations (which turns into the โ10% ruleโ for sampling without replacement: if the sample size n is less than 10% of the population size N, we can do calculations as if we sampled with replacement).
If these conditions are met, the sampling distribution we build will be approximately Normal and all of our formulas for calculating the mean and standard deviation of sampling distributions on the formula sheet will hold. We then build sampling distributions for sample proportions/sample means and the difference of sample proportions/sample means.
๐ Read these study guides:
5.0 Unit 5 Overview
5.1 Introducing Statistics: Why is My Sample Not Like Yours?
5.2 The Normal Distribution, Revisited
5.6 Sampling Distributions for Differences in Sample Proportions
๐ฅ Watch these videos from the Fiveable archives:
Sampling Distributions for Proportions: an intro to vocabulary surrounding sampling distributions, and a simulation using a virtual โcandy machineโ
Sampling Distributions for Means: an intro to the building of a sampling distribution for x-bar and a summary of the Central Limit Theorem
Unit 5 Practice FRQ: describe a sampling distribution and compute an associated probability
๐ Check out some online applets:
The "Candy Machine": Build a sampling distribution for p-hat.
Sampling Distribution for x-bar: See the Central Limit Theorem in action! Definitely try to make a โcustomโ graph to give the population a unique shape.
Unit 6 is where we meet Confidence Intervals and Hypothesis Tests for the first time, specifically z-intervals and z-tests for population proportions. After learning โthe basicsโ about confidence intervals (whatโs a confidence level? Whatโs a margin of error?), we construct and interpret 1 and 2-sample z-intervals.
These intervals, built from samples, can be used to justify claims about a population. Then, after exploring the rationale behind hypothesis tests (including how to write null/alternative hypotheses and interpret a p-value in context), we run 1 and 2-sample z-tests. Finally, we meet โErrorsโ: both Type I (rejecting a true H0) and Type II (failing to reject a false H0), and define the โPowerโ of a test as the probability of correctly rejecting a false H0. This unit is often heavily tested and is well worth your time to review!
๐ Read these study guides:
6.0 Unit 6 Overview
6.2 Constructing a Confidence Interval for a Population Proportion
6.3 Justifying a Claim Based on a Confidence Interval for a Population Proportion
6.8 Confidence Intervals for the Difference of Two Proportions
6.9 Justifying a Claim Based on a Confidence Interval for a Difference of Population Proportions
6.10 Setting Up a Test for the Difference of Two Population Proportions
6.11 Carrying Out a Test for the Difference of Two Population Proportions
๐ฅ Watch these videos from the Fiveable archives:
Confidence intervals for p: An intro to Confidence Intervals and a breakdown of how to construct and interpret 1-sample z-intervals.
Hypothesis Tests for p: An intro to Hypothesis Tests and practice running 1 and 2-sample z-tests.
Errors & Power of a Test: A breakdown of the types of errors in hypothesis testing, and how to increase the power of a test.
๐ฐ Check out these articles:
Understanding Type I and Type II Errors: A breakdown of the Types of Errors with โboy who cried wolfโ examples [Source: William Schmarzo]
โ๏ธ Practice:
Unit 6 Practice FRQ #1: Test your knowledge about Confidence Intervals!
๐ Check out some online applets:
Confidence Intervals for p: play with the population parameters and see what we mean by โconfidence levelโ
Reasoning of a Hypothesis Test: demonstrates the idea of Hypothesis Testing using basketball free-throws.ย
Unit 7 is an extension of Unit 6: we basically do everything again, but with t-procedures instead of z-procedures! We build Confidence Intervals and run Hypothesis Tests for a population mean or a difference of population means.
For the difference of population means, we must be able to distinguish between if we are running a 2-sample procedure or a matched-pairs procedure (in which we will use a 1-sample procedure to execute the process).
๐ Read these study guides:
7.0 Unit 7 Overview
7.2 Constructing a Confidence Interval for a Population Mean
7.3 Justifying a Claim About a Population Mean Based on a Confidence Interval
7.7 Justifying a Claim About the Difference of Two Means Based on a Confidence Interval
7.8 Setting up a Test for the Difference of Two Population Means
7.9 Carrying Out a Test for the Difference of Two Population Means
7.10 Skills Focus: Selecting, Implementing, and Communicating Inference Procedures
๐ฅ Watch these videos from the Fiveable archives:
Hypothesis Tests for Mu: Lots of good FRQ practice
Errors & Power of a Test: A breakdown of the types of errors in hypothesis testing, and how to increase the power of a test. (same as from Unit 6)
Review of z and t procedures: A (mostly) comprehensive review of Units 6 and 7. Great for last-minute preparations!
โ๏ธ Practice:
Unit 7 Practice FRQ #1: Should we shut down the production line?
๐ Check out some online applets:
Confidence Intervals for Mu: play with the population parameters and see what we mean by โconfidence levelโ
Statistical Power: Explore how the โPowerโ of a test is impacted by various inputs
Unit 8 is where we learn about chi-square tests, which can be used when there are two or more categorical variables. Weโll learn how to select from the following tests: the chi-square test for goodness of fit (for a distribution of proportions of one categorical variable in a population), the chi-square test for independence (for associations between categorical variables within a single population), or the chi-square test for homogeneity (for comparing distributions of a categorical variable across populations or treatments).ย
๐ Read these study guides:
8.0 Unit 8 Overview
8.5 Setting up a Chi-Square Test for Homogeneity or Independence
8.6 Carrying Out a Chi-Square Test for Homogeneity or Independence
8.7 Skills Focus: Selecting an Appropriate Inference Procedure for Categorical Data
๐ฅ Watch these videos:
๐ฐ Check out these articles:
โ๏ธ Practice:
๐ Check out some online applets:
Unit 9 will teach students how to construct confidence intervals for and perform significance tests about the slope of a population regression line when appropriate conditions are met. Surprisingly, there is variability in slope, which differs from studentsโ experience in previous courses. Slopes will likely vary as part of an approximately normal sampling distribution centered at the (true) slope of the population regression line relating spring length to hanging mass.ย
๐ Read these study guides:
9.0 Unit 9 Overview
9.2 Confidence Intervals for the Slope of a Regression Model
9.3 Justifying a Claim About the Slope of a Regression Model Based on a Confidence Interval
9.6 Skills Focus: Selecting an Appropriate Inference Procedure
๐ฅ Watch these videos:
โ๏ธ Practice:
๐ Check out some online applets:
Bias in Surveys
: Bias in surveys refers to systematic errors that occur when there's an inconsistency between the survey results and the true values in the target population due to flaws in data collection methods or respondent behavior.Chi-square test for goodness of fit
: The chi-square test for goodness of fit is a statistical test that determines whether an observed frequency distribution differs significantly from an expected frequency distribution. It is commonly used when we want to assess how well observed data fits with theoretical expectations.Chi-square test for homogeneity
: The chi-square test for homogeneity is a statistical test used to compare the distributions of multiple groups or populations based on categorical data. It determines whether the proportions or frequencies across different groups are significantly different from each other.Chi-square test for independence
: The chi-square test for independence is a statistical test used to determine if there is a relationship between two categorical variables. It assesses whether the observed frequencies in each category are significantly different from what would be expected if the variables were independent.Cluster Sampling
: Cluster sampling is a sampling technique where the population is divided into groups or clusters, and a random sample of clusters is selected. Then, all individuals within the selected clusters are included in the sample.Completely Randomized Design
: Completely randomized design is an experimental design where subjects or items are randomly assigned to different treatment groups. It helps eliminate bias and ensures that any differences observed between groups are due to the treatments applied.Confidence Interval
: A confidence interval is a range of values that is likely to contain the true value of a population parameter. It provides an estimate along with a level of confidence about how accurate the estimate is.Confidence intervals for the slope of a regression model
: Confidence intervals for the slope of a regression model provide a range of plausible values for the true slope parameter. They indicate the uncertainty associated with estimating the relationship between two variables in a linear regression analysis.Correlation Coefficient
: The correlation coefficient is a statistical measure that quantifies the strength and direction of the linear relationship between two variables. It ranges from -1 to 1, where -1 indicates a perfect negative relationship, 0 indicates no relationship, and 1 indicates a perfect positive relationship.Cram Mode
: Cram mode refers to the last-minute studying strategy where students try to quickly learn a large amount of information right before an exam.Experiments
: Experiments are research methods where the researcher manipulates one or more variables to observe their effect on another variable, while controlling other factors.Free Response
: A type of question in which students are required to provide a written response, rather than selecting from multiple choice options.Hypothesis Testing
: Hypothesis testing is a statistical method used to make inferences about population parameters based on sample data. It involves formulating a null hypothesis and an alternative hypothesis, collecting data, calculating test statistics, and making decisions about rejecting or failing to reject the null hypothesis.Least-Squares Regression Lines
: Least-squares regression lines are used to model the relationship between two variables by minimizing the sum of the squared differences between observed data points and predicted values. They provide an equation that represents the best-fit line through the data.Live Cram Sessions
: Live cram sessions are intensive review sessions conducted in real-time, usually before an exam or test. They involve focused studying, quick recall exercises, and active engagement with the material.Margin of Error
: The margin of error is a measure of the uncertainty or variability in survey results. It represents the range within which the true population parameter is likely to fall.Matched-Pairs Design
: A matched-pairs design is an experimental design where pairs of subjects who are similar in some important aspect are selected, and one subject from each pair receives one treatment while the other subject receives a different treatment. This helps to reduce variability caused by individual differences.Normal Distribution
: A normal distribution is a symmetric bell-shaped probability distribution characterized by its mean and standard deviation. It follows a specific mathematical formula called Gaussian distribution.null/alternative hypotheses
: Null hypothesis refers to assuming there is no significant relationship or difference between variables, while alternative hypothesis suggests otherwise based on available evidence or theories.Observational Studies
: Observational studies are research methods where the researcher observes and collects data on individuals without manipulating any variables.P-value
: The p-value is a probability value that helps determine whether an observed result is statistically significant or occurred by chance. It quantifies how strong or weak evidence against a null hypothesis exists.Power of a test
: The power of a statistical test is the probability that it correctly rejects the null hypothesis when the alternative hypothesis is true. In other words, it measures the ability of a test to detect an effect or difference if one truly exists.Probability and Sampling Distributions
: Probability and sampling distributions involve the study of the likelihood of events occurring and how data is distributed. It explores the chances of different outcomes happening and how those outcomes are spread out.Random Sampling
: Random sampling is a method of selecting individuals from a population in such a way that every individual has an equal chance of being chosen. It helps to ensure that the sample represents the population accurately.Randomized Block Design
: Randomized block design is a method used in experimental design where subjects or items are divided into homogeneous groups called blocks. Within each block, treatments are randomly assigned to minimize the effect of confounding variables.Residuals
: Residuals are the differences between observed values and predicted values in a regression analysis. They represent the vertical distances between data points and the least-squares regression line.Scatterplots
: Scatterplots are graphs that display the relationship between two quantitative variables. Each point on the graph represents a pair of values, one for each variable.Significance Level
: The significance level, also known as alpha (ฮฑ), determines how much evidence we need to reject the null hypothesis. It represents the probability of making a Type I error.Stratified Sampling
: Stratified sampling is a sampling method where the population is divided into distinct subgroups (strata) based on certain characteristics. Then, random samples are taken from each stratum proportionally to its size.Study Materials
: Study materials refer to resources such as textbooks, notes, practice problems, and online resources that students use to review and prepare for exams.Study Plan
: A study plan is a structured schedule that outlines specific tasks and goals for studying. It helps students stay organized, manage their time effectively, and ensure they cover all the necessary material.Study Space
: A study space refers to a dedicated area where students can focus on their academic work without distractions. It is typically organized, comfortable, well-lit, and equipped with necessary materials.Systematic Samples
: Systematic samples are obtained by selecting every kth individual from the population. The first individual is randomly chosen between 1 and k.t-procedures for population mean or difference of population means
: T-procedures are statistical methods used to make inferences about population means or differences between population means based on sample data. They rely on t-distributions instead of normal distributions when certain assumptions about the population are not met.Type I Error
: Type I error refers to rejecting a true null hypothesis. It occurs when we conclude there is a significant difference or relationship between variables when there actually isn't one.Type II error
: Type II error occurs when we fail to reject a null hypothesis that is actually false. In other words, it's the mistake of accepting the null hypothesis when we should have rejected it.z-intervals and z-tests for population proportions
: Z-intervals and z-tests for population proportions are statistical methods used to make inferences about population proportions based on sample data using normal distribution theory.ยฉ 2024 Fiveable Inc. All rights reserved.
APยฎ and SATยฎ are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.