# AP Stats Unit 3 Practice FRQ (Surveys & Sampling)

A large university offers undergraduate courses both in-person on its campus and digitally through an online platform. The administration of the university wishes to gauge their students’ opinions about the quality of teaching at the university; specifically, they want to estimate the proportion of all students who would agree that they are receiving good instruction. They are considering several methods for collecting this data.

a. One method they are considering is conducting a simple random sample. The university has 30,000 students, and the administrators wish to ask a sample of 500 students for their opinion of the quality of teaching. Describe a procedure for how the administrators could conduct a simple random sample in this situation.

b. A faculty member of the statistics department suggests to the administrators that they may want to consider a stratified random sample, using the primary type of instruction a student receives (in-person or digitally) as a variable for stratification. Describe in the context of the scenario why a stratified random sample may provide a more precise estimate of the proportion of all students who would agree that they are receiving good instruction than the simple random sampling method you detailed in part (a).

1 Like

(a) Write the names of all 30,000 students on equal sized pieces of paper and put in a hat. Shake the hat. Then choose 500 pieces of paper out of the hat without replacement and collect the data they need.
(b) In a stratified random sample, you are guaranteed to get results from students who receive instruction online and in person. In a random sample, you have the ability to possibly get students from only in school or only online instruction. With a stratified random sample, you’d sort the students based on how they receive their instruction content. Then write down the names of each student in each strata on equal sized pieces of paper and place in a hat. Shake the hat. Then choose 250 students from the digital hat and 250 students from the in person hat and study their results. This will ensure that both populations of interest are being examined.

Hello!

~Jerry

Hello again!

In part (a), your use of the “names in a hat” method of conducting an SRS is clear, but stumbles a bit at the end: “collect the data they need” is a little vague and doesn’t connect to this scenario. It should be clear that the 500 pieces of paper represent students (you do this in your first sentence) and that the selected students will be given the survey about quality of instruction. That second part isn’t clear in your response. On some rubrics, your response would still earn “E” (full credit); on others, it may be in jeopardy of being bumped to a “P” (partial credit)

In part (b), you do a good job of explaining one purpose of a stratified sample (you are guaranteed to get results from each type of student), how that differs from an SRS (you might only get one group or the other), but then you describe how to implement the stratified sample. That wasn’t what the question asked: they want to know why the stratified sample is a good idea. The beginning of your answer starts down that road, but should go a little further in describing the fact that having results from the two different groups of students is good because their opinions about the quality of instruction may differ, which is the intent of the survey. Therefore, your response would likely get partial credit (“P”)

~Jerry

a) The administrators would randomly number the 30,000 students an integer between 1 and 30,000. Then, the students with the numbers 1-500 would be the sample of 500 students who would be asked about the opinion of the quality of teaching.

b) Since students’ opinions of their quality of teaching might be similar within their primary type of instruction, the primary type of instruction a student receives (in-person or digitally) should be used as a variable for stratification. Stratification by the primary type of instruction should result in more precise estimates of student opinion than a simple random sample of the same size as there will be much less variability.

Hi Brandon!

In part (a), I’m a little torn on whether you would earn “E” or “P”. Typically, the criteria for describing implementing a process like this are (1) clearly assign numbers to each individual, (2) generate a list of n unique numbers within the boundaries of the assigned numbers, (3) select individuals who correspond with the numbers. It is clear that you have fulfilled components (2) and (3)… what I’m stuck on is (1). Saying “randomly number the 30,000 students [with] an integer between 1 and 30,000” does not clearly indicate that each student is receiving a unique number label; your description being just “randomly” leaves open the possibility of multiple people being assigned the same number, for example. This can be alleviated by using a clear randomization method. For example, you could have said “From a list of the 30,000 students, randomize the order of names and then assign the first name on the list 1, second name 2, and so on until all students are numbered. [rest of your response here]”. Or - and here’s the annoying part - simply add the word “unique” before “integer between 1 and 30,000”, and you’re covered. So with all of that said (sorry for long-winded response), you’d likely earn “P” for this part.

Much shorter feedback for part (b): you crushed it. You show a clear understanding of why stratification is useful in this context, and gave a reason why (“opinions… might be similar within their primary type of instruction”). “E” for this part!

~Jerry

(a) The administrators could conduct a simple random sample in this situation by randomly assigning the students a number from 1-30,000 (only 1 number per student). They can use a random number generator to select 500 students without replacement who can be asked for their opinion of the quality of teaching at the university.

(b). A stratified random sample may provide a more precise estimate of the proportion of all students who would agree that they are receiving good instruction than the SRS method in part (a). Because the variable for stratification is either if the student receives in-person or digital instruction, the spread of the data will be less (or less variability)

Hello again -

In part (a), your response is strong, but missing one small component: when you use the RNG to select 500 students without replacement (as you should), you must specify that you want the RNG to select 500 numbers within the range of 1-30,000. The way you’ve described it leaves us open to the possibility that 500 numbers will be generated, but not all 500 numbers will match the labels of students. Yes, it’s a small detail, but it’s been a part of rubrics for this type of question in the past. (The 3 things that are usually looked for: (1) give the population unique numbers, (2) generate n unique numbers within the range of the numbers from the population, (3) choose the [individuals] corresponding to those numbers and administer [thing].) Your response clearly does 2 of those 3 things, and would likely earn partial instead of full credit.

In part b, your response essentially summarizes the question (we’ll provide a more precise estimate of the proportion), without fully explaining why that happens. You mention that there will be “less variability”, but do not mention how stratifying the sample will do that. When explaining why we stratify (or block in experiments), it’s important to connect to the stratification/blocking variable to the response variable: in this case, that means explaining that the type of instruction a student receives may impact their opinion of the quality of that instruction (then insert a possible reason for this), which is why stratifying might help reduce variation: we’ll have separated the sample based on a factor that would influence the response, making our estimate more representative of the true population proportion.

Hope this helps!
~Jerry

a. For a SRS, the university can assign each student a # from 00000-29999. Then, use a random generator to randomly drawn a #. Each student with the corresponding # will answer the question about the quality of teaching. Repeat is not allowed. Stop until the 500th student is reached. Keep track of their answers to whether the quality of teaching is good or bad.

b. A stratified sample may provide a more precise estimate of the proportion of all students who would agree that they are receiving good instruction than the SRS method because students who receive online lectures might enjoy it more than in-person lectures, therefore, respond differently to the questions (a variable that influences the response). We would not know if the quality is actually good or it’s the type of teaching that makes the student respond good (confounding.) Stratified helps us to give a representative sample of the student from each group(less variation) while SRS makes sure every sample has the same chance(more variation.)

Hello again!

In part (a), your response is strong, but missing one small component: when you use the RNG to select 500 students without replacement and without repeats (as you should), you must specify that you want the RNG to select numbers within the range of 00,000-29,999 . The way you’ve described it leaves us open to the possibility that 500 numbers will be generated, but not all 500 numbers will match the labels of students. Yes, it’s a small detail, but it’s been a part of rubrics for this type of question in the past. (The 3 things that are usually looked for: (1) give the population unique numbers, (2) generate n unique numbers within the range of the numbers from the population, (3) choose the [individuals] corresponding to those numbers and administer [thing].) Your response clearly does 2 of those 3 things, and would likely earn partial instead of full credit. [FYI, I gave this same feedback to the poster above you on the thread] In your case, you could potentially fix this by saying “Repeats and numbers that do not match a student label are not allowed… stop when the 500th student is reached.”

For part (b), your response is strong. You clearly connect the stratifying variable to the response variable in the scenario, and justify the use of stratification as providing precision in our response. This would likely earn full credit!

~Jerry

1 Like

A. Label each student with a 2 digit number from 01-3,000, where each 2 digit number represents one student. Starting at a given line of the random digit table, look at 2 digits at a time, reading from left to right, ignoring 00, numbers above 3,000, and any repeats. Stop when you selected 500 distinct students for the sample. Make sure to list the individuals selected.

B. A stratified random sample may provide a more precise estimate of the proportion of all students because there is little variability between each stratum (in-person vs digitally). When stratifying the sample into homogenous groups, you are also reducing the sampling error which is more effective and convenient than a simple random sampling method that includes heterogeneous individuals in the sample.

Welcome!

While I can see where you’re going with your work in part (a), there are some miscommunications along the way. First - it would be impossible to give 30,000 students a unique 2-digit number label (there’s only 100 possibilities there!), so we’ll have to use more digits (5 in this case, from 00000-29999 or 00001-30000, something like that). Your description of using a random digit table is accurate given your set up (but would need to be edited to accommodate the 5 digits), and you give clear instructions for how to handle repeats or numbers outside of the appropriate range. However, the initial mistake would be too much to overcome to receive full credit.

In part (b), you do a good job of describing why a stratified sample reduces variability, but you should go a little further in describing why the two stratum would have little variability and what there would be little variability about - in this case, you could address both by saying something like “students receiving the different types of instruction (in-person vs digitally) might have differing opinions on the quality, as the ability of the professors may differ on the different platforms.” That is - you should make a clear connection to the response being measured.

~Jerry

a.) The administrators should conduct a simple random sample by randomly assigning each student a unique number from 1-30,000. Then the administrators can use a random number generator and choose 500 numbers within the range of 1-30,000. The chosen 500 numbers would then represent the sample of 500 students who would be asked for their opinion of the quality of teaching by the administrators.

b) The stratified random sample may provide a more precise estimate of the proportion of all students who would agree that they are receiving good instruction than the SRS method because students who receive in-person instruction may enjoy the instruction more that the people who receive online lectures. Therefore, the type of instruction received will influence a student’s response to the administrator’s question. Thus, the type of instruction received will be a confounding variable and result in variability in responses. To reduce variability, stratification of the type of instruction received will result in more precise estimates of students who agree they are receiving good instruction.

Very well done! In part (a), you check all of the boxes that we as readers must look for - each individual is given a unique number, the numbers you generate are within the bounds of the numbers you assign, and you clearly indicate what is being done with the individuals selected.

In part (b), you give a clear description of why stratification is done, in the context of the scenario. Nicely done!

~Jerry

A) The administration of the university should conduct a simple random sample by assigning each student that attends the university a unique number between 1 and 30,000. The administration should then use a random number generator to create a list of 500 numbers between 1 and 30,000. The 500 individuals whose numbers are chosen will represent the sample of students asked about the quality of teaching at the university.

B) If the administration uses a stratified random sample, the stratification of in person courses and digital courses help guarantee representation from both course types. For example, a student who receives in person lessons might enjoy the hands on teaching quality as opposed to an online student who may get less of that. By performing a SRS within each strata of in person and digital courses, the administration will receive more precise estimates of all students who agree they are receiving good teaching instruction.

A. These researchers could use a random number generator to conduct a simple random sample. To do this, researchers could assign each student at the university a number from 1-70,000 so that each student has a number. Then, using a random number generator, researchers would use the first 500 numbers generated by this number generator to use within their experiment.
B. A stratified random sample may provide a more precise estimate because the variable being stratified upon (in this case, the type of instruction a student receives) may have an impact on the proportion of all students who would agree they’re receiving good instruction, which is why researchers may want representation from students who primarily receive in-person instruction and students who primarily receive digital instruction.

a)
Have Every student assigned a unique integer from 1-30,000. Then use a random integer calculation to select 500 of those integers without replacement and the students with the assigned integers that were chosen will be part of the sample.
b)
By using stratification we can avoid some sampling bias that could be caused by under-coverage of online or in person teaching quality. Thus when we use strata both the opinions of online students and in person students will be equally reflected.

Good work on both parts! Part (a) provides just about all of the information required to execute an SRS. The small tweak is that you should say “500 unique numbers between 1 and 30,000,” as your answer indirectly leaves repeats as an option, which we don’t want. In part (b) you provide an in-context reason for differing opinions between the groups (and therefore a reason for stratifying our sample!).

Hello again!

Be careful in part (a) - you say “1-70,000” when in this scenario there are only 30,000 students. There was a similar CollegeBoard problem with 70,000 students though lol. Then, you should be clear that the 500 numbers you’re generating are unique and within the boundaries of the assigned numbers. You can see an answer submitted by “devangana.rana” a few days back for an example of this.

In part (b), you give a good description of why stratification is used - but you need to go a little further in describing why the type of instruction a student receives would influence their opinion in this particular case. It can really be any reason you want to come up with, as long as you defend it.

~Jerry

2550 north lake drive
suite 2
milwaukee, wi 53211

✉️ help@fiveable.me

*ap® and advanced placement® are registered trademarks of the college board, which was not involved in the production of, and does not endorse, this product.