How to Use the Data SGP Package to Calculate Student Growth Percentiles (SGP) and Percentile Growth Trajectories

The data sgp package provides classes, functions and data to calculate student growth percentiles (SGP) and percentile growth projections/trajectories using large scale, longitudinal education assessment data. Quantile regression is used to estimate the conditional density associated with a student’s achievement history and these coefficient matrices are then used to derive percentile growth estimates.

A student’s SGP measures the student’s relative progress in a subject-matter test compared to the progress of other students with comparable score histories on that test. It is a number between 1 and 99, with higher numbers representing greater relative progress. A student who has a SGP of 70 indicates that, in the most recent testing window, their score on the subject-matter test rose to the point where they are no longer performing below half of all students with similar score histories on that test.

SGPs estimated from standardized test scores are error-prone measures of the latent achievement attribute that they represent. This is because standardized tests only measure a limited number of items and because of the finite number of ways in which a test can be administered. Despite these errors, recent research has shown that SGPs estimated from standardized test scores provide reliable, valid and meaningful measures of the performance of students.

In order to address the issues of estimation errors, SGP analyses are typically based on two steps. First, a prior test is estimated for each student by applying a quantile regression model. Then, current test scores are compared to this prior estimate. This step is a reasonable approximation of the true SGP for a given student, but it introduces error in two ways: the error from the comparison and the error from the prior estimation.

These errors are not inherently cancellable, but the SGP package includes functions to compensate for them. These functions use the same distribution for all analyses so that the error associated with a particular year’s data is averaged out across all subsequent years’ analysis. This allows for accurate and valid comparisons of student progress across assessment years.

For example, consider a sixth grader named Simon who is taking a statewide English language arts (ELA) assessment for the fifth time this school year. Suppose he or she achieved a scale score of 370 on the assessment this year and a scale score of 300 last year. This represents a SGP of 85, which means that this student’s score increased by more than the rate experienced by 85 percent of his or her academic peers.

As a note of caution, the decision of whether or not to format the data in WIDE or LONG format is driven by many factors. In general, for operational SGP analyses performed year after year, LONG format is recommended for its preparation and storage benefits. However, the lower level SGP function studentGrowthPercentiles and higher level wrapper function studentGrowthProjections do not require LONG format and will work with data in either format.