(solution) 1 Milestone Three The research for solving the question of which

The attached essay is the third milestone for my final project. I am looking for someone to revise it as my final project.  I need someone who understands R (Rattle) to possibly revise the model I included and then using the information below, finalize the project.

1

Milestone Three

The research for solving the question of which AP course a college bound student should

take math or science, analyzed the national numbers of exam participation for 11th and 12th grade

high school students. Because most high schools require four years of math and three years of

science with two years of lab, it was expected that AP math would have a higher number of

student exam participation. Moreover, it was expected that college bound high school students

would take an AP math exam before science because the top U.S. universities offering STEM

degrees require a minimum of four years advanced math as a prerequisite of admission.

Structure

Over the past 20 years, the percentage of high school students completing advanced

mathematics and science courses have substantially increased. The number of advanced math

courses such as Precalculus completed in high school rose from 13% in 1990 to 35% in 2009, and

the number of advanced science courses including Biology, Chemistry, and Physics rose from 19%

in 1990 to 30% in 2009 (Digest of Education Statistics, 2015, Table 225.40). NCES Digest of

Education Statistics (2015) reported more than 41,000 high schools including private in the

United States (Digest of Education Statistics, 2015, Table 214.10).

More and more high schools across the U.S. are utilizing the Advanced Placement (AP)

Program to advance curriculum with rigorous coursework emphasizing college preparation. For

the year 2015, The College Board reported that 21,953 U.S. high schools participate in the AP

program (The College Board, 2016). The Associated Press (2012) reported that 18 percent of

U.S. high school graduates passed at least one AP exam, up from 11 percent a decade ago.

The top-down decision tree depicted below was constructed using the R package Rattle.

It is a classification tree model specifically chosen for its algorithm that does the complex work on 2

its own requiring limited tweaking by the novice still learning the craft. The paragraphs following

Figure 1 Decision Tree AP Program Summary and Figure 2 Summary of the Decision Tree model

for Classification explain the structure of the tree presented in detail.

Figure 1: Decision Tree AP Program Summary

Summary of the Decision Tree model for Classification (built using 'rpart'): n= 72 node), split, n,

loss, yval, (yprob) * denotes terminal node

1) root 72 63 BIOLOGY (0.12 0.097 0.083 0.11 0.12 0.12 0.097 0.11 0.12)

2) X2015.Students.who.took.AP=118,707,152,745,22,789,302,532,52,678 36 28 CHEMISTRY (0

0.19 0.17 0.22 0 0 0.19 0.22 0)

3) X2015.Students.who.took.AP=171,074,195,526,20,533,223,479 36 27 BIOLOGY (0.25 0 0 0 0.25

0.25 0 0 0.25)

4) X2015.Students.who.took.AP=118,707,22,789,302,532 20 13 CALCULUS AB (0 0.35 0.3 0 0 0

0.35 0 0)

5) X2015.Students.who.took.AP=152,745,52,678 16 8 CHEMISTRY (0 0 0 0.5 0 0 0 0.5 0)

6) X2015.Students.who.took.AP=20,533,223,479 18 9 BIOLOGY (0.5 0 0 0 0 0.5 0 0 0)

7) X2015.Students.who.took.AP=171,074,195,526 18 9 PHYSICS 1 (0 0 0 0 0.5 0 0 0 0.5)

8) Mean.Score&gt;=2.935 8 2 CALCULUS BC (0 0.12 0.75 0 0 0 0.12 0 0) *

9) Mean.Score&lt; 2.935 12 6 CALCULUS AB (0 0.5 0 0 0 0 0.5 0 0) *

10) X2015.Students.who.took.AP=152,745 8 0 CHEMISTRY (0 0 0 1 0 0 0 0 0) *

11) X2015.Students.who.took.AP=52,678 8 0 PHYSICS C - MECH (0 0 0 0 0 0 0 1 0) *

12) X2015.Students.who.took.AP=223,479 9 0 BIOLOGY (1 0 0 0 0 0 0 0 0) *

13) X2015.Students.who.took.AP=20,533 9 0 PHYSICS 2 (0 0 0 0 0 1 0 0 0) *

14) X2015.Students.who.took.AP=171,074 9 0 PHYSICS 1 (0 0 0 0 1 0 0 0 0) *

15) X2015.Students.who.took.AP=195,526 9 0 STATISTICS (0 0 0 0 0 0 0 0 1) *

Classification tree:

rpart(formula = AP.Math...Science.Courses ~ ., data = crs\$dataset[crs\$train, c(crs\$input,

crs\$target)], method = &quot;class&quot;, parms = list(split = &quot;information&quot;), control = rpart.control(minsplit

= 8, minbucket = 8, usesurrogate = 0, maxsurrogate = 0))

Variables actually used in tree construction:

[1] Mean.Score X2015.Students.who.took.AP Root node error: 63/72 = 0.875 n= 72 CP nsplit rel error xerror xstd 1 0.138889

2 0.119048 0 1.00000 1.14286 0.000000

4 0.44444 0.82540 0.060327 3 0.079365

4 0.010000 6 0.20635 0.63492 0.066927

7 0.12698 0.42857 0.065205 Time taken: 0.09 secs Rattle timestamp: 2016-09-24 13:43:27 KEPAS Figure 2 Summary of the Decision Tree model for Classification 3

The model that has been built is a fairly large decision tree with seven nodes and eight leaf

nodes.The first node of the tree is Biology. The information provided tells us that the majority

class for the root node (the yval) is No. The 63 tells us how many of the 72 observations will be

incorrectly classifed as Yes, this is also known as the loss. 88% of the observations have the target

variable AP math and science courses as Yes and 12% of the observations have it as No. The

algorithm has chosen 2015 Students who took AP for the next split with a split value of 50/50 for

Chemistry and Biology. Node 2 uses the same variable 2015 Students who took AP to branch and

split nodes 4 and 5 that shows 28% took Calculus AB and 22% Chemistry.

The right side, Node 3 branches and splits to leaf nodes 6 and 7 showing 25% took

Biology and 25% took Physics I. The algorithm then chooses the mean score to split on Calculus

AB to leaf nodes 8 and 9 showing with 11% on Calculus BC and 17% Calculus AB. Node 5

Chemistry splits leaf nodes 10 Chemistry and 11 Physics C Mechanics 11/11. Node 6 Biology

splits leaf nodes 12 Biology and 13 Physics II 12%/12%. Finally, Node 7 splits leaf nodes14

Physics I and 16 Statistics 12%/12%.

What this means is that Biology was taken more than Chemistry, Physics I, Physics II and

Physics Mech. Biology was taken more than math. Calculus AB was taken more than Calculus BC

and Statistics. Based on the projections of this model, a college bound high school student given

the choice between taking an AP math or science course, would take a science course.

Process Documentation. The data set for this research, National Report was taken from the

College Board?s AP Program Participation and Performance Data 2015. The National Report is an

excel document with several worksheets comprising the following raw data (The College Board, 2016).

1. Number of AP exams taken by high school students listed by subject 4

2. Number of exams by subject for all participating high schools

3. Number of exams by subject accepted by colleges

4. Number of exam takers broken down with AP scores and mean by high school

The original data presented 36 subjects with all of the above breakdowns. The data chosen

by the algorithms on 36 subjects made it impossible to get useful results for making the research

decision. The data set was narrowed down to only the math and science subjects and data.

Another problem was the gender and race/ethnicity which made analysis difficult when the

algorithms chose to split on one of these variables. Those variables were not removed from the

dataset. Finally after modifying the partition default from 70/15/15 to 80/10/10 the algorithm

chose the ?number of students completing exams? variable to split on the AP subjects. The data

set was also saved from an Excel file to a comma-delimited.

Figure 3: The College Board?s National

Summary 5

Evaluation of Results. Although this analysis examined the data that included a wealth of

detailed information on the number of students that took AP math and science courses, we did not

have information on the number of high school students enrolled in STEM degree college

programs. Additionally, while the top 50 U.S. universities who offer STEM degrees were included

in the colleges accepting exams they were not identified separately in the data set and specific

prerequisite requirements were not obtained. Such information would have allowed more analysis

of whether college bound students interested in pursuing STEM degrees should take an AP math

or science exam. 6

References

The Associated Press. (2012, May 5). More students taking Advanced Placement classes, but test

The College Board. (2016). AP Program Participation and Performance Data 2015 ? Research ?

https://research.collegeboard.org/programs/ap/data/archived/ap-2015

The College Board. (2016). Number of schools offering AP exams (Rep.). Retrieved

https://research.collegeboard.org/programs/ap/data/participation/ap-2016

U.S. Department of Education, National Center for Education Statistics. (2012, September).

Percentage of public and private high school graduates taking selected mathematics and

science courses in high school, by selected student and school characteristics: Selected

https://nces.ed.gov/programs/digest/d15/tables/dt15_225.40.asp

U.S. Department of Education, National Center for Education Statistics. (2015). Advanced

U.S. Department of Education, National Center for Education Statistics. (2016, January).

Number of public school districts and public and private elementary and secondary

https://nces.ed.gov/programs/digest/index.asp

Williams, G. J. (2013). Data mining with Rattle and R: The art of excavating data for knowledge

discovery. New York: Springer.

Solution details:
STATUS
QUALITY
Approved

This question was answered on: Jan 30, 2021

Solution~0001000545.zip (25.37 KB)

This attachment is locked

We have a ready expert answer for this paper which you can use for in-depth understanding, research editing or paraphrasing. You can buy it or order for a fresh, original and plagiarism-free solution (Deadline assured. Flexible pricing. TurnItIn Report provided)

STATUS

QUALITY

Approved

Jan 30, 2021

EXPERT

Tutor