Question Details

(solution) 1 Milestone Three The research for solving the question of which

The attached essay is the third milestone for my final project. I am looking for someone to revise it as my final project.  I need someone who understands R (Rattle) to tweak the decision tree using the attached .csv dataset and then revise the essay according to the instructions doc.



Milestone Three


The research for solving the question of which AP course a college bound student should


take math or science, analyzed the national numbers of exam participation for 11th and 12th grade


high school students. Because most high schools require four years of math and three years of


science with two years of lab, it was expected that AP math would have a higher number of


student exam participation. Moreover, it was expected that college bound high school students


would take an AP math exam before science because the top U.S. universities offering STEM


degrees require a minimum of four years advanced math as a prerequisite of admission.




Over the past 20 years, the percentage of high school students completing advanced


mathematics and science courses have substantially increased. The number of advanced math


courses such as Precalculus completed in high school rose from 13% in 1990 to 35% in 2009, and


the number of advanced science courses including Biology, Chemistry, and Physics rose from 19%


in 1990 to 30% in 2009 (Digest of Education Statistics, 2015, Table 225.40). NCES Digest of


Education Statistics (2015) reported more than 41,000 high schools including private in the


United States (Digest of Education Statistics, 2015, Table 214.10).


More and more high schools across the U.S. are utilizing the Advanced Placement (AP)


Program to advance curriculum with rigorous coursework emphasizing college preparation. For


the year 2015, The College Board reported that 21,953 U.S. high schools participate in the AP


program (The College Board, 2016). The Associated Press (2012) reported that 18 percent of


U.S. high school graduates passed at least one AP exam, up from 11 percent a decade ago.


The top-down decision tree depicted below was constructed using the R package Rattle.


It is a classification tree model specifically chosen for its algorithm that does the complex work on 2


its own requiring limited tweaking by the novice still learning the craft. The paragraphs following


Figure 1 Decision Tree AP Program Summary and Figure 2 Summary of the Decision Tree model


for Classification explain the structure of the tree presented in detail.


Figure 1: Decision Tree AP Program Summary


Summary of the Decision Tree model for Classification (built using 'rpart'): n= 72 node), split, n,


loss, yval, (yprob) * denotes terminal node


1) root 72 63 BIOLOGY (0.12 0.097 0.083 0.11 0.12 0.12 0.097 0.11 0.12)


2) X2015.Students.who.took.AP=118,707,152,745,22,789,302,532,52,678 36 28 CHEMISTRY (0


0.19 0.17 0.22 0 0 0.19 0.22 0)


3) X2015.Students.who.took.AP=171,074,195,526,20,533,223,479 36 27 BIOLOGY (0.25 0 0 0 0.25


0.25 0 0 0.25)


4) X2015.Students.who.took.AP=118,707,22,789,302,532 20 13 CALCULUS AB (0 0.35 0.3 0 0 0


0.35 0 0)


5) X2015.Students.who.took.AP=152,745,52,678 16 8 CHEMISTRY (0 0 0 0.5 0 0 0 0.5 0)


6) X2015.Students.who.took.AP=20,533,223,479 18 9 BIOLOGY (0.5 0 0 0 0 0.5 0 0 0)


7) X2015.Students.who.took.AP=171,074,195,526 18 9 PHYSICS 1 (0 0 0 0 0.5 0 0 0 0.5)


8) Mean.Score>=2.935 8 2 CALCULUS BC (0 0.12 0.75 0 0 0 0.12 0 0) *


9) Mean.Score< 2.935 12 6 CALCULUS AB (0 0.5 0 0 0 0 0.5 0 0) *


10) X2015.Students.who.took.AP=152,745 8 0 CHEMISTRY (0 0 0 1 0 0 0 0 0) *


11) X2015.Students.who.took.AP=52,678 8 0 PHYSICS C - MECH (0 0 0 0 0 0 0 1 0) *


12) X2015.Students.who.took.AP=223,479 9 0 BIOLOGY (1 0 0 0 0 0 0 0 0) *


13) X2015.Students.who.took.AP=20,533 9 0 PHYSICS 2 (0 0 0 0 0 1 0 0 0) *


14) X2015.Students.who.took.AP=171,074 9 0 PHYSICS 1 (0 0 0 0 1 0 0 0 0) *


15) X2015.Students.who.took.AP=195,526 9 0 STATISTICS (0 0 0 0 0 0 0 0 1) *


Classification tree:


rpart(formula = AP.Math...Science.Courses ~ ., data = crs$dataset[crs$train, c(crs$input,


crs$target)], method = "class", parms = list(split = "information"), control = rpart.control(minsplit


= 8, minbucket = 8, usesurrogate = 0, maxsurrogate = 0))


Variables actually used in tree construction:


[1] Mean.Score X2015.Students.who.took.AP Root node error: 63/72 = 0.875 n= 72 CP nsplit rel error xerror xstd 1 0.138889


2 0.119048 0 1.00000 1.14286 0.000000


4 0.44444 0.82540 0.060327 3 0.079365


4 0.010000 6 0.20635 0.63492 0.066927


7 0.12698 0.42857 0.065205 Time taken: 0.09 secs Rattle timestamp: 2016-09-24 13:43:27 KEPAS Figure 2 Summary of the Decision Tree model for Classification 3


The model that has been built is a fairly large decision tree with seven nodes and eight leaf


nodes.The first node of the tree is Biology. The information provided tells us that the majority


class for the root node (the yval) is No. The 63 tells us how many of the 72 observations will be


incorrectly classifed as Yes, this is also known as the loss. 88% of the observations have the target


variable AP math and science courses as Yes and 12% of the observations have it as No. The


algorithm has chosen 2015 Students who took AP for the next split with a split value of 50/50 for


Chemistry and Biology. Node 2 uses the same variable 2015 Students who took AP to branch and


split nodes 4 and 5 that shows 28% took Calculus AB and 22% Chemistry.


The right side, Node 3 branches and splits to leaf nodes 6 and 7 showing 25% took


Biology and 25% took Physics I. The algorithm then chooses the mean score to split on Calculus


AB to leaf nodes 8 and 9 showing with 11% on Calculus BC and 17% Calculus AB. Node 5


Chemistry splits leaf nodes 10 Chemistry and 11 Physics C Mechanics 11/11. Node 6 Biology


splits leaf nodes 12 Biology and 13 Physics II 12%/12%. Finally, Node 7 splits leaf nodes14


Physics I and 16 Statistics 12%/12%.


What this means is that Biology was taken more than Chemistry, Physics I, Physics II and


Physics Mech. Biology was taken more than math. Calculus AB was taken more than Calculus BC


and Statistics. Based on the projections of this model, a college bound high school student given


the choice between taking an AP math or science course, would take a science course.


Process Documentation. The data set for this research, National Report was taken from the


College Board?s AP Program Participation and Performance Data 2015. The National Report is an


excel document with several worksheets comprising the following raw data (The College Board, 2016).


1. Number of AP exams taken by high school students listed by subject 4


2. Number of exams by subject for all participating high schools


3. Number of exams by subject accepted by colleges


4. Number of exam takers broken down with AP scores and mean by high school


grade, gender, race/ethnicity


The original data presented 36 subjects with all of the above breakdowns. The data chosen


by the algorithms on 36 subjects made it impossible to get useful results for making the research


decision. The data set was narrowed down to only the math and science subjects and data.


Another problem was the gender and race/ethnicity which made analysis difficult when the


algorithms chose to split on one of these variables. Those variables were not removed from the


dataset. Finally after modifying the partition default from 70/15/15 to 80/10/10 the algorithm


chose the ?number of students completing exams? variable to split on the AP subjects. The data


set was also saved from an Excel file to a comma-delimited.


Figure 3: The College Board?s National


Summary 5


Evaluation of Results. Although this analysis examined the data that included a wealth of


detailed information on the number of students that took AP math and science courses, we did not


have information on the number of high school students enrolled in STEM degree college


programs. Additionally, while the top 50 U.S. universities who offer STEM degrees were included


in the colleges accepting exams they were not identified separately in the data set and specific


prerequisite requirements were not obtained. Such information would have allowed more analysis


of whether college bound students interested in pursuing STEM degrees should take an AP math


or science exam. 6




The Associated Press. (2012, May 5). More students taking Advanced Placement classes, but test


pass rate remains about the same. Retrieved from


The College Board. (2016). AP Program Participation and Performance Data 2015 ? Research ?


The College Board. Retrieved from


The College Board. (2016). Number of schools offering AP exams (Rep.). Retrieved


U.S. Department of Education, National Center for Education Statistics. (2012, September).


Percentage of public and private high school graduates taking selected mathematics and


science courses in high school, by selected student and school characteristics: Selected


years, 1990 through 2009. Retrieved from


U.S. Department of Education, National Center for Education Statistics. (2015). Advanced


mathematics and science courses. Retrieved from


U.S. Department of Education, National Center for Education Statistics. (2016, January).


Number of public school districts and public and private elementary and secondary


schools: Selected years, 1869-70 through 2013-14. Retrieved from


Williams, G. J. (2013). Data mining with Rattle and R: The art of excavating data for knowledge


discovery. New York: Springer.


Solution details:

This question was answered on: Jan 30, 2021

PRICE: $15 (25.37 KB)

Buy this answer for only: $15

This attachment is locked

We have a ready expert answer for this paper which you can use for in-depth understanding, research editing or paraphrasing. You can buy it or order for a fresh, original and plagiarism-free solution (Deadline assured. Flexible pricing. TurnItIn Report provided)

Pay using PayPal (No PayPal account Required) or your credit card . All your purchases are securely protected by .

About this Question






Jan 30, 2021





We have top-notch tutors who can do your essay/homework for you at a reasonable cost and then you can simply use that essay as a template to build your own arguments.

You can also use these solutions:

  • As a reference for in-depth understanding of the subject.
  • As a source of ideas / reasoning for your own research (if properly referenced)
  • For editing and paraphrasing (check your institution's definition of plagiarism and recommended paraphrase).
This we believe is a better way of understanding a problem and makes use of the efficiency of time of the student.


Order New Solution. Quick Turnaround

Click on the button below in order to Order for a New, Original and High-Quality Essay Solutions. New orders are original solutions and precise to your writing instruction requirements. Place a New Order using the button below.


Order Now