Está en la página 1de 3

MATH 107 Curve-fitting Project - Linear

Regression Model (UMUC)

Download
Curve-fitting Project - Linear Regression Model
A. Summary For this assignment you will be collecting data which exhibits a relatively linear trend,
finding the line of best fit, plotting the data and the line, interpreting the slope, and using the linear
equation to make a prediction. You will also find r 2 (coefficient of determination) and r (correlation
coefficient). Finally, you will write a report discussing your findings. Your topic may be related to
sports, your work, a hobby, or something you find interesting. If you choose, you may use the
suggestions described below. There are two assignments for you to complete with respect to this
project: 1. A proposal for your project in a posting online in the Linear Model Project Proposal
discussion group. In addition to describing your topic in a few sentences, your posting must include
your data, how/where you obtained your data, and a rough scatterplot. 2. A report in which you
document the linear regression you did on your data, what you found, and predictions based on your
results.
B. Background A linear regression is a technique for examining real-world data to determine if the
data follows a linear model. In other words, given some data points, can we reliably use a line to
model the points and make predictions? There are tools available which will find the best line that
approximates a set of data points. The tools provide a measure of how well the line fits the data
values. If a line exists that is a good fit, then we can use the line to make predicitions for values we
do not have. There are a variety of reference materials available to help you complete the project.
Your textbook has a brief introduction to mathematical models on pages 114 - 117. The following
YouTube video is an introduction to Linear Regression. This is background/motivation rather than
how to actually compute a linear regression. Introduction to Linear Regression Suzanne Sands (a
teacher at UMUC) has made two video tutorials that show you how to compute a linear regression
using Excel. See: Excel Linear Regression Tutorial #1 Excel Linear Regression Tutorial #2 Suzanne
has also done a video on using a free online tool (www.meta-calculator.com) to do a linear
regression. See: Online Linear Regression Tutorial
C. Instructions For this assignment, collect data exhibiting a relatively linear trend, find the line of
best fit, plot the data and 1 the line, interpret the slope, and use the linear equation to make a
prediction. Also, find r 2 (coefficient of determination) and r (correlation coefficient). Discuss your
findings. Your topic may be related to sports, your work, a hobby, or something you find interesting.
Several suggested topics are provided at the end of these instructions. 1. Describe your topic,
provide your data, and cite your source. You must have at least 8 data points for this project. Post

this information in the Linear Model Project Proposal (see the discussion group for a detailed list of
requirements for this posting). This summary is also the first part of your project report. Each student
must use different data. The idea with the discussion posting is two-fold: (1) To share your interesting
project idea with your classmates, and (2) To give me a chance to give you a brief thumbs-up or
thumbs-down about your proposed topic and data. Sometimes students get off on the wrong foot or
misunderstand the intent of the project, and your posting provides an opportunity for some feedback.
Remark: Students may choose similar topics, but must have different data sets. For example, several
students may be interested in a particular Olympic sport, and that is fine, but they must collect
different data, perhaps from different events or different gender. 2. Plot the points (x, y) to obtain a
scatterplot. Use an appropriate scale on the horizontal and vertical axes and be sure to label the
axes carefully, including units. Visually judge whether the data points exhibit a relatively linear trend.
(If so, proceed. If not, try a different topic or data set.) 3. Find the line of best fit (regression line) and
graph it on the scatterplot. The equation of the line must be included on the graph or in the text. 4.
State the slope of the line of best fit. Carefully interpret the meaning of the slope in a sentence or
two. 5. Find and state the value of r 2 , the coefficient of determination, and r, the correlation
coefficient. Discuss your findings in a few sentences. Is r positive or negative? Why? Is a line a good
curve to fit to this data? Why or why not? Is the linear relationship very strong, moderately strong,
weak, or nonexistent? 6. Choose a value of interest and use the line of best fit to make an estimate
or prediction. Show calculation work. 7. Write a brief narrative of a paragraph or two. Summarize
your topic (same information that you posted online at the beginning of the project) as well as your
findings. Be sure to mention any aspect of the linear model project (topic, data, scatterplot, line, r, or
estimate, etc.) that you found particularly important or interesting. Do not just mimic what I have said
in my sample project thoughtfully describe your own project. Items #1-#7 constitute your project
report. You may submit all of your project report in one document or a combination of documents,
which may consist of word processing documents or scanned handwritten work, provided it is clearly
labeled where each task can be found. If you used Excel or other spreadsheet software to do the
graphs, you must copy the resulting graphs into your word processing document. In the past,
students have tried to hand in projects with the text portion written in a spreadsheet this is
confusing and poorly presented and will no longer be accepted. Be sure to include your name.
Projects are graded on the basis of completeness, correctness, and strength of the narrative
portions. While mathematics work can be hand-written, any descriptions, sentences, or paragraphs
must be typed!
D. Suggested Topics You are welcome to use a topic of your own. Several ideas are listed below. If
you are using your own topic, it is important to note that you topic cannot involve a physical law that
is defined to be linear. For example, an inappropriate choice for a topic would be to relate the time it
takes to travel somewhere 2 with the distance travelled. The reason this is an inappropriate choice is
that physical laws tell us that distance = speed time, which is a linear relationship. Further, since
we already know the equation of the line, doing a linear regression for this case is not interesting!
Another example of an inappropriate choice for a topic is data that exhibits a linear trend but have no
apparent cause to do so. For example, if you graph the divorce rate in Maine vs the consumption of
margarine, you will find these values correlated. This is an example of an inappropriate topic for the
project because we have no reason to believe that margarine causes divorces! The goal of this
project is to use data that appears to be roughly linear but where the formula or equation is not
known ahead of time and show how the data can be modelled with a line found through linear

regression. Choose an Olympic sport an event that interests you. Go to


http://www.databaseolympics.com/ and collect data for winners in the event for at least 8 Olympic
games (dating back to at least 1980). (Example: Winning times in Mens 400 m dash). Make a quick
plot for yourself to eyeball whether the data points exhibit a relatively linear trend. (If so, proceed. If
not, try a different event.) After you find the line of best fit, use your line to make a prediction for the
next Olympics (2014 for a winter event, 2012 or 2016 for a summer event ). NOTE: Not all Olympic
events lend themselves to this type of analysis. For instance, downhill skiing times from different
Olympics cannot be compared because the race courses can be very different, unlike swimming
events where the same swimming pool specifications are used with each Olympics. Choose a
particular type of food. (Examples: Fish sandwich at fast-food chains, cheese pizza, breakfast
cereal) For at least 8 brands, look up the fat content and the associated calorie total per serving.
Make a quick plot for yourself to eyeball whether the data exhibit a relatively linear trend. (If so,
proceed. If not, try a different type of food.) After you find the line of best fit, use your line to make a
prediction corresponding to a fat amount not occurring in your data set.) Alternative: Look up
carbohydrate content and associated calorie total per serving. Choose a sport that particularly
interests you and find two variables that may exhibit a linear relationship. For instance, for each team
for a particular season in baseball, find the total runs scored and the number of wins. Excellent
websites: http://www.databasesports.com/ and http://www.baseballreference.com/

También podría gustarte