Data Analysis Project

0All Excel output should be copied into a single Word document where you must enter all of your responses to the questions below. Format the document professionally so it flows well. Include a table of contents.

Choose a published database from the Bethel library. You may opt to use one of the data files provided by the instructor if applicable. (I chose Baseball and I will upload the document) 

Explain each variable in the file that you are analyzing. Be sure your file includes at least 3 scale variables and at least 2 nominal variables. 

Conduct a descriptive analysis on any 2 interval / ratio variables you wish using Descriptive_Statistics.xls and Frequency_Distribution.xls. Explain the output. 

Conduct 3 different hypothesis tests of your choice using appropriate variables from the file (note: you must use 3 different tests and not run one test on 3 different variables).

In each case, state the variables being tested as well as the hypothesis, decision and conclusion.

Use 3 of the following (1-Sample Test for Means, 1-Sample Test for Proportions, 2-Sample Test for Means – Independent Samples, 2-Sample Test for Means – Paired Samples, 2-Sample Test for Proportions, Analysis of Variance, Chi Square Goodness of Fit Test, Chi Square Test of Independence, Correlation Test). 

Develop a model to predict an interval / ratio variable using at least 2 other variables.

Use Multiple_Regression.xls and state the regression model and which variables are or are not significant. Also, use the model to make a prediction by making up values for each of the independent variables. 

Write a one to two page summary of your findings. Include the data file in the appendix.

The one I’m using a description of database that is to be used I have uploaded the data.

BASEBALL: This file includes actual team by team data for the 1997 MLB season. The key variable to predict in Multiple Regression Analysis is the number of wins (or possibly the attendance). Lots of interesting analysis possibilities here, including how team salary relates to a team making the playoffs, or whether money buys wins, or how wins relate to attendance, or how performance on the field relates to the field surface, etc. If you know something about baseball, this file should make sense to you.

Note that you should not use a nominal variable with 3 or more values in the Multiple Regression Analysis.