Using regression analysis to derive a demand curve, also t-stats, R-squared, F-stat, adjusted R-square) - Anya DeVoss
Final Paper:
Regression analysis is defined as a statistical technique used to find relationships between variables in order to predict future values. It is used to determine the best fit line correlated to a series of data points that are graphing the relationship between a dependent variable and an explanatory variable. Regression analysis is used regularly in the business world to determine the relationship between a given dependent variable on various explanatory variables. It can be used to determine equitable compensation for workers by using the amount of responsibility, the number of people they supervise, and more to evaluate what contributes to the value of a person's job and what compensation they should be paid based on their duties. Regression analysis simply answers the general question "what is the best predictor of ?". Researchers in the education field may want to determine what is the best predictor of high school success in students or the best predictor of teacher effectiveness. Sales employees may use regression analysis to see the impact of a given type of advertising on the number of sales made because of that advertisement. This type of analysis predicts the outcome of a given indicator (the dependent variable), based on the interactions between it and other related drivers (or explanatory variables). Regression analysis is used all the time in the business world, and you can find examples of it in nearly every academic journal that uses statistics in their research. The following is a link to an academic journal that uses regression analysis to determine the impact of the cooperative education experience (the explanatory variable) on GPA (the dependent variable).
Econometrics is a much more complicated phenomenon to arrive at statistical analysis, but is where regression analysis is derived from. Econometrics is defined as the application of statistical and mathematical methods in economics to describe the relationships between key economic forces. To find a relationship between variables (one dependent, one explanatory), data is collected, and the gathered information is plotted on a graph. They will not be in a straight line, so to find a smooth curve an econometrician must choose what demonstrates a good relationship between the points plotted. Econometricians use regression software to find values that minimize the sum of the squared deviations between the actual points graphed and the regression line drawn (the expected relation).
Regression software is readily available to the everyday businessman. Microsoft Excel has regression capabilities through its data analysis feature. This is a link to a document posted by the University of Missouri-Columbia. It has a step-by-step description of how to use Microsoft Excel’s regression analysis tool.
Also, a common, easy-to-use regression tool is the regression feature in the MegaStat add-in for Microsoft Excel. You can estimate demand by using regression analysis. To do this, you need to use simple linear regression to plot your data points on a scatter plot to determine if a linear relationship exists between the variables. You then draw a line through the center of the majority of the points, which will be the line of best fit. The line of best fit is the straight trend line that runs through and represents the majority of the data points on your graph. This line can pass through some of the points, none of the points, or all of the points depending on where your points fall on your graph.
Using regression analysis software, (shown in my MegaStat examples later on), you will be given an estimated equation for the line of best fit that is drawn through the center of the majority of the points that were plotted. This equation is the estimated demand equation for your variables. The y-intercept for your equation can be found in MegaStat's output by looking at where it says the Coefficient Intercept. The slope for each of your variables can then be found by looking at the variable and the coefficient associated with it in the MegaStat output. You then need to evaluate your equation by looking at the standard errors of the estimates, the t-statistics, as well as the coefficient of determination to see its goodness of fit.
Standard error is the standard deviation of the sampling distribution of a statistic. It is a statistical term that measures how accurately a sample represents a population. You want the standard error of your model to be as low as possible.
A test statistic (referred to as a t-stat) is a number that helps determine whether a hypothesis will be accepted or rejected. If the test statistic is too far off of the original hypothesis, it will be rejected. Conversely, when a test statistic is close to the original hypothesis, it will likely be accepted. Your explanatory variable is deemed reasonable if the t-statistic is close to zero.
R-squared is another name for the coefficient of determination. It is a statistical method that explains how much variability there is in the relationship that your dependent variable can be explained by your explanatory variable. It is shown as a value between 0 and 1. The higher your value of R-squared is, the better the fit of the model. Coefficient of determination is symbolized by “r2” because it is square of the coefficient of correlation, symbolized by “r”. Coefficient of correlation (r) is a statistical measure of the correlation (or linear relationship) between a dependent variable and your explanatory variable. Represented by the lowercase letter “r”, its value varies between -1 and 1. Having an “r” of 1 means there is perfect correlation, a 0 means there no correlation, all positive values means the relationship is positive (when one variable increases, so does the other), and all negative values mean the relationship is negative (when one increases, the other decreases).
The coefficient of determination (R-squared) determines the degree of linear correlation of your variables, or the goodness of fit. Higher values of R-squared indicate that the model is a good fit to describe the data, meaning that the demand curve comes very close to the majority of your points. Another of way of explaining goodness of fit is that if the model is a good fit to describe your data, the total variance of your Y variables explained by your demand equation will be small. The larger the R-squared figure is, the poorer the fit, meaning your demand equation does not portray a relationship between your variables.
You also want to look at the f-statistic in your model’s regression output. An F-statistic is a value resulting that determines if the variances between the means of two populations are significantly different. An f-statistic is not used in the interpretation of a regression, but is used to determine the model’s P-value. A P-value is the probability that an effect from the current observation occurred by chance. Generally, you want the P-value of your model to be less than 5% to show that there is a less than 5% chance that the explanatory variable is not an indicator of your dependent variable. A P-value of less than 5% shows that your model is statistically significant.
Adjusted R-square is a modification of R-square that adjusts for the number of terms in a model. R-square always increases when a new term is added to a model, but adjusted R-square increases only if the new term improves the model more than would be expected by chance. Adjusted R-square penalizes the prediction for using useless or non-explanatory predictors. You want your adjusted R-square value to be close to the value you got for R-square. This means that no explanatory variables are missing in your model.
If your numbers are all good (meaning your standard error is low, your t-statistics are close to your original hypothesis, your coefficient of determination shows a high correlation between your dependent variable and your explanatory variable, your P-value is less than 5%, and your adjusted r-square is close to the value of your r-squared), then your estimated demand equation is a good representation of your data. Once you have proved that your estimated demand equation is a good representation, you can then insert forecasted numbers into the equation (specifically insert forecasted values of your explanatory variables into the equation) to predict future demand of your dependent variable.
The following is an example of regression analysis that I created. It uses bivariate regression (which uses two explanatory variables and one dependent variable). The regression output is found using the MegaStat add-in in Microsoft Excel. This example analyzes the relationship between the number of copy machines serviced by a worker (the dependent variable) to the number of minutes required to service the machines (the independent variable). This example looks at the demand of copy machine maintenance when it comes to the explanatory variable of the number of minutes required to service the machines.
The following is an additional example of regression analysis that I created. It uses multiple regression (which analyzes three explanatory variables and one dependent variable). The regression output is found using the MegaStat add-in in Microsoft Excel. This example analyzes the relationship between the monthly labor hours needed at a hospital (the dependent variable) to the number of monthly x-ray exposures, monthly occupied bed days, and average length of patients’ stay in days (the explanatory variables). This example looks at the demand of labor at a hospital when it comes to the explanatory variables of how many x-ray exposures are taken monthly, how many days beds are occupied during the month, and the average length of patients' stay in days.
1) True or False. Your model depicts a good fit if your R-squared, or coefficient of determination, is closer to 1.0.
2) Regression analysis looks at the relationship between: A. Dependent variables and Independent variables B. Independent variables and Explanatory variables C. Dependent variables and Explanatory variables D. None of the above
3) True or False. R-square increases only if the new term improves the model more than would be expected by chance.
4) True or False. An F-statistic determines if the variances between the means of two samples are significantly different.
5) Regression analysis: A. answers the question "What is the best predictor of _?". B. answers the question "What is the best indicator of supply?" C. is not often used outside of a statistics course. D. does not help you estimate demand.
Answers to Sample Questions: 1) True 2) C. Dependent variables and Explanatory variables 3) False – should say Adjusted R-squared to be true 4) False – should say between two populations to be true 5) A. answers the question “What is the best predictor of ?”
Armstrong, J. S. & Green, K. (2011, August 17). Demand forecasting: Evidence-based methods. Oxford Handbook in Managerial Economics, 1-23.
Baye, M. R. (2006). Managerial Economics and Business Strategy. New York: McGraw-Hill Irwin.
Blair, B., Millea, M. & Hammer, J. (2004, October). The impact of cooperative education on academic performance and compensation of engineering majors. Journal of Engineering Education, 333-338.
-------------------------------------------------------------------------------------------------END OF FINAL REPORT------------------------------------------------------------------------------------------------------------- WikiSpaces Summaries:
#1) Supply and Demand (Also, include private goods vs. public goods, marginal analysis, producer and consumer surplus and how it applies to perfect competition) by Thaddeus Bogardus This paper explained many of the basic concepts in a discussion of supply and demand through brief, but thorough definitions of those concepts. The author then goes on to describe what causes a demand curve to shift (including a description of various types of goods and their relationships with demand) as well as what causes a supply curve to shift (including different economic factors and forces that influence supply). He then goes into specific detail concerning supply and demand by bringing in four different real world applications. His definitions, descriptions, and explanations make for a very good and easy-to-understand basic essay on the concepts of supply and demand. A reader with little experience with the topic could fairly easily understand it through reading his essay.
#2) Comparative advantage and trade (How do you determine your comparative advantage?) by Regina Gauger Regina’s paper begins by discussing that comparative advantage is when two parties both gain from trade if when in the absence of trade they have different relative costs for producing the same goods. She then goes into various examples of comparative advantage in different industries including the PC industry, the apparel industry, and the drug trafficking industry. She then explains how you know if you have a comparative advantage by using the same examples as before to describe the specific things that prove a comparative advantage is present. She then explains an opposing view which holds that a comparative advantage is not always present when two parties both gain from trade as she had previously explained.
#3) Profit – opportunity costs including implicit costs and explicit costs, the difference between economic and accounting costs, economic profit economic losses, and zero economic profit by Matthew Gray Matthew begins his paper by giving brief definitions of different concepts of profits/losses and costs. He shows graphs depicting these concepts, but does not really provide a written description of the graphs. Although I know what they mean, I am not sure someone not having taken an economics course such as this would understand them. He links a couple articles that discuss economic loss as well as a written explanation of various economic and opportunity costs. Although his concepts make sense, his paper seemed somewhat scattered.
#4) Monopoly by Amy Kitzman Amy begins her essay by defining a monopoly as a single firm that serves an entire market, where there are no close substitutes for their products. She then goes into four different sources that help create monopoly power, including economies of scale, economies of scope, cost complementary, and patents and other legal barriers. She discusses what would show on a graph depicting the market of a monopoly as well as its cost and revenue curves. She goes on to explain that the Sherman Antitrust Act tries to prevent the creation of monopolies as well as a couple examples of monopolies in the world.
#5) Market Failure – Antitrust, positive and negative externalities, public goods, rent seeking, tariffs Government failure by Mark Wilk Mark begins his essay by discussing how a monopoly causes a market in which the firm produces less than the socially efficient level of output. He discusses how Google has been under watch for using monopolistic strategies to get consumers to use their services. He then moves on to a discussion of negative externalities, specifically looking at pollution when it comes to the energy industry. He then discusses how public goods can cause markets to fail as well as how rent seeking can negatively impact a market.
-------------------------------------------------------------------------------------------------END OF SUMMARIES------------------------------------------------------------------------------------------------------------------ WEEKLY BLOG POSTS:
Week of October 3rd-7th, 2011:
Regression analysis is defined as a "statistical technique used to find relationships between variables for the purpose of predicting future values".
WebFinance. (2011). Regression analysis. Retrieved from http://www.investorwords.com/4136/regression_analysis.html
Regression analysis is used to determine the best fit line correlated to a series of data points that are graphing the relationship between a dependent variable and an explanatory variable.
A test statistic (AKA: t-stat) is defined as a "sample used to determine whether a hypothesis will be accepted or rejected. If the test statistic is too far off of the original hypothesis, it will be rejected. Conversely, when a test statistic is close to the original hypothesis, it will likely be accepted."
WebFinance. (2011). Test statistic. Retrieved from http://www.businessdictionary.com/definition/test-statistic.html
Coefficient of correlation (r) is a "statistical measure of the linear relationship (correlation) between a dependent-variable and an independent variable. Represented by the lowercase letter 'r', its value varied between -1 and 1: 1 means perfect correlation, 0 means no correlation, positive values means the relationship is positive (when one goes up so does the other), negative values mean the relationship is negative (when one goes up the other goes down)."
WebFinance. (2011). Coefficient of correlation (r). Retrieved from http://www.businessdictionary.com/definition/coefficient-of-correlation-r.html
R-squared is another name for the coefficient of determination. It is defined as "a statistical method that explains how much of the variability of a factor can be caused or explained by its relationship to another factor. It is computed as a value between 0 (0 percent) and 1 (100 percent). The higher the value, the better the fit. Coefficient of determination is symbolized by r^2 because it is square of the coefficient of correlation, symbolized by r. The coefficient f determination is an important tool determining the degree of linear-correlation of variables ('goodness of fit') in regression analysis."
WebFinance. (2011). Coefficient of determination (r2). Retrieved from http://www.businessdictionary.com/definition/coefficient-of-determination-r2.html
An F-statistic is "a value resulting from a standard statistical test used in ANOVA and regression analysis to determine if the variances between the means of two populations are significantly different."
Hennekens, C. H. (1987). Epidemiology in medicine. Retrieved from http://www.tulane.edu/~panda2/Analysis2/two-way/fstat_and_significance.htm
Adjusted R-square is a "modification of R-square that adjusts for the number of terms in a model. R-square always increases when a new term is added to a model, but adjusted R-square increases only if the new term improves the model more than would be expected by chance."
Mt. Rushmore Securities. R-squared. Retrieved from http://www.hedgefund-index.com/d_rsquared.asp
Week of October 10th-14th, 2011:
Econometrics is defined as "the application of statistical and mathematical methods in the field of economics to describe the numerical relationships between key economic forces such as capital, interest rates, and labor".
WebFinance. (2011). Econometrics. Retrieved from http://www.investorwords.com/1638/econometrics.html
Econometrics is a much more complicated phenomena to arrive at statistical analysis, but is where regression analysis is derived from. To find a relationship between variables (one dependent, one explanatory), data is collected, and the gathered information is plotted on a graph. They will not be a straight line, so to find a smooth curve, an econometrician must choose what demonstrates a good relationship between the points plotted. Ecnometricians use regression software to find values that minimize the sum of the squared deviations between the actual points graphed and the regression line drawn (the expected relation).
Regression software is readily available to the everyday businessman. Microsoft Excel has regression capabilities through its data analysis feature. Also, a common, and simple-to-use regression tool is the regression feature in MegaStat.
Standard error is defined as "the standard deviation of the sampling distribution of a statistic. It is a statistical term that measures the accuracy with which a sample represents a population. In statistics, a sample mean deviates from the actual mean of a population; this deviation is the standard error".
Investopedia. (2011). Standard error. Retrieved from http://www.investopedia.com/terms/s/standard-error.asp#axzz1b43n8opO
Week of October 17th-21st, 2011:
The following is an example of regression analysis. It uses bivariate regression (two explanatory variables vs. one dependent variable).
The regression output is found using the MegaStat add-in in Microsoft Excel.
Line of Best Fit: The line of best fit is the straight trend line that runs through and represents the majority of the data points on your graph. This line can pass through some of the points, none of the points, or all of the points depending on where your points fall on your graph.
Goodness of Fit: The value R-squared quantifies the goodness of fit in the model. It is given as a number between 0.0 and 1.0. Higher values of R-squared indicate that the model is a good fit to describe the data, meaning that the demand curve comes very close to the majority of your points. Another of way of explaining goodness of fit is that if the model is a good fit to describe your data, the total variance of your Y variables explained by your demand equation will be small. The larger the R-squared figure is, the poorer the fit, meaning your demand equation does not portray a relationship between your variables.
Week of November 7th-11th, 2011:
You can estimate demand by using regression analysis. To do this, you need to use simple linear regression to plot quantity and price values on a scatter plot to determine if a linear relationship exists between the variables. You then draw a line through the center of the majority of the points, which will be the line of best fit. Using regression analysis software, such as MegaStat in my examples, you will be given an estimated equation for the line of best fit that is drawn through the center of the majority of the points that were plotted. This equation is the estimated demand equation for your variables. The y-intercept for your equation can be found in MegaStat's output by looking at where it says the Coefficient Intercept. The slope for each of your variables can then be found by looking at the variable and the coefficient associated with it in the MegaStat output. Once you have your estimated demand equation, you then want to evaluate it to see its standard goodness of fit.
You need to evaluate your equation by looking at the standard errors of the estimates, the t-statistics of the estimates of the coefficients, as well as the coefficient of determination. If your numbers are all good (meaning your standard error is low, your t-statistics are close to your original hypothesis, and your coefficient of determination is shows a high correlation between your dependent variable and your explanatory variable), then your estimated demand equation is a good representation of your data.
Week of November 14th-18th, 2011:
Regression analysis is used regularly in the business world to determine the relationship between a given dependent variable on various explanatory variables. It can be used to determine equitable compensation for workers by using the amount of responsibility, the number of people they supervise, and more to evaluate what contributes to the value of a person's job and what compensation they should be paid based on their duties. Regression analysis simply answers the general question "what is the best predictor of _?". Researchers in the education field may want to determine what is the best predictor of high school success in students, of teacher effectiveness, and more. Sales employees may use regression analysis to see the impact of a given type of advertising on the number of sales made because of that advertisement. This type of analysis predicts the outcome of a given indicator (the dependent variable), based on the interactions between it and other related drivers (or explanatory variables).
Week of November 21st-25th, 2011:
Questions:
1) True or False. Your model depicts a good fit if your R-squared, or coefficient of determination, is closer to 1.0.
Answer: True
2) Regression analysis looks at the relationship between:
A. Dependent variables and Independent variables
B. Independent variables and Explanatory variables
C. Dependent variables and Explanatory variables
D. None of the Above
Answer: C. Dependent variables and Explanatory variables
3) True or False. R-square increases only if the new term improves the model more than would be expected by chance. Answer: False, Adjusted R-squared does this.
4) True or False. An F-statistic determines if the variances between the means of two samples are significantly different. Answer: False. between two populations
5) Regression analysis:
A. answers the question "What is the best predictor of _?".
B. answers the question "What is the best indicator of supply?"
C. is not often used outside of a statistics course.
D. does not help you estimate demand.
Answer: A. answers the question "What is the best predictor of ?". Week of November 28th-December 2nd, 2011:
Armstrong, J. S. & Green, K. (2011, August 17). Demand forecasting: Evidence-based methods. Oxford Handbook in Managerial Economics, 1-23.
Baye, M. R. (2006). Managerial Economics and Business Strategy. New York: McGraw-Hill Irwin.
Blair, B., Millea, M. & Hammer, J. (2004, October). The impact of cooperative education on academic performance and compensation of engineering majors. Journal of Engineering Education, 333-338.
- Anya DeVoss
Final Paper:
Regression analysis is defined as a statistical technique used to find relationships between variables in order to predict future values. It is used to determine the best fit line correlated to a series of data points that are graphing the relationship between a dependent variable and an explanatory variable. Regression analysis is used regularly in the business world to determine the relationship between a given dependent variable on various explanatory variables. It can be used to determine equitable compensation for workers by using the amount of responsibility, the number of people they supervise, and more to evaluate what contributes to the value of a person's job and what compensation they should be paid based on their duties. Regression analysis simply answers the general question "what is the best predictor of ?". Researchers in the education field may want to determine what is the best predictor of high school success in students or the best predictor of teacher effectiveness. Sales employees may use regression analysis to see the impact of a given type of advertising on the number of sales made because of that advertisement. This type of analysis predicts the outcome of a given indicator (the dependent variable), based on the interactions between it and other related drivers (or explanatory variables). Regression analysis is used all the time in the business world, and you can find examples of it in nearly every academic journal that uses statistics in their research. The following is a link to an academic journal that uses regression analysis to determine the impact of the cooperative education experience (the explanatory variable) on GPA (the dependent variable).
Econometrics is a much more complicated phenomenon to arrive at statistical analysis, but is where regression analysis is derived from. Econometrics is defined as the application of statistical and mathematical methods in economics to describe the relationships between key economic forces. To find a relationship between variables (one dependent, one explanatory), data is collected, and the gathered information is plotted on a graph. They will not be in a straight line, so to find a smooth curve an econometrician must choose what demonstrates a good relationship between the points plotted. Econometricians use regression software to find values that minimize the sum of the squared deviations between the actual points graphed and the regression line drawn (the expected relation).
Regression software is readily available to the everyday businessman. Microsoft Excel has regression capabilities through its data analysis feature. This is a link to a document posted by the University of Missouri-Columbia. It has a step-by-step description of how to use Microsoft Excel’s regression analysis tool.
Also, a common, easy-to-use regression tool is the regression feature in the MegaStat add-in for Microsoft Excel. You can estimate demand by using regression analysis. To do this, you need to use simple linear regression to plot your data points on a scatter plot to determine if a linear relationship exists between the variables. You then draw a line through the center of the majority of the points, which will be the line of best fit. The line of best fit is the straight trend line that runs through and represents the majority of the data points on your graph. This line can pass through some of the points, none of the points, or all of the points depending on where your points fall on your graph.
Using regression analysis software, (shown in my MegaStat examples later on), you will be given an estimated equation for the line of best fit that is drawn through the center of the majority of the points that were plotted. This equation is the estimated demand equation for your variables. The y-intercept for your equation can be found in MegaStat's output by looking at where it says the Coefficient Intercept. The slope for each of your variables can then be found by looking at the variable and the coefficient associated with it in the MegaStat output. You then need to evaluate your equation by looking at the standard errors of the estimates, the t-statistics, as well as the coefficient of determination to see its goodness of fit.
Standard error is the standard deviation of the sampling distribution of a statistic. It is a statistical term that measures how accurately a sample represents a population. You want the standard error of your model to be as low as possible.
A test statistic (referred to as a t-stat) is a number that helps determine whether a hypothesis will be accepted or rejected. If the test statistic is too far off of the original hypothesis, it will be rejected. Conversely, when a test statistic is close to the original hypothesis, it will likely be accepted. Your explanatory variable is deemed reasonable if the t-statistic is close to zero.
R-squared is another name for the coefficient of determination. It is a statistical method that explains how much variability there is in the relationship that your dependent variable can be explained by your explanatory variable. It is shown as a value between 0 and 1. The higher your value of R-squared is, the better the fit of the model. Coefficient of determination is symbolized by “r2” because it is square of the coefficient of correlation, symbolized by “r”. Coefficient of correlation (r) is a statistical measure of the correlation (or linear relationship) between a dependent variable and your explanatory variable. Represented by the lowercase letter “r”, its value varies between -1 and 1. Having an “r” of 1 means there is perfect correlation, a 0 means there no correlation, all positive values means the relationship is positive (when one variable increases, so does the other), and all negative values mean the relationship is negative (when one increases, the other decreases).
The coefficient of determination (R-squared) determines the degree of linear correlation of your variables, or the goodness of fit. Higher values of R-squared indicate that the model is a good fit to describe the data, meaning that the demand curve comes very close to the majority of your points. Another of way of explaining goodness of fit is that if the model is a good fit to describe your data, the total variance of your Y variables explained by your demand equation will be small. The larger the R-squared figure is, the poorer the fit, meaning your demand equation does not portray a relationship between your variables.
You also want to look at the f-statistic in your model’s regression output. An F-statistic is a value resulting that determines if the variances between the means of two populations are significantly different. An f-statistic is not used in the interpretation of a regression, but is used to determine the model’s P-value. A P-value is the probability that an effect from the current observation occurred by chance. Generally, you want the P-value of your model to be less than 5% to show that there is a less than 5% chance that the explanatory variable is not an indicator of your dependent variable. A P-value of less than 5% shows that your model is statistically significant.
Adjusted R-square is a modification of R-square that adjusts for the number of terms in a model. R-square always increases when a new term is added to a model, but adjusted R-square increases only if the new term improves the model more than would be expected by chance. Adjusted R-square penalizes the prediction for using useless or non-explanatory predictors. You want your adjusted R-square value to be close to the value you got for R-square. This means that no explanatory variables are missing in your model.
If your numbers are all good (meaning your standard error is low, your t-statistics are close to your original hypothesis, your coefficient of determination shows a high correlation between your dependent variable and your explanatory variable, your P-value is less than 5%, and your adjusted r-square is close to the value of your r-squared), then your estimated demand equation is a good representation of your data. Once you have proved that your estimated demand equation is a good representation, you can then insert forecasted numbers into the equation (specifically insert forecasted values of your explanatory variables into the equation) to predict future demand of your dependent variable.
The following is an example of regression analysis that I created. It uses bivariate regression (which uses two explanatory variables and one dependent variable). The regression output is found using the MegaStat add-in in Microsoft Excel. This example analyzes the relationship between the number of copy machines serviced by a worker (the dependent variable) to the number of minutes required to service the machines (the independent variable). This example looks at the demand of copy machine maintenance when it comes to the explanatory variable of the number of minutes required to service the machines.
The following is an additional example of regression analysis that I created. It uses multiple regression (which analyzes three explanatory variables and one dependent variable). The regression output is found using the MegaStat add-in in Microsoft Excel. This example analyzes the relationship between the monthly labor hours needed at a hospital (the dependent variable) to the number of monthly x-ray exposures, monthly occupied bed days, and average length of patients’ stay in days (the explanatory variables). This example looks at the demand of labor at a hospital when it comes to the explanatory variables of how many x-ray exposures are taken monthly, how many days beds are occupied during the month, and the average length of patients' stay in days.
Sample Questions:
1) True or False. Your model depicts a good fit if your R-squared, or coefficient of determination, is closer to 1.0.
2) Regression analysis looks at the relationship between:
A. Dependent variables and Independent variables
B. Independent variables and Explanatory variables
C. Dependent variables and Explanatory variables
D. None of the above
3) True or False. R-square increases only if the new term improves the model more than would be expected by chance.
4) True or False. An F-statistic determines if the variances between the means of two samples are significantly different.
5) Regression analysis:
A. answers the question "What is the best predictor of _?".
B. answers the question "What is the best indicator of supply?"
C. is not often used outside of a statistics course.
D. does not help you estimate demand.
Answers to Sample Questions:
1) True
2) C. Dependent variables and Explanatory variables
3) False – should say Adjusted R-squared to be true
4) False – should say between two populations to be true
5) A. answers the question “What is the best predictor of ?”
Bibliography:
(2007). Introduction to regression. Retrieved from http://dss.princeton.edu/online_help/analysis/regression_intro.htm
Armstrong, J. S. & Green, K. (2011, August 17). Demand forecasting: Evidence-based methods. Oxford Handbook in Managerial Economics, 1-23.
Baye, M. R. (2006). Managerial Economics and Business Strategy. New York: McGraw-Hill Irwin.
Blair, B., Millea, M. & Hammer, J. (2004, October). The impact of cooperative education on academic performance and compensation of engineering majors. Journal of Engineering Education, 333-338.
Cottrell, A. Regression analysis: Basic concepts. Retrieved from http://www.wfu.edu/~cottrell/ecn215/regress.pdf
Doane, D. P., & Seward, L. E. (2011). Applied Statistics in Business and Economics (3rd edition). Boston, MA: McGraw-Hill, Irwin.
Greenlief, M. (2010). Using Excel for regression analysis. Retrieved from http://www.chem.missouri.edu/Greenlief/courses/234W04/Excel%20Handout.pdf
Hennekens, C. H. (1987). Epidemiology in medicine. Retrieved from http://www.tulane.edu/~panda2/Analysis2/two-way/fstat_and_significance.htm
Investopedia. (2011). Standard error. Retrieved from http://www.investopedia.com/terms/s/standard-error.asp#axzz1b43n8opO
Mt. Rushmore Securities. R-squared. Retrieved from http://www.hedgefund-index.com/d_rsquared.asp
Sykes, A. An introduction to regression analysis. Retrieved from http://www.law.uchicago.edu/files/files/20.Sykes_.Regression.pdf
Tulane University. F-statistic and significance. Retrieved from http://www.tulane.edu/~panda2/Analysis2/two-way/fstat_and_significance.htm
WebFinance. (2011). Retrieved from http://www.businessdictionary.com/
-------------------------------------------------------------------------------------------------END OF FINAL REPORT-------------------------------------------------------------------------------------------------------------
WikiSpaces Summaries:
#1) Supply and Demand (Also, include private goods vs. public goods, marginal analysis, producer and consumer surplus and how it applies to perfect competition) by Thaddeus Bogardus
This paper explained many of the basic concepts in a discussion of supply and demand through brief, but thorough definitions of those concepts. The author then goes on to describe what causes a demand curve to shift (including a description of various types of goods and their relationships with demand) as well as what causes a supply curve to shift (including different economic factors and forces that influence supply). He then goes into specific detail concerning supply and demand by bringing in four different real world applications. His definitions, descriptions, and explanations make for a very good and easy-to-understand basic essay on the concepts of supply and demand. A reader with little experience with the topic could fairly easily understand it through reading his essay.
#2) Comparative advantage and trade (How do you determine your comparative advantage?) by Regina Gauger
Regina’s paper begins by discussing that comparative advantage is when two parties both gain from trade if when in the absence of trade they have different relative costs for producing the same goods. She then goes into various examples of comparative advantage in different industries including the PC industry, the apparel industry, and the drug trafficking industry. She then explains how you know if you have a comparative advantage by using the same examples as before to describe the specific things that prove a comparative advantage is present. She then explains an opposing view which holds that a comparative advantage is not always present when two parties both gain from trade as she had previously explained.
#3) Profit – opportunity costs including implicit costs and explicit costs, the difference between economic and accounting costs, economic profit economic losses, and zero economic profit by Matthew Gray
Matthew begins his paper by giving brief definitions of different concepts of profits/losses and costs. He shows graphs depicting these concepts, but does not really provide a written description of the graphs. Although I know what they mean, I am not sure someone not having taken an economics course such as this would understand them. He links a couple articles that discuss economic loss as well as a written explanation of various economic and opportunity costs. Although his concepts make sense, his paper seemed somewhat scattered.
#4) Monopoly by Amy Kitzman
Amy begins her essay by defining a monopoly as a single firm that serves an entire market, where there are no close substitutes for their products. She then goes into four different sources that help create monopoly power, including economies of scale, economies of scope, cost complementary, and patents and other legal barriers. She discusses what would show on a graph depicting the market of a monopoly as well as its cost and revenue curves. She goes on to explain that the Sherman Antitrust Act tries to prevent the creation of monopolies as well as a couple examples of monopolies in the world.
#5) Market Failure – Antitrust, positive and negative externalities, public goods, rent seeking, tariffs Government failure by Mark Wilk
Mark begins his essay by discussing how a monopoly causes a market in which the firm produces less than the socially efficient level of output. He discusses how Google has been under watch for using monopolistic strategies to get consumers to use their services. He then moves on to a discussion of negative externalities, specifically looking at pollution when it comes to the energy industry. He then discusses how public goods can cause markets to fail as well as how rent seeking can negatively impact a market.
-------------------------------------------------------------------------------------------------END OF SUMMARIES------------------------------------------------------------------------------------------------------------------
WEEKLY BLOG POSTS:
Week of October 3rd-7th, 2011:
Regression analysis is defined as a "statistical technique used to find relationships between variables for the purpose of predicting future values".
WebFinance. (2011). Regression analysis. Retrieved from http://www.investorwords.com/4136/regression_analysis.html
Regression analysis is used to determine the best fit line correlated to a series of data points that are graphing the relationship between a dependent variable and an explanatory variable.
A test statistic (AKA: t-stat) is defined as a "sample used to determine whether a hypothesis will be accepted or rejected. If the test statistic is too far off of the original hypothesis, it will be rejected. Conversely, when a test statistic is close to the original hypothesis, it will likely be accepted."
WebFinance. (2011). Test statistic. Retrieved from http://www.businessdictionary.com/definition/test-statistic.html
Coefficient of correlation (r) is a "statistical measure of the linear relationship (correlation) between a dependent-variable and an independent variable. Represented by the lowercase letter 'r', its value varied between -1 and 1: 1 means perfect correlation, 0 means no correlation, positive values means the relationship is positive (when one goes up so does the other), negative values mean the relationship is negative (when one goes up the other goes down)."
WebFinance. (2011). Coefficient of correlation (r). Retrieved from http://www.businessdictionary.com/definition/coefficient-of-correlation-r.html
R-squared is another name for the coefficient of determination. It is defined as "a statistical method that explains how much of the variability of a factor can be caused or explained by its relationship to another factor. It is computed as a value between 0 (0 percent) and 1 (100 percent). The higher the value, the better the fit. Coefficient of determination is symbolized by r^2 because it is square of the coefficient of correlation, symbolized by r. The coefficient f determination is an important tool determining the degree of linear-correlation of variables ('goodness of fit') in regression analysis."
WebFinance. (2011). Coefficient of determination (r2). Retrieved from http://www.businessdictionary.com/definition/coefficient-of-determination-r2.html
An F-statistic is "a value resulting from a standard statistical test used in ANOVA and regression analysis to determine if the variances between the means of two populations are significantly different."
Hennekens, C. H. (1987). Epidemiology in medicine. Retrieved from http://www.tulane.edu/~panda2/Analysis2/two-way/fstat_and_significance.htm
Adjusted R-square is a "modification of R-square that adjusts for the number of terms in a model. R-square always increases when a new term is added to a model, but adjusted R-square increases only if the new term improves the model more than would be expected by chance."
Mt. Rushmore Securities. R-squared. Retrieved from http://www.hedgefund-index.com/d_rsquared.asp
Week of October 10th-14th, 2011:
Econometrics is defined as "the application of statistical and mathematical methods in the field of economics to describe the numerical relationships between key economic forces such as capital, interest rates, and labor".
WebFinance. (2011). Econometrics. Retrieved from http://www.investorwords.com/1638/econometrics.html
Econometrics is a much more complicated phenomena to arrive at statistical analysis, but is where regression analysis is derived from. To find a relationship between variables (one dependent, one explanatory), data is collected, and the gathered information is plotted on a graph. They will not be a straight line, so to find a smooth curve, an econometrician must choose what demonstrates a good relationship between the points plotted. Ecnometricians use regression software to find values that minimize the sum of the squared deviations between the actual points graphed and the regression line drawn (the expected relation).
Regression software is readily available to the everyday businessman. Microsoft Excel has regression capabilities through its data analysis feature. Also, a common, and simple-to-use regression tool is the regression feature in MegaStat.
Standard error is defined as "the standard deviation of the sampling distribution of a statistic. It is a statistical term that measures the accuracy with which a sample represents a population. In statistics, a sample mean deviates from the actual mean of a population; this deviation is the standard error".
Investopedia. (2011). Standard error. Retrieved from http://www.investopedia.com/terms/s/standard-error.asp#axzz1b43n8opO
Week of October 17th-21st, 2011:
The following is an example of regression analysis. It uses bivariate regression (two explanatory variables vs. one dependent variable).
The regression output is found using the MegaStat add-in in Microsoft Excel.
WikiSpaces Regression Example.xlsx
Week of October 24th-28th, 2011:
The following is an additional example of regression analysis. It uses multiple regression (three explanatory variables vs. one dependent variable).
The regression output is found using the MegaStat add-in in Microsoft Excel.
WikiSpaces Regression Example #2.xlsx
Week of October 31st-November 4th, 2011:
Line of Best Fit: The line of best fit is the straight trend line that runs through and represents the majority of the data points on your graph. This line can pass through some of the points, none of the points, or all of the points depending on where your points fall on your graph.
Goodness of Fit: The value R-squared quantifies the goodness of fit in the model. It is given as a number between 0.0 and 1.0. Higher values of R-squared indicate that the model is a good fit to describe the data, meaning that the demand curve comes very close to the majority of your points. Another of way of explaining goodness of fit is that if the model is a good fit to describe your data, the total variance of your Y variables explained by your demand equation will be small. The larger the R-squared figure is, the poorer the fit, meaning your demand equation does not portray a relationship between your variables.
Week of November 7th-11th, 2011:
You can estimate demand by using regression analysis. To do this, you need to use simple linear regression to plot quantity and price values on a scatter plot to determine if a linear relationship exists between the variables. You then draw a line through the center of the majority of the points, which will be the line of best fit. Using regression analysis software, such as MegaStat in my examples, you will be given an estimated equation for the line of best fit that is drawn through the center of the majority of the points that were plotted. This equation is the estimated demand equation for your variables. The y-intercept for your equation can be found in MegaStat's output by looking at where it says the Coefficient Intercept. The slope for each of your variables can then be found by looking at the variable and the coefficient associated with it in the MegaStat output. Once you have your estimated demand equation, you then want to evaluate it to see its standard goodness of fit.
You need to evaluate your equation by looking at the standard errors of the estimates, the t-statistics of the estimates of the coefficients, as well as the coefficient of determination. If your numbers are all good (meaning your standard error is low, your t-statistics are close to your original hypothesis, and your coefficient of determination is shows a high correlation between your dependent variable and your explanatory variable), then your estimated demand equation is a good representation of your data.
Week of November 14th-18th, 2011:
Regression analysis is used regularly in the business world to determine the relationship between a given dependent variable on various explanatory variables. It can be used to determine equitable compensation for workers by using the amount of responsibility, the number of people they supervise, and more to evaluate what contributes to the value of a person's job and what compensation they should be paid based on their duties. Regression analysis simply answers the general question "what is the best predictor of _?". Researchers in the education field may want to determine what is the best predictor of high school success in students, of teacher effectiveness, and more. Sales employees may use regression analysis to see the impact of a given type of advertising on the number of sales made because of that advertisement. This type of analysis predicts the outcome of a given indicator (the dependent variable), based on the interactions between it and other related drivers (or explanatory variables).
Week of November 21st-25th, 2011:
Questions:
1) True or False. Your model depicts a good fit if your R-squared, or coefficient of determination, is closer to 1.0.
Answer: True
2) Regression analysis looks at the relationship between:
A. Dependent variables and Independent variables
B. Independent variables and Explanatory variables
C. Dependent variables and Explanatory variables
D. None of the Above
Answer: C. Dependent variables and Explanatory variables
3) True or False. R-square increases only if the new term improves the model more than would be expected by chance.
Answer: False, Adjusted R-squared does this.
4) True or False. An F-statistic determines if the variances between the means of two samples are significantly different.
Answer: False. between two populations
5) Regression analysis:
A. answers the question "What is the best predictor of _?".
B. answers the question "What is the best indicator of supply?"
C. is not often used outside of a statistics course.
D. does not help you estimate demand.
Answer: A. answers the question "What is the best predictor of ?".
Week of November 28th-December 2nd, 2011:
How to use Excel's regression analysis tool:
Bibliography:
(2007). Introduction to regression. Retrieved from http://dss.princeton.edu/online_help/analysis/regression_intro.htm
Armstrong, J. S. & Green, K. (2011, August 17). Demand forecasting: Evidence-based methods. Oxford Handbook in Managerial Economics, 1-23.
Baye, M. R. (2006). Managerial Economics and Business Strategy. New York: McGraw-Hill Irwin.
Blair, B., Millea, M. & Hammer, J. (2004, October). The impact of cooperative education on academic performance and compensation of engineering majors. Journal of Engineering Education, 333-338.
Cottrell, A. Regression analysis: Basic concepts. Retrieved from http://www.wfu.edu/~cottrell/ecn215/regress.pdf
Doane, D. P., & Seward, L. E. (2011). Applied Statistics in Business and Economics (3rd edition). Boston, MA: McGraw-Hill, Irwin.
Greenlief, M. (2010). Using Excel for regression analysis. Retrieved from http://www.chem.missouri.edu/Greenlief/courses/234W04/Excel%20Handout.pdf
Hennekens, C. H. (1987). Epidemiology in medicine. Retrieved from http://www.tulane.edu/~panda2/Analysis2/two-way/fstat_and_significance.htm
Investopedia. (2011). Standard error. Retrieved from http://www.investopedia.com/terms/s/standard-error.asp#axzz1b43n8opO
Mt. Rushmore Securities. R-squared. Retrieved from http://www.hedgefund-index.com/d_rsquared.asp
Sykes, A. An introduction to regression analysis. Retrieved from http://www.law.uchicago.edu/files/files/20.Sykes_.Regression.pdf
Tulane University. F-statistic and significance. Retrieved from http://www.tulane.edu/~panda2/Analysis2/two-way/fstat_and_significance.htm
WebFinance. (2011). Retrieved from http://www.businessdictionary.com/