Estimating a demand function for Fruit and Vegetables
In this project I will examine the quarterly data set for FTVG20 from Ruritania between 1981 and 2010. I will find a functional form which best fits the data and then test for insignificant variables, structural breaks, seasonality and homogeneity. I will use Slutsky’s equation to calculate the income and substitution effects and then interpret the model. The social, economic, geographic and economic characteristics of Ruritania are not known.
The data set shows that the quantity demanded of fruit and vegetables (QFTVG20) is dependent on the following variables:
PMTFHPrice of meat and fish
PFTVGPrice of fruit and vegetables
PTEAPrice of tea
PCOFFPrice of coffee
PBEERPrice of beer
PWINEPrice of wine
PLEISPrice of leisure
PTRAVPrice of travel
PALLOTHPrice of all other goods
Ruel, Minot and Smith use household expenditure surveys in 10 Sub-Saharan African countries and used a Working-Lessor functional form to find that the main determinants of demand are per capita expenditure, household size, households headed by a female, education and location (urban vs rural).
A study by Seale considered the effect of price and income on the demand for different food categories. They found that the food budget share of fruit and vegetable consumption is 10-25%, which is much higher than that of Ruritania. They calculated the expenditure elasticity of fruit and vegetables for low income countries (LICs) to be 0.636, middle-income countries (MICs) 0.514 and high-income countries (HICs) 0.281. The Frisch own-price elasticity of demand was -0.514 in LICs, -0.416 in MICs and -0.227 in HICs.
There have been several studies considering non-economic factors that contribute to the demand for fruit and vegetables. A study by Nayga found that demand depends on socio-demographic factors such as location, age, family structure, ethnicity, children and education, whilst Pollard, Kirk and Cade find social desirability, habits, sensory appeal, convenience and advertising to be explanatory variables. Block’s research in Indonesia finds that mothers with nutritional knowledge spend a greater proportion of their food budget on foods rich in nutrients and minerals, such as fruit and vegetables.
Studenmund says that ‘choice of a functional form is a vital part of the specification of that equation.’ He goes on to mention that the use of Ordinary Least Squares means that the equation should be linear in the parameters rather than variables. In determining a demand function for fruit and vegetables I will consider the following functional forms:
LinearQFTVG20 = b0 + b1PMTFH + b2PFTVG + b3PTEA + b4PCOFF + b5PBEER + b6PWINE + b7PLEIS + b8PTRAV + b9PALLOTH + b10INCOME + et
Log-Logln(QFTVG20) = b0 + b1ln(PMTFH) + b2ln(PFTVG) + b3ln(PTEA) + b4ln(PCOFF) + b5ln(PBEER) + b6ln(PWINE) + b7ln(PLEIS) + b8ln(PTRAV) + b9ln(PALLOTH) + b10ln(INCOME) + et
Log-Linearln(QFTVG20) = b0 + b1PMTFH + b2PFTVG + b3PTEA + b4PCOFF + b5PBEER + b6PWINE + b7PLEIS + b8PTRAV + b9PALLOTH + b10INCOME + et
Linear-LogQFTVG20 = b0 + b1ln(PMTFH) + b2ln(PFTVG) + b3ln(PTEA) + b4ln(PCOFF) + b5ln(PBEER) + b6ln(PWINE) + b7ln(PLEIS) + b8ln(PTRAV) + b9ln(PALLOTH) + b10ln(INCOME) + et
In determining which functional form is preferable and which variables are significant, I will use the statistical tests detailed below:
TestIt tests for…Null HypothesisAlternative Hypothesis
F TestSignificance of overall regression, individual and joint parametersH0: Test statistic < critical valueModel is insignificant HA: Test statistic > critical valueModel is significant
R2Proportion of variation in sample data explained by the regressionn/an/a
Ramsey RESET (RR)Misspecification of the model and omitted variablesH0: Test statistic < critical valueModel is adequate and there is no misspecification HA: Test statistic > critical valueModel is inadequate and can be improved
Jarque-Bera (JB)Normality of the error termH0: Test statistic < critical valueThe error term is normally distributed HA: Test statistic > critical valueThe error term is not normally distributed
White’s (WT)HeteroscedasticityH0: Test statistic < critical valueThere is homoscedasticity HA: Test statistic > critical valueThere is heteroscedasticity
Breusch-Godfrey (BG)Higher order autocorrelationH0: Test statistic < critical valueThere is no autocorrelation HA: Test statistic > critical valueThere is autocorrelation
Durbin-Watson (DW)First order autocorrelationH0: Test statistic > upper boundThere is no autocorrelationHA: Test statistic < lower boundThere is autocorrelation Additionally, when :Lower bound < Test statistic < Upper boundThe test for autocorrelation is inconclusive
Changes In Demand
Roberta Cook’s research has shown that per capita fruit and vegetables consumption (pounds) in the United States has increased by 12.4% from 1976-2006. Interestingly, in the same period there was a 28% reduction in the amount of citrus fruits consumed but growth was boosted by non-citrus fruits and vegetables. Cook suggests that the increase in demand is due to changes in lifestyle such as the large increase in the number of two-income households. This has led to a focus on cooking quickly therefore using more fresh produce.
The scatter plot below shows the change in quantity demanded for fruit and vegetables in Ruritania over the time period 1981 to 2010. Quantity demanded was constant between 1980 and 1991 before increasing exponentially. The data does not follow the results of Cook’s research but I am able to predict that the data will fit either a log-log or log-linear model.
Choosing the Functional Form
From considering the four functional forms I obtained the following test results which are in line with my predictions:
Statistical TestCritical value at 5% significance levelLinearLog-LogLog-LinearLinear-Log
Ramsey RESET (RR)3.9231.982
Durbin-Watson (DW)DU= 1.898DL = 1.4621.32
Although the linear model and the linear-log model pass the F-test, only 85% and 83% of the variation in the data is explained by the respective regression model. Both models also fail the Breusch-Godfrey test, Durbin-Watson test, White’s Test, Jarque Bera test and the Ramsey RESET test. From these results I can conclude that the demand function for fruit and vegetables is not in linear or linear-log form.
The log-log functional form and the log-linear functional form both explain around 93.5% of the data, which is relatively high. They both pass the T-test, Durbin-Watson test, White’s Test, Jarque Bera test, Breusch-Godfrey test and the Ramsey RESET test at 5%. Although they both pass the same tests, the log-log form passes the Ramsey RESET test at 0.0028123 whilst the log-linear form passes at 1.7429. Since the log-log model passes this more satisfactorily, the model will have a lower chance of misspecification. Additionally, a log-log model allows easier interpretation as elasticity is constant and equal to b at every point. I will therefore choose the log-log functional form as the demand function for fruit and vegetables. For analysis, if an independent variable changes by 1% whilst other independent variables are held constant, then the dependant variable will change by the b value of the independent variable.
Testing individual parameters
Having identified the preferred functional form, I will now test the significance of individual parameters at a 5% significance level.
Calculated using a 2-tailed T-test
H0: b0 = 0
H1: b0 ?0
Test statistic (t) = b0 – b0 T(N-2) where N = 120 so T(118)
If – tc ? t ? tc fail to reject the null hypothesis and b0 is not significant
Ift ? tc or t ? -tc reject the null hypothesis and b0 is significant
The critical values for the t – test are +/- 1.98.
From the t-test I have found that only four of the parameters are significant at a 5% significance level. They are: price of fruit and vegetables, price of tea, price of all other goods and the level of income. Since the price of meat and fish, intercept and price of travel are close to the critical value, I will keep these in the model. I will now run a second regression excluding the variables: price of coffee, price of beer, price of wine and price of leisure, and will use more t-tests to determine which of the parameters are significant. The results are shown in the table below.
Whilst the intercept is still insignificant, I will continue to include it in the model as removing it can create bias in the regression.The price of meat and fish and the price of travel are still insignificant in this regression so I will remove them from the model.
The restricted regression model has the functional form:
ln(QFTVG20) = b0 + b2ln(PFTVG) + b3ln(PTEA) + b9ln(PALLOTH) + b10ln(INCOME) + et
To ensure the removal of the six parameters improves the model, I will run an F-test on the restricted model:
F = (SSRR-SSUR)/r
Where r = number of restrictions in the model, n = number of observations, k = number of parameters in the unrestricted model (including the intercept)
The null hypothesis is:
H0: b1 = b4 = b5 = b6 = b7 = b8 = 0
HA: Null hypothesis is untrue
At 5% significance level, critical value F(6,109) = 2.18
F = (16.7224302 – 16.0433624)/6
F = 0.7689409526 < 2.18
Since the test statistic is less that the critical value, I fail to reject the null hypothesis so the variables are collectively insignificant and can now be removed.
I will consider whether there are structural breaks and seasonal changes.
I have chosen to graph QFTVG20 over time rather than lnQFTVG20 as there is a marked increase in fruit and vegetables consumption after 1998 which does not appear on the graph for lnFTVG. This increase in consumption may be due to a structural change. I will therefore split the regression model into two, and carry out a Chow Test, where:
H0 = no structural change
HA = structural change
n1 = number of observations in the first regression
n2 = number of observations in the second regression
k = number of parameters including the constant
SSRR = RSS from original model
SSUR = RSS from regression 1 + RSS from regression 2
Time PeriodNumber of observationsResidual sum of squares
1981 – 19987210.7905333
1999 – 2010485.82816287
1981 – 201012016.7224302
F = (16.7224302 – 10.7905333 – 5.82816287)/5= 0.1373241701
(10.7905333 + 5.82816287)/(72 + 48 – 2?5)
At a 5% significance level, the critical value is F(5,110) = 2.29
Since 0.137 < 2.29 I fail to reject the null hypothesis and can conclude that there is no structural change when tested at the 5% significance level.
Seasonal Dummy Variables
Since fruit and vegetables grow on a seasonal basis, it is prudent to include seasonal dummy variables to see whether the data follows seasonality. To do this, I will create four dummy variables, however, I will only include three dummy variables so as to avoid falling into the dummy variable trap. This avoids obtaining perfect multicollinearity. The three dummies refer to the difference between themselves and the omitted (reference) dummy variable.
With the inclusion of three dummy variables, the model becomes:
ln(QFTVG20) = b0 + b2ln(PFTVG) + b3ln(PTEA) + b9ln(PALLOTH) + b10ln(INCOME) + baD1 + bbD2 + bcD3 + et
QuarterParameterCoefficientEstimated standard errorTest statisticSignificant at 5% (critical value +/- 1.98
This shows that the dummy variables are insignificant at 5% significance level. To remove the dummy variables, I run an F-test to check for the combined significance.
H0: ba = bb = bc = 0
HA: H0 is not true
F = (SSRR-SSUR)/r ~ F(r, n-k)
F = (16.7224302 – 16.3332741)/3= 0.8890750414
16.3332741/(120 – 8)
At 5% significance level, the critical value for F(3,112) is 2.68. Since 0.889 < 2.68 I fail to reject the null hypothesis. From this, it can be seen that at the 5% significance level, there is no evidence of seasonality. I can now remove the seasonal dummy variables.
A demand function is homogenous if when both prices and income are doubled, the optimal quantities demanded do not change.
H0: b2 + b3 + b9 + b10 = 0
HA: b2 + b3 + b9 + b10 ? 0
If H0 is true, the equation can be rearranged as:
b10 = – b2 – b3 – b9
The regression model thus becomes:
ln(QFTVG20) = b0 + b2ln(PFTVG) + b3ln(PTEA) + b9ln(PALLOTH) + (- b2 – b3 – b9)ln(INCOME)
From logarithmic rules, the equation can be written as:
Ln(QFTVG20)= b0 + b2ln(PFTVG/INCOME) + b3ln(PTEA/INCOME) + b9ln(PALLOTH/INCOME)
F = (SSRR-SSUR)/r
F = (17.3810772 – 16.7224302)/1 = 4.529509413
The critical value for F(1,115) is 3.92. Since 4.5295 > 3.92 I reject the null hypothesis and conclude that demand is not homogenous, it exhibits heterogeneity. Laitinen has undertaken a study which concludes that the test of homogeneity is ‘seriously biased’ towards rejecting the null hypothesis. This leads me to believe that my result is acceptable and could be due to this, or the money illusion, where consumers mistake changes in nominal values to be changes in real values.
The Slutsky equation shows how a price change can lead to an income effect and a substitution effect.
To calculate the price elasticity of demand I multiply through by P/Q and multiply the last term by I/I giving:
Price elasticity of demand = substitution effect – (income elasticity x fraction of income spent)
From table 10 it can be seen that the income elasticity of demand is -0.470995 and price elasticity of demand of fruit and vegetables is -0.626791. The fraction of income spent on fruit and vegetables is 3%.
Income effect = -0.470995 x 0.03 = -0.01412985
Substitution effect = -0.626791 – -0.01412985 = -0.61266115
Since income elasticity of demand is negative, this means that fruit and vegetables are inferior goods. The substitution effect must always be negative.
Interpretation Of The Preferred Model
Having identified that there are no structural breaks in the model and that there is no evidence of seasonality, I can run a third regression with all the insignificant variables removed. The demand function is determined by:
ln(QFTVG20) = b0 + b2ln(PFTVG) + b3ln(PTEA) + b9ln(PALLOTH) + b10ln(INCOME) + et
The restricted regression model gives the following results to the aforementioned diagnostic tests:
Statistical TestCritical value at 5% significance levelLog Log (restricted)Log Log (unrestricted)
Ramsey RESET (RR)3.920.26863*0.0028123*
Durbin-Watson (DW)upper 1.898lower 1.4622.01*2.05*
* Significant at 5% significance level
The restricted log-log model passes every test carried out and passes the F test and White’s Test more satisfactorily than the unrestricted log-log model.
I will now run further t-tests and consider whether the remaining variables are still significant. The results are shown in the table below.
The table shows that all the remaining parameters (except the constant) are significant at a 5% significance level.
Regression equation for the preferred model
ln(QFTVG20) = 0.814700 – 0.626791ln(PFTVG) – 0.579563ln(PTEA) + 2.80783ln(PALLOTH) – 0.470995ln(INCOME)
The equation suggests that fruit and vegetables are inferior goods as the coefficient for income is negative. This means that as income increases, the demand for fruit and vegetables decrease.
Interpretation of Elasticities
LPFTVG-0.626791Own price inelastic
LINCOME-0.470995FTVG20 is income inelastic and is an inferior good.
Constant – represents the value that is predicted for the dependant variable when all the independent variables are equal to zero.
LPFTVG – A 1% increase in price will lead to a 0.626791% fall in quantity demand of fruit and vegetables. The average own-price elasticity for fresh fruit from 10 studies combined by Durham and Eales is -0.6 which is very close to the elasticity I have found.
LPTEA – A 1% increase in price of tea will lead to a fall in demand of FTVG20 of 0.579563%. This could be due to fruit and tea being consumed together, for example, as part of breakfast.
LPALLOTH – a 1 % increase in the price of all other goods will cause a 2.80783% increase in demand for fruit and vegetables
LINCOME – A 1% increase in income means the demand for fruit and vegetables will fall by 0.470995%. From this I can conclude that fruit and vegetables are inferior goods. Purcell and Raunikar found that at lower incomes, fruit and vegetables are normal goods but at higher incomes they are inferior goods. They also found that green vegetables are inferior goods for all levels of income from 1958-62. Their results correspond to a recent study (2007) by Ruel, Minot and Smith, who found that in 10 (relatively poor) African countries the average income-elasticity of demand for fruit and vegetables was 0.766, i.e. fruit and vegetables are normal goods for low-income countries.
In this project I have estimated a demand function for fruit and vegetables (20) in Ruritania. Through using diagnostic tests and regression analysis I have found it to be a log-log model. I was able to remove insignificant variables leaving independent variables of price of fruit and vegetables, tea, all other goods and income. I then tested the data for seasonality and structural breaks and found no evidence of seasonality or structural breaks between 1981 and 2010. I found the data to be heterogeneous and justified this with reference to Laitinen’s research. Using Slutsky’s equation, I found that fruit and vegetables are inferior goods.
To improve the model I could separate the demand for fruit and vegetables to see whether they both remain inferior goods. It would also be interesting to consider socioeconomic factors, such as those studied by Nayga. Additionally, since a large proportion of demand for fruit is made up of the demand for juice, it would useful to consider the demand of whole fruit and vegetables rather than that pressed into juice. These factors combined may improve the model so that a proportion of the remaining 6.6% of the data fits my regression model.
Ashworth, J. Durham Economics Lecture Notes
Bath Lecture Notes: www.people.bath.ac.uk/bm232/EC50161/Dummy%20Variables.ppt
Block, S., ‘Maternal Nutritional Knowledge and the Demand for Micronutrient Rich Foods: Evidence From Indonesia’
Cook, R. ‘U.S. Per Capita Fruit and Vegetables Consumption’
Cook, R. ‘Some Key Changes In U.S. Consumption Patterns’
Durham, C. Eales, J. ‘Demand Elasticities For Fresh Fruit and the Retail Level’
Greenwood, S. ‘Consumer Trends for the New Millennium Impact Fresh-cut Produce’
Han, T., Wahl, T. ‘China’s Rural Demand For Fruit and Vegetables’
Griffiths, W., Judge, G. ‘Undergraduate Economics’
Laitinen, K. ‘Why is demand homogeneity so often rejected?’
Nau, F. ‘Additional Notes On Regression Analysis’ Duke Fuqua Business School
Nayga. ‘Determinants of US Household Expenditures on Fruit and Vegetables. A Note and Update.’
Nicholson, W. ‘Microeconomic Theory: Basic Principles and Extensions’
Purcell, J.C., Raunikar, R. ‘Quantity-Income Elasticities For Foods By Level of Income’ Journal of Farm Economics, December 1967
Ruel, M.T., Minot, N., Smith, L. ‘Patterns and Determinants of Fruit and Vegetable Consumption In Sub-Saharan Africa: A Multicountry Comparison’ International Food Policy Research Institute, 2005
Seale, J., Regmi, A., Bernstein, J. ‘International Evidence on Food Consumption Patterns’
Studenmund, A. ‘Using Econometrics’
Wang, X. Durham Economics Lecture Notes