Problem set 3

Due by 11:59 PM on Wednesday, February 12, 2020


You must submit all assignments as a Word document. Answer all questions carefully and neatly. Use section headings to distinguish what question you are answering. Be neat. If the question involves a graph, copy/paste the graph into your document and make sure the graph has a title.


Tutorial for this week here

Task 1: Land Invasions in Brazil

Let’s look at this data on land invasions in Brazilian municipalities (kinda like a US district) between 1988 and 2004. This data covers the whole of Brazil: not just rural areas.

original labels
ext_pov Extreme Poverty (Percent) 1991
landgini Land Gini
occs Land Occupations 1988-2004
logarea Logged (Land Area)
logfam Logged (Families Involved in Land Occupations) 1988-2004
logpop Logged (Population) 1991
  1. Look at one or two of the municipalities with the highest rates of land invasions. Look them up online, and also see if you can find any reference to invasions happening there (OK if you don’t); what are these places like? Where are they within Brazil? What stands out about them?
  2. Plot the relationship between land invasions and two of the other variables in the dataset using whatever type of plot you like (2 plots). What’s the relationship in the the plots? What stands out?

Task 2: Measuring Poverty

Measuring poverty in Latin America is hard. You can simply ask people how much they make each month, but their incomes might vary wildly from month to month. Some may not know, exactly, how much they make. And two people with the same monthly income may have very different levels of wealth as a result of divergent life circumstances.

We’re going to try to measure poverty using data on household assets from Honduras in 2018. Each of the r columns tells you whether or not a household has a particular asset (e.g., fridge, cell phone, etc.). If a household has the asset, the cell = 1, otherwise = 0.

  1. Calculate the proportion of households who own an asset, for 3 assets of your choosing separately for urban and rural areas.You can either do this by filtering or (easier approach) using a pivot table, putting the asset on “value” and ur on “rows”.

    Are you surprised by the results? Are they higher or lower than you thought they would be? How different are urban and rural households?
  1. Plot the distribution (bar plot, column chart) of “poverty_index” in the data, separately for urban and rural areas. Do they fit what you’d expect? How skewed are they? What stands out about them?

  2. You’ve just created measures of poverty based on household assets. Let’s see how well these measures capture poverty. Using a pivot table, take the average of two non-asset variables you think should be related to poverty for each level of “poverty_index”. Then, plot “poverty_index” on the x-axis and these averages on the y-axis in two new plots. What does the relationship look like? Would you say the measure you created captures poverty well, or not?

original labels
pais Country
ed Years of Schooling
q10new_18 Monthly Household Income
r1 Television in Home
r3 Refrigerator in Home
r4 Landline in Home
r4a Cellular Telephone in Home
r5 Number of Vehicles at the House
r6 Washing Machine in Home
r7 Microwave Oven in Home
r8 Owns Motorcycle
r12 Drinking water in Home
r14 Indoor Bathroom in Home
r15 Computer in Home
r16 Flat Panel TV in Home
r18 Internet Service in Home
ur Urban/Rural
q14 Intends to Live or Work Abroad
fs2 Has Run Out of Food in the Last 3 Months (1 = yes)
fs8 Has Gone without Meals in the Last 3 Months (1 = yes)
wf1 Receives Government Assistance (1 = yes)