Problem Set 8

Due by 11:59 PM on Tuesday, March 30, 2021

We’re going to look at data from the most recent legislative elections in El Salvador, in which Bukele’s Nuevas Ideas party participated for the first time. I scraped the data from the internet and did some cleaning but it’s still incomplete and should not be interpreted as final vote results.

Every row here tells you how many votes each party got in a municipality, plus some other information about economic and social variables in the municipality (e.g., access to electricity).

Task 1: What did Nuevas Ideas run on?

Go online and try to figure out how Nuevas Ideas campaigned. What was their platform or campaign promises, how did they present themselves to the public? Include a link for reference.

Task 2: What are these municipalities like?

  1. Pick three variables you are interested in and make a histogram of each. Describe and discuss the distribution; what’s the average look like? How much do municipalities vary along this variable (i.e., is there a big range of values or a small range)?

  2. Pick two of these variables and make a scatterplot with a trend line. Describe the relationship and what it tells you about El Salvador’s municipalities.

Task 3: How did Bukele do?

  1. What percent of the vote did Bukele’s Nuevas Ideas party win in each municipality? To answer this question you need to first create a column that tells you the TOTAL number of votes cast in each municipality (you can use the SUM function). Then, create a column that tells you the vote share for Nuevas Ideas and make a histogram. How did Nuevas Ideas do? What does the distribution look like?

  2. How did El Salvador’s traditional parties do (the FMLN and ARENA)? Repeat (1), make the histogram for each party, and discuss their performance in the election. How badly were they wrecked?

  3. What variables explain where Nuevas Ideas did best? Pick 3 municipal characteristics, make scatterplots with trend lines (with the characteristic on the x-axis, Nuevas Ideas vote share on the y-axis) and discuss the relationship. How well do these variables explain who voted for Nuevas Ideas?

Description of variables in dataset
original labels
departamento Department
municipio Municipality
nuestro_tiempo Nuestro Tiempo (political party)
fmln FMLN (political party)
n Nuevas Ideas (political party)
pdc PDC (political party; christian dems)
vamos Vamos (political party)
pcn PCN (political party)
gana Gana (political party)
cd CD (political party)
leonardo_bonilla No idea; one guy?
impugnados Number of contested ballots
nulos Number of null ballots
arena ARENA (political party)
coalicion_pcn A coalition involving PCN?
jesus_segovia No idea; one guy?
densidad Populationd density (ppl per sq km?)
percent_urbano % urban population
masculinidad Male index (# of men per 100 women
relacion_dependencia Dependency ratio (non-working age / working age)
x60_anos_y_mas Percent over 60
tgf Fertility rate (avg. number of kids per woman)
tmi Infant mortality rate (number of deaths per 1,000 live births)
tasa_analfabetismo Illiteracy rate
asistencia_escolar School assistance rates (ages?)
agua_potable % access potable water
electricidad % access electricity
sin_servicio_sanitario % without sanitation
con_piso_de_tierra % with dirt floor