Category Archives: Uncategorized

Interest Rates and Effective Federal Funds Rate

As I learn SAS, I would still like to retain a working knowledge of R.  Unlike SAS and Minitab, R is a free software.
As many comparison sites indicate, there is a steeper learning curve for R.  Initially you can become frustrated by making small typos that result in error messages.  I had to overcome initial aversion to finding and installing the right packages.  There are no automatic updates, so you need to discern the cause of mistakes, be it due to format, typos, or outdated programs.  An organized project folder also saves time later.
With time limitations, I will try to “explore” data.  Data mining can be much more time consuming than the actual analysis.  For this reason, the scoop of these explorations will be limited as this blog is kept as a hobby.

Below are three questions I want to explore based on a data set that contains the following variables, recorded monthly, from 1954 to 2016:


## How have interest rates evolved over the last few decades? 


## How many times was the effective federal funds target rate above the rate of inflation?


## Are real GDP changes, unemployment rates, and inflation rates variables that help predict the variation in the inflation rate?

#These are the variables:

> str(interest_rates)

Classes ‘tbl_df’, ‘tbl’ and ‘data.frame’:    904 obs. of  10 variables:

 $ Year                        : int  1954 1954 1954 1954 1954 1954 1955 1955 1955 1955 …

 $ Month                       : chr  “07” “08” “09” “10” …

 $ Day                         : chr  “01” “01” “01” “01” …

 $ Federal Funds Target Rate   : num  NA NA NA NA NA NA NA NA NA NA …

 $ Federal Funds Upper Target  : num  NA NA NA NA NA NA NA NA NA NA …

 $ Federal Funds Lower Target  : num  NA NA NA NA NA NA NA NA NA NA …

 $ Effective Federal Funds Rate: num  0.8 1.22 1.06 0.85 0.83 1.28 1.39 1.29 1.35 1.43 …

 $ Real GDP (Percent Change)   : num  4.6 NA NA 8 NA NA 11.9 NA NA 6.7 …

 $ Unemployment Rate           : num  5.8 6 6.1 5.7 5.3 5 4.9 4.7 4.6 4.7 …

 $ Inflation Rate              : num  NA NA NA NA NA NA NA NA NA NA …




1.) How have interest rates evolved over the last few decades?

#We should first begin with a general time series plot with the year on the x-axis and the effective federal funds rate on the y-axis.

plot(`Effective Federal Funds Rate` ~ `Year`, data = interest_rates, xlab =”Rates per Year”, ylab = “Effective Federal Funds Rate”, main = “Rates over Time”, cex = 2, col = “blue”

lending rate per year#Although we notice the general rise and fall of this lending interest rate, how about the variability of the effective federal funds rate within a single year?  There seem to be monthly observations that change during some years and remain constant for others.  The 1980s look very different than the first part of the 2010s.

model.Year <- lm(`Effective Federal Funds Rate` ~ `Year`, data = interest_rates)

plot(model.Year, which =4)

We can use the Cook’s distance value to see the combined effect of each observation’s leverage and residual values.  To calculate the Cook’s distance, the ith data point is removed from the model and the regression is recalculated.  The Cook’s distance summarizes how much all of the other values in the regression model changed when the ith observation was removed.

Cook's distance values

#The middle observation counts (1980s) are associated with relatively large Cook’s distance values.  Normally the Cook’s distances are also represented by clearly separated vertical lines.  Since we have so many observations, this visual effect is lost.

 >#Are there certain months where the effective rate changes? 

boxplot(`Effective Federal Funds Rate` ~ Month, data = interest_rates, tck = 0.02, xlab = “Month”, ylab=”Effective Federal Funds Rate”, main = “Does the month matter?”, col = c(“darkgreen”, “orangered”))

Does the month matter.png

#The short answer is, no, the months do not matter. The medians are pretty similar, probably just around 4.5 to 5%… and statistically we can prove this, even though the above graph should suffice as an explanation.

> model.Month <- lm(`Effective Federal Funds Rate` ~ Month, data = interest_rates)

> summary(model.Month)


lm(formula = `Effective Federal Funds Rate` ~ Month, data = interest_rates)


   Min     1Q Median     3Q    Max

-4.914 -2.450 -0.199  1.715 14.261


            Estimate Std. Error t value Pr(>|t|)   

(Intercept)  4.81905    0.45829  10.515   <2e-16 ***

Month02     -0.05032    0.64812  -0.078    0.938   

Month03      0.04950    0.65073   0.076    0.939   

Month04      0.11886    0.65073   0.183    0.855   

Month05      0.12918    0.65073   0.199    0.843   

Month06      0.18482    0.65073   0.284    0.776   

Month07      0.12841    0.64812   0.198    0.843   

Month08      0.15413    0.64812   0.238    0.812   

Month09      0.15095    0.64812   0.233    0.816   

Month10      0.11063    0.64812   0.171    0.865   

Month11      0.06810    0.64812   0.105    0.916   

Month12      0.06095    0.64812   0.094    0.925   

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.638 on 740 degrees of freedom

  (152 observations deleted due to missingness)

Multiple R-squared:  0.0003318,     Adjusted R-squared:  -0.01453

F-statistic: 0.02233 on 11 and 740 DF,  p-value: 1


#None of the p-values associated with the single t-tests are significantly different than zero.  Just in case the 2010s proved unusually steady while other years experienced more oscillations between months, all months associated with years 2010 through 2016 were eliminated from the model and a new regression was made.

> model.Month2 <- lm(`Effective Federal Funds Rate` ~ Month, data = subset(interest_rates, Year <2010))

> summary(model.Month2)


lm(formula = `Effective Federal Funds Rate` ~ Month, data = subset(interest_rates,

    Year < 2010))


    Min      1Q  Median      3Q     Max

-5.4213 -2.4667 -0.3437  1.5644 13.5904


            Estimate Std. Error t value Pr(>|t|)   

(Intercept)  5.48964    0.46005  11.933   <2e-16 ***

Month02     -0.05927    0.65061  -0.091    0.927   

Month03     -0.02182    0.65061  -0.034    0.973   

Month04      0.05545    0.65061   0.085    0.932   

Month05      0.06764    0.65061   0.104    0.917   

Month06      0.13055    0.65061   0.201    0.841   

Month07      0.05644    0.64770   0.087    0.931   

Month08      0.08501    0.64770   0.131    0.896   

Month09      0.08161    0.64770   0.126    0.900   

Month10      0.03626    0.64770   0.056    0.955   

Month11     -0.01178    0.64770  -0.018    0.985   

Month12     -0.02464    0.64770  -0.038    0.970   

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 3.412 on 654 degrees of freedom

  (148 observations deleted due to missingness)

Multiple R-squared:  0.0002527,     Adjusted R-squared:  -0.01656

F-statistic: 0.01503 on 11 and 654 DF,  p-value: 1


#Nothing changes by excluding the months in these seven years.  So let’s create a model object for years regressed on the effective federal funds rate.  We come to a different conclusion for the year predictor variable.

> model.Year <- lm(`Effective Federal Funds Rate` ~ Year, data = interest_rates)
> summary(model.Year)


lm(formula = `Effective Federal Funds Rate` ~ Year, data = interest_rates)



    Min      1Q  Median      3Q     Max

-5.7058 -2.9800 -0.4617  1.7083 13.9685


              Estimate Std. Error t value Pr(>|t|)   

(Intercept) 105.963938  13.982013   7.579 1.03e-13 ***

Year         -0.050900   0.007042  -7.228 1.21e-12 ***

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.494 on 750 degrees of freedom

  (152 observations deleted due to missingness)

Multiple R-squared:  0.06512, Adjusted R-squared:  0.06387

F-statistic: 52.24 on 1 and 750 DF,  p-value: 1.212e-12



2.) How many times is the effective federal funds target rate above the rate of inflation?

> occurrences <- interest_rates$`Effective Federal Funds Rate` > interest_rates$`Inflation Rate`

> summary(occurrences)
   Mode   FALSE    TRUE    NA's 
logical     226     484     194


#The FALSE and TRUE observation counts indicate how many times the effective federal funds target rate was larger than the inflation rate.  Whenever data is missing for either the effective federal funds target rate or the inflation rate, a NA will be produced for occurrences.  Therefore we only look at observations with values for both the effective federal funds target and inflation rate.


[1] 315

 #When False



 #When True



#More often than not, the effective federal funds target rate is higher than the rate of inflation.

#The effective federal funds rate is the interest rate depository institutions charge one another when they lend one another funds.

#Here is more information:

#It makes sense that banking institutions would want to earn interest on the funds loaned.


3.) Are real GDP changes, unemployment rates, and inflation rates good variables for predicting the average variation in the inflation rate?

#We can run three single linear regressions and assign them to a model object.  This is done three times below.

> model.GDP <- lm(`Effective Federal Funds Rate` ~ `Real GDP (Percent Change)`, data = interest_rates)

> summary(model.GDP)


lm(formula = `Effective Federal Funds Rate` ~ `Real GDP (Percent Change)`,

    data = interest_rates)


    Min      1Q  Median      3Q     Max

-5.6613 -2.5303 -0.1314  1.6837 14.7109


                            Estimate Std. Error t value Pr(>|t|)   

(Intercept)                  5.25102    0.30661  17.126   <2e-16 ***

`Real GDP (Percent Change)` -0.10375    0.06429  -1.614    0.108   

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 3.651 on 248 degrees of freedom

  (654 observations deleted due to missingness)

Multiple R-squared:  0.01039, Adjusted R-squared:  0.006402

F-statistic: 2.604 on 1 and 248 DF,  p-value: 0.1078


> model.Unemployment <- lm(`Effective Federal Funds Rate` ~ `Unemployment Rate`, data = interest_rates)

> summary(model.Unemployment)



lm(formula = `Effective Federal Funds Rate` ~ `Unemployment Rate`,

    data = interest_rates)


    Min      1Q  Median      3Q     Max

-5.1303 -2.4760 -0.1807  1.7769 14.0607


                    Estimate Std. Error t value Pr(>|t|)   

(Intercept)          4.40650    0.51960   8.481   <2e-16 ***

`Unemployment Rate`  0.08438    0.08406   1.004    0.316   

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.611 on 750 degrees of freedom

  (152 observations deleted due to missingness)

Multiple R-squared:  0.001341,      Adjusted R-squared:  9.914e-06

F-statistic: 1.007 on 1 and 750 DF,  p-value: 0.3158


> model.Inflation.Rate <- lm(`Effective Federal Funds Rate` ~ `Inflation Rate`, data = interest_rates)

> summary(model.Inflation.Rate)



lm(formula = `Effective Federal Funds Rate` ~ `Inflation Rate`,

    data = interest_rates)


    Min      1Q  Median      3Q     Max

-8.0637 -1.6861  0.1715  1.5918  7.7240


                 Estimate Std. Error t value Pr(>|t|)   

(Intercept)        0.9058     0.1501   6.036 2.55e-09 ***

`Inflation Rate`   1.1139     0.0331  33.647  < 2e-16 ***


Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.269 on 708 degrees of freedom

  (194 observations deleted due to missingness)

Multiple R-squared:  0.6152,  Adjusted R-squared:  0.6147

F-statistic:  1132 on 1 and 708 DF,  p-value: < 2.2e-16


#The slope coefficient of the predictor variable in the second model is not significant at any reasonable level of significance.  In order words, the slope of the employment rate is not significantly different than zero. 

#The employment rate is not a good predictor variable of the effective federal funds rate.  It should be noted that we do not know if the unemployment rate is record in U-3 or U-6 percentages.  It  is most likely that these percentages are in U-3 since our first observations are in 1954.  The U-6 unemployment rates first began to be calculated in the early 1990s. 

#The unemployment rate does not have a lower bound of 0%.  Frictional unemployment exists in the best economic conditions.

#The real GDP (percent change) is marginally significant if we were to set alpha at 0.10.  This is a questionably large confidence interval to accept.  If we were to build a best subsets model with multiple predictor variables, the alpha to enter and exit could be set higher, possibly 0.15.  We can revisit this later.

#The inflation rate is significant at any reasonable level of significance.  On average, the effective federal funds rate will be 1.1139 times the inflation rate plus 0.9058.  The inflation rate explains 61.52% of the variation in the effective federal funds rate.

#We should see if the real GDP in percentage change terms belongs as a second predictor variable in a model that contains inflation rate as the first predictor variable.  A new linear regression is assigned a new model object and the summary function is performed on that new model object.

> model.MLR <- lm(`Effective Federal Funds Rate` ~ `Inflation Rate` + `Real GDP (Percent Change)`, data = interest_rates)

> summary(model.MLR)


lm(formula = `Effective Federal Funds Rate` ~ `Inflation Rate` +

    `Real GDP (Percent Change)`, data = interest_rates)


    Min      1Q  Median      3Q     Max

-8.2808 -1.6203  0.1622  1.5577  6.2258



                            Estimate Std. Error t value Pr(>|t|)   

(Intercept)                  0.61688    0.31189   1.978   0.0491 * 

`Inflation Rate`             1.14912    0.05840  19.678   <2e-16 ***

`Real GDP (Percent Change)`  0.05447    0.04223   1.290   0.1983   

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.27 on 233 degrees of freedom

  (668 observations deleted due to missingness)

Multiple R-squared:  0.6275,        Adjusted R-squared:  0.6243

F-statistic: 196.3 on 2 and 233 DF,  p-value: < 2.2e-16

#The adjusted R-squared value does not improve much by adding a second predictor variable.  The p-value for the individual t-test for the slope of the real GDP (percent change) coefficient is also not significantly different than zero at any reasonable level of significance.

#Before settling for the single linear regression model that only contains the inflation rate, let us check the diagnostic plots for the residuals in the model.

> op <- par(mfrow=c(2,2))

> plot(model.Inflation.Rate)

SLR diagonistic plots.jpg

#Just from looking at the first row of plots, we see that the density and variance of the residuals looks good for the first half of the fitted values, but the variance increases and density decreases as fitted values increase.  There is no pattern.  The normality plot also looked decent, but does have an issue in the lower tail.  In the future we may want to consider a transformation or debate whether the model fits well for a subset of inflation rate values.

The Electoral College and the Tidewater Nation

The author of American Nations: A History of the Eleven Rival Regional Cultures of North America, tries to show us why we should not view policy positions as simply “Democrat” or “Republican”.  According to Woodard, we live in a country of 11 nations that form coalitions based upon various issues.  The objective of each nation is to preserve their identity and to be influential in national politics.



Woodard (c)2011

The author suggests that contrary to popular notion of the United States being a melting pot, new arrivals either specifically moved to one of the 11 nations because the nation encompassed their values or the newcomers were assimilated, adopting the pre-existing values of a nation.   In this second scenario, the original founders of a community set the framework for that nation and new arrivals conform to or otherwise reinforce that culture.

Colin Woodard also explains that different nations in the United States held different conceptions of democracy.  The Yankeedom nation held the Nordic or Germanic conception of democracy, which encouraged near universal male suffrage.  Yankeedom was founded primarily by middle-class, well-educated Puritans.  Immigrants came in family units and they valued community structure and shared values.  When migrants settled other parts of the United States, they carried these tendencies and traditions with them.  When confronting other nations, such as New Amsterdam, the Midlands, and Greater Appalachia, they sought to impose their Puritanism.

Other nations were founded by deep inequalities.  The Tidewater and Deep South treasured Greek or Roman democratic system, where the existence of slaves coincided with their perception of democracy.   The Greek of Roman democracy exists to benefit the few, allowing a select group of men to become “enlightened” and guide their societies.  This benefit is seen as outweighing the agonies of those enslaved.  They viewed slavery as more humane than the treatment of the urban poor in the northern nations.  They reasoned that at least the slaves had a master that was supposed to care for them.  “Enlightened” Tidewater and Deep South gentry also argued that Yankeedom was a society of shopkeepers, which prevented individuals from becoming educated enough to advance their societies.

The Tidewater and Deep South were also not founded by equal proportions of men and women and tended to support the Royalists back in the United Kingdom.  During the English Civil War, then tended to side with the King.  The Tidewater saw themselves as an extension of the Norman culture while Yankeedom was Anglo-Saxon.  Things changed for the Tidewater when the British Empire sought to homogenize control over their empire.  The King redefined the rights of his British subjects.  Only those living in England had full rights.  This  clarification of who was considered an Englishman did not go over well for the gentry of the Tidewater.

It should be interesting to note that other nations did not value the democratic system at all.  New Netherland (New York) preferred a hegemonic system and hoped to be reabsorbed by the Dutch or British monarchies on several occasions.   Autocracy worked given that citizens showed tolerance towards one another.

It should not be surprising which cultures would support the continued use of the Electoral College system.  The National Constitution Center features a podcast from December 1, 2016 titled “Should we abolish the Electoral College?”.  The two panelists have biographies included on the website.  From this limited information, we might conclude that the one panelist is from either Yankeedom or the Left Coast while the other is from the Tidewater.  Given that Woodward’s theory is correct, both natives and migrants become assimilated by their nations.  In turn, panelists eventually will advocate the ideals of their nations.

This perspective is interesting because “Yankeedom” or “the Left Coast” could be considered “Democrat” in this past election cycle.  They will be on the defensive when faced with the new administration.  The representative from the “Tidewater” may or may not be considered a “Democrat”, but they come from a dying nation.   The Tidewater nation may not exist in the future. The growth of the DC metropolitan area into Maryland and northern Virginia essentially divides this nation.  Incremental growth from the Midlands also reduces its power.  With rising sea levels, the region will also loose geography to the east.  Essentially, the representative from the Tidewater seeks to preserve any formerly established advantage at all costs.

Both panelists introduce us to the history of the Electoral College.  Some of the original founders envisioned the electors to choose the President and Vice President that were most qualified for the position.  Initially the most qualified person would become the President and the second-most qualified would be Vice President.  Electors were supposed to deliberate and select candidates to run for the final election.

The Electoral College was one of the last systems established during the Constitutional Convention.  The framers were concerned about the excesses of democracy and the emergence of demagogues, but showed “haste and fatigue” by the time they got around to the Electoral College.  Modern campaigns were also not envisioned.  Founding fathers thought the President should be determined based on their reputation and history of service, not by their cleverness, or radicalism, during a campaign.

According to Alex, the electoral college was supposed to serve as a nominating board to send candidates to the House of Representatives.  From this cohort we would end up with the best candidate.  However, by the 1820s, the responsibility for narrowing down the candidate list was being usurped from the Electoral College and handed over to the political parties.

During the 19th century a series of reforms were advocated.  Since several nations exist in the same state, district elections were advocated versus the “winner-take-all”. Some also wanted to eliminate human electors.  Andrew Jackson, an Appalachian, was one of these strong advocates for changing the system.

The moderator and President of the National Constitution Center reminds us that when elections are close, the Electoral College provides us with a clear winner.  A series of small differences in certain states are magnified by the electoral system.  In effect, there is “no room for doubt”.

The Tidewater representative suggests that the smaller states look favorably on keeping the Electoral College.  Its existence helps preserve the Federation; all consistences matter.  The Yankeedom or Left Coast representative refutes this idea, starting that two strong advocates for ditching the Electoral College in favor of a popular vote came from small states, Rhode Island and North Dakota.  Candidates do not campaign in these states under the Electoral College system and they probably still would not if we switched to a Popular Election system.

Both representatives do agree that a popular vote system would lead to increased role for the federal government, since national standards for registration and voting would need to be set and enforced.  The Tidewater representative shows deep concern over this possibility.

It is important to put this concern in its proper context.  As previously mentioned, the Tidewater nation is the only one today that is at risk of extinction.  During the expansion of the Deep South, the values of the Tidewater were eroded and made more extreme, especially its policy towards slavery.  Tidewater leaders eventually followed the lead of the Deep South.  The tobacco industry declined in the Tidewater just as the cotton industry became prosperous in the Deep South.  The Deep South was also able to expand west whereas the Tidewater was cut off by a new nation, Greater Appalachia.

american-nations-advancingWoodard (c)2011

The Yankeedom representative tells us that the conception of “democracy” has changed over time.  The Electoral College does not conform with people’s every day notions of democracy.  He uses our gubernatorial and student body elections as classic examples.  In these instances the popular vote installs the new leader.

This argument rests on the belief that all people in the 11 nations share this belief.  We might question if the Deep South uses wealth and race, or if Greater Appalachia uses strength, in place of popular elections as their preferred method for finding a new leader.

The panelists also discuss the geography of states.  The blue oases in red states do not count.  Woodard addresses this issue by analyzing nations at the county level.

They also discuss the implications of a Popular Vote system.  The Tidewater representative reminds us that having “run off elections” creates an entirely different system.  Other “fringe” political parties would have a stronger initiative to enter the contest.  These “fringe” parties would be able to form coalitions and run for a second round.  The Tidewater representative also warns us that with more than two political parties, there would be less of a “moderating” influence.  It is also uncertain if third parties would increase or reduce the emergence of demagogues.  Regardless of how many exist, political parties were not viewed favorably by most of the founding fathers.

Cotton subsidies and total production per county

Cotton subsidies have been decreasing since 2005.  This may be associated with the Dispute Settlement Board of the World Trade Association’s recommendation the United States cease subsidizing upland cotton subsidies.

The subsidies in question were:

i) the export credit guarantees under the GSM 102, GSM 103 and SCGP export credit guarantee programmes in respect of exports of upland cotton and other unscheduled agricultural products supported under the programmes, and in respect of one scheduled product (rice);

(ii) Section 1207(a) of the Farm Security and Rural Investment (FSRI) Act of 2002 providing for user marketing (STEP2) payments to exporters of upland cotton; and

(iii) Section 1207(a) of the FSRI Act of 2002 providing for user marketing (STEP2) payments to domestic users of upland cotton. As for the actionable subsidies the recommendation is that the United States takes appropriate steps to remove the adverse effects of certain subsidies or withdraw these subsidies within six months from the date of adoption of the Panel and Appellate Body reports, i.e. the compliance period expired on 21 September 2005.


The aggregate amount of subsidies between 2000 and 2014 is depicted below:


The following county maps were produced by the United States Department of Agriculture.  We can see the decrease in both the total counties and the amount of cotton (pima and upland varieties) produced by each county between 2010 and 2015.  During these five years, the direct payments, production flexibility contracts, and counter-cyclical programs are completely phased out.  “Other cotton programs” emerge in 2013 and to a great extent in 2014 (EWG Farm Subsidy Database).


Pima cotton moves out of western Texas and emerges in Arizona.  Kern County in California almost completely stops producing pima cotton.





There are major decreases in the amount of upload cotton produced by counties between 2010 and 2015.  In particular, we should note the color shade changes in Arkansas, Louisiana, North Carolina, and South Carolina.


False Allegations of Voter Fraud

Allegations of voter fraud make great headlines, but are generally false.  These allegations are based on “feelings” instead of data.  The claims are ultimately stoked by the interest of incumbent politicians and political parties seeking to suppress the vote when greater voter turnout does not favor their odds.

The maximum occurrence of voter fraud in the United States between 2000 and 2014 has been calculated at 0.0000031%. 

We typically see the mobilization efforts of political parties.  What we often do not see, are their demobilization efforts.  According to Groarke, both methods are equally important during an election cycle (Groarke 2016).  We see an example of a demobilization effort when voter requirements become more stringent in response to allegations of “voter fraud”.

Groarke studied three campaigns to improve voter turnout and the allegations waged against these efforts in the name of “voter fraud”.  Groarke found that the representative’s tenure and propensity to want more “unreliable voters” in their district influenced their political calculus.

There was an observed difference between northern and southern Democrats in response to each effort.  Some Republicans even initially supported these efforts, but eventually retreated from these positions.

Postcard Registration Bills (1971 – 1976) introduced by Senator Gale McGee (D-WY)

The arguments against the legislation:

  • Registration of unqualified persons
  • Registration of nonexistent persons
  • Postcard registration cost
  • Danger of federal intrusion into election process
  • Nonvoters are naturally uninterested in politics

Minnesota and Maryland served as a case example of how postcard registration improved voter turnout.  Between the two, a single case of fraud was determined during the registration leading up to the 1974 election.  In exchange for that one case of fraud, registration increased by 1.5% from the previous voting period and was a cheap way to increase participation (Ford Foundation 715).

Election Day Registration (~1976) introduced by President Jimmy Carter and VP Mondale

The arguments against the legislation:

  • Endangers the integrity of franchise
  • Serious threats of fraud even if voters showed identification while registering
  • Increased federal regulation and would discourage participation

Minnesota had allowed same day voter registration since 1973.  Between 1972 and 1976, the percentage of the population actively voting increased from 68.4% to 71.4%.  22.9% of these voters took advantage of their ability to register on election day.  In 1976, Minnesota has the highest voter turnout in the nation (Smolka 26).

Groarke once again noticed that the largest opponents of same day election registration were Republicans or Southern Democrats who were incumbent members of Congress within “safe” districts (578-580).

The beginnings of the Motor Voter Law (mid to late 1970s, 1980s)

The arguments against the legislation:

  • Fraud
  • People who really want to vote will find a way (comparing voters in El Salvador to those in the U.S.)

If effective, the Motor Voter Law would have required states to offer Election Day registration, mail registration, and ensure that government agencies offered voter registration and the ability to update existing registration.  The initial movement also did not include provisions for “non-voter purging”.

Although Election Day Registration failed, some states opted to use mail registration and have government agencies assist in registering and updating existing voter information.  The Reagan administration fought Motor Voter Laws with the Hatch Act.  Litigation tied the ability of the government agencies to consistently offer voter registration and updates to existing registration.  This law became known as the “Motor Voter Law” because essentially only motor vehicle agencies were seamlessly incorporating voter registration as part of their processes (582).

In the 1980s, certain portions of the population tended to live in cities while others lived in the suburbs.  This translates into the need for a driver’s license.

Today, the reality is much different than it was in the 1980s.  Unfortunately, this data from the U.S. Census Bureau’s American FactFinder is not readily available for the 1980s.  It would require more time to juxtapose a snapshot of 1980 and 2015.  I could also then see if the changes are statistically significant.


Proponents of the law had to negotiate to win over opponents.  Penalties were included for the alleged fraudulent registration.  Same day registrants were also segregated and further scrutinized before their votes were counted (Groarke 585 – 586).

The mail delivery service is not consistent in all communities.  This became a problem when a mailing purge was suggested to occur every two years.  Voters that did not respond to the mailer would be removed from the voting list.

National Voter Registration Act – (1993) President Clinton and Rep. Al Swift (D-WA)

The Motor Voter Law reemerges and undergoes significant changes.  Election Day registration is dropped and voter list maintenance requirements were added.  It was also no longer considered mandatory for unemployment agencies to offer voter registration.

Some states threw up roadblocks, by requiring two separate registration processes for voting in state versus national elections.

Groarke reminds us that political parties exert equal efforts to mobilize and demobilize potential voters.  Her Table 3 shows how many voter removals there are annually in percentage terms of voting applications.  The purging process has been implemented nationally thanks to the National Voter Registration Act.  During the first year, a name is identified “to be purged”.  If the potential voter has not communicated with voter registration by year two, they are purged from the voting rolls.


With the emphasis on “fraud”, we must remind ourselves that those who might want to commit fraud probably do a simple cost versus benefit analysis.    What is the incremental benefit of one vote against the fines and prison sentences if the one fraudulent vote is discovered?  Levitt argues that the incremental vote is not worth the fines and penalties for a rational agent (Levitt 2007).

Instead, we should consider other rationale for mistakes made at the polls.  Some of these possibilities are:

  • Clerical or typographical errors (ie: signing the wrong line or choosing an identical or almost identical name)
  • First and last names or parts of street number addresses are inverted
  • Incomplete written data matches with another person (ie: consider middle initials)
  • Common names are prone to being flagged and purged
  • Certain birthdates are more common than others (ie: Check out this article about the probability of you being born on a given day.)
  • Voters move and can be registered at two addresses, but only vote once.
  • Voters can begin filling out a form on election day, make a mistake, be given another form, and an election official can accidently count the discarded form again.
  • Voters can vote before an election and die by the time the vote is confirmed.
  • The right to vote for felons is not consistent across all states. In some states once you are released, your right to vote is restored.  In others, you must go through a process to regain your right to vote.  In addition, misdemeanor offenders retain the right to vote.
  • “Caging” efforts to purge voters are not always effective. This is a tactic used to see which postcards are returned by the USPS.  Sometimes the potential voter is out of the country or their area is poorly serviced by the USPS.
  • Just because an address is unusual does not mean it is illegitimate. Homeless persons can register their address as the local shelter and business owners can live in the same building as their business.

Levitt calculates overall documented fraud rates in the following states:

  • Missouri: 0.0003%
  • New Hampshire: 0.0000%
  • New Jersey: 0.0002%
  • New York: 0.000009%
  • Wisconsin: 0.0000%

In a Washington Post article Levitt reports 31 incidents of voter fraud in all general, primary, special, and municipal elections from 2000 through 2014 (Levitt 2014).  These 31 incidents occurred out of over 1 billion ballots cast during these 12 years.  Some of fraud allegations have not been fully investigated, which may indicate that they are falsely being flagged as “fraud”.

U.S. state fertilizer indices and growth of factor productivity levels

I use USDA data from 1960 and 2004 to create a brief exploratory analysis about what makes some states more agriculturally productive than others.

Is a higher fertilizer index associated with higher factor productivity levels?

H0: There is no correlation between fertilizer indices and the growth of factor productivity.

Ha: There is a correlation between fertilizer indices and the growth of factor productivity.


Both growth of factor productivity levels and fertilizer consumption indices are relative to Alabama in 1998.  Alaska and Hawaii are the only states excluded.

I run the regression analysis for the 1960 and 2004 data.  A small p-value for the 1960 data has us reject the null hypothesis and conclude the alternative hypothesis that a correlation between the two variables exists.  A larger p-value for the 2004 data has us fail to reject the null hypothesis. 

We should note that the fertilizer index variable explains such a small percentage of the variability in the response variable.  The data points are scattered far from the regression line.  We see this by the value of the R-sq value.  Over time, the fertilizer index predictor variable explains even less of the variability in the response variable. 

Regression Analysis: Factor Productivity (1960) versus Fertilizer Indices in 1960

 Analysis of Variance

Source                        DF   Adj SS    Adj MS  F-Value  P-Value

Regression                     1  0.07532  0.075324     7.75    0.008

  Fertilizer Indices in 1960   1  0.07532  0.075324     7.75    0.008

Error                         46  0.44736  0.009725

Total                         47  0.52269


Model Summary 

        S    R-sq  R-sq(adj)  R-sq(pred)

0.0986168  14.41%     12.55%       1.83%



Term                          Coef  SE Coef  T-Value  P-Value   VIF

Constant                    0.4969   0.0222    22.40    0.000

Fertilizer Indices in 1960  0.0499   0.0179     2.78    0.008  1.00


Regression Equation

 Factor Productivity (1960) = 0.4969 + 0.0499(Fertilizer Indices in 1960)


 Fits and Diagnostics for Unusual Observations



Obs        (1960)     Fit    Resid  Std Resid

  2        0.7057  0.5104   0.1953       2.02  R         [Arizona]

  4        0.8643  0.6561   0.2082       2.34  R  X      [California]

  8        0.8649  0.5997   0.2652       2.78  R          [Florida]

 33        0.4673  0.6438  -0.1765      -1.94     X        [Ohio]


R  Large residual

X  Unusual X



Regression Analysis: Factor Productivity (2004) versus Fertilizer Indices in 2004

 Analysis of Variance

Source                        DF  Adj SS   Adj MS  F-Value  P-Value

Regression                     1  0.1356  0.13556     2.12    0.152

  Fertilizer Indices in 2004   1  0.1356  0.13556     2.12    0.152

Error                         46  2.9435  0.06399

Total                         47  3.0791


Model Summary

       S   R-sq  R-sq(adj)  R-sq(pred)

0.252961  4.40%      2.32%       0.00%



Term                          Coef  SE Coef  T-Value  P-Value   VIF

Constant                    1.1049   0.0493    22.39    0.000

Fertilizer Indices in 2004  0.0184   0.0127     1.46    0.152  1.00

 Regression Equation

 Factor Productivity (2004) = 1.1049 + 0.0184(Fertilizer Indices in 2004)

 Fits and Diagnostics for Unusual Observations



Obs        (2004)     Fit    Resid  Std Resid

  1        1.7979  1.1305   0.6674       2.67  R     [Alabama]

  2        1.6304  1.1162   0.5142       2.07  R     [Arizona]

  4        1.5297  1.2817   0.2480       1.06     X  [California]

 13        1.3554  1.3211   0.0343       0.15     X  [Iowa]

 47        0.5777  1.1679  -0.5902      -2.36  R     [Wisconsin]

 48        0.5712  1.1103  -0.5391      -2.17  R     [Wyoming]


R  Large residual

X  Unusual X


09.06.16 fitted line plot #2.png


We could create a prediction interval for the 1960 data, but the low R-sq value indicates that this interval will be wider than desired.

We may want to include other variables in the linear regression to see if we can better capture the changes of variability of the response variable.

 Data for this brief exploratory analysis was gathered from the USDA website, specifically this page:






Sugarcane cane smut (“carvão da cana-de-açúcar”) susceptibility

I haven’t posted for a long time.  A lot of new things have happened within the last four months, forcing me to stop writing temporarily.

I began a graduate program in applied statistics two months ago.  Below is a very basic example of some experimental learning with R.

07.00.16 Sugarcane smut R exploration


¿Cómo se distribuye la población de internet?

Por este medio me gustaría compartir lo que he leído de este blog mexicano.

Véase: ¿Cómo se distribuye la población de internet?

As faculdades brasileiras hierarquizadas

Agora eu tenho oitenta rascunhos incompletes que ficam aqui com a poeira acumulando acima deles.  Este rascunho não vai ser apenas outro rascunho esquecido!

Aqui só quero deixar um linkspara um site que hierarquiza as faculdades brasilieras por suas especializações.  Isso servirá mais como um recurso para mim mesmo no fúturo.

A classificação das faculdades: eis o site.

Entretanto ainda estou experimentando com os cursos gratuitos pela Internet.  Será que alguém precisa pagar para realmente levar um curso a sério?  A última vez eu fracassei porque fiz uma tentativa andar sozinho.  Eu não considerei a importância de pertenecer a uma comunidade que se apoia.

Igualmente importante, devido ao fato que o tempo é limitado e eu trabalho de forma renumerada em torno de dez horas por dia (inclusive os fins de semana), terei que reduzir as distracções no trabalho e na minha vida particular.  Também eu terei que fazer mais mudanças para avançar ainda mais.

A arte de tradução nós ensina a ser mais humildes

Anteriormente  eu tenho escrito algo de uma redação do Cláudio Almir Dalbosco da UNICAMP.  Eis quero continuar a explorar os temas introduzidos pelo autor.

Eu estou expondo um pouco mais sobre o segundo pilar nomeado “cidadania universal”. Especificamente, quero escrever mais sobre o exercício de tradução e como ele nos afeta como pessoas.

Dalbosco propõe a ideia de que enquanto nós estamos aprendendo outra linguagem, entendemos nossas próprias falhas caraterísticas. No início sempre vamos ter uma interpretação imperfeita, que por sua vez nos dará uma lição sobre como ser humilde.

Nós nos tornaremos humildes por ter consciência de que, pelo fato de ser humano, cada um de nós encara nossas próprias limitações. Sem correr o risco de errar, nunca poderemos aprender outra linguagem. O processo inteiro é cheio de falhas e desentendimentos. Enquanto aprendemos com nossos erros, também temos que treinar tanto o ato de escutar quanto o de falar.

Em Campinas, comprei um livro, Quase A Mesma Coisa, que discute o ato de tradução. Uma parte do texto descreve a necessidade de estar disciplinado enquanto traduzindo, senão você pode “trair as intenções do texto fonte” (127). É errôneo fazer qualquer tentativa para enriquecer o texto, mesmo que o texto possa ficar mais interessante ou o sentimento expressado possa ser ainda mais aprofundado na segunda linguagem.

Por fim, você tem que estar sempre focado no que você está traduzindo para evitar quaisquer distrações. O aprendizado de outro idioma, e depois a tradução entre dois, requer um foco constante. Este processo está repleto de dúvidas até que a pessoa se sinta confiante em aprender pelo processo de errar.


La Apuesta de la Visión México 2030

Desde ahora, la idea de incluir políticas neokeynesianas va a ser importante en México. Todavía hay alguna vacilación para muchos de los países en desarrollo para que  dejen de depender en políticas neoliberales. Aquelles que adoptan el nuevo modelo económico son castigados por la comunidad financiera de los inversionistas del “primer” mundo y corren el riesgo de ser etiquetados como aquellos que solo buscan intereses nacionalistas.

Sin embargo aun en los Estados Unidos, aceptamos las políticas neokeynesianas sin darnos cuenta. El gobierno de un país tiene un papel fundamental en el arreglo y el cumplimiento de estrategias para fomentar el desarrollo y el mejoramiento del país, no importa el nivel de “desarrollo” alcanzado hasta entonces.

Arriba he incluido un video donde el diputado federal mexicano, Alberto Curi Naime, discute las ideas de la Visión México 2030 en la reunión del Grupo Visión Prospectiva México 2030. El énfasis fue puesto en la importancia de la infraestructura en el desarrollo económico de México. Ellos proponen un nuevo modelo o paradigma de desarrollo.

Me toca escribir un poco sobre algunos puntos que me interesan.

Habrán varias etapas del proyecto hasta 2018 y aproximadamente un gasto de $7,7 billiones. Los planes indican la construcción y modernización de 46 nuevos autopistas (vías con 2, 3, o más carriles y que permitan el motorista manejar a velocidades hasta 120kmh) y 90 carreteras federales (vías únicas que permitan a los  conductores manejar a velocidades hasta 90 kmh).

También destaca contar algo sobre la revitalización del sistema ferroviario, lo cual tiene tres ejes, interurbano, rápido y transpeninsular. Después algunos comentaristas criticaron los planes del transporte ferroviario interurbano de Toluca-México. Alguién dijo que este tren no consideraba  “interurbano” sino “suburbano”. La distancia de los trenes interubanos y rápidos no está incluyendo cuidades que tienen gran distancia entre ellas.

Otro punto interesante es la expansión de los puertos como lo del Puerto de Guaymes en Sonora. Véase otro artículo de este sitio escrito sobre el puerto de Manzanillo.  México tendrá cuarto puertos de clase internacional.  También resulta interesante la expansión de la infraestructura en el sur del país como la construcción de un aeropuerto en la cuidad de Ixtepec en la región del Istmo.

El presupuesto por este proyecto dependerá de los recursos dirigidos a financiarlo.  La caída del precio del petróleo es una variable importante para recordarnos.  Previamente las estimaciones para las ganacias de tal recurso fueron por debajo del ingreso que fue eventualmente obtenido.  Ahora ya no tenemos la misma situación.  El precio de petroleo acaba siendo inferior de lo que fue calculado para sostenar el presupuesto mexicano.

Otra cosa que me gustó fue la definición de “eficacia” por Naime (1hr, 20 min).  Eficacia fue definida como el punto para la entrega de resultados.  México tiene que buscar el punto de equilibrio entre la eficacia del gobierno  y el cumplimiento y seguridad de cualquier obra.

Por ende, CEPAL también disponibilizó un archivo sobre la Visión México 2030.  He aquí es aquello.  El fin de este relato tiene una serie de metas para la propuesta.  La meta 13 “prosperidad” es que no haya ningún mexicano viviendo en condiciones de pobreza alimentaria.  Véase otro artículo mío que empiece a conversar sobre este tema en México.