cluelessresearch.com

political methodology, brazilian politics, etc.

Archive for June, 2007

Breaking news: A college degree in Brazil is worth 4 fm radios more than high school!

ABEP, the Brazilian association of market research firms, has just approved the new "Brazil criterion" or CCEB.  CCEB is the standardized way to measure survey respondents’  consumption power by asking questions about consumption items they have and the education of the head of the household. They argue this better than just asking an individual’s household income, particularly in countries with high inflation or black market economies.

The newly designed CCEB was designed using a regression of household income on a  set of items and trimming them down using qualitative. For example, computers are excluded given the accelerated increase in computer ownership taking place. We don’t want items  that are subject to large changes in a short period of time, since consumption power itself does not in general move very fast.

My first qualm with it is (you guessed) methodological in nature. You see, the regression they estimated as a basis of the index has the log of household income on the set of items with no interactions whatsoever! This can’t possibly be the "best" regression they could find! And I am pretty sure it wasn’t. The underlying objective is to have a way for interviewers on the field to categorize the "class" of the respondent as a filter (for quotas), and they probably think interviewers in Brazil know how to add but not how to multiply. ABEP criticizes the Mexico index for using a classification tree  (which neatly allows interactions) for being too prone to error by interviewers. I would like to see the study showing this.

So, how does the CCEB look like? Based on the regression they created a point system.  Thus, if you have one color TV you get one point and if you have four you get four points. In addition, you get extra points for the education of the head of the household.

Now excuse me for I have to go  to the store buy some cheap radios…

1 comment

Biologists and statistics

So, I am reading this interesting review by University of Chicago’s Jerry Coine of a book proposing the Intelligent Design (a.k.a. the new creationist) "theory". The book in question, by Lehigh University’s biologist Michael J. Behe, basically argues that random mutations cannot lead to the complex changes we observe in the fossil record. The reviewer, then, tries to clarify what they (biologists) mean by random.

What we do not mean by "random" is that all genes are equally likely to mutate (some are more mutable than others) or that all mutations are equally likely (some types of DNA change are more common than others). It is more accurate, then, to call mutations "indifferent" rather than "random": the chance of a mutation happening is indifferent to whether it would be helpful or harmful.

Indifferent?!? Come on! Wouldn’t it be better to use an actual statistical term for a, uh, statistical concept? Yes, like independent. [tex]P(A|B)=P(A)[/tex]. My dictionary informs me that indifferent can mean three things: a) "having no particular interest"; b) "neither good or bad". (Both would seem more fitting to an ID proponent, don’t you think?) or c) (in archaic biology neutral in respect of some specified physical property.) Our ability to create unnecessary new terms never ceases to amaze me.

No comments

Another Stata rant

So, I am using my macbook and suddenly it becomes really hot, and the fan starts at full speed. Perhaps I am encoding some music or video? Or am I doing some fancy statistical analysis?

Not really. It is Stata waiting for me to press a key! WTF!! It’s been like so for as many versions as I remember.

1 comment

Regression plots

I am writing a paper with a coauthor promoting the use of graphs instead of tables in political science. We did some research on the current use of tables and graphs and found out that a substantial proportion of the tables is devoted to the display of regression results.

So we thought that creating graphs to display regression tables was essential to our task. Thais to turn, this:

into a nice graph.

We are currently revising the paper for (re)submission, and are still undecided on how to display such graphs. Here are some of the several revisions, from one of the first (back in November)

This one from later in the same month

And the two I am currently looking, taking out the boxes aroud the plots. This one is minimalist:

And the next one has the x-axis repeated in each plot:

My coauthor thinks the boxes are necessary, I cite Tufte over and over and say they aren’t. I think we will have to arrange an intercontinental boxing match to settle the issue.

No comments