5: Measurement and Outcomes

Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

About this Video
Playlist
Related Resources
Transcript
Download this Video

Topics covered: Measurement and Outcomes

Key hypotheses
Primary and intermediate outcomes
Interpreting multiple outcomes
Theory of change Model
Questionnaire design
Data collection/entry

Instructor: Esther Duflo

1: What is Evaluation?

2: Why Randomize?

3: How to Randomize I

4: How to Randomize II

Now Playing

5: Measurement and Outcomes

6: Sample Size and Power Ca...

7: Managing Threats to Eval...

8: Analyzing Data

Related Resources

Lecture slides (PDF)

Download English-US transcript (PDF)

The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.

PROFESSOR: So we now have, hopefully, I don't know. I've been going to some of the groups and we seem to be there for at least one of the groups I surveyed by the end of yesterday. We have a question. We know what we'd like to evaluate, and we have some kind of a design. So we know how we are going to get a sample. We know how we are going to randomize. And that's a good start.

So what we are going to do today is getting to know that we have a way of randomizing. We're going to get into the details of how we are actually going to go about collecting the data for the evaluation. So it's not going to be so much about the program, it's going to be about thinking how we really want to do the survey. Thinking about what is the sample size we need, how many data points we need, and what data should we go about collecting, and how much is it going to cost, and what are the trade-offs we are facing when we are trying to address those questions. So we could be spending from one week to one year on these questions. In particular, on the survey design aspect. These are complicated questions. Not many people who are much more qualified, certainly, than me on this issue of how to ask the right question correctly. So we are not going to get into any of that.

It's going to remain at a somewhat abstract level of business one. It's going to be both abstract and quite particular in a way. At the abstract level is how do we even think about the type of data we want to collect? And at the concrete level it's we're going to do it for one particular example, which is this Panchayat example which happens to be one that I know pretty well.

AUDIENCE: [UNINTELLIGIBLE]

PROFESSOR: Not making much progress, huh?

So what are the questions we need to know the answer to before getting ready to start our evaluation? One is what data should we collect? It's kind of a big question. What should go in the questionnaire? Anybody among you who's been involved in doing questionnaires knows that you cannot ask a million questions. And so how do we choose what to put in? And then, what systems we should put in place to ensure that the quality of the data is good? And what sample size we need to plan. So not going to do sample size this morning. It has its own little lecture and a firm set of exercises. So that's going to go this afternoon. We are going to be doing a lot of that and a bit of that today.

So let's start with the Panchayat example. So let's talk a bit about the setting. So the setting is a reservation for women in the Panchayat region in India. So what's the Panchayat and what's the reservation policy? So the rule of the game, I don't ask rhetorical questions. So when I ask a question, someone needs to answer otherwise we're stuck.

AUDIENCE: It was a program designed to decentralize the allocation of public resources in villages to the lower level. And at the same time, you wanted to ensure that meant minorities and scheduled tribes in India got represented somehow, and the preferences and opinions at this Panchayat level.

PROFESSOR: That's right. So the policy is Panchayat stands to consider five people. And maybe in some historical India there was this council of five people who make decisions for the villages. Or maybe not, but anyway, that's where the name comes from. And the idea, and that's not only India, it's something that you see in a lot of developed and developing countries today, is that decisions regarding local public goods such as drinking water infrastructure, the local wards, the buildings-- at least the buildings so, for example, schools, health facilities and such things-- would better represent people's need if people had a say in what they wanted. So it's very difficult to know if our bureaucrats sitting in Delhi that such and such village in the middle of Bihar needs a road as opposed to a water well. And so you can get this vast misallocation of money in this way.

So the idea with the Panchayat is that now even though the revenue collection is still pretty much not coming from the villages where there is not much taxation ability, this one defending decision should be taken increasingly at this local level to ensure that better adequation, a better fit between what you build and what people want. The problem with that, of course, is that when you have local decision making it's who is going to be controlling the power. So India, like many other countries, perhaps even more than other countries, has this history of these conditions again some particular group.

So, for example, the scheduled caste who are the former untouchable? The scheduled tribe who don't even have a caste? And so the worries that there are enough people, strong people, in these groups, that at the national level they can organize to make sure that there are policies that help these particular groups. Same thing for women. There definitely are very strong women in India. But in any local villages, you might not have a minority of scheduled casual tribal women who are able to take responsibility to ensure that their group is represented. So they might never be represented, you might have some tyranny of the majority. So that's why the system also introduced this reservation concept, which is to ensure some representation.

So how does a reservation work? How did they put it in place?

AUDIENCE: They reserved a certain number of seats for women in each council.

PROFESSOR: Right. So there is one reservation at the level of the council. So we can say these are various councils. Each council has about, say, 10-12 representatives which represent a population of 10,000 to 12,000. And so the first reservation is within each council. How many seats for women?

AUDIENCE: One third.

PROFESSOR: One third. So each council gets one third for women. And what about SC and ST What do they get?

STUDENT: In proportion to their population?

PROFESSOR: In proportion to their population. Not in that particular village, but in the whole district. So, for example, in Birbhum, there is about, I think, 30% of scheduled caste. So every village needs to have 30% of scheduled caste, even if they have much less. Except that if they have less than five, they're exempted. If you have no scheduled tribe in the village, you can not have a scheduled tribe representative. So that's the first thing. So are we going to be able to evaluate the impact of having this policy which has, within each council, has one third of the seats to a woman.

AUDIENCE: No, because there's no control.

PROFESSOR: Right. That's going to be very difficult because there is no control. So certainly we won't be able to look at the impact of this policy on what the council does as a whole. You might ask whether the particular village where she comes from, or segment of the village where she comes from, gets different good, possibly. But certainly you're not going to be able to compare the GPs since they are different. So what is the second layer of the policy that's going to be more helpful?

AUDIENCE: The Pradhan?

PROFESSOR: At the level of the Pradhan. And how does that work?

AUDIENCE: It's like a rotation?

PROFESSOR: Right. So how does it work for any particular election?

AUDIENCE: It's like it gets sorted or something?

PROFESSOR: Right. So what the deal exactly, is that they rank them. So these GPs have a serial number, which is some number that they had forever and ever. So they rank them by their serial number and by geographic unit. So they first rank them by block, and within each block they rank them by their serial number. And then the constituent three lists. One list of what they call general, one list of SC, and one list of ST. So the general list means it's not reserved for an SC Pradhan. Those SC lists means it's there for a SC Pradhan, and ST means it's there for a ST Pradhan.

And the way I did this selection is that they have to stay with a random number which they use to say, so if you need to reserve it's like a very long table. Because it tells you if you need to reserve five GPs, you pick number 1, 15, 17, and 21 for SC. So they do this list this way. And then this gives them three lists, which are ordered by serial number. And in each list, they count 1-2-3, 1-2-3, 1-2-3, in the first election. All the GP number one in all the lists got reserved for women. In the second election, again this goes through the same process. And all the GP number two are reserved, and the third election, all the GP number three were reserved.

So in principle there is a rotation, though it is not necessarily one for one in the sense that if a GP happens to be number one on this list in the election in 2003, and number two in this list in 2008, they get reserved twice in a row. Possibly if they become number three in this list, they would do that three times in a row. That could happen, too. But twice in a row is pretty frequent, actually. So this is more hopeful because now we have some GPs that are reserved for women. That says those three are reserved for women. And some who are not in the same year. So now we can compare them. Compare the decisions that are made in those places, compare anything we would like to compare, and we have a chance to actually identify the effects. Yeah.

AUDIENCE: Can you compare if in the non-reserve you elect a woman, does that become part of the control group, or?

PROFESSOR: So what's your thinking on that? So two things could happen, right? It could happen that in the places that are reserved for women, they elect a man anyway in defiance? So you don't have that problem, actually. They elected a woman. And the other thing that could happen is what you're talking about, is that some of those places might decide to elect a woman anyway. In fact, about 7% of them do. So what do you think we do with those guys?

AUDIENCE: They're part of the control.

PROFESSOR: We keep them in the control, right? Because we're going to spend more time on that tomorrow, but if we did put them in the treatment, then we don't have a random selection anymore because now in the treatment we have the people who are forced to have a woman that's on them. Plus the places that choose that's not random. So we'll be keeping them here, and this is going to mean that it's going to look a little bit like an encouragement design like you saw yesterday. Which is, in the places that are reserved for women, 100% of them have women. In the places that were not reserved, 7% have women anyway. So the difference is not quite 100%.

So tomorrow we'll deal with, see, so here we are going to focus on the impact of being reserved, and we're not going to be concerned so much about the impact of having a woman. So we are still going to always compare treatment versus control. Tomorrow you can see how, in fact, we can do something about thinking about how we go from the impact of being reserved, to the impact of being a woman if we are willing to make a couple more assumptions.

So that's a better setting. We can now compare those places and even though the Indian government didn't do this beautiful experiment with the view of an evaluation in mind. And they might have. And then they might contact you and say, hey, we have this set up. How should we go about evaluating it? And so that's the name of the game.

So I think we went through this maybe? We can spend a little more time on the goals. So what do we expect from the Panchayati raj? What will give us a first sense of what data we should think about collecting to know what's the goal of the institution. This big untraditional amendment took place in 1993. What was in the mind of the policy makers at that time?

AUDIENCE: That there would be more accountability to the interest in priorities of interested countries?

PROFESSOR: Right.

AUDIENCE: Greater women's empowerment?

PROFESSOR: So I don't know if it's a goal of the Panchayati raj amendment per se, but we could think it was something people had in mind in the reservation policies.

AUDIENCE: That program selection would be more aligned with the local preferences?

PROFESSOR: Right. Program selection might reflect local preferences. We should keep that in mind.

AUDIENCE: To increase the quality and quantity of public goods?

PROFESSOR: Right. We can see that there is more as a result of their accountability, there might be better and more public goods.

AUDIENCE: More participation in public life?

PROFESSOR: Right. We might also think that we might value the participation for its own good. I would think it's a good thing if people are involved in local democracy.

AUDIENCE: They vote more?

PROFESSOR: They vote more, participate in meetings, things like that. So that's kind of the sum of the objective of the Panchayat raj, generally. And within that there was this reservation. And the reservation policy is, was, still is in a sense, quite controversial. It is still controversial in the sense that the next step they are asking themselves now, and in fact I think has always been somewhat lurking in the background, is should there be reservations at other levels.

For example, for foreign peace. Like, should there be reservations in the parliament modeled on the one [UNINTELLIGIBLE] model, but akin to what we have on the Panchayat. And that debate is lively in India, but it's also elsewhere. Many countries have reservation policies. Last I counted, hundreds of some kind of some form of mandated representation for women. So it is not a very prominent policy, and some countries are very opposed to that. For example, the US.

So what are the potential pros and cons of a reservation policy? Since our evaluation is meant in a sense to inform this debate, we should think about what are the questions that are in people's minds?

AUDIENCE: It's not democratic. It's not majority rule.

PROFESSOR: Right. So that's a disadvantage. So it's not democratic, I'll just put it that way. We are constraining people's choices. They might not like it. So that reduces their utility.

AUDIENCE: But there is this structural inequality, so you need to empower your women [INAUDIBLE] do it, [INAUDIBLE] this reservation [UNINTELLIGIBLE]. But at some point in history, you need to do it to push women farther.

PROFESSOR: So I see more than one thing in that comment. One is that in the short run we need to ensure woman's representation. That's the short run. And I think what I also hear in your comment is that--

AUDIENCE: So many noises in the room. It's not working as it should.

PROFESSOR: Right. So one of the reasons why you might want to ensure woman's participation is just because-- just like that. Not even because you think it will affect, again, like democracy here is an end in itself. You might think, well, democracy is not real because women are not participating. I'm interested in having women participating, period. Even if it doesn't change anything to the outcome, that's what makes democracy real. So that could be one outcome. So we are interested in that as an end in itself. And I think one thing that I overhear in your comment that might not be there in reality, is also that maybe that will help jump-start the process where that would not be needed in the long run.

So it would ensure a woman's representation in the long run, and maybe change voter's view. View is such that in the long run, the effect would persist. In which case you might go to a bigger and better democracy where everybody's actually involved. In fact, it might be good for men as well to avail themselves of the possibility of half the population's talents which now, for some reason, becomes shut.

So that's a first set of things which concerns the participation itself.

AUDIENCE: So one of the advantages of ensuring women's representation is that they might then allocate more money to goods or services that benefit women's [INAUDIBLE].

PROFESSOR: So that is certainly something you hear a lot. So we allocate more money to goods and services which, sorry, can you repeat the end?

AUDIENCE: Which benefit women in [UNINTELLIGIBLE].

PROFESSOR: Which benefit women. Which might be good if-- you were saying that woman were not represented adequately before, and all-- yeah.

AUDIENCE: And then there might also, I mean, there's been some literature saying that there's been a spillover effect for children and families, but if [INAUDIBLE].

PROFESSOR: That's also something you hear a lot which is, so this is an argument in favor of redistribution. To half the population there is a group that doesn't get very much in the normal system. They need to get some part of the rent as well. And then that's sort of one step further which is an efficiency argument. You say, as you do that, you will also get a different type of goods.

STUDENT: Increases a woman's empowerment.

PROFESSOR: Right. So as you do that you might also increase a woman's empowerment. I think it's a double positive, but I'll leave it like that. Empowerment. Which may be good again as a redistribution or because we think that women in the house will do things differently.

AUDIENCE: Maybe the same thing as empowerment, but maybe more measurable is that it claims to reverse the trend of increasing gender gap in the population?

PROFESSOR: So one of the possible outcomes would be gender gap. You mean among children or among adults?

AUDIENCE: Eventually it will be among the adults, but it can be noticed [INAUDIBLE] indicators would be like reduction in girl infanticide.

PROFESSOR: So do we think there is a direct rule between what women policy makers can do, in terms of what type of public goods they can provide? Or do you think it's going to be all indirect to doing stuff that benefits women, which increases the power of women, which means that they make decisions that are better for girls in the household?

AUDIENCE: And perception, it will quickly change in our society.

PROFESSOR: So there is an issue of perception as well.

AUDIENCE: [INAUDIBLE] that when you see [INAUDIBLE] it would be more [UNINTELLIGIBLE] to follow.

PROFESSOR: Right. So you might think that might be something that people are also talked about is just, again, not just the aspect of what she might be doing. Just having the woman as the figurehead changes the image in a somewhat more permanent way. You don't just think well, she's there because of the reservation. You see a woman in a position of power and that might change your view of what's possible.

AUDIENCE: As a disadvantage, you also have the potential of the women who have poor job and so the perception actually becomes negative, if it's perceived that the woman might be unqualified.

PROFESSOR: So the first woman might be unqualified, and that would worsen perception. And then the other thing that your comment could be read as is maybe they're not particularly unqualified, but they are perceived as being unqualified. Because they are perceived as being there because of the reservation policy. And, in fact, it's makes it even more difficult for competent women to assert themselves. That's something that has been said a lot about affirmative action in the US. For blacks, for example, now every time you see a black person succeeding you're thinking, why? It's because they got some unfair advantage somewhere, and so that makes things worse.

AUDIENCE: Backlash.

PROFESSOR: So we can write it as backlash. That's a very good point. So that's a question that we can also ask. What's the perception of women, and women in power before and after?

AUDIENCE: We also create this backlash. It's my perception of it, too, but it's a different issue. It's not necessarily clear that women will, in fact, represent women better. It is possible that women will feel the need to compensate and therefore go out of their way to identify with men's issues, and go out of their way to not be perceived as a feminine candidate.

PROFESSOR: Right. So I'm looking for a term for that but I don't find one. Women may overcompensate and not represent women's interests.

AUDIENCE: As an advantage having women on the councils that you bring more information that may benefit women but also may just benefit the entire society. Because you weren't getting this channel of information either directly from their communities, or from their constituents who don't feel comfortable with women [INAUDIBLE].

PROFESSOR: Right. You may be more information. For example, because women speak up more, and so they give their opinion on stuff and there is stuff they might just know better about, even if they don't particularly care more about. Everybody may have the same preferences so that might not be a conflict. If you never hear from the women that the water well is blocked, then you may never think of fixing it even though that's something that would benefit everybody ultimately.

AUDIENCE: An advantage would be building capacity in women for leadership for the higher levels.

PROFESSOR: Right. So we are now talking about spillover over time. That you have one woman one time, and people understand that they are good and then they can continue on. But it can also be, of course, different levels while you're building up a cadre of powerful women.

AUDIENCE: I would think that you could maybe get an income of that, right away, on women. We vote next year to put in a water well. That means that there's a lot more time to do my trading business, [UNINTELLIGIBLE] business, so that right away I might make more money.

PROFESSOR: Right. So one of the ways we can put it into women empowerment, I'm going to add "to," so it could be to income. It could be to time that you generate, because now you spend less time collecting water. Or it could be to the role models as already discussed. But just the fact that I have the public good that is convenient for me, that I need. And presumably if women's power increased, they're going to do some kind of public good that's good for them which might be these types of things that will free times for them.

AUDIENCE: Kind of a counterpoint to the last disadvantage. It's also possible that because a woman doesn't have to compete against a man, they don't have to pander to men, and in a sense, overcompensate to men's preferences as much.

PROFESSOR: Right. So in a reservation-- we have too many advantages.

AUDIENCE: You add to disadvantage also which is that, I think, this may be more indirect in the sense that when you're electing someone on the premise that they'll able to deliver public goods to a specific constituency, you risk perhaps instituting a culture of paternalistic government that is elected just to deliver the pork to a specific constituency.

PROFESSOR: Right. And that is something that has been discussed a lot. For example, with the scheduled caste is that you could have this shifting. So you go it's your turn, it's our turn, it's your turn, it's our turn, and then people are not really watching the scheduled caste guy because they understand this is the scheduled caste guy's time to take their turn, and vice versa. It might also go the other way in the sense that the scheduled caste people who were previously disenfranchised might now feel, well, we might as well get this guy to deliver really well while he is in power for us. Same thing for women. But the effect on accountability is it's a good point. Because since the whole Panchayat is about accountability to the people, the effect of reservation system on accountability, at best, ambiguous.

So we can say this is related to this first point which is it's not democratic anymore. So any tampering with the democracy you might have story going either way. But that's a disadvantage showing, effect on accountability. You're now accountable to nobody. First, you are a lame duck. There is a very poor chance that you get reelected again, so that's your point here. Which in a sense makes you free to pander to your guys which is maybe what we wanted in this case, but on the other hand, makes you maybe less likely to deliver. Sorry, there was a--

AUDIENCE: Another potential disadvantage which is kind of still under whether or not women are in fact representing, is that if there's a culture of women not speaking up or not advocating, then you might end up with no representation at all.

PROFESSOR: Right. So there is a question of who is in charge. So it might be that when it's an election you elect someone, they're in charge. Here it's like maybe no woman really wants to run, so you pick up any figurehead. It might go back to elite control. If nobody who's democratically elected is in a position to exercise the power, whoever is a natural leader will take it back. It might be her husband, it might be anybody else. It's a very pretty reasonably recent effort that local democracy at that level. And it might be crazy to go back to elite control. So who is in charge? And is there a risk of elite control? Which is of course not why we did this in the first place.

AUDIENCE: I guess it builds upon the last point maybe overall less efficiency in elite administration functions because there may be resentment against the women, so less cooperation against them. So maybe there's more stalemate policies being passed.

PROFESSOR: Yeah. So that's a risk of stalemate. So unless you have a burning point, then we stop here because I have no space anymore on the board. So that's it, that's budget constraint for thoughts.

So there is kind of a lot of ideas and so somehow we are going to want them organized in order to go and collect our data. So one thing we could think of doing is that, oh let's postpone the big deorganization of the thoughts. We can see that it can go all over the place, so let's go and collect a bunch of outcomes and see how it goes. So given all of what we have discussed here, what are the possible things that might be affected by this policy? I'd like to have a board. I can just write. Given all the discussion we had, what are the things that we think might be affected by having this reservation policy? A lot of these things already came in, but we can make a little list for ourselves.

AUDIENCE: Choice of which public goods--

PROFESSOR: So one is definitely the public goods. So the disadvantage I'm going to-- so outcomes-- oh, wow. I don't usually use boards, so I'm going to-- ta da. That has to be MIT. So we have a whole [UNINTELLIGIBLE], so one is clearly public good, and potentially we have lots of them. What are the public goods that we can see in villages?

AUDIENCE: Water.

PROFESSOR: Water.

AUDIENCE: Roads.

PROFESSOR: Roads.

AUDIENCE: Schools.

PROFESSOR: Schools.

AUDIENCE: Hospitals.

AUDIENCE: Health centers.

PROFESSOR: Small health centers, yeah.

AUDIENCE: Young centers.

PROFESSOR: Young centers.

AUDIENCE: Large buildings.

PROFESSOR: Large buildings, irrigation, biogas.

AUDIENCE: Electricity.

PROFESSOR: Sorry?

AUDIENCE: Electricity.

PROFESSOR: Electricity, potentially. Sanitation. So like a long list of public goods could go either a way, so that's a long list. Where else?

AUDIENCE: Perceptions about women.

PROFESSOR: Perception of women. So if you want we'll talk a bit more about how we measure perception about women.

AUDIENCE: Participation.

PROFESSOR: Political participation. So attendance at meetings, voting. And of course we have men and women. That might be different.

AUDIENCE: Better governance?

PROFESSOR: So that's kind of the same thing. So better governance. In practice that's going to be whether there is graft, maybe budget utilization. So some measure of corruption. Bribes. You didn't say it actually in the advantages or disadvantages. The women are less corrupt, more corrupt. I guess it came up in the accountability. What else?

AUDIENCE: Household income.

PROFESSOR: Household income. And while we are in the household?

AUDIENCE: Health education.

PROFESSOR: Health education. Any gender differences in these things, both for the household and for the kids. We were talking about gender discrimination within the household. So again it's like a long list of household stuff potentially. Was a woman participating in savings, groups, blah, blah, blah? There is both perception of women politicians, and we also discussed about perception of women in general. What else could we need?

AUDIENCE: Maybe greater employability of women?

PROFESSOR: Right. So that's going to be maybe an income and then employment. That's part of the long list of stuff we might collect in a household.

AUDIENCE: Social cohesion.

PROFESSOR: Right. So that's maybe political participation and social cohesion. Don't know how you measure that, but we can think about it later.

AUDIENCE: Sustainability. Or sort of decentralization of power. The majority that's been in the government, it's like a vicious cycle. They'll do everything to keep the minorities out of the governance, so if you had jump started this process by introducing 30% of women--

PROFESSOR: Right, right, right. So it's perception of women politicians is in a woman's future electoral success. Is it the case that once a place is reserved for a woman, obviously you are the woman. How about the next time when it's not reserved anymore, do you have more woman candidates? Do you have a woman elected?

Women's future electoral successes. That's one of the first things we had discussed. More? I think it's a pretty long list already. And so now that we have this long list, the issue is what do we do? Suppose you had no money problem. Suppose budget was not a problem. Would you just take this long list? So one possible thing is to say, well, we'll think about this thing later. That makes a long list of outcomes, and go and collect data. So we're going to do a household survey and a community survey, and then an audit of what's there in the villages, and we are going to collect a bunch of data. And then this data will come back, and we'll start to look at it. So what are the pluses and minuses of that approach?

AUDIENCE: [INAUDIBLE] statistical significance. If you collect so much data, eventually you'll randomly stumble upon a result, and maybe that's the result that you end up reporting.

PROFESSOR: Right. So in particular, since we have seen at least some of the groups starting to discuss power, so you've seen hypothesis testing. So if I ran a hundred regression and looked for significant results and then independently looked for results in each of them, how many of them would count significant at the 5% level?

AUDIENCE: Five.

PROFESSOR: Five. So if I run 20, I would get two. So by example, suppose I collect 20 public goods, which is not such a large number. And we find that water wells go up in places which have more women, and irrigation goes down, and anything else doesn't change. What can I conclude if I have just gone on to this big fishing expedition?

AUDIENCE: [INAUDIBLE]

PROFESSOR: Well, you don't know whether it was just random. You can definitely make up a story. What is a story you could easily make up on the basis of those results?

AUDIENCE: That women invest in wells because collecting water used to be a woman's job, and men invest in irrigation because farming is a man's occupation.

PROFESSOR: Exactly. So you could say, well, women are not going to benefit from irrigation anyway, they're going to benefit from drinking water a lot. So this is quite consistent with what we were saying about women leaders investing in the goods women want, and so that's great. The only problem is, if we've done that ex-post, someone else could say, well, or an alternative interpretation is you've run 20 regression, you've found two significant, and you're making up a story ex-post to explain the results.

It's not that it's morally wrong to do that, but if you're not sure it is always going to be some suspicion. And then it's a little bit sad to have spent so much money collecting so much data and not to be very sure how to interpret the results. So maybe that's not the right approach. Maybe we need to do something slightly different, and that is we need to try and put a little bit more thought into why we are collecting each piece of data and where it's going to fit in our global explanation of the results.

So if you take this specific example, instead of doing what we just suppose we did, which is making this long list of results and hope for the best. Instead, if we had said we are going to go after this one question, potentially we can have we can have more than one big question. But we are at least going to go after this one question which is, what is it the case that the women leader do what women want? And we had given ourselves the means to first find out what women want for real, and not making it up from the results.

The key is to try to not be in a position where you're going to have to reverse engineer your explanation of the result ex-post. If you want me to be completely honest, it's very difficult not to do that. Because once you see the results you always want to explain them to yourself, and explain to others. So I'm not about to tell you that from a position of where I sit and say well, don't you do that or you will be damned and go to hell. But the truth is that we do it, but the truth is the less we have to do it the better. And you have to do it less if you've been thinking ex-ante about what it is that you want to test, and what are the steps that align themselves in order to get to where you want to be.

And this result might have been totally fine if we had a good way to say this is what woman are really interested in, and we had collected data on that. And then where our test will not be good by good, is it the case that women do different things. But we ask the questions that generally do women go in that direction. And what is the case? Yes, we're missing a moral explaining to us why it is. How do we go from this hypothesis we have that women are going to do what women want, to how is it going to translate into the public goods.

So the hypothesis to test must be defined before the beginning of the experiment or we don't know how to assist their validity. And what we missed when we did this big list of outcomes, or what we would have missed if we just did this big list of outcomes, and then go out and collect the data and then think about how to interpret them, is these steps. Which is go from a discussion which is very likely at an implicit level at the back of our minds. In fact, now it's very much in the front of our minds since we just had it. But usually when you do an evaluation, it's everybody sort of had that in mind, but it might remain implicit and it's much better to make it explicit. First because you're more likely to collect the data that you actually will need. For example, here we might have missed collecting women's preferences. It's not in the list. It's not in the list of data that's here. Why? Because it's not an outcome. So if were just thinking about, oh, what are the effects, and we forget to collect women's preferences. But then we have no good way to interpret the preferences in the context of that model if it was the model.

So if we don't do this thinking implicitly, we might be missing a key step. In particular, in what we call the intermediate variable that might be needed not as a measure of the impact of the program, but as what is going to help us interpret the impact of this program. So you need to try and define the hypotheses before the beginning of the experiment. And this is actually an exercise that's very useful in my experience working with implementation partners. It's very useful for both sides. Because from the side of the evaluation team, strive to understand what is the program you're evaluating. How it connects to what you know about, say, developments, poverty, et cetera. From the side of the partner, many of you have more experience than me on that. At least the half that are actually into implementation. It's like why are we doing this? Sometimes the answer is not as forthcoming as you might hope.

AUDIENCE: The other thing that was noticeable to me when we were doing this is that there are different levels maybe of importance, or that a lot of them are subsets of other things.

PROFESSOR: Exactly. So one of the things that you might do in this process is to prioritize what is sort of the big news, sort of the million dollar question? What are subsidiary? What are things that are going to enlighten whatever impacts you find? So you might also, we are going to do that in a minute, is to think depending on how much money you have-- is it that good? So sorry. We are trying to do something.

Given how much money you have, you might be thinking of some small things, or some things that might-- for example, some of the things, like say, one of the outcomes we are thinking is relative mortality of girls. You might think realistically two years after the women's empowerment is less likely to happen. So if you had an infinite amount of money, you might still collect it as a subsidiary outcome, but you wouldn't want it in a long list up there with whether or not they invest in water wells. There's no way to do this prioritization unless you are in good cause the exercise of thinking to your causal model, linking whatever intervention you have to the results.

AUDIENCE: In this discussion I believe you also raised the possibly of doing like a factor analysis to create an index of different factors so that we could reduce the number of variables.

PROFESSOR: Right. So that's a very good idea. But on the other hand, you have to think about how to-- your instinct is the right one, which is to say how are we going to combine this stuff? So if you take, for example, the 20 public goods. If you're like, but these women are more efficient, they're going to build more goods. Then you might want to do that, or you might want to sum them or to average them somehow, which is what making an index is. To see whether in general all go positive. But if your model is along this line which is women are doing what women want, then you might not get more goods. You may get some of some, and less of others, so you might need something else than an index to deal with this mess.

But you're exactly right with the idea that what we want is a way to not have many, many, many hypothesis, but fewer. So, for example, it can be one, which is a woman does what women want. That would be the first one, and then what comes out of that. If that's your hypothesis, the index you create would be something different than what you would get from a factor analysis. If, on the other hand, it's an education intervention, and you have a math result, and an English result, and science result, and [UNINTELLIGIBLE] results, and geographic results, then you know that they should all up. Or at least that's your hypothesis they should all go up, then you can do something like that to average the effect across the outcomes.

AUDIENCE: I understand why you don't want to throw in 20 different outcomes. But in terms of how you prioritize the outcomes along with their causal model, there are some fields where it's going to be fairly clear. You know, in the health field, they would say, OK, well, our objective is to reduce malaria, so we didn't reduce malaria. But if something that has more intermediate steps along the way, such as a--

PROFESSOR: What was that about?

AUDIENCE: --such as like a conflict mitigation program, where you're saying, OK, we are going to get children into clubs because we believe that this is going to create more solidarity among ethnic groups, and that this is going to lead to conflict reduction. When you have, like, six or seven different steps along this causal model, how and at what point do you prioritize, this is the impact that we're looking for, versus these are the steps that we have take to get to this impact?

PROFESSOR: So that's a great question, and you can answer it at two levels in a way. One is, where and when you're going to see an impact first. For example, in this case, you might think that ultimately what we care is not the number of water wells. Ultimately what we care is the growth of the Indian economy through these complicated channels. But this is not what we are trying to do, because we think it's going to take a bit more time to arrive to this question. Or ultimately what we care about is girl's mortality. And we're thinking that through this complicated path we're going to get there eventually, but this is not the yardstick by which you measure the success of the program because it's in too long time, and after too many other things have diluted this. But at least you want to know that you're going in the right direction, so you might decide to stop by the water wells for the beginning.

Likewise in your program, you might say you're going to produce into groups to work together. You might say, well our first measure of success with this program is whether or not the use opinion of the other guys has changed in a way that we can reliably measure. And then you're going to ask yourself the question of the measurement, and then you can take it at various steps. And one is, am I still going to find an effect after all the things that have happened? So is it fair to my program? And the other is what can I measure? Some things are easier and harder to measure, and they might be at different levels in the chain. It might be that what is happening first is easier to measure, sometimes it's what's happening a little later that's easier to measure. For example, perceptions might be very hard to measure. But whether or not people are willing to work together to build something might be very easy to measure, so you might go for that.

AUDIENCE: And at the same time, you do want to make sure that you're doing something further enough along the chain that there isn't actually impact, and not just an output measure, or--

PROFESSOR: Exactly. So this is where this conversation's going to be helpful. What is our program hoping to do, and how. And sometimes you realize in that conversation is that what is your program hoping to do so squishy that maybe we should do something else. But not the case of your program that you just described which sounded actually quite specific in what it was hoping to achieve.

So here's our example. So we have to define the hypothesis. So, for example, we can take here, we could take several. One possibility is to take this one which is public goods favored by women are more likely to be chosen by women. So to test that, we need to know what women want, and then we need to collect the data on public goods. And then the one thing we are going to do is, we are creating this index of wanted by women more than by men, which is the way we would aggregate. And then the key test would be are the public goods moving towards what women want? And so the one little snag is how are we going to measure women's preferences? What is it that women want? So what do you guys think? What would be possibilities to measure a woman's preferences?

AUDIENCE: Household surveys?

PROFESSOR: So we could ask them, in household surveys. That would be one way. What they care about. Any other ideas?

AUDIENCE: If there were communities where women were well represented, what were the choices of those communities, historically?

PROFESSOR: Well, it's slightly circular because--

AUDIENCE: [INAUDIBLE] that have a majority of women on the local council have higher percentage of wells.

PROFESSOR: Well, yeah. That's a little bit circular, because that means that what we can do that only, A, if we believe that, in fact, it is true that women better do what women want, which is what we are trying to test. And B, if we think there was no selection in which village elected a woman, and we don't believe that which is the whole reason why we are going through this tortuous exercise. So that might--

AUDIENCE: I know in the exercise it talked about the transcripts from some of the meetings, so you could look to see if there were patterns for what issues were raised. Whether or not the council voted on them, it was what they brought to the table.

PROFESSOR: Right. So you could do that. In fact, this is what we did in this case. Cheaper than household survey, is to say, in general, what do women ask about in those meetings? Preferably, not in places where the woman is the head, because that might have changed the dynamic, but in places which are not reserved. What are women talking about? What are men talking about? And the advantage of a household survey is that it's a little bit costly to go and talk about something. In a meeting, you speak up and other people look at you. If you have to go to the leader and have a written complaint it takes some time, and people wouldn't do that unless they cared about it. So maybe that's one way to proceed. But household surveys is also a good way to proceed.

AUDIENCE: But if you're measuring whether women represent women, like you've assumed at that point, but if you're looking at the discussion within the council, you've assumed at that point that the woman that was selected--

PROFESSOR: Alright.

AUDIENCE: --for women.

PROFESSOR: Alright. So I was thinking of not the small council, but the big council when they have this big meeting. What she was referring to is the transcript of the [UNINTELLIGIBLE] meetings. Or even when women have gone to the office of the Pradhan and asked them. So that's different ways, but the key is, we need to have that. Because then we can see it doesn't go into that direction. And now we don't have 20 public goods, we have one hypothesis.

AUDIENCE: Is it necessarily critical that what women are speaking about when they go visit to large meetings that's what they want, and what the men are speaking about, that's what they want? What if a man is speaking on behalf of his wife's interests?

PROFESSOR: Right. So it is not entirely clear. And so this measure of women's and men's preference might not be the best you can think of. The key is to have a good one. I think you're right, which is it could be that women never speak for example, or never complain about anything, and they would still have preferences so what would have to figure something out to get them. To the extent that both genders speak, then you might think that they speaking about what they care about as long as the house is not a fully harmonious unit. Or you might think that it's not, and then for example, women tend to talk about water just because they know better about water as someone suggested earlier. And it doesn't represent needs, it just represents relative advantages.

So that is something that this particular wealth measuring preferences might not be the best one, but the key is to have one. More than one would be even better. So in this case, that's the only when we had so it was a bit of a gamble. But the key thing is that we want to do that. Then there are other things we might be interested in. So once we have that, do women invest more in public goods, then we can ask the next question which is do investing more once we find this that kind of can be our study number one. Which is yes, we are showing that women invest more in the public goods that women prefer.

Then there was this other question that also came about, a completely different question, which is, does having a woman as a policy maker change the perception of women as policy makers. And that we can think of a separate hypothesis altogether that can be tested separately. And it could be that women as leaders do different things and do what women want, but that there is no lingering effect of having a woman because that doesn't affect people's preferences or because there is a backlash. So we can think of these two as different bins. We can even do one study and not the other, of both of them. But they are kind of separate tracks to which to go.

So if we take a minute on people's perception, do women affect the perception of voters of women leaders? How would you go about thinking about this? How would you go about trying to measure people's political preferences?

AUDIENCE: Asking questions about the role of women in the household, or women in the community.

PROFESSOR: So you can ask questions. So what is an issue is this asking questions in this context? Do you like women leaders? What do you think of--

AUDIENCE: People might feel like there's a right answer.

PROFESSOR: People may feel there is a right answer. And what's the right answer, do you think, in this context?

AUDIENCE: I see that you can tell the story right away that it's important to believe in equality between genders. Or that if you're a male, in front of your other male friends you want to appear like a tough guy.

AUDIENCE: Or who's asking the questions.

PROFESSOR: Or who is asking the questions. Or you might think that it's the right time to send a message to say those people from the capital who are imposing this woman on us, better tell them that really we don't like it. So it could really go either way, we don't know. And in a sense that is something that is interesting as well, is to ask which way is what people are willing to reveal to you? In which direction is it best? So then we would need some measure of "to" preferences. To willingness to consider women as policymakers.

AUDIENCE: Their voting preferences in the next cycle?

PROFESSOR: Yeah. So it seems that the litmus test here would be voting. Which is after one cycle of reservation, or two cycles of reservation, are women more likely to be elected. And that seems to be that it would be like the place easier to start or at least to end, which is does it make a difference? And along the way, if you find that you can think, well, maybe this is because of various things. Maybe this is women started to develop their networks. Maybe women figured out that they could do this. So if we wanted to know that it's really true, the change in the perception of women as policy makers, you would try and get a measure of perception. Again, more to eliminate the end line result, and to try to get the measure of pure, just one sort of--

AUDIENCE: Could you also-- so there are women elected to these councils.

PROFESSOR: Yeah.

AUDIENCE: Could you also though measure the women's ascendancy to other kinds of more management posts, like at school level, or at community groups or something like that, as stepping stones on the way to that?

PROFESSOR: Right. That goes into this margin on women's participation. So you're thinking [STUDENT NAME] earlier was thinking up, like do we see more women--

AUDIENCE: At higher levels.

PROFESSOR: --going at higher level. Or you could say do we see more women in position of power within the village.

AUDIENCE: Sink or swim.

PROFESSOR: School headmasters, things like that. School council, other things like that. Going back to the perception question, do you have an idea of how we could go about measuring people's perception of women as policymakers?

AUDIENCE: We could have an index of the satisfaction they have with investments that women have made, and try to relate it to how much they like or dislike the woman who decided about this investment?

PROFESSOR: So we could try satisfaction of the goods. Thinking that in particular if we had an objective quality of the goods, we could say that if their satisfaction of the goods is lower even though the goods are the same, it indicates that they like women less. But that requires having a very good measure of the quality of the goods. Otherwise, someone could always say they are worse in some dimension that you didn't observe.

AUDIENCE: We could do some kind of hypothetical question that involves psychology, or you're trying to--

PROFESSOR: So yeah, great. Continue in this direction.

AUDIENCE: Where you're interviewing people or doing surveys, and you asked them to imagine a scenario where you have two different candidates, and you give them the backgrounds and one's a man and one is a woman, and ask them who they would pick. I mean, that would be a poorly disguised question--

PROFESSOR: If you have the two candidates and you ask them to compare, you might get into the same issue. But just go one step further with this same idea. You're almost there. Or at least that's one way of doing it.

AUDIENCE: You could present two hypothetical CVs to people and say, would like to vote for this person. But on one of the hypothetical CVs, make it up as a woman, and the other make it up as a man, but they have all the same qualifications. And just see if there's a sense that people would vote more for one versus the other.

PROFESSOR: Exactly. You are [? interpreting ?] values to different people. Nothing forces you to present the same questionnaire to every person. You could randomize within your questionnaire what you're going to show them. So you're going to show them, for example, exactly the same CV and ask them, are you going to vote for this person? Or you could present a scenario saying, so this and this happened, and there was a choice to be made, and they decided to do this and what do you think? Was it a good decision? And in one case it's like Mr. So and So decided to do this, and in one case Mrs. So and So decided to do this. If you ask both, you're exactly right, it was poorly disguised. But if you ask only one, people will answer the question. And they don't have enough information to really judge the person fully. Especially if you give them a small scenario and then ask them, will you vote for them?

So then we would bring in, presumably, whatever else their other views on the person. So then you will have compare across surveys whether people tend to rank more highly the survey that has Mr. So and So, versus the survey that has Mrs. So and So. The way we did it in the follow up study to the one that is in the case is we actually taped speeches. So we had speeches that someone had given, and we had a bunch of women record the same speech. And we then played a tape and said, so what do you think? So the advantage is, we don't even have to insist on that it's Mr. So and So, it's just would you listen to this speech. And the voice immediately tells you and then you can compare people's answers. And so now it becomes you have to compare people's answers to this question in the villages that were reserved and the villages that were not reserved, and see whether any gap changes.

An interesting fact is that then you can correlate that to what they actual tell you and see whether, for example, it might be that the real gap doesn't change, but what they tell you narrows, or it could be the opposite. If they're trying to signal their dislike of the policy or something like that. And at the end of the day, all of this leads to sort of I'll find a hypothesis of did this work, in a sense. Which would be what are the vote shares for women in politics? Or if you were less ambitious, something like what are the share of women who are represented in positions of power within the village as you suggested earlier.

So that's something for day two because we were on measurement of outcomes. I just wanted to give you one example so we don't lack of various ways we can collect outcomes. That's one way. Going back to the example of do women leaders better represent the preferences of women. So what we try and do with an evaluation is to try to find out whether the program's just, whether the program is effective, but also why it's really more helpful? Both because well, we have a richer understanding, and because it will enrich our understanding of the program and also make it easier to draw more general lessons. Because for example, this program you evaluate in West Bengal and someone can say, well we need to work in Rajasthan. So one way is to go and do it in Rajasthan. But then you can say, well, but it worked in Rajasthan, too. We need to work in South India and it cannot replicate everywhere, everywhere.

So eventually you want to have the ability to say something. It's not necessarily we'll go with some hypothesis, but to say something that what is your take at the end of the day of the reasons why this program will have such and such effect. So if you say, well what I think is what's happening is when women are elected, when there is a reservation for women they are doing stuff that women prefer. Then you can say, well, in West Bengal what they want is water so that's what they'll do. In, say, Tamil Nadu our water's not so important-- well, it is very important to me-- so in any place where it's not so much of an issue, I guess that wouldn't be India, then they'll do that.

So as long as you tell me what women care about, I can tell you that it's going to go in this direction. It has the advantage that it makes the replication more interesting because if you're doing North Bengal and then you say, well now some goods go up, some goods go down. And then say you replicate in Rajasthan a new level of tests that you are willing to subject yourself to. If some goods go up, some goods go down, it's like well, maybe you don't even need to do the evaluation because probably you'll find it. Whereas after the West Bengal study you say, whatever I find from this however imperfect way to collect preferences is what women prefer, this is the same direction it's going to move in Rajasthan.

So, in fact, in here in this case it's exactly what happened. Because we first did West Bengal, and we find that women preferred water, according to the measure of preferences, and goods went to water. And men prefer schools, which was surprising to us and others, but that's the way it is, and the goods went to schools. So now for Rajasthan, we say, well we are going to do the same thing. Find out what women want, what men want in the same way. It's going to go in this direction. And we find that there in Rajasthan, also women prefer water, but men love roads. So we should have less roads. More water, less roads. While in West Bengal, women also likes the road, for the reason that they work on them. They are the people who do the roads, so it's employment activities for them.

So it's interesting because we have different predictions. Our prediction is that the road will go up in West Bengal where you have women reservation, and they will go down in Rajasthan. We know why, because it's related to the thing. And in fact we can make this leap of faith beforehand. And so that makes it much more powerful once you replicate, rather than I'll replicate as I can. To say I will replicate with a good sense of what I'm expecting to find.

Which brings me to the same thing which is our saying it's a very difficult ex-post not to use the data to learn more than what the hypothesis was at the beginning because, you know, it's sad. So you could think about it in a more constructive way, which is you could think well, this is what was my hypothesis was in the beginning, this is what I find. In addition, I have also these interesting, tantalizing tidbits of results. I'm putting it in front if you admitting that it was not my hypothesis to start with, but I am contesting that it should be the hypothesis of the next study.

I'll give you one very good example of that. There was one project by David McKenzie from the World Bank, and Suresh de Mel, who works in Sri Lanka, and Chris Woodruff in UCSD, and they were interested in the return to capitol for very small entrepreneurs. So what they did is they gave people in Sri Lanka a grant. Small entrepreneurs, people who had about $200 of working capital, that's just what we call a helicopter drop of money. So you get $100 grant or $200 grant. And they did that and they found that at first you get the average and they found great returns to capital. Very high returns to capital. So very beneficial to give people of the other 5% percent a month. So very high return to capital. Great.

And then they decided to do it separately at women and men. And oh, surprise, they found no return whatsoever for the women, and huge return for the men. So you can say, sorry, it was not in your original design. It was not stratified by gender, so we have really no intents that you are-- so we have to throw this result away. Of course, we don't want to throw this result away because that's so surprising and striking that we kind of want to think about it. So what's the idea? You write this up being very explicit that we found this ex-post, but it seems like really robust. We are going to go and do a new experiment, so we could redo it in Sri Lanka or do it in somewhere else. And in this case, our hypothesis is the age zero is the return to capital are the same for men and women, that's what you are trying to reject.

So these scores are good to think about. These evaluations are part of a process, we are not alone. A lot of people are working on this, there will be replication either by you or by others. And being explicit up front about what was your first hypothesis and your current model. And was is it that you found out as well? Our goal is to not get mixed up, and at the same time not to lose the information that is going to be useful in the future.

So we stop at 12:00 whenever we started, or what are the social norms in this?

AUDIENCE: As long as you want.

PROFESSOR: Oh. I have a phone call at 12:30, otherwise--

AUDIENCE: It's 12:15.

PROFESSOR: I'd probably be finished before that. I don't have time to finish what's on the slides anyway.

So I just wanted to give you a sense of what might be a causal model in this case. The whole perception and goal is not there, it's the public goods thing. And it's a way of disciplining all of the outcomes, as well as the various things we spoke about. So you start from reservation, so one thing that reservations definitely do is that they will lead to more women Pradhan. And then the question is whether or not having more women Pradhan will change the public good, and in what way? And there are really two channels to change the preferences, which we have discussed. One is the women as the Pradhan do what women want. And you've not really discussed that, but that comes with its own set of assumptions which is on the one hand the Pradhan are not representing the majority. As always, the majority hasn't changed. We shouldn't see a difference because even if you are saying, well, you don't have to be accountable to the men. That's not true, because they still do it for you. Ex-ante.

Several woman compete, and the issue is what platform are you going to run on? Well you should be running on the platform that is going to get you elected. And whether or not you're a man or a woman, you're elected by the same group of people. So in a totally stand out model where democracy is perfect, who is in charge doesn't matter. Because who is in charge is representing the desire of the majority, what we call the median voter. So if you had perfect democracy, that channel would be killed, and we wouldn't see an impact. On the other hand, if all the decisions were made by a group of elite villagers, that again wouldn't matter because who is in charge doesn't matter.

So the identity of the Pradhan is going to make a difference only in some middle-of-the-road kind of [? ward ?] where the politician has some control over what is going on. It is not completely controlled by a bureaucracy or by elite, and is not completely accountable to the people. But who is he? He cannot fully commit, for example, to a platform. That doesn't seem unrealistic. People make electoral promises and sometimes they go against them. Almost never, but sometimes. But that's something. So if we do learn that the public good prefers to change, we have learned something broader than just impact to this program. We have learned something about politics in India, which is OK, there is some democracy. Some people are contesting that. Some people say the Panchayat is just a face. There is no democracy really. So if you find a difference by the identity of the Pradhan, it shows you by the by that there is some reality in the democratic system, but it's not perfect democracy. So if we've learned something, it can be broader than a new program as well. And that is a lesson you can take elsewhere.

Another channel by which a woman having reservation can influence the representation of women is to more representation. For example, we were talking about more political participation of women if the woman is the head. So one thing that could be true is that they're more likely to show up in meetings, that they're more likely to speak up. For example, because the women have said that the Pradhan has to be at the village meeting, so she better put it at the time where she can go. So not in the middle of the night in a field. And so as long as she can go, a woman can also go.

So to either of these channels, you'd have the fact that the public good will reflect better women's preferences. We have to add another assumption, is that women have different preferences. And if that's the case, then the public good will be different in a specific way, which is towards those different preferences. And then you might have different outcomes. More income for the women, better health and education outcome, if it comes out that it's what women care about. And you'll be able to follow exactly the trace that, you know, if you find like in West Bengal more water. Maybe you're going to be interested in diarrhea. If you find less schools, you're going to be interested in education to see whether education goes down, and things like that.

So now we have the complete channel. And we can now think about all of our variables that we had collected, and we're going to slot them into, well, what are they going to do for us? So all the public goods go here. We need to collect women's preferences somewhere, it will go here. We want to know as a woman are empowered, so we are going to be collecting all of these. Whether people come to the meeting, whether they speak up, how they are answered to once they speak up. This is all going to go here. So now, if we have an infinite amount of money we're still going to collect a large amount of data, but we know in advance what it is we are going to do with them. We can write it down, put it in an envelope, send it to your grandmother, and this is the thing that really gives a lot of credibility to what you're going to do. Another version of sending it to your grandmother that we are going to try and implement here at J-PAL is to allow you to put it on a website, to upload it somewhere where nobody can see it but you, but it's secure, and the data is mapped. So this is whatever it was your analysis plan at the time of the beginning of your study. So you are tying your hand behind your back.

AUDIENCE: Excuse me, but why would you do that?

PROFESSOR: Because we want to-- or maybe someone can answer that question. Why do we do that?

AUDIENCE: I guess it's just so people aren't relying on your personal integrity. You're saying this was our hypothesis. We didn't ex-post change our hypothesis.

PROFESSOR: Right. So the reason why you want to say what was your hypothesis in advance, is because then you can attest it. Whereas if you take something you can always reverse engineer it. And again, I have no big issue with that, with reverse engineering. Personally, I think it's useful, but it needs to be very clear what was there before and what was reverse engineered after, otherwise we have no notion of statistical tests.

AUDIENCE: I'm just pushing out a little bit further, too. What if you just had a question where you just don't know, right? You're like, wow, we really think there might be an effect of this on that, or we're not sure whether the effect would be upward or downward. Do you still have to just pick one for the sake of having a hypothesis that you're testing?

PROFESSOR: I think you wouldn't want to embark in an evaluation without at least having a sense of why it would go up and why it would go down. So take this specific example. To start with, you shouldn't really know whether the water wells are going to go up and down, because it's going to depend on what women want. So I'm not making a stance. It would be silly to write to my grandmother. I'm betting that the women water wells will go up. Because they could go up or down depending on what women want. Of course, you may have a very strong prior that it's what women want, but the statement would be of the form, if it is the case that women have a strong preference for water, water should go up.

So if you have some uncertainty, it's probably because there is an if somewhere that you're not thinking about that you're expliciting now. I think if thinking sufficiently hard about something, you can know in what condition you would go up and down. There are a lot of programs which could go up and down, that's why we evaluate them. I mean, sometimes you think that they should really go up, but it could also be zero. And it's good to know, if this and this happened then this effect would be expected. If this and this doesn't happen, then I wouldn't see this effect, so you could write that. And in fact, I think these types of statements, in a sense, are almost more informative than this will happen.

So let me stop here. What's in the rest of the slides is kind of little bit random-- not randomized, but not random in the sense of randomized, but random in the sense of every which way I pass out the advice on how to collect data and how to enter data and things like that. For those of you who are going to IPA training after this, you're going to be sick of it by the end of the three days so that's not needed. For the other ones, it's in the slides and it's really-- that's the problem of it being too short anyway. So it is pretty self-explanatory and relatively common sense and not sufficient anyway, but a starting point. Thank you very much.

Related Resources

Free Downloads

Video

Caption