Está en la página 1de 309

Part 1: What is an Argument?

1.1 What is an argument?


Transcript

Since arguments are at the heart of logic and argumentation it's natural to start with this
question.

The first thing to say about arguments is that, as this term is used in logic, it isn't intended
to imply anything like an emotional confrontation, like when I say that "an argument broke
out at a bar" or "I just had a huge argument with my parents about my grades".
In logic an argument is a technical term. It doesn't carry any connotation about conflict or
confrontation.

So here's our definition. It will have three parts.


(1) An argument is a set of "claims", or "statements".
We'll have more to say about what a claim or statement is later, but for now it's enough to
say that a claim is the sort of thing that can be true or false.
(2) One of the claims is singled out for special attention. We call it the "conclusion".
The remaining claims are called the "premises".
(3) The premises are interpreted as offering reasons to believe or accept the
conclusion.
That's it, that's the definition of an argument.
Now let's have a look at one:
1. All musicians can read music.
2. John is a musician.
Therefore, John can read music.


Premises 1 and 2 are being offered as reasons to accept the conclusion that John can
read music.
This may not be a particularly good argument actually, since that first premise makes a
pretty broad generalization about all musicians that isn't very plausible. I'm sure there are
a few great musicians out there that don't read sheet music. But it's an argument
nonetheless.

Now, notice how it's written. The premises are each numbered and put on separate lines,
and the conclusion is at the bottom and set off from the rest by a line and flagged with the
word "therefore".

This is called putting an argument in "standard form" and it can be useful when you're
doing argument analysis.

In ordinary language we're almost never this formal, but when you're trying to analyze
arguments, when you're investigating their logical properties, or considering whether the
premises are true or not, putting an argument in standard form can make life a lot easier.
Now just to highlight this point, here's another way of saying the same thing:
"Can John read music? Of course, he's a musician, isn't he?"

These actually express the very same argument. But notice how much easier it is to see
the structure of the argument when it's written in standard form. In this version you have to
infer the conclusion, "John can read music", from the question and the "of course" part.
And you have to fill in a missing premise. What you're given is "John is a musician", but
the conclusion only follows if you assume that all musicians, or most musicians, can read
music, which is not a given, it's just a background assumption. The argument only makes
sense because you're filling in the background premise automatically. You can imagine
that this might become a problem for more complex arguments. You can't always be sure
that everyone is filling in the same background premise.

So, standard form can be helpful, and we're going to be using it a lot in this course.
1.2 What is a claim, or statement?
Transcript

Arguments are made up of "claims", or "statements". In this section I want to say a few
words about what this means and why it's important for logic.

A claim is a sentence that can be true or false (but not both).

Actually in logic texts the more commonly used term is "statement" or "proposition". These
are all intended to mean the same thing. A claim, or a statement, or a proposition, is a bit
of language whose defining characteristic is that it makes an assertion that could be true
or false but not both.

The "true or false" part of this definition expresses a principle of classical logic that's called
the Principle of Bivalence. This principle asserts that a claim can only assume one of
two truth values, "true" or "false"; there's no third option like "half-true" or "half-false", or
"almost true".

The "but not both" part of this definition expresses a different principle of classical logic
called the Principle of Non-Contradiction. This principle states that a claim can't be both
true and false at the same time, it's either one or the other. To assert otherwise is to assert
a contradiction.

Actually there are systems of logic where both the Principle of Bivalence and the Principle
of Non-Contradiction are relaxed. Logicians can study different systems of reasoning that
don't assume these principles. But in classical logic they hold, and in what follows we're
going to assume that they hold.

Now, you might ask why defining claims in this way is important. Here are two reasons.
First, not all sentences can function as claims for argumentative purposes.

For example, questions don't count as claims, since they don't assert anything that can
be true or false. If you have a question like "Do you like mushrooms on your pizza?", that's
a request for information, it doesn't assert that such-and-such is true or false.

Another type of sentence that doesn't count as a claim is a command, or an imperative,


like "Step away from the car and put your hands behind your head!". That's a request to
perform an action, it makes no sense to ask whether the action or the command is true or
false.

So, not every bit of language counts as a claim, and so not every bit of language can play
the role of a premise or a conclusion of an argument.
A second reason why this notion of a claim or statement is important is because it gives
us a way of talking about clarity and precision in our use of language.

This is because in saying that a sentence can function as a claim in an argument, what
we're saying is that all the relevant parties, both the person giving the argument and the
intended audience of the argument, have a shared understanding of the meaning of that
sentence.
In this context, what it means to understand the meaning of a sentence is to
understand what it would mean for the sentence to be true or false. That is, it involves
being able to recognize and distinguish in your mind the state of affairs in which the
sentence is true from the state of affairs in which the sentence is false.

So, in the context of logic and argumentation, for a sentence to be able to function as a
claim, for it to be able to function as a premise or a conclusion of an argument, there has
to be a shared understanding of what the sentence is asserting.

So, if the sentence is too vague or its meaning is ambiguous, then it can't function
as a claim in an argument, because in this case we literally don't know what we're talking
about.

The requirement that a sentence be able to function as a claim is not a trivial one. It's
actually pretty demanding. Not all bits of language make assertions, and not all assertions
are sufficiently clear in their meaning to function as claims.

This is important. Before we can even begin to ask whether an argument is good or bad,
we need to have a shared understanding of what the argument actually is, and this is the
requirement that you're expressing when you say that an argument is made up of claims
that can be true or false.

What this means in practice is that, ideally, you want everyone to have a shared
understanding of what the argument is about and what the premises are asserting. If
people can't agree on what the issue is, or what's being asserted then you have to go back
and forth can clarify the issue and clarify the arguments so that all parties finally have a
shared understanding of the argument.

Only then can you have a rational discussion of the strengths and weaknesses of the
argument.
Questions and Discussion

Note: This material is not included in the video above.

1. Sometimes a statement can be expressed using a question, can’t it?

That’s true. A good example is “rhetorical questions”, which are really statements. For
example, if you ask me whether I’m going to study for the final exam, and I look at you say
“What do you think I am, an idiot?,

I’m not actually asking whether you think I’m an idiot. It’s a rhetorical question, used to
(sarcastically) assert the statement “Yes, I’m going to study for the final exam”.

This example illustrates a general point, that identifying arguments in ordinary language
requires paying attention to rhetorical forms and stylistic conventions in speech and in
writing. A complex writing style, or a failure to understand the rhetorical context in which an
argument is given (for example, failing to recognize SATIRE), can lead to confusion about
what is being asserted.

2. What’s the difference between a sentence being VAGUE and a sentence being
AMBIGUOUS?

If I ask my daughter when she’ll be back from visiting friends and she says “Later”, that’s a
VAGUE answer. It's not specific enough to be useful.

On the other hand, if I ask her which friend she’ll be visiting, and she says “Kelly”, and she
has three friends named Kelly, then that’s an AMBIGUOUS answer, since I don’t know
which Kelly she’s talking about. The problem isn’t one of specificity, it’s about identifying
which of a set of well-specified meanings is the one that was intended.

Note that all natural language suffers from vagueness to some degree. If I say that the
grass is looking very green after last week’s rain, one could always ask which shade of
green I’m referring to. But it would be silly to think that you don’t understand what I’m
saying just because I haven’t specified the shade of green.

For purposes of identifying claims in arguments, the question to ask isn’t “Is this
sentence vague?, but rather, “Is this sentence TOO vague, given the context?”.

If all I’m doing is trying to determine whether the grass needs watering or not, the specific
shade of green probably doesn’t matter. But if I’m trying to pick a green to paint a room in
my house, specifying the shade will be more important.
1.3 What is a good argument (I)?
Transcript

An argument is an attempt to persuade, but the goal of logic and argumentation isn't
simply to persuade — it's to persuade for good reasons.

The most basic definition of a good argument is straightforward. It's an argument that gives
us good reasons to believe the conclusion.

There's not much we can do with this definition, though. It's too vague. We have to say
more about what we mean by "good reasons" obviously.

To do this we'll start by looking at some ways that arguments can fail to be good, and from
these we'll extract a couple of necessary conditions for an argument to be good.

Here's an argument:
1. All actors are robots.
2. Tom Cruise is an actor.
Therefore, Tom Cruise is a robot.

Here's another argument:


1. All tigers are mammals.
2. Tony is a mammal.
Therefore, Tony is a tiger.

Both of these are bad arguments, as you might be able to see. But they're bad in different
ways.

In the first argument on top, the problem is obviously that the first premise is false — all
actors are not robots. But that's the only problem with this argument.

In particular, the logic of this argument is perfectly good. When I say that the logic is good
what I mean is that the premises logically support or imply the conclusion, or that the
conclusion follows from the premises.

In this case it's clear that if the premises of this argument were all true then the conclusion
would have to be true, right? If all actors were robots, and if Tom Cruise is an actor, then it
would follow that Tom Cruise would have to be a robot. The conclusion does follow from
these premises.

Now let's look at the other argument:


1. All tigers are mammals.
2. Tony is a mammal.
Therefore, Tony is a tiger.

First question: Are all the premises true?

Are all tigers mammals? Yes they are. Is Tony a mammal? Well in this case we have no
reason to question the premise so we can stipulate that it's true. So in this argument all the
premises are true.
Now what about the logic?

This is where we have a problem. Just because all tigers are mammals and Tony is a
mammal, it doesn't follow that Tony has to be a tiger. Tony could be a dog or a cat or a
mouse. So even if we grant that all the premises are true, those premises don't give us
good reason to accept that conclusion.

So this argument has the opposite problem of the first argument. The first argument has
good logic but a false premise. This argument has all true premises but bad logic. They're
both bad arguments but they're bad in different ways. And these two distinct ways of
being bad give us a pair of conditions that an argument must satisfy if it's going to be good.

First condition:
If an argument is good then all the premises must be true.
We'll call this the "Truth Condition".

Second condition:
If an argument is good then the conclusion must follow from the premises.
We'll call this the "Logic Condition".

Note that at the top I called these "necessary conditions". What this means is that any
good argument has to satisfy these conditions. If an argument is good then it satisfies both
the Truth Condition and the Logic Condition.

But I'm not saying that they're "sufficient" conditions. By that I mean that they don't by
themselves guarantee that an argument is going to be good. An argument can satisfy both
conditions but still fail to be good for other reasons.

Still, these are the two most important conditions to be thinking about when you're doing
argument analysis.

In later tutorials we'll look more closely at both the Truth Condition and the Logic
Condition, and we'll also look at ways in which an argument can satisfy both conditions
and still fail to be good.

Questions and Discussion


(Note: This material does not appear in the video above.)
1. What’s the difference between “persuading” and “persuading for good reasons”?

If my only goal is to persuade you to accept my conclusion, then I might use all kinds of
rhetorical tricks to achieve that goal. I might choose to outright lie to you. If mere
persuasion is the ultimate goal then there would be no normative distinction between
argumentation and using lies, rhetorical tricks and psychological manipulation to persuade.

Mere persuasion is NOT the ultimate goal of argumentation, at least as this term is used in
philosophy and rhetoric. Argumentation is about persuasion for good reasons. Precisely
what this means is not obvious, and it will take us some time to work it out work, but at a
minimum it involves offering premises that your audience is willing to accept, and
demonstrating how the conclusion follows logically from those premises.
Here’s another way to think about argumentation. From a broader perspective, to argue
with a person, as opposed to merely trying to persuade or influence a person, is to treat
that person as a rational agent capable of acting from, and being moved by, reasons. It’s
part of what it means to treat a person as an “end” in themselves, rather than as a mere
means to some other end.

In this respect, theories of argumentation are normative theories of how we ought to


reason, if we’re treating our audience as rational agents. They’re a component of a basic
moral stance that we adopt toward beings who we recognize as capable of rational
thought.

Consider: if I can argue with it, most of us think that’s a sufficient reason to think it would
be morally wrong to kill and eat it for food, right?. To use a bit of philosophical jargon, it’s a
component of our concept of moral personhood.

2. How can an argument satisfy both the Truth Condition and the Logic Condition
and still fail to be good?
I said I would consider this question in a later tutorial, but just to give an example,
arguments that commit the fallacy of “begging the question” may satisfy both these
conditions but fail to be good.

Consider this argument:


1. Capital punishment involves the killing of a person by the state as punishment for
a crime.

2. It is morally unjustified for the state to take the life of a person as punishment for
a crime.

Therefore, capital punishment is morally unjustified.


Let’s grant that the conclusion follows from the premises. Could this count as a good
argument?

The problem is that the second premise simply asserts what is precisely at issue in the
debate over capital punishment. It “begs the question” about the ethics of capital
punishment by assuming as a premise that it’s wrong. Such arguments are also called
“circular”, for obvious reasons -the conclusion simply restates what is already asserted in
the premises.

The problem with this kind of argument isn’t that the premises are obviously false. The
problem is that they don’t provide any independent reasons for accepting that conclusion.
Logic and critical thinking texts will treat this kind of argument as fallacious, a bad
argument. What makes this argument bad isn’t captured by violating the Truth Condition or
the Logic Condition.

Arguments that beg the question in this way may well have all true premises and good
logic, but they would still be judged as bad arguments due to their circularity.
So this is an example of how an argument may be judged bad even though it satisfied both
the Truth Condition and the Logic Condition. And this is what it means to say that these
are merely necessary conditions for a good argument, not sufficient conditions. All good
arguments will satisfy these two conditions, but not all arguments that satisfy these two
conditions will be good.

Still, the distinctions captured by the Truth Condition and the Logic Condition are
absolutely central to argument analysis.

3. You keep using arguments with only two premises. Do all arguments only have
two premises?

I can see how people might initially get this impression, since so many of the introductory
examples you see in logic and critical thinking texts are short two-premise arguments. For
the purpose of introducing new logical concepts, the short two-premise argument forms
are very useful.

But you can have arguments with five, ten or a hundred premises. In longer and more
complex arguments what you usually see are nested sets of sub-arguments where each
sub-argument has relatively few premises, but the overarching argument (when you
expand it all out) might be very long.

You see this kind of progression learning almost any complex skill. You start out by
rehearsing the most elementary concepts or skill elements (basic programming statements
in computer languages; the positions and moves of the chess pieces in chess, forehand
and backhand strokes in

tennis; basic statement types and argument forms in logic, etc.), and then combine them to
create more complex structures or perform more complex tasks. Learning logic and
argument analysis isn’t any different.
1.4 Identifying premises and conclusions
Transcript

Argument analysis would be a lot easier if people gave their arguments in standard form,
with the premises and conclusions flagged in an obvious way.

But people don’t usually talk this way, or write this way. Sometimes the conclusion of an
argument is obvious, but sometimes it’s not. Sometimes the conclusion is buried or implicit
and we have to reconstruct the argument based on what’s given, and it’s not always
obvious how to do this.

In this tutorial we’re going to look at some principles that will help us identify premises and
conclusions and put natural language arguments in standard form. This is a very important
critical thinking skill.
Here’s an argument.

“Abortion is wrong because all human life is sacred.”



Question: which is the conclusion? “Abortion is wrong”? or “All human life is sacred”?
For most of us the answer is clear. “Abortion is wrong” is the conclusion, and “All human
life is sacred” is the premise.

How did we know this? Well, two things are going on.
First, we’re consciously, intentionally, reading for the argument, and when we do this we’re
asking ourselves, “what claim are we being asked to believe or accept, and what other
claims are being offered as reasons to accept that claim?.

Second, we recognize the logical significance of the word “because”. “Because” is what we
call an “indicator word”, a word that indicates the logical relationship of claims that come
before it or after it. In this case it indicates that the claim following it is being offered as a
reason to accept the claim before it.

So, rewriting this argument in standard form, it looks like this...


1. All human life is sacred.
Therefore, abortion is wrong.


At this point we could start talking about whether this is a good argument or not, but that’s
not really the point of this tutorial. Right now we’re more concerned with identifying
premises and conclusions and getting the logical structure of an argument right.

Here are some key words or phrases that indicate a CONCLUSION:


therefore, so, hence, thus, it follows that, as a result, consequently,
and of course there are others.
This argument gives an example using “so”:
It’s flu season and you work with kids, SO you should get a flu shot.


Now, keywords like these make it much easier to identify conclusions, but not all
arguments have keywords that flag the conclusion. Some arguments have no indicator
words of any kind. In these cases you have to rely on your ability to analyze context and
read for the argument.

Here’s a more complex argument that illustrates this point:


"We must reduce the amount of money we spend on space exploration. Right now,
the enemy is launching a massive military buildup, and we need the additional
money to purchase military equipment to match the anticipated increase in the
enemy’s strength." 


Notice that there are no indicator words that might help us flag the conclusion.
So, which claim is the conclusion of this argument?

Is it...
“We must reduce the amount of money we spend on space exploration.”?

Is it...
“The enemy is launching a massive military buildup”?

Or is it...
“We need the additional money to purchase military equipment to match the
anticipated increase in the enemy’s strength”?

The answer is...


“We must reduce the amount of money we spend on space exploration.”

Most people can see this just by looking at the argument for a few seconds, but from
experience I know that some people have a harder time seeing logical relationships like
this.

If it’s not obvious, the way to work the problem is this: for each claim asserted in the
argument you have to ask yourself,

“Is this the main point that the arguer is trying to convey?”

or,

“Is this a claim that is being offered as a reason to believe another claim?”

If it’s being offered as a reason to believe another claim, then it’s functioning as a premise.
If it’s expressing the main point of the argument, what the argument is trying to persuade
you to accept, then it’s the conclusion.

There are words and phrases that indicate premises too. Here are a few:
since, if, because, from which it follows, for these reasons,
and of course there are others.
And here’s an example that uses “since”:

"John will probably receive the next promotion SINCE he’s been here the longest."


“Since” is used to indicate that John’s being here the longest is a reason for thinking that
he will probably receive the next promotion.
So, let’s summarize:
•Arguments in natural language aren’t usually presented in standard form, so we need to
know how to extract the logical structure from the language that’s given.
•To do this, we look at each of the claims in the argument and we ask ourselves, is this the
main point that the arguer is trying to convey, or is this being offered as a reason to accept
some other claim?
•The claim that expresses the main point is the conclusion.
•The claims that are functioning as reasons to accept the main point are the premises.
•And finally, premises and conclusions are often flagged by the presence of indicator
words. Paying attention to indicator words can really help to simplify the task of
reconstructing an argument.
Part 2: What is a Good Argument?
2.1 The Truth Condition
Transcript

The Truth Condition is a necessary condition for an argument to be good. We stated it as


the condition that all the premises of an argument have to be true. In this tutorial I want to
talk about what this condition amounts to in real world contexts where arguments are used
to persuade specific audiences to accept specific claims.

I'm going to try to show why we actually need to modify this definition somewhat to capture
what's really important in argumentation. I'll offer a modification of the definition that I think
does a better job of capturing this.

Here's our current definition of the Truth Condition: All premises must be true.
I want to use a simple example to illustrate the problem with this definition.
Consider the claim that "The earth rotates on its axis once every 24 hours".

We all agree that this claim is true. It's an accepted part of our modern scientific
understanding of the world.

But, say, 500 years ago, this claim would have been regarded by almost everyone as
obviously false. The common understanding was that the earth does not move. Most
everyone believed that the planets and everything else in the universe revolved around the
earth, which his fixed at the center of the universe.

And they had good reason to believe this. When we look outside we see the moon and the
sun and the stars and planets all moving around us. It certainly doesn't seem like we're all
moving at hundreds of miles an hour toward the east. Around the equator it would be
closer to a thousand miles an hour, as fast as a rifle bullet. If the earth was really rotating
that fast, why don't centrifugal forces make us fly off the surface of the earth? Or why don't
we experience perpetual hurricane-force winds as the rotating earth drags us through the
atmosphere at hundreds of miles per hour?

These are the sorts of arguments that medievals might have given, and did give, to
support their contention that the earth in fact does not move. For their time, given what
they knew about physics and astronomy, these seem like they would be compelling
arguments.

So, for a medieval audience, any argument that employed this claim as a premise — the
claim that the earth rotates once every 24 hours — or that argued for it as a conclusion,
would have been judged as a bad argument, because for them the claim is clearly false.
Now, why does this situation pose a problem for our version of the Truth Condition? It
poses a problem because if we read "true" as REALLY true, true IN ACTUALITY, and we
think it's really true that the earth rotates, and it's really false that it's fixed at the center of
the universe, then this version of the Truth Condition makes it so that no medieval person
can have a good argument for their belief that the earth does not move.

And this just seems wrong. It seems like we want to say that yes, they were wrong about
this, but at the time they had perfectly good reasons to think they were right.
And if they had good reasons, that means they had good arguments. But this version of
the Truth Condition doesn't allow us to acknowledge that they had good arguments for
their belief that the earth does not move.

So, we need to modify our phrasing of the Truth Condition so that it's sensitive to the
background beliefs and assumptions of particular audiences.

This is a natural modification that does the trick.

We'll call a claim "plausible" (for a given audience -- plausibility is always relative to a
given audience even if we don't specifically say so) if that audience believes they have
good reason to think it's true.

So to say that a claim is plausible for a given audience is just to say that the audience is
willing to grant it as a premise, that they're not inclined to challenge it, since they think they
have good reason to believe it's true.

What we're doing here is pointing out that in real-world argumentation, when someone is
offering reasons for someone else to believe or accept something, an argument will only
be persuasive if the target audience is willing to grant the premises being offered.
Premises have to be plausible to them, not just plausible to the arguer, or some
hypothetical audience.

So here are our conditions for an argument to be good, with the truth condition modified in
the way that I've just suggested.

Condition one: The Truth Condition


All premises must be true (where “true” is read as “plausible”)

Condition two: The Logic Condition


The conclusion most follow from the premises.

Just to note, for the rest of these tutorials I'll keep using the expression "Truth Condition",
even though it's really a "plausibility condition", just because the language of "truth" is so
commonly used in logic and critical thinking texts when describing this feature of good
arguments.

Just remember, when we talk about evaluating the premises of an argument to see if the
argument satisfies the Truth Condition — when I give an argument and I ask, are all the
premises true? — what I'm really asking is whether the intended audience of the argument
would be willing to grant those premises. In other words, whether they would find those
premises plausible.

Let me just wrap up with an objection to this modification that some of my students will
usually offer at this point.

Some people might object that what I've done here is redefined the concept of truth into
something purely relative and subjective, that I'm denying the existence of objective truth.
This isn't what I'm saying. All I'm saying is that the persuasive power of an argument isn't a
function of the actual truth of its premises. It's a function of the subjective plausibility of its
premises for a given audience. A premise may be genuinely, objectively true, but if no one
believes it's true, then no one will accept it as a premise, and any argument that employs it
is guaranteed to fail, in the sense that it won't be judged by anyone as offering good
reasons to accept it.

This point doesn't imply anything about the actual truth or falsity of the claims. We can say
this and still say that claims or beliefs can be objectively true or false. The point is just that
the objective truth or falsity of the claims isn't the feature that plays a role in the actual
success or failure of real world arguments. It's the subjective plausibility of premises that
plays a role, and that's what this reading of the Truth Condition is intended to capture.
2.2 The Logic Condition
Transcript

The Logic Condition is another necessary condition for an argument to be "good".

In the tutorial that introduced the notion of a "good argument" we defined the Logic
Condition in very general terms: an argument satisfies the Logic Condition if the
conclusion "follows from" the premises.

The main point I wanted to make in that discussion was to distinguish arguments that are
bad because of false premises, from arguments that are bad because of bad logic. These
are two distinct ways in which an argument can fail to provide good reasons to believe the
conclusion.

But we need to say a lot more about the Logic Condition, and what it means to say that an
argument has good logic.

In fact, this tutorial and the next two tutorials are devoted to it. The concepts that we'll be
looking at are absolutely central to logic.

In this tutorial I want to highlight the hypothetical character of the Logic Condition, and how
it differs from the Truth Condition.

We say that an argument satisfies the Logic Condition if the conclusion "follows from" the
premises, or equivalently, if the premises "support" the conclusion.

The following two arguments give examples of good logic and bad logic, respectively.
1. All tigers are mammals.
2. Tony is a tiger.
Therefore, Tony is a mammal.


In this first argument the conclusion clearly follows from the premises. If all tigers are
mammals and if Tony is a tiger then it follows that Tony is a mammal.
1. All tigers are mammals.
2. Tony is a mammal.
Therefore, Tony is a tiger.


In this second argument the conclusion doesn't follow. If all tigers are mammals and if
Tony is a tiger we can't infer that Tony is a tiger. Those premises may be true but they
don't support the conclusion.

I want to draw attention to the way in which we make these kinds of judgments. In judging
the logic of the argument we ask ourselves this hypothetical question:
"IF all the premises WERE true, WOULD they give us good reason to believe the
conclusion?"

If the answer is "yes" then the logic is good, and the argument satisfies the Logic
Condition.

If the answer is "no" then the logic is bad, and the argument doesn't satisfy the Logic
Condition.
This gives us a more helpful way of phrasing the Logic Condition. An argument satisfies
the Logic Condition if it satisfies the following hypothetical condition:

If the premises are all true, then we have good reason to believe the conclusion.
The key part of this definition is the hypothetical "IF".

When we're evaluating the logic of an argument, we're not interested in whether the
premises are actually true or false. The premises might all be false, but that's irrelevant to
whether the logic is good or bad. What matters to the logic is only this hypothetical "if". IF
all the premises WERE true, WOULD the conclusion follow?

So, when evaluating the logic of an argument we just ASSUME the premises are all true,
and we ask ourselves what follows from these premises?

This is fundamentally different from the Truth Condition, where what we're interested in is
the actual truth or falsity of the premises themselves (or as we talked earlier, the
"plausibility" or "implausibility" of premises).

The Truth Condition and the Logic Condition are focusing on very different properties of
arguments.

In fact you can have arguments with all false premises that satisfy the Logic Condition.
Here's an example:

1. If the moon is made of green cheese, then steel melts at room temperature.
2. The moon is made of green cheese.
Therefore, steel melts at room temperature.


This argument satisfies the Logic Condition, even though both premises are clearly false.
Why?

Because IF the first premise WAS true, and IF the second premise WAS true, then the
conclusion WOULD follow.

In fact, this argument is an instance of a well known argument FORM that always satisfies
the Logic Condition:
1. If A then B
2. A
Therefore, B


If A is true then B is true. A is true, therefore B is true. ANY argument that instantiates this
argument form is going to satisfy the Logic Condition.
Here's another example:
1. All actors are billionaires.
2. All billionaires are women.
Therefore, all actors are women. 


Each claim in this argument is false, so it's a bad argument, but the logic is airtight. This
argument fails the Truth Condition but satisfies the Logic Condition.
And like the previous example it's an instance of an argument FORM that always satisfies
the Logic Condition:
1. All A are B.
2. All B are C.
Therefore, all A are C.

Alternately, you can have arguments that have all true premises but FAIL the Logic
Condition, like this one:
1. If I live in Iowa then I live in the United States.
2. I live in the United States.
Therefore, I live in Iowa.


The premises are true (at the time of writing this), but the conclusion doesn't follow,
because even if they're all true it doesn't follow that I have to live in Iowa, or even that it's
likely that I live in Iowa. I might live in New York or Florida or any of the other fifty states.
So, to summarize:

•We can rephrase the Logic Condition in a more helpful way by emphasizing the
hypothetical character of the property that we're interested in, which is the LOGICAL
RELATIONSHIP between premises and conclusion: If the premises are all true, then we
have good reason to accept the conclusion.

•The actual truth or falsity of the premises is irrelevant to the logic of the argument.

•Argument analysis is a two-stage process. When we evaluate the logic of the argument
we're not concerned about the actual truth or falsity of the premises. All we're concerned
with is whether the conclusion follows from those premises.

•Once we've evaluated the logic, then we can ask whether the premises are actually true
or plausible.

•If you confuse these steps, and make the mistake of judging the logic of an argument in
terms of the truth or falsity of the premises, then you won't be able to properly evaluate an
argument.

This distinction, between TRUTH and LOGIC, is arguably the most important distinction for
critical argument analysis.
2.3 Valid vs invalid arguments

An argument has to satisfy the Logic Condition in order for it to qualify as a good
argument. But there are two importantly different ways in which an argument can satisfy
the Logic Condition.

One way is if the argument is "valid". Another way is if the argument is "strong".

"Validity" and "strength" are technical terms that logicians and philosophers use to
describe the logical "glue" that binds premises and conclusions together. Valid arguments
have the strongest logical glue possible.

In this tutorial we're going to talk about "validity" and the difference between "valid" versus
"invalid" arguments. In the next tutorial we'll talk about "strength" and the difference
between "strong" versus "weak" arguments.

Together, these two concepts, validity and strength, will help us to specify precisely what it
means for an argument to satisfy the Logic Condition.

We've seen valid arguments before. Recall the Tom Cruise argument:
1. All actors are robots.
2. Tom Cruise is an actor.
Therefore, Tom Cruise is a robot.


This is an example of a valid argument.


Here's the standard definition of a valid argument.
An argument is VALID if it has the following hypothetical or conditional property:
IF all the premises are true, then the conclusion CANNOT be false.

In this case we know that in fact the first premise is false — not all actors are robots — but
the argument is still valid because IF the premises were true it would be IMPOSSIBLE for
the conclusion to be false.

In a hypothetical world were all actors are robots, and Tom Cruise also happens to be an
actor, then it's logically impossible for Tom Cruise NOT to be a robot.

THAT is the distinctive property of this argument that we're pointing to when we call it
valid; that it's logically impossible for the premises to be true and the conclusion false.
Or to put it another way, the truth of the premises guarantees the truth of the conclusion.
These are all different ways of saying the same thing. Validity is the strongest possible
logical glue you can have between premises and conclusion.

Here's an example of an INVALID argument:


1. All actors are robots.
2. Tom Cruise is a robot.
Therefore, Tom Cruise is an actor.


The first premise is the same, "All actors are robots". But the second premise is different.
Instead of assuming that Tom Cruise is an actor, we're assuming that Tom Cruise is a
robot.
Now, if these are both true, does it follow that Tom Cruise HAS to be an actor? No, it does
not follow. It would follow if we said that ONLY actors are robots, but the first premise
doesn't say that.

All we can assume is that in this hypothetical world, anyone in the acting profession is a
robot, but robots might be doing lots of different jobs besides acting. They might be
mechanics or teachers or politicians or whatever. So in this hypothetical world the fact that
Tom Cruise is a robot doesn't guarantee that he's also an actor.

And THAT is what makes this an invalid argument.


An argument is INVALID just in case it's NOT VALID.
What this means is that even if all the premises are true, it's still possible for the conclusion
to be false. The truth of the premises doesn't guarantee the truth of the conclusion.

That's ALL it means to call an argument "invalid".


In particular, it doesn't imply that the argument is bad. As we'll see in the next tutorial,
invalid arguments can still be good arguments. Even if they don't guarantee the conclusion
they can still give us good reasons to believe the conclusion, so they can still satisfy the
Logic Condition.

But like I said, we'll talk more about this later.


I'll end with a cautionary note about this terminology.
We're using the terms "valid" and "invalid" in a very specific technical sense that is
commonly used in logic and philosophy but not so common outside of these fields.

As we all know in ordinary language the word "valid" is used in a bunch of different ways.
Like when we say "that contract is valid", meaning something like the contract is “legally
legitimate” or that it's “executed with proper legal authority”.

Or when we say "You make a valid point", we mean that the point is “relevant” or
“appropriate”, or it
 has some justification behind it.

These are perfectly acceptable uses of the term "valid". But I just want to emphasize that
this isn't how we're using the term in logic when we're doing argument analysis. It's
important to keep the various meanings of "valid" and "invalid" distinct so there's no
confusion.

Note for example that when we use the terms valid and invalid in logic we're talking about
properties of whole arguments, not of individual claims.

If we're using the terms in the way we've defined them in this tutorial then it makes NO
SENSE to say that an individual premise or claim is valid or invalid.

Validity is a property that describes the logical relationship between premises and
conclusions. It's a feature of arguments taken as a whole. Still, it's very common for
students who are new to logic to confuse the various senses of valid and invalid, and make
the mistake of describing a premise as invalid when what they mean is simply that it's false
or dubious.

So that's just a cautionary note about the terminology. If you keep the logical definition
clear in your mind then you shouldn't have a problem.
2.4 Strong vs weak arguments
Transcript

There are two importantly different ways in which an argument can satisfy the Logic
Condition. One way is if the argument is VALID. Another way is if the argument is
STRONG. We've talked about validity. Now let's talk about strength.

Here's an argument:
1. All humans have DNA.
2. Pat is human.
Therefore, Pat has DNA.


This is a valid argument. If the premises are true the conclusion can't possibly be false.
Now take a look at this argument:
1. 50% of humans are female.
2. Pat is human.
Therefore, Pat is female. 


The percentage isn't exact, but we're not interested in whether the premises are actually
true. We're interested in whether, if they were true, the conclusion would follow.
In this case the answer is clearly NO. Knowing that Pat is human doesn't give any good
reason to think that he or she is female.

This is an example of an argument that does NOT satisfy the Logic Condition.
Now take a look at this argument:
1. 90% of humans are right-handed.
2. Pat is human.
Therefore, Pat is right-handed. 


This argument is different. In this case the premises make it very likely — 90% likely —
that the conclusion is true. They don't guarantee that Pat is right-handed, but we might still
want to say that they provide good reasons to think that Pat is right-handed.

And if that's the case then we should say that this argument satisfies the Logic Condition.
Because it has the property that, if all the premises are true, they give us good reason to
believe the conclusion.

This difference is what the distinction between weak and strong arguments amounts to.
The first argument is what we call a logically WEAK argument. It does not satisfy the Logic
Condition and so it can't be a good argument.

The second argument is a logically STRONG argument. It does satisfy the Logic Condition
so it can be a good argument.
This is what distinguishes these arguments, but note what they have in common. They're
both logically INVALID.

In a valid argument if the premises are true the conclusion can't possibly be false. Neither
of these arguments guarantees certainty. They're both fallible inferences. Even if the
premises are true you could still be wrong about the conclusion.
The difference is that in a STRONG argument the premises make the conclusion VERY
LIKELY true. A WEAK argument doesn't even give us this.

Now these examples immediately raise an important question:


HOW strong does the inference have to be for the argument to satisfy the Logic Condition
and qualify as a strong argument?

To put it another way, with what probability must the conclusion follow from the premises
for the argument to qualify as strong?

50% is clearly too weak. 90% is clearly strong enough. But where's the cut-off, what's the
threshold that the strength of the logical inference has to meet to count as satisfying the
Logic Condition?

Well, it turns out that there is no principled answer to this question.


The distinction between valid and invalid arguments is a sharp one. Every argument is
either valid or invalid. There are no "degrees" of validity. Validity is like pregnancy — you
can't be almost pregnant or a little bit pregnant.

The distinction between strong and weak arguments, on the other hand, is a matter of
degree. It does make sense to say that an argument is very strong, or moderately strong,
or moderately weak or very weak.

But the threshold between weak and strong arguments isn't fixed or specified by logic. It is,
in fact, a conventional choice that we make. We decide when the premises provide
sufficient evidence or reason to justify accepting the conclusion. There are no formal
principles of logic that make this decision for us.

This is actually a big topic. It needs a lot more space to properly discuss. It really belongs
in a course on inductive and scientific reasoning.

But there are some common argument forms that people generally recognize as valid,
strong or weak that are helpful to know.

Here are some simple argument forms that are recognized as valid, strong or weak
respectively.
1. ALL A are B.
2. x is an A.
Therefore, x is a B.


An example of a valid argument of this form is


1. All actors are robots.
2. Tom is an actor.
Therefore, Tom is a robot. 


We've seen this one before. But if we change "ALL" to "MOST" we get an invalid but
strong argument:
1. Most A are B.
2. x is an A.
Therefore, x is a B.

1. Most actors are robots.
2. Tom is an actor.
Therefore, Tom is a robot.


The conclusion doesn't follow with certainty, but we're stipulating that "most" means
"enough to make it reasonable to believe the conclusion".
If we switch from "most" to "some" we get a weak argument:
1. Some A are B.
2. x is an A.
Therefore, x is a B.


1. Some actors are robots.


2. Tom is an actor.
Therefore, Tom is a robot.


"Some actors are robots" doesn't even guarantee 50-50 odds. The way this term is
commonly used in logic, "some" just means that AT LEAST ONE actor is a robot.
These definitions summarize what we've seen so far:

VALID: If all the premises are true, the conclusion follows with certainty.
STRONG: If all the premises are true, the conclusion follows with high probability.
WEAK: If all the premises are true, the conclusion follows neither with certainty nor with
high probability.

Validity, strength and weakness are logical properties of arguments that characterize the
logical relationship between the premises and the conclusion.

With valid arguments the conclusion follows with certainty, it's impossible for the premises
to be true and the conclusion false.

With strong arguments it's possible for the conclusion to be false but it's unlikely; the
conclusion follows with high probability.

With weak arguments the conclusion isn't even likely or highly probable. It doesn't
necessarily mean that it's unlikely either, or that the conclusion has a very low probability
of being true. It just means that the premises don't give us good enough reason to think
the conclusion is true.

And finally, both valid and strong arguments satisfy the Logic Condition for an argument to
be good. Weak arguments fail to satisfy the Logic Condition and so are automatically ruled
out as bad.
2.5 Definition of a good argument (II)
Transcript

Now that we’ve discussed the truth condition and the logic condition in more detail, we can
state the conditions for an argument to be good with more precision than in our first
attempt.

Here was our first pass at a definition (see “What is a good argument? Part 1”): if an
argument is good then

(1) it has all true premises, and


(2) the conclusion follows from the premises.
Here’s our current definition:
If an argument is good then
(1) all the premises are plausible, and
(2) the argument is either valid or strong.
Plausible

We’re using “plausible” in the Truth Condition to highlight the fact that a given premise
might be regarded as true by one audience but as false by another, and what we want are
premises that are regarded as true by the target audience of the argument.

So, a plausible premise is one where the audience believes it has good reason to think it’s
true, and so is willing to grant it as a premise.

This helps to distinguish a plausible premise from a reading of “true premise” that’s defined
in terms of correspondence with the objective facts. We’d like our premises to be true in
this sense, but what really matters to argument evaluation is whether they’re regarded as
plausible or not by the intended audience of the argument.

Valid or Strong
We use “valid” and ‘strong” to help specify precisely what we mean when we say that the
conclusion follows from the premises.
A valid argument is one where the conclusion follows with absolute certainty from the
premises, where the truth of the premises logically necessitates the truth of the conclusion.
A strong argument is one where the conclusion follows not with absolute certainty, but with
some high probability.

Together, these help to clarify what we mean when we say that an argument satisfies the
Logic Condition.
Altogether, these definitions of plausibility, validity and strength give us a set of
necessary conditions for an argument to be good.
Part 3: Deductive versus Inductive Arguments
3.1 Deduction and valid reasoning
Transcript

In this final series of tutorials we’re going to look at the distinction between “deductive”
reasoning and “inductive” reasoning, and see how they relate to the concepts of validity
and strength that we just introduced.

Both of these terms, “deductive” and “inductive”, have a life outside of their usage in logic,
and they can be used in different ways so it’s helpful to be familiar with the various ways
they’re used.

In this section we’ll look at the relationship between deduction and valid arguments.
In ordinary logic, the term “deductive argument” or “deductive inference” is basically a
synonym for “valid argument” or “valid inference”. The terms are often used
interchangeably.

However, it’s also common to describe an argument as a deductive argument even if the
argument fails to be valid.

For example, someone might give an argument like this one:


1. If the match is burning then there is oxygen in the room.
2. The match is not burning.
Therefore, there is no oxygen in the room.


and they might intend for this argument to be valid. They believe the conclusion follows
with certainty from the premises.

But in this case they’ve made a mistake — this argument isn’t valid, it’s invalid. Just
because the match isn’t burning it doesn’t follow that there’s no oxygen in the room.
So this is an invalid argument, but it was intended as a valid argument. In this case, we’ll
still want to call it a deductive argument, but it’s a failed deductive argument, a deductive
argument that is guilty of a formal fallacy, a mistake in reasoning.

So, while the terms “deductive” and “valid” are sometimes used interchangeably, they
aren’t strict synonyms.

 When you describe an argument as valid you’re saying something about the logic of the
argument itself.
 When you describe an argument as deductive you’re saying something about the
conscious intentions of the person presenting the argument, namely, that they are
intending to offer a valid argument.
You need to draw this distinction in order for it to be meaningful to say things like “this is a
valid deductive argument” or “this is an invalid deductive argument”, which is a pretty
common thing to say in logic.
3.2 Induction and invalid reasoning
Transcript

In logic there’s a close relationship between deductive and valid arguments, and there’s a
similar relationship between inductive and strong arguments.

In standard logic, the term “inductive argument” basically means “an argument that is
intended to be strong rather than valid”.

So, when you give an inductive argument for a conclusion, you’re not intending it to be
read as valid. You’re acknowledging that the conclusion doesn’t follow with certainty from
the premises, but you think the inference is strong, that the conclusion is very likely true,
given the premises.

Here’s an example of a strong argument:


1. Most Chinese people have dark hair.
2. Julie is Chinese.
Therefore, Julie has dark hair.


We would call this an inductive argument because it’s obvious that the argument is
intended to be strong, not valid. Since the argument is in fact strong, it counts as a
successful inductive argument.
And as with deductive arguments, we also want to be able to talk about FAILED inductive
arguments, arguments that are intended to be strong but are in fact weak.
Like this one:
1. Most Chinese people have dark hair.
2. Julie has dark hair.
Therefore, Julie is Chinese.


Here we’re supposed to infer that, simply because Julie has dark hair, she’s probably
Chinese. This is a weak argument.

But we still want to call it an inductive argument if the intention was for it to be strong. In
this case the word “most” indicates that the inference is intended to be strong rather than
valid. We would call this a WEAK inductive argument.

So the terms “strong” and “inductive” have a relationship similar to the terms “valid” and
“deductive”.
 To call an argument STRONG is to say something about the logical properties of the
argument itself (that if the premises are true, the conclusion is very likely true).
 To call an argument INDUCTIVE is to say something about the INTENTIONS of the arguer
(that the argument is intended to be strong).
3.3 Induction and scientific reasoning
Transcript

In the terminology of standard textbook logic an inductive argument is one that's intended
to be strong rather than valid. But this terminology isn't necessarily standard outside of
logic.

In the sciences in particular there is a more common reading of "induction" that means
something like "making an inference from particular cases to the general case".

In this tutorial we're going to talk about this reading of the term and how it relates to the
standard usage in logic, and the role of inductive reasoning in the sciences more broadly.
The Standard Scientific Meaning of “Induction”

In the sciences the term "induction" is commonly used to describe inferences from
particular cases to the general case, or from a finite sample of data to a generalization
about a whole population.

Here's the prototype argument form that illustrates this notion of induction:
1. a1 is B.
2. a2 is B.
:
n. an is B.
Therefore, all A are B


You note that some individual of a certain kind, a1, has a property B. An example would be
"This swan is white".

Then you note that some other individual of the same kind a2 has the same property —
"This OTHER swan is white".

And you keep doing this for all the individuals available to you this. THIS swan is white,
and THIS swan is white, and THIS SWAN OVER THERE is white, and so on.

So you've observed n swans, and all of them are white. From here the inductive move is to
say that ALL swans, EVERYWHERE are white. Even the swans that you haven't observed
and will never observe.

This is an example of an inductive generalization.

Arguments of this form, or that do something similar — namely, infer a general conclusion
from a finite sample -exemplify the way the term "induction" is most commonly used in
science.
Another example that illustrates inductive reasoning in this sense is the reasoning involved
in inferring a functional relationship between two variables based on a finite set of data
points.

Let's say you heat a metal rod. You observe that it expands the hotter it gets. So for
various temperatures you plot the length of the rod against the temperature. You get a
spread of data points that looks like this:
What you may want to know, though, is how length varies with temperature generally, so
that for any value of temperature you can then predict the length of the rod.
To do that you might try to draw a curve of best fit through the data points, like so:

It looks like a pretty linear relationship, so a straight line with a slope like this seems like it'll
give a good fit to the data.

Now, given the equation for this functional relationship, you can now plug in a value for the
temperature and derive a value for the length.

The equation for the straight line is an inductive generalization you've inferred from
the finite set of data points.

The data points in fact are functioning as premises and the straight line is the general
conclusion that you're inferring from those premises.
When you plug in a value for the temperature and derive a value for the length of the rod
based on the equation for the straight line, you're deriving a prediction about a specific
event based on the generalization you've inferred from the data.

This example illustrates just how common inductive generalizations are in science, so it's
not surprising that scientists have a word for this kind of reasoning.

In fact, the language of induction used in this sense can be traced back to people like
Francis Bacon who back in the 17th century articulated and defended this kind of
reasoning as a general method for doing science.

Scientific vs Logical Senses of Induction

So how does this kind of reasoning relate to the definition of induction used in logic?

Recall, in standard logic an argument is inductive if it's intended to be strong rather than
valid.

The key thing to note is that this is a much broader definition than the one commonly used
in the sciences. That definition focuses on arguments of a specific form, those where the
premises are claims about particular things or cases, and the conclusion is a
generalization from those cases.

But if you take the standard logical definition of an inductive argument you find that many
different kinds of arguments will qualify as inductive, not just arguments that infer
generalizations from particular cases.

So, for example, on the logical definition, a prediction about the future based on past data
will count as an inductive argument.

1. The sun has risen every day in the east for all of recorded history.
Therefore, the sun will rise in the east tomorrow.

The sun has risen every day for as long as the earth has existed, as far as we know. So
we expect the sun to rise tomorrow as well.

This is an inductive argument on our definition because we acknowledge that even with
this reliable track record it's still possible for the sun to not rise tomorrow. Aliens, for
example, might blow up the earth or the sun overnight.

So this inference from the past to the future is inductive, and most of us would say that it's
a strong inference. But notice that it's not an argument from the particular to the general.
The conclusion isn't a generalization, it's a claim about a particular event, the rising of the
sun tomorrow.

So this kind of argument wouldn't count as inductive under the standard science
definition, but it does count under the standard logical definition.
Here's a second example that illustrates the difference:
1. 90% of human beings are right-handed.
2. John is a human being.
Therefore, John is right-handed. 


Notice that the main premise is a general claim, while the conclusion is a claim about a
particular person. On the standard science definition this isn't an inductive argument, since
it's moving from the general to the particular rather than from the particular to the general.

But on the logical definition of induction this argument does count, since the argument is
intended to be strong, not valid.

The relationship between the two definitions looks like this:

The arguments that qualify as inductive under the standard science definition are a subset
of the arguments that qualify as inductive under the standard logical definition.

So from a logical point of view there's no problem with calling an inference from the
particular to the general an inductive argument since all such arguments satisfy the basic
logical definition.
But scientists are sometimes confused when they see the term "induction" used to
describe other forms of reasoning than the ones they normally associate with inductive
inferences. There shouldn't be any confusion as long as you keep the two senses in mind
and distinguish them when it's appropriate.

But if you don't distinguish them then you may run into discussions like this one that
contradict themselves. Below are the first two senses of the Wikipedia entry on "induction"
at the time of making this tutorial (2010):

"Induction or inductive reasoning, sometimes called inductive logic, is the process of


reasoning in which the premises of an argument are believed to support the conclusion but
do not entail it, i.e. they do not ensure its truth. Induction is a form of reasoning that makes
generalizations based on individual instances."

The first sentence presents the standard logical definition — inductive reasoning is defined
as strong reasoning, reasoning that doesn't guarantee truth. The second sentence
presents the standard science definition of induction, defining it as reasoning from the
particular to the general.

Later on in the article they present a number of examples of inductive arguments that
satisfy the logical definition but not the scientific definition, such as inferences from
correlations to causes, or predictions of future events based on past events, and so on.
These examples flat out contradict the definition of induction in the second sentence.
So if anyone out there is inclined you might want to edit that entry to clarify the distinction
we've been discussing here :)

Summary

Now let's summarize some key points of this discussion.

The first is that we should be aware that there is a difference between the way the term
"induction" is defined in general scientific usage and the way it's defined in logic. The
logical definition is much broader — it's basically synonymous with "non-deductive"
inference. The scientific usage is narrower, and focuses on inferences from the particular
to the general.
Second, induction in the broader logical sense is fundamental to scientific reasoning in
general. Inductive reasoning is risky reasoning, it's fallible reasoning, where you're moving
from known facts about observable phenomena, say, to a hypothesis or a conclusion
about the world beyond the observable facts.
The distinctive feature about this reasoning is that you can have all the observable facts
right, and you can still be wrong about the generalizations you draw from those
observations, or the theoretical story you tell to try to explain those observations. It's a
fundamental feature of scientific theorizing that it's revisable in light of new evidence and
new experience.
It follows from this observation that scientific reasoning is broadly speaking inductive
reasoning — that scientific arguments should aim to be strong rather than valid, and that
it's both unrealistic and confused to expect them to be valid.
Disciplines that trade in valid arguments and valid inferences are fields like mathematics,
computer science and formal deductive logic. The natural and social sciences, on the
other hand, deal with fallible, risky inferences. They aim for strong arguments.
Basic Concepts in Propositional Logic
Part 1: Compound Claims
1.1 Conjunctions (A and B)

A conjunction is a compound claim formed by, as the name suggestions, conjoining two
or more component claims. The component claims are called the “conjuncts”.

The logic of conjunctions is pretty straightforward.

Here’s a simple conjunction pertaining to my preferences for pizza toppings:

Let A stand for the claim "I love pepperoni".

Let B stand for the claim "I hate anchovies".

Then the conjunction of A and B is the compound claim “I love pepperoni AND I hate
anchovies”.

We want to know the conditions under which the conjunction as a whole is true or false.

In this case it’s pretty obvious. The conjunction “A and B” is true just in case each of the
conjuncts is true. If either one is false then the conjunction as a whole is false.

It’s sometimes handy to represent the logic of compound claims with a table that gives the
truth value for the compound claim for every possible combination of truth values of the
component claims.

For conjunctions the “truth table” looks like this:

Under each of the conjuncts we list all the possible truth values in such a way that each
row represents a distinct logical combination of truth values.
In the first row, A is true and B is true. In the second row, A is true and B is false.

In the third row, A is false and B is true, and in the last row, A is false and B is false.

This exhausts all the possible combinations of truth values.

The middle column under the “and” represents the truth value of the conjunction taken as
a whole, “A and B”, as a function of the truth values for A and B in the adjacent row.

So, in the first row, we see that if both A and B are true, then the conjunction as a whole is
true.

But for every other combination of truth values, where at least one of the conjunctions is
false, then the conjunction as a whole is false.

We’ll use truth tables like this one to represent the logic of all the compound forms we’ll be
looking at.

Knowing the logic of the conjunction doesn’t help much if you can’t recognize when a
conjunction is being used in ordinary language. Here are a few things to look out for.

“John is a Rolling Stones fan and John is a teacher.”

“John is a Rolling Stones fan and a teacher.”

In the first sentence the conjunctive form is transparent. Each of the conjuncts, "John is a
Rolling Stones fan" and "John is a teacher", show up as complete sentences on either side
of the “and”.

But the second sentence represents the very same conjunction as the first sentence. The
syntax is different, but from the standpoint of propositional logic, the semantics, the
meaning of the sentence, is exactly the same as the first sentence. It’s implicit that “a
teacher” is a predicate term that takes “John” as the subject. Don’t make the mistake of
reading a sentence like this as a simple, non-compound claim.

Also, conjunctions don’t always use the word “and” to flag the conjunction. In ordinary
language there may be a subtle difference in meaning between this sentence using “but”
...

“John is a Rolling Stones fan BUT he doesn’t like The Who.”

... and the same sentence using “and”, but from the standpoint of propositional logic,
where all we care about is whether the sentence is true or not, and how the truth of the
sentence depends on the truth of any component claims, this sentence still represents a
conjunction. The “but” doesn’t make any difference from this perspective. This sentence is
still true just in case John is a Rolling Stones fan AND John doesn’t like The Who.

There are other words that sometimes function to conjoin claims together, like “although”,
“however”, and “yet”. You can substitute all of these for “but” in this sentence and you’ll get
slight variations in the sense of what’s being said, but from the standpoint of propositional
logic all of these represent the same conjunction, a claim of the form “A and B”.

One last point. Conjunctions can have more than two component claims. A claim like

“A and B and C and D and E”

might represent a compound claim like,

“John is a writer and a director and a producer of The Simpson’s tv show, but he’s
also a stand-up comic and an accomplished violinist”.

This is still a conjunction, and it follows the same rules as any conjunction, namely, it’s true
as a whole just in case all those component claims are true, and false otherwise.

This is about all you need to know about the logic of conjunctions and how conjunctions
are expressed in ordinary language.
1.2 Disjunctions (A or B)

You form a conjunction when you assert that two or more claims are all true at the same
time. You form a disjunction when you assert that AT LEAST ONE of a set of claims
is true. We’ll look at the logic of disjunctive claims in this video.

“John is at the movies or John is at the library.”

You form a disjunction by saying that either one of a set of claims is true. In this case
we’ve got two claims, “John is at the movies” and “John is at the library”. The
disjunction asserts that one of these is true, John is either at the movies or he’s at the
library.

The individual claims that make up a disjunction are called the “disjuncts”. So, you’d say
that in this case “A” and “B” are the disjuncts, and the disjunction is the whole claim, “A or
B”.

We’re interested in when a disjunction as a whole counts as true or false.

There are two kinds of cases we need to distinguish. Both of the sentences below express
disjunctions. You can represent these as claims of the form “A or B”.

“A triangle can be defined as a polygon with three sides or as a polygon with three
vertices.”

“The coin landed either heads or tails.”

But there’s a difference between the top sentence and the bottom sentence. With the top
sentence, both disjuncts can be true at the same time, they’re not mutually exclusive. You
can define a triangle as a polygon with three sides, or as a polygon with three vertices, but
both are equally good definitions.

With the sentence on the bottom its different. A coin toss is either heads or tails, it can’t be
both. So the “or” expressed in the bottom sentence is more restrictive than the “or”
expressed in the top sentence.

The “or” on top is called an “inclusive OR”. It includes the case where both conjuncts can
be true.

The “or” on the bottom is called an “exclusive OR”. It excludes the case where both
conjuncts can be true.

When examining arguments that use “OR” you need to know what kind of OR you’re
dealing with, an inclusive OR or an exclusive OR, because the logic is different.
Here are the truth tables for the inclusive OR and the exclusive OR:

With the inclusive OR, if either A or B is true, then the disjunction as a whole is true. The
only case where it’s false is if both A and B are false.

The truth table for the exclusive OR is exactly the same except for the first row, where both
A and B are true. The exclusive OR says that A and B can’t both be true at the same time,
so for this combination the disjunction is false.

You’re using an exclusive OR when you say things like, “The dice rolled either a six or a
two”, “The door is either open or shut”, “I’m either pregnant or I’m not pregnant”, “I either
passed the course or I failed the course.”.

On the other hand, if a psychic predicts that you will either come into some money or meet
a significant new person in the next month, that’s probably an inclusive OR, since they’re
probably not excluding the possibility that both might happen.

But sometimes it’s hard to know whether an OR is intended to be inclusive or exclusive,


and in those cases you might need to ask for clarification if an argument turns on how you
read the “OR”.

That’s it for the logic of the disjunction.

I’ll finish here with the same point I made about conjunctions, namely that a string of
disjunctions like this — “A or B or C or D or E” — is still a disjunction.

When I see strings like this I think of detective work, where you’re given a set of clues and
you’ve got to figure out who did it, and the string is a list of suspects, and with each new
clue or bit of evidence you systematically eliminate one of the options until you’re left with
the killer. You see similar logic with medical diagnosis or forensic research.

We look more closely at the logic of disjunctive arguments in the video course titled
“Common Valid and Invalid Argument Forms”.
1.3 Conditionals (If A then B)

Conditionals are claims of the form “If A is true, then B is true”. We use and reason with
conditionals all the time.

There is a lot that can be said about conditions. In this tutorial we’re just going to give the
basic definition and the truth table for the conditional.

In part 4 of this course I’ll come back to conditionals and say a few more things about the
different ways in which we express conditional relationships in language.

Here’s a conditional. “If I miss the bus then I’ll be late for work”.

It’s composed of two separate claims, “I miss the bus”, and “I’ll be late for work.”

The conditional claim is telling us that if the first claim is true, then the second claim is also
true.

We have names for the component parts of a conditional. The first part, the claim that
comes after the “if”, is called the “antecedent” of the conditional. The second part, the
claim that becomes after the “then”, is called the “consequent” of the conditional.

The names are a bit obscure but they do convey a sense of the role that the claims are
playing. What “antecedes” is what “comes before”. The “consequent” is a “consequence”
of what has come before.

The names are handy to know because they’re used in translation exercises where you’re
asked to express a bit of natural language as a conditional, and they’re used to to identify
the most common logical fallacies that are associated with conditional arguments.

One of these fallacies, for example, is called “affirming the consequent”. You commit this
fallacy when you’re given a conditional like this and assume, from the fact that I was late
for work, that I must have missed the bus. You’re affirming the consequent and trying to
infer the antecedent. This is an invalid inference, and the name for the fallacy, which you’ll
find in any logic or critical thinking textbook, is “affirming the consequent”.

It’s pretty easy to understand what conjunctions and disjunctions assert, it’s not quite as
easy seeing exactly what it is that you’re asserting when you assert a conditional.

Here’s a conditional: “If I drive drunk then I’ll get into a car accident”.

Question: If I assert that this is true, am I asserting that I’m driving drunk?

No, I’m not.

Am I asserting that I’m going to get into a car accident?

No, I’m not asserting that either.

When I assert “if A then B”, I’m not asserting A, and I’m not asserting B. What I’m
asserting is a conditional relationship, a relationship of logical dependency between A and
B. I’m saying that if A were true, then B would also be true. But I can say
that without asserting that either A or B is in facttrue.

It follows that a conditional can be true even when both the antecedent and the
consequent are false. Here’s an example:

“If I live in Beijing then I live in China.”

I don’t live in Beijing, and I don’t live in China, so both the antecedent and the consequent
are false, but this conditional is clearly true: if I did live in the city of Beijing, I would live in
China.

In a minute we’ll look at the truth table for the conditional, which gives you the truth value
of the conditional for every possible combination of truth values of A and B. The easiest
way to understand that truth table is to think about the case where we would judge a
conditional to be false. Here’s one:

“If I study hard then I’ll pass the test.”

Under what conditions would we say that this conditional claim is false?

Well, let’s consider some possibilities. Let’s say I didn’t study hard, but I still passed the
test. Here the antecedent is false but the consequent is true. In this case, would the
conditional have been false?

Well, no. The conditional could still be true in this case. What it says is that if I study hard
then I’ll pass. It doesn’t say that the only way I’ll pass is if I study hard. So, my failing to
study hard and still passing doesn’t falsify the conditional.

So, this combination of truth values does not make the conditional false.

Now, what about this case? I didn’t study hard and I didn’t pass the test. Here, both the
antecedent and the consequent are false.

This clearly doesn’t falsify the conditional. If anything this is what you might expect would
happen if the conditional was true and the test was hard!

So this combination of truth values doesn’t make the conditional false either.

Now let’s look at a third case: I studied hard but I didn’t pass the test. Here, the antecedent
is true but the consequent is false.

Under these conditions, could the conditional still be true?

No, it can’t. THESE are the conditions under which a conditional is false, when the
antecedent is true but the consequent turns out to be false.

If I studied hard and didn’t pass, this is precisely what the conditional says won’t happen.

So, in general, a conditional is false just in case the antecedent is true and the consequent
is false.
This turns out to be the ONLY case when we want to say with certainty that a conditional
claim is false. All other combinations of truth values are consistent with the conditional
being true.

This, now, gives us the truth table for the conditional.

For the sake of consistency and familiarity with the truth tables for the conjunction and the
disjunction, I’ve placed the arrow symbol in between the A and the B to represent the
conditional operator, where “A arrow B” means “If A then B”. Actually in formal symbolic
logic an arrow is often used as the symbol for the conditional. It’s also common in formal
logic to use a sideways horseshoe symbol for the conditional, but there’s no need to
complicate things more than they are.
The arrow has the virtue that it gives a nice visual cue about the direction of the logical
dependency between the antecedent and the consequent. The conditional asserts that if A
is true then you can infer B, it doesn’t go the other way, it doesn’t say that if B is true then
you can infer A.

Notice that the second row gives the only combination of truth values that makes the
conditional false, when A is true and B is false. For all other combinations the conditional
counts as true. This definition doesn’t always give intuitive results about how to interpret
language that uses conditionals, but for purposes of doing propositional logic it gets it right
in all the cases that matter.

That’s it for now for the logic of the conditional. We’ll come back to conditionals when we
talk about contradictories, and in part four of this course, where we look at different ways
we express conditionals in ordinary language.
Part 2: Contradiction and Consistency
2.1 Contradictories (not-A)

In Part 2 of this series of tutorials we’re going to look at the concept of


the contradictory of a claim, distinguish it from the contrary of a claim, define what
a contradiction is, and introduce the concepts of consistency and inconsistency when
applied to a set of claims.

Let A be the claim “John is at the movies”.

The contradictory of A is defined as a claim that always has the opposite truth value
of A. So, whenever A is true, the contradictory of A is false, and whenever A is false, the
contradictory of A is true.

There are couple of different ways that people write the contradictory of A. We’re going to
write it in English as “not-A”. But in textbooks and websites that treat logic in a more formal
way you’ll likely see “not-A” written with a tilde, a wavy symbol, or as a corner-of-a-
rectangle shape.

What does the contradictory assert? It asserts that the claim A is false. There are couple of
ways of saying this, some more natural than others.

You can read the contradictory of A as “A is false”, i.e.

‘John is at the movies’ is false”,

or

“It is not the case that John is at the movies”.

But the most natural formulation is obviously

“John is NOT at the movies”.

For simple claims like this it’s not too hard to find a natural way of expressing the
contradictory. For compound claims, like conjunctions or disjunctions or conditionals,
finding the contradictory isn’t so simple, and sometimes we have to revert to more formal
language to make sure we’re expressing the contradictory accurately. In part 3 we’ll spend
some time looking at the contradictories of compound claims.

Here’s the truth table for the contradictory. Pretty simple. Whenever A is true, not-A is
false, and vice versa.
The definition is simple, but the concept is important, and it isn’t trivial when you’re looking
at real-world arguments involving more complex claims.

For example, when you’re debating an issue it’s important that all parties understand what
it means for the claim at issue to be true and what it means for it to be false, so that
everyone understands what sorts of evidence would count for or against the claim. And
this requires that you understand contradictories.

So the concept is simple, but you shouldn’t think it’s trivial.


2.2 Contradictories vs contraries

One of the problems that people have with identifying contradictories is that they
sometimes confuse them with contraries. In this video we’ll clear up the distinction.

Here’s our claim,

A = “The ball is black.”

Now consider the claim “The ball is white”. Is this the contradictory of A?

It’s tempting to say this. There’s a natural sense in which “the ball is white” is opposite in
meaning to “the ball is black”, since black and white are regarded as opposites in some
sense. And it’s also true that they both can’t be true at the same time: if the ball is black
then it’s not white, and vice versa.

But this is NOT the contradictory of A.

Why not?

Well, what if the ball we’re dealing with is a GREY ball? The ball isn’t black, and it isn’t
white -- it’s grey.

This possibility is relevant because it’s a counterexample to the basic definition of a


contradictory.

If “the ball is white” is the contradictory of “the ball is black”, then these are supposed to
have opposite truth values in all possible worlds, whenever one is true the other is false,
and vice versa.

But if the ball is grey, then A, “The ball is black” is false, since grey is not black.

But if A is false, then the contradictory of A must be TRUE. But “The ball is white” is NOT
true, it’s false as well. Both of these claims are false. And that’s not supposed to be
possible if these are genuine contradictories of one another.

We do have a word to describe pairs of claims like these, though. They’re called
“contraries”.

Two claims are contraries of one another if they can’t both be true at the same time,
but they can both be false.

So, the ball can’t be both black and white, but if it’s grey, or red, or blue, then it’s neither
black nor white. These are contraries, not contradictories.

Now, how do we formulate the contradictory of “The ball is black?” so that it always has
the opposite truth value?

Like so: you say “The ball is NOT black”.


Now, the ball being grey doesn’t violate the definition of a contradictory. In this world, A is
false, but not-A is true -- it’s true that the ball is not black.

Examples like this illustrate why it’s sometimes helpful to have the formal language in the
back of your head. A more formal way of stating the contradictory is “it is not the case that
A”, “it is not the case that the ball is black”, which is equivalent to “the ball is not black”.

The language is pretty stiff but if you stick “it is not the case that ...” in front of the claim,
you’re guaranteed not to make the mistake we’ve seen here, of mistaking contrary
properties for contradictory properties. In some cases, when dealing with more complex,
compound claims, it’s the only way to formulate the contradictory.
2.3 Contradictions (A and not-A)

The concept of a contradiction is very important in logic. In this video we’ll look at the
standard logical definition of a contradiction.

Here’s the standard definition. A contradiction is a conjunction of the form “A and not-
A”, where not-A is the contradictory of A.

So, this is a compound claim, where you’re simultaneously asserting that a proposition is
both true and false.

Given the logic of the conjunction and the contradictory that we’ve looked at in this course,
we can see that the defining feature of a contradiction is that for all possible combinations
of truth values, the conjunction comes out false, since a conjunction is only true when both
of the conjuncts are true, but by definition, if the conjuncts are contradictories, they can
never be true at the same time.

So, propositional logic requires that all contradictions be interpreted as false. It’s
logically impossible for a claim to be both true and false in the same sense at the same
time.

This is known as the “principle of non-contradiction”, and some people have argued
that this is the most fundamental principle of logical reasoning, in that no argument could
be rationally persuasive to anyone if they were consciously willing to embrace
contradictory beliefs.

There’s a minor subtlety in the definition of a contradiction that I want to mention.

Here’s a pair of claims:

“John is at the movies” and “John is not at the movies”


This is clearly a contradiction, since these are contradictories of one another. John can’t
be both at the movies and not at the movies at the same time.

Now, what about this pair?

“John is at the movies” and “John is at the store”


Recall, now, that these are contraries of one another, not contradictories. They can’t both
be true at the same time, but they can both be false at the same time.

Our question is: Does this form a contradiction?

This is actually an interesting case from a formal point of view. Let’s assume that being at
the store implies that you’re not at the movies (so we’re excluding the odd possibility
where a movie theater might actually be in a store).

Then, it seems appropriate to say that, since they both can’t be true at the same time, it
would be contradictory to assert that John is both at the movies and at the store. And
that’s the way most logicians would interpret this. They’d say that the law of non-
contradiction applies to this conjunction even though, strictly speaking, these aren’t logical
contradictories of one another. The key property that it has, is that it’s a claim that is false
for all possible truth values.

Here’s another way to look at it.

This is the truth table for the conjunction:

A conjunction is only true when both conjuncts are true. For all other truth values it’s false.

But in our case, the top line of the truth table doesn’t apply, since our two claims are
contraries -- they can’t both be true at the same time. So this case never applies.
The remaining three lines give you all the possible truth values for contraries, and now we
see that the conjunction comes out false for all of them.

This kind of example raises a question that logicians might debate: whether, on the one
hand, a contradiction should be defined as a conjunction of contradictory claims, or, on the
other hand, whether it should be defined as any claim that is false in all logically possible
worlds.

Examples like these suggest to some people that it is this latter definition which is more
fundamental, that’s it more fundamental to say that a contradiction is a claim that is
logically false, false in all possible worlds.

This issue isn’t something you’ll have to worry about, though. If you’re a philosopher or a
logician this may be interesting, but for solving logic problems and analyzing arguments, it
doesn’t make any difference.
2.4 Consistent vs inconsistent sets of claims

Like the terms “valid” and “invalid”, the most common use of “consistency” and
“inconsistency” in everyday language is different from its use in logic.

In everyday language, something is “consistent” of it’s predictable or reliable. So, a


“consistent A student” is a student who regularly and predictably gets As. A consistent
athlete is one who reliably performs at a certain level regardless of the circumstances. An
inconsistent athlete performs well sometimes and not so-well other times, and their
performance is hard to predict.

This isn’t how we use the terms “consistent” and “inconsistent” in logic.

In logic, “consistency” is a property of sets of claims. We say that a set of claims is


consistent if it’s logically possible for all of them to be true at the same time.

What does “logically possible” mean here? Logically possible means that the set of
claims doesn’t entail a logical contradiction.

A contradiction is a claim that is false in all logically possible worlds, and we usually write
the general form of a contradiction as a claim of the form “A and not-A”.

“not-A” is usually interpreted as the contradictory of A, but as we saw in the last tutorial,
this is can also be the contrary of A.

So, if it’s logically impossible for a set of claims to be true at the same time, then we say
that the set is logically inconsistent.

Let’s look at some examples:

“All humans are mortal.”

“Some humans are not mortal.”

These clearly form an inconsistent set, since these are logical contradictories of one
another. “Mortal” means you will some day die. “Not mortal” means you’ll never die, you’re
“immortal”. If one is true then the other must be false, and vice versa.

Now what about this set?

“All humans are mortal.”

and

“Simon is immortal.”

Can both of these be true at the same time?


In this case, the answer is “yes”, both of these can be true. They’re only inconsistent if you
assume that Simon is human -- if that were true, then these would be inconsistent. But
“Simon” is just a name for an individual, so Simon could be a robot or an angel or an alien,
and if so then the claim about all humans being mortal wouldn’t apply.

Now, if you added the assumption about Simon being human as a claim to this set, like so

“All humans are mortal.”

“Simon is immortal.”

“Simon is human.”

... then you’d have an inconsistent set.

Here you have three claims where, if any two of them are true, the third has to be false.

Let’s take a moment and look at this at little closer.

If all humans are mortal, and if Simon is immortal, then it logically follows that Simon can’t
be human.

By “logically follows” I mean that you can construct a valid argument from these premises
for the conclusion that Simon is not human, like so:

1. All humans are mortal.


2. Simon is immortal.
Therefore, Simon is not human.

This is what it means to say that the set entails a logical contradiction. From the set one
can deduce a claim that is either the contradictory or the contrary of one of the other
claims in the set.

Here’s another way to represent this. The first two claims entail a claim that is the
contradictory of the third claim. And from this it becomes evident that, to assert that all the
claims in the set are true is to assert a formal contradiction.

Now, we can run this with any pair of claims in the set. If we set the second and third
claims as true, for example, then we can infer that the first must be false. If Simon is
immortal and if Simon is human, then it must be the case that not all humans are mortal,
which contradicts the first claim.

The only remaining pair to check is the first and the third claims. If it’s true that all humans
are mortal, and it’s true that Simon is human, we can validly infer that Simon is mortal,
which contradicts the second claim.

This example helps to illustrate another important fact about inconsistent sets of claims. IF
we’re given a set of claims that we know is inconsistent, then we know that at least one of
the claims in the set must be FALSE.
So, if we want re-establish consistency, we need to abandon or modify at least one of
these claims.

We used logic to establish that the set is inconsistent, but it’s important to understand that
logic alone can’t tell us which of these claims to modify.

All that logic tells us is that you can’t consistently believe all of these claims at the same
time, it doesn’t tell us how in any particular case to resolve the inconsistency.

Nevertheless, it can be very helpful in argumentation to have a group of people come to


agree that a set of claims is inconsistent. In the end they may disagree about how to
resolve the inconsistency, but it’s still an achievement to get everyone to realize that they
can’t accept everything on the table, that something has to go.
Part 3: Contradictories of Compound Claims
3.1 not-(not-A)

In this series of tutorials we’ll be looking at how to write and interpret the contradictories of
the basic compound claim forms -- conjunctions, disjunctions, and conditionals.

But before we do that we should first talk about the contradictory of a contradictory.
Contradictories are sometimes called “negations”, so this rule is commonly called “double
negation”.

The rule is straightforward. If you take a claim and negate it, and then negate the negation,
you recover the original claim.

Here’s a simple example:

“Sarah makes good coffee.”

The contradictory of this is “It is not the case that Sarah makes good coffee”, or, more
naturally, “Sarah does not make good coffee”.

If we now take the contradictory of this, we get an awkward expression. If you were being
very formal you’d say

“It is not the case that it is not the case that Sarah makes good coffee”,

or

“it is false that it is false that Sarah makes good coffee”.

These are pretty unnatural. “It is not the case that Sarah does not make good coffee”
is also pretty unnatural. But you don’t have to say this. With double negation you recover
the original claim, so you can just say “Sarah makes good coffee”.

Double negation is mostly used as a simplification rule in formal logic, but we use it
intuitively in ordinary speech all the time.

One word of caution. To use double negation correctly you need to know how to construct
and recognize the contradictories of different kinds of claims.

For example, let’s say someone wants to say

“It’s false that Sarah and Tom didn’t go bike riding”.

If I want to simplify this using double negation I need to know how to interpret the
contradictory of“Sarah and Tom didn’t go bike riding”.

But does this mean that “Both Sarah and Tom went bike riding”? Or does it mean “Either
Sarah or Tom went bike riding”?
To be sure about this you need to know how to interpret the contradictory of a conjunction
where each of the conjuncts is already negated -- “Sarah didn’t go bike riding AND Tom
didn’t go bike riding.

The correct answer is the second one, “Either Sarah or Tom went bike riding.”

But you’ll need to check out the tutorial on negating conjunctions to see why.
3.2 not-(A and B)

Let’s look at how to construct the contradictory of a conjunction.

Here’s our claim: “Dan got an A in physics and a B in history”

When I say that this claim is false, what am I saying?

The conjunction says that both of these are true. The conjunction is false when either one
or the other of the conjuncts is false, or both are false.

This gives us the form of contradictory: “Either Dan didn’t get an A in physics OR he
didn’t get a B in history”.

The contradictory of a conjunction is a DISJUNCTION, an “OR” claim. You construct it by


changing the AND to an OR and negating each of the disjuncts.

If you wanted to be really formal about it could write a derivation of the contradictory but
the rule is fairly simple to remember. When you have a “not-” sign in front of a conjunction,
you “push” the “not-” inside the brackets, distribute the “not-” across both of the conjuncts,
and then switch the “AND” to an “OR”.

The basic rule looks like this:

not-(A and B) = (not-A) or (not-B)

If this formula looks “algebraic” to you, that’s because it is. This is a formal equivalence in
propositional logic. It’s also a basic formula of Boolean logic, which computer scientists will
be familiar with since it’s the logic that governs the functioning of digital electronic devices.

Let’s do some problems with it.

Question: What’s the contradictory of “John and Mary went to the zoo”?

Answer: “Either John didn’t go to the zoo or Mary didn’t go to the zoo.”

You need to recognize the individual conjuncts, negate them, and put an “OR” in between.
Note that the “either” is just a stylistic choice -- “either A or B” is equivalent to “A or B”.

You can also write the answer like this, of course: “Either John or Mary didn’t go to the
zoo”.

Let’s try another one.

Question: What’s the contradictory of “Sam loves hotdogs but he doesn’t like relish”?

With this one you have to remember that, from the standpoint of propositional logic, the
“but” is functioning just like “and”, and the whole thing is still a conjunction.
You’ll also need to pay attention to the negation, “doesn’t like relish”, because you’re
going to end up negating this negation, which gives us an opportunity to use the double-
negation rule.

Here’s the answer: “Either Sam doesn’t like hotdogs or he likes relish”.

You replace the conjunction with a disjunction, and you negate the disjuncts. Note that
we’ve used double negation on the second disjunct. It’s much easier to write “he likes
relish” than “it’s not the case that he doesn’t like relish”.

Well, that’s about all you need to know about negating conjunctions. The basic rule is easy
to remember:

“not-(A and B)” = “not-A OR not-B”

We’ll see in the next video that the rule for the contradictory of a disjunction is very similar.
3.3 not-(A or B)

Now that we’ve done the contradictory of a conjunction, the contradictory of a


disjunction will be no problem.

“Dan will either go to law school or become a priest”.

This is a disjunction. What am I saying when I say that this disjunction is false?

The disjunction says that either one, or the other, or both of these are true -- Dan will either
go to law school, or he’ll become a priest, or both.

If this is false, that means that Dan doesn’t do any of these things. He doesn’t go to law
school, and he doesn’t become a priest.

So the contradictory looks like this:

“Dan will not go to law school AND Dan will not become a priest.”

The disjunction has become a conjunction, with each of the conjuncts negated. This is
structurally identical to the rule we saw in the previous video, with the “OR” and the “AND”
switched.

In English we have a natural construction that is equivalent to this conjunction:

“Dan will neither go to law school nor become a priest.”

Remember: “not-A” and “not-B” is the same as “neither A nor B”.

Don’t be fooled by the “or” in “nor” -- this not a disjunction, it’s a conjunction.

Let’s put the rules for the contradictory of the conjunction and the disjunction side-by-side,
so we can appreciate the formal similarity:

not-(A and B) = (not-A) or (not-B)

not-(A or B) = (not-A) and (not-B)

In propositional logic these together are known as “DeMorgan’s Rules” or “DeMorgan’s


Laws”, named after Augustus DeMorgan who formalized these rules in propositional logic
in the 19th century.

They’re also part of Boolean logic, and are used all the time in computer science and
electrical engineering in the design of digital circuits.

That’s it. These are the rules you need to know to construct the contradictories of
conjunctions and disjunctions.
3.4 not-(If A then B)

The contradictory of the conditional is probably the least intuitive of all the contradictory
forms that we’ll look at. But we’ve already discussed this topic when we introduced the
conditional and presented the truth table for the conditional, so this should be review.

Here’s a conditional:

“If I pay for dinner then you’ll pay for drinks.”

What does it mean to say that this conditional claim is false?

When we first introduced the conditional we looked at this question. We determined that
the only condition under which we would certainly agree that this claim is false is when the
antecedent is true but the consequent is false.

This gives us the form for the contradictory. The most natural way to say it is “I pay for
dinner but you don’t pay for drinks”. I’m affirming the antecedent and denying the
consequent.

Recall from our discussion of the conjunction that “but” just means “and”, and that this is a
conjunction, not a conditional.

Let me repeat that. The contradictory of a conditional is not itself a conditional, it’s a
conjunction.

Here’s the general rule that makes this clear:

not-(If A then B) = A and not-B = A but not-B

The contradictory of a conditional is a conjunction that affirms the antecedent of the


conditional but denies the consequent. Almost always, though, it’s more natural to phrase
the contradictory as “A but not-B”, as in “I pay for dinner but you don’t pay for drinks”.

These are the conditions under which, if they obtained, we’d say that the original
conditional was false.

The most common mistake that students make when solving problems that require taking
the contradictory of a conditional is to interpret the contradictory as a conditional of this
form:

“If A then not-B”

This is a tempting interpretation of the contradictory, but it just doesn’t work. There are a
couple of ways of seeing why this is so. One way uses truth tables.
On the left is the truth table for the conditional. The conditional is true for all truth values of
A and B except when A is true and B is false.

The contradictory of the conditional is, by definition, a claim that is true whenever the
conditional is false, and vice versa. So the middle column has the opposite truth value of
the conditional, for the same values of A and B.

This, we know, must be the truth table for the contradictory of the conditional. The question
is, what operations on A and B will yield this truth table? That’s the question represented
by the question mark in between A and B.

We can see right away that a truth table formed by simply negating the consequent won’t
do. Here’s the truth table for the conditional where the only change is negating the
consequent.

As you can see I’ve switched the truth values in the B-column, and I’ve evaluated the truth
value of the conditional in the middle column according to the rule that the conditional as a
whole is true except when A is true and B is false.

You can see that the truth values for this new conditional don’t match up with the truth
values for the contradictory of the conditional. They match for the cases where A is true,
but not where A false.
From this alone we can rule this out as a candidate for the contradictory. Whatever
functions as the contradictory of the conditional has to be more restrictive in its truth value,
so that it comes out false whenever A is false.

Now let’s look at the truth table for the conjunction with B negated.

As you can see, this gives us exactly what we need -- the truth tables match. What we’ve
just done confirms our rule, that the contradictory of a conditional is a conjunction with the
B-term negated.

Well, this is the end of Part III of this video course.

For the sake of having them all in one place, here are the formulas we introduced that give
the contradictories for the basic compound claims of propositional logic:

not-(not-A) = A

not-(A and B) = (not-A) or (not-B)

not-(A or B) = (not-A) and (not-B)

not-(If A then B) = A and (not-B)


Part 4: Ways of Saying “If A then B”
4.1 A if B

In this series of tutorials we’ll be looking at various different ways that we


express conditional relationships in language.

The basic syntax for the conditional is “if A then B”, but in ordinary language we have lots
of ways of expressing conditionals that don’t use this form.

We’ll start with the form “A if B”.

“If I pay for dinner then you’ll pay for drinks."


This is written in standard conditional form. The antecedent is “I pay for dinner”.
The consequentis “you’ll pay for drinks”.

The “if” is what flags the antecedent. In standard form, the antecedent comes immediately
after the “if”.

Now, I can write the same claim like this:

“You’ll pay for drinks if I pay for dinner.”

Here the consequent is now at the beginning of the sentence and the antecedent is at the
end. But the antecedent is still “I pay for dinner”. The “if” flags the antecedent just as as it
does when the conditional is written in standard form.

Here’s the general translation rule:

B if A = If A then B

Now, I want to mention something here that might be a source of confusion. I’ve written it
as a “B if A” rather than “A if B”, so that the As and Bs correspond when compared with the
conditional in standard form. So the same symbols represent the antecedent and the
conditional in both versions.

But you shouldn’t expect the same letter to always represent the antecedent of the
conditional. The symbols are arbitrary. I could write the same rule in all these different
ways,

A if B = If B then A
Q if P = If P then Q
$ if @ = If @ then $

and it would still represent the same rule.

What matters is that in standard form, whatever follows the “if’ is the antecedent. The trick
in interpreting different versions of the conditional is to identify the claim that is functioning
as the antecedent, so that you can then re-write the conditional in standard form.

This is actually a very useful skill when analyzing ordinary arguments. We’ll eventually
cover the valid and invalid argument forms that use the conditional, and these are always
expressed using the conditional in standard form, so in order to apply your knowledge of
valid and invalid argument forms you need to be able to translate conditionals into
standard form.

Let’s finish with a couple examples. The exercise is to write these conditionals in standard
form:

“David will be late if he misses the bus.”

and

“You’ll gain weight if you don’t exercise.”


And the answers looks like this:

“David will be late if he misses the bus.” = “If David misses the bus then he’ll be late.”

“You’ll gain weight if you don’t exercise.” = “If you don’t exercise then you’ll gain weight.”

The rule is that you look for the “if”, and whatever follows the “if” is the antecedent of the
conditional.

This is the simplest alternative form for the conditional. As we’ll see, there are other forms,
and they can be trickier to translate.
4.2 A only if B

There’s a big difference between saying that A is true IF B is true, and A is true ONLY
IF B is true. Let’s look at this difference.

Here’s a conditional expressed using “only if”:

“The match is burning only if there’s oxygen in the room.”

We need to figure which of these is the antecedent, “the match is burning” or “there’s
oxygen in the room”.

Given what we did in the last tutorial, it’s tempting to just look for the “if” and apply the rule,
whatever comes after the “if” is the antecedent, and conclude that “there’s oxygen in the
room” is the antecedent.

But that’s wrong. “There’s oxygen in the room” isn’t the antecedent.

If this was the antecedent, then the sentence would be saying that if there’s oxygen in the
room then the match will be burning. But if you’re saying that then you’re saying that the
presence of oxygen isenough to guarantee that the match is burning.

That’s not what’s being said. What’s being said is that the presence of oxygen in the room
isnecessary for the match to be burning, it doesn’t say that it will be burning.

This sentence expresses a conditional, but the antecedent of the conditional is, in fact,
“the match is burning”.

The “only if” makes a dramatic difference. This sentence is equivalent to the following
conditional written in standard form:

“If the match is burning then there’s oxygen in the room.”

The “only if” actually reverses the direction of logical dependency. When you have “only
if”, the claim that precedes the “only If’ is antecedent, what follows it is the consequent.

Here’s the “only if” rule:

“A only if B” = “If A then B”

The antecedent doesn’t come after the “if”, the consequent comes after the “if”.

Let’s take away the symbols and compare the “if” and “only if’ rules.

___ if ___

___ only if ___

When you’re given a conditional that uses “if” or “only if”, you look for the “if”, and if the “if”
is all by itself, then the antecedent is what immediately follows the “if”.
If the “if” is preceded by “only” then you do the opposite, what follows the “only if” is
theconsequent, what precedes the “only if” is the antecedent. Once you’ve got that, then
the rest is easy:

(consequent) if (antecedent)

(antecedent) only if (consequent)

From here you can easily write the conditional in standard form, “If A then B”.

Let’s look at some examples. Here are two sentences:

“Our team will kick off if the coin lands heads.”

“I’ll buy you a puppy only if you promise to take care of it.”

They both express conditionals. You need to write these in standard form, in the form “If A
then B”. That requires that you identify the antecedent and the conditional in each
sentence.

In the first sentence, “Our team will kick off if he coin lands heads”, the “if” appears by
itself, so we know that what immediately follows the ‘if” is the antecedent.

So we write the conditional in standard form as follows:

“If the coin lands heads then our team will kick off.”

For the second sentence we have an “only if”, so we know to do precisely the opposite of
what we did in the previous case. The antecedent is what precedes the “only if”. So, you
write the conditional in standard form like this:

“If I buy you a puppy then you promise to take care of it.”

This conditional doesn’t say that if you promise to take care of it I’ll buy you a puppy. It’s
saying that if I buy you a puppy, then you can be sure that you promised to take care of it,
because that was a necessary condition for buying the puppy. But merely promising to
take care of the puppy doesn’t guarantee that I’ll buy it.

It might be a bit more natural to write it like this: “If I bought you a puppy then you
promised to take care of it”.

Sometimes shifting the tenses around can be helpful in expressing conditionals like this in
a more natural way. For our purposes they mean the same thing.

Here’s the general rule once again:

“A only if B” = “If A then B”

In the next tutorial we’ll look at what happens when you combine the “if” and “only if”.
4.3 A if and only if B

You may have heard the expression “if and only if” in a math class or some other
context. In this tutorial we’ll look at what this means as a conditional relationship.

As the name suggests, “A if and only if B” is formed by conjoining two conditionals using
the “if” rule and the “only if” rule.

So it asserts two things, that A is true if B is true, AND that A is true ONLY IF B is true.

You can use the “if” rule and the “only if” rule to translate these into standard conditionals,
and when you do the expression looks like this:

“If B then A” and “if A then B”

This asserts that the conditional relationship runs both ways. Given A you’re allowed to
infer B, and, given B you’re allowed to infer A.

It’s not surprising, then, this is also called a “biconditional”. You might encounter the
biconditional written in different ways, but they all mean the same thing, that A implies B
and B implies A.

Biconditionals show up a lot in formal logic and mathematics. They’re used to demonstrate
the logical equivalence of two different expressions. From a propositional logic
standpoint, the defining feature of a biconditional is that the claims, A and B, always
have the same truth value -- if A is true, then B is true, and vice versa.

Here’s an example of a biconditional relationship whose truth is obvious.

Let A be the claim that the triangle ABC has two equal sides.

Let B be the claim that the triangle ABC has two equal angles.

It’s clear that if A is true then B is also true. The sides AB and AC are equal, and from the
diagram you can see this requires that the angles at B and C must also be equal.

And it’s also clear that converse is true as well, that if a triangle has two equal angles then
it also has two equal sides. So “if B then A” is also true.
But if both of these conditionals are true, then we can say that A is true if and only if B is
true, and vice versa.

One of the helpful things about learning about the biconditional as a concept is that it helps
us to remember that ordinary conditionals are only half a biconditional, they only go one
way. If A implies B it doesn’t follow that you can go backwards and say that B implies A. It
reminds us that you need to argue or demonstrate that you can run the inference in the
other direction as well.
4.4 A unless B

There are a lot of ways of saying “if A then B”. We can even say it without using the words
“if” or “then”.

Here’s an example:

“Jennifer won’t go to the party unless Lauren goes too.”

The word “unless” is acting like a conditional operator in this sentence.

If you were asked to rewrite this as a standard conditional, you would probably translate
this as

“If Lauren doesn’t to the party then Jennifer won’t go to the party.”

This exactly right.

We’ve done two things here. First, we recognize that the antecedent of the conditional is
what comes immediately after the “unless”.

Second, we recognize that we need to take the contradictory of this claim to get the
meaning of the conditional right.

Here it is in a way that highlights these two moves.

Look for the “unless”, take what immediately follows, negate it, and make that the
antecedent of the conditional.

This gives us the form of the general rule:

B unless A = If not-A then B


If you say that B is true unless A is true, then you’re saying that if A is false then B is true.

Once again, don’t be too fixated on which letters we’re using to represent the antecedent
and the consequent.
For me, I like simple translation rules that are easy to remember, so I usually say to
myself, read “unless” as “if not-”. This is probably the easiest way to remember this
rule.

Here’s a final example that puts a small spin on things. The claim is

“Unless you pay us one million dollars, you’ll never see your pet goldfish again.”

Here the “unless” is at the beginning of the sentence rather than in the middle, but the rule
still applies. You should read “unless” as “if not-”.

So the translation is

“If you don’t pay us one million dollars, then you’ll never see your pet goldfish
again”.

It doesn’t matter where the “unless” shows up in the sentence, the translation rule still
applies.
4.5 The contrapositive (If not-B then not-A)

The contrapositive is a very important translation rule. It’s mainly used for simplifying
conditionals that use negations, but it’s used extensively in LSAT logic games.

The contrapositive is easy to illustrate.

“If I live in Paris then I live in France.”

This is a conditional claim, it happens to be true (assuming we’re talking about Paris, the
city with the Eiffel Tower, and not some other city with the same name).

Now, if this is true, then this is also true:

“If I don’t live in France then I don’t live in Paris”.

This is the contrapositive of the conditional above.

The contrapositive is a conditional formed by switching the antecedent and the consequent
and negating them.

Here’s the general rule:

“If A then B” = “If not-B then not-A”.

Here are few more examples:

Conditional: “If we win the game then we’ll win the championship.”

Contrapositive: “If we didn’t win the championship then we didn’t win the game.”

If the first conditional is true, then if we didn’t win the championship, we can be sure that
we didn’t win the game.

The rule applies to all conditionals, even false or nonsensical conditionals like this one:

Conditional: “If George Bush is a robot then Jackie Chan is Vice-President.”

This is absurd, of course, but if this conditional was true --- if George Bush’s being a robot
actually entailed that Jackie Chan was the Vice President” -- then the contrapositive would
also be true:

Contrapositive: "If Jackie Chan is not Vice President, then George Bush is not a robot."

For our last example let’s mix it up.

Conditional: “You won’t become a good player if you don’t practice.”

It’s written with the “if” in the middle, so to write the contrapositive you have to make sure
you’ve got the antecedent and the consequent right.
Well, the “if” rule says that whatever follows the “if” is the antecedent, so we know
the antecedentis “You don’t practice”, and the consequent is “you won’t become a
good player”.

Now, to write the contrapositive, you switch the antecedent and the consequent and
negate both parts.

The consequent of the original is “you won’t become a good player”. Negating this you get
“You will become a good player”. This becomes the antecedent of the contrapositive.

Contrapositive: “If you become a good player than you practiced.”

Sometimes you may want to shift tenses a bit to make a claim sound more natural. In this
case I’ve written it as “you become a good player” rather than “you will become a good
player”, but it doesn’t make much difference. I could have also written it as “you became a
good player”.

Once you’ve got the antecedent of the contrapositive it’s easy to write the consequent -- “if
you become a good player then you must have practiced.”

The challenge with problems like these is to not get turned around and mistake an
antecedent for a consequent. In this case, the most common error would be to interpret
the contrapositive as “If you practice then you’ll become a good player”. This is very
tempting, but it’s not entailed by the original claim.

The original claim doesn’t say that if you practice you’re guaranteed to become a good
player. All it says is that if don’t practice then you’re certainly not going to become a good
player. So, what we can infer is that if you end up becoming a good player, then we can be
sure of one thing, that you practiced.

The value of these rules is that they keep you from assuming that you know more than you
do, based on the information given.
4.6 (not-A) or B

Here’s another translation rule for conditionals that doesn’t use “if” or “then”. This one is
useful for interpreting disjunctions as conditionals, or rewriting conditionals in the form of a
disjunction.

It turns out you can write any conditional as a disjunction, a claim of the form “A or
B”.

Consider this conditional: “If live in Paris, then I live in France.”

There are only three possibilities for the way the disjunction can be phrased:

A or B

A or (not-B)

(not-A) or B

From the title of this tutorial you already know the answer to this question, but for the sake
of demonstrating why this answer is correct let’s work through these.

A or B = I live in Paris or I live in France.

If the original conditional is true, is this disjunction true? No, this disjunction doesn’t have
to be true. The original conditional is consistent with me living in New York, say. There’s
no reason why I have to live in Paris or France. So this won’t work. Let’s try the next one.

A or (not-B) = I live in Paris or I don’t live in France.

If the original conditional is true, does it follow that either I live in Paris or I don’t live in
France? That would be an odd inference, wouldn’t it? This entails that Paris is the only city
in France that I’m allowed to live in. This doesn’t follow, so strike that one out.

(not-A) or B = I don’t live in Paris or I live in France.

Now, if the original conditional is true, does it imply that either I don’t live in Paris or I live in
France?

It does. Let’s see why.

Recall that what the “OR” means is that one or the other of these must be true, they can’t
both be false. It’s easy to see that they can’t both be false. If they were both false, I’d be
saying that I live in Paris but I don’t live in France. That’s impossible, since Paris is in
France. So both of these disjuncts can’t be false at the same time; one of them must be
true.

Now let’s assume that the first disjunct is false: “I do live in Paris.” Does it then follow that I
live in France? Yes it does — Paris is in France.
Now assume that the second disjunction is false: “I don’t live in France.” Does it then
follow that I don’t live in Paris? Yes it does, for the same reason. This is indeed the correct
translation.

If you’re not convinced you can show that these are logically equivalent with truth tables.

On the left is all the possible truth values of A and B. In the middle is the truth table for the
disjunction when you negate A. On the right is the truth table for the conditional.

You can see that the truth values for the disjunction and the conditional, highlighted in red,
match up.

Some people find these kinds of explanations helpful, and some don’t. Either way, the
general rule is easy to remember:

If A then B = (not-A) or B

It’s helpful to have the brackets around “not-A” so that you don’t confuse this expression
with the contradictory of a disjunction, “not-(A or B)”. Brackets in logic function like they do
in math. They clarify the order of operations when it might otherwise be unclear.

Let’s look at a few examples.

Conditional: If we win the game then we’ll win the championship.

Disjunction: We won’t win the game or we’ll win the championship.

I admit that these translations don’t always sound very natural, but if you think about the
semantics of disjunctions and work through the reasoning you’ll see that they get the logic
right.

And sometimes you can express them in a more natural way, like this:

“Either we lost the game or we won the championship.”

Regardless, you won’t go wrong if you trust the translation rule.

Here’s one more example:


Conditional: "If there’s no gas in the car then the car won’t run."

Disjunction: "There’s gas in the car or the car won’t run."

This one sounds pretty natural as it is.

This translation rule can be handy for working through certain kinds of LSAT logic
problems where you have to represent conditional rules on a diagram. Sometimes it’s
easier to do this when the rule is expressed as an “either __ or ___” proposition.
4.7 Necessary and sufficient

The last set of terms we’ll look at for expressing conditional relationships involve the
concepts of “necessity” and “sufficiency”.

“If I become rich, then I’ll be happy.”

Here’s a question: When I say this, am I saying that becoming rich is necessary for me to
be happy?

If I say that becoming rich is necessary for my happiness, then I’m saying that there’s no
way for me to be happy unless I’m rich.

That doesn’t seem right.

What does seem right is to say that my becoming rich is “sufficient” for my being happy.

“Sufficient” means that it’s enough to guarantee that I’ll be happy. But it doesn’t imply that
becoming rich is the only way that I can be happy. My becoming rich is sufficient for my
happiness, but it’s notnecessary for it.

In terms of the antecedent and the consequent of the original conditional, we can say that

“If A then B” = “A is sufficient for B”.

Or in other words, the antecedent is sufficient to establish the consequent.

This is the first general rule. A conditional can always be translated into a sentence stating
that the truth of the antecedent is sufficient to ensure the truth of the consequent.

So how do interpret the language of “necessity”?

Well, let’s go back to our original claim.

“If I become rich, then I’ll be happy.”

We can’t say that A is necessary for B. But we can say that B is necessary for A.

In other words, we can’t say that my being rich is necessary for my happiness. But we can
say that my happiness is a necessary consequence of my being rich. In other words, if I
end up rich then I’m necessarily happy.

So, relationships of necessity and relationships of sufficiency are converses of one


another. If A is sufficient for B then B is necessary for A.

It’s easier to see if you have the rules side by side:

A is sufficient for B = If A then B

A is necessary for B = If B then A


When you see a claim of the form “A is sufficient for B”, then you read that as saying that
if A is true then B is guaranteed; the truth of A is sufficient for the truth of B.

When you see a claim of the form “A is necessary for B”, then you should
imagine flipping the conditional around, because the B term is now playing the role of the
antecedent.

Another way of stating the rule for “necessary” is to express it in terms of the
contrapositive, “If not-A then not-B”. The only way to make this clear is to look at
examples.

“Oxygen is necessary for combustion.”

This doesn’t mean that if there’s oxygen in the room then something is going to combust.
Matches don’t spontaneously burst into flame just because there’s oxygen in the room.

What this statement says is that if there’s combustion going on then you know that oxygen
must be present. And you would write that like this:

“If there’s combustion then there’s oxygen.”

Or, you could write it in the contrapositive form,

“If there’s no oxygen then there’s no combustion”.

Either way will do.

Let’s do an example working the other way. We’re given the conditional,

“If I have a driver’s license then I passed a driver’s test.”

How do we write this in terms of necessary and sufficient conditions?

How about this?

“Having a driver’s license is necessary for passing a driver’s test.”

Does this work?

No, it doesn’t. It would be very odd to say this, since it implies that you already have to
have a driver’s license in order to pass a driver’s test!

We need to switch these around:

“Passing a driver’s test is necessary for having a driver’s license.”

Using the language of sufficiency, you’ll reverse these:

“Having a driver’s license is sufficient for passing a driver’s test.”


This is a little awkward, but the logic is right. If you know that someone has a driver’s
license, that’s sufficient to guarantee that at some point they passed a driver’s test.

Finally, I want to draw attention to the parallels between the language of necessary and
sufficient conditions and the language of “if and only if”. These function in exactly the
same way.

A is necessary for B = A if B = If B then A

A is sufficient for B = A only if B = If A then B

Both emphasize that a conditional relationship only goes one way, and that if you can
establish that both are true then you’ve established biconditional relationship:

A is necessary and sufficient for B = A if and only if B = If (B then A) and (If A then
B)

That’s it for this section on the different ways we express conditionals in ordinary
language.

I know from experience that mastering the rules that we’ve been discussing in these last
few tutorials really does make you become aware of logical relationships, and by itself this
will help you to detect errors in reasoning and help you to be clear and precise in your own
reasoning.
Appendix: Categorical Claims and Their Contradictories
Categorical vs propositional logic

In any standard logic textbook you’ll see separate chapters on both propositional
logic andcategorical logic. Sometimes categorical logic is called “Aristotelian” logic, since
the key concepts in this branch of logic were first developed by the Greek philosopher
Aristotle.

I’m not planning on doing a whole course on categorical logic at this stage, but there are a
few concepts from this tradition that are important to have under your belt when doing very
basic argument analysis, so in this next series of tutorials I’m going to introduce some of
these basic concepts.

In this introduction I’m going to say a few a words about what the basic difference is
between categorical logic and propositional logic.

In the course on “Basic Concepts in Logic and Argumentation” we saw a lot of arguments
and argument forms that are basically categorical arguments and that use the formalism of
categorical logic.

Here’s a classic example.

1. All humans are mortal.


2. Simon is human.
Therefore, Simon is mortal.

This argument is valid. When you extract the form of this argument it looks like this:

1. All H are M
2. x is an H
Therefore, x is an M

The letters are arbitrary, but it’s usually a good idea to pick them so they can help us
remember what they represent.

Now, the thing I want to direct your attention to is how different this symbolization is from
the symbolization in propositional logic.

When we use the expression “All H are M”, the “H” and the “M” DO NOT represent
PROPOSITIONS, they don’t represent complete CLAIMS. In propositional logic each letter
symbolizes a complete proposition, a bit of language that can be true or false. Here, the H
and the M aren’t propositions.

So what are they?

They’re categories, or classes. H stands for the category of human beings, M stands for
thecategory of all things that are mortal, that don’t live forever.

These categories or classes are like buckets that contain all the things that satisfy the
description of the category.
What they don’t represent is a complete claim that can be true or false. This is a
fundamental difference in how you interpret symbolizations in categorical logic compared
to how you interpret them in propositional logic.

In categorical logic, you get a complete claim by stating that there is a


particularrelationship between different categories of things.

In this case, when we say that all humans are mortal, you can visualize the relationship
between the categories like this:

We’re saying that the category of mortals CONTAINS the category of humans. Humans
are a SUBSET of the category of things that die. The category of mortals is larger than the
category of humans because lots of other things can die besides human beings. This
category includes all living things on earth.

Now, when you assert this relationship between these two categories, you have a
complete proposition, a claim that makes an assertion that can be true or false.

This is the fundamental difference between symbolizations in propositional logic and


categorical logic.

In propositional logic you use a single letter to represent a complete proposition.

In categorical logic the analysis is more fine-grained. You’re looking INSIDE a proposition
and symbolizing the categories that represent the subject and predicate terms in the
proposition, and you construct a proposition by showing how these categories relate to
one another.

Now, what does that small “x” represent, in “x is an H”?

It represents an INDIVIDUAL human being, Simon.


In categorical logic you use capital letters to represent categories or classes of things,
and you uselower-case letters to represent individual members of any particular category.

On a diagram like this you’d normally us a little x to represent Simon, like this:

So, putting the X for Simon inside the category of humans is a way of representing the
whole proposition, “Simon is human”.

Notice, also, that from this diagram you can see at a glance why the argument is VALID.
This diagram represents the first two premises of the argument. When judging validity you
ask yourself, if these premises are true, could the conclusion possibly be false?

And you can see that it can’t be false. If x is inside the category of humans, then it HAS to
be inside the category of mortals, since humans are subset of mortals.

In a full course in categorical logic you would learn a whole set of diagramming techniques
for representing and evaluating categorical arguments, but that’s not something we’re
going to get into here.

What we’re going to talk about is what sorts of claims lend themselves to categorical
analysis. Claims with the following forms:

All A are B All A are not-B


Some A are B Some A are not-B
No A are B No A are not-B

Claims like “All humans are mortal”, “Some men have brown hair”, “No US President has
been female”, “All mammals do not have gills” and so on.

Aristotle worked out a general scheme for analyzing arguments that use premises of this
form. In the video course on common valid and invalid argument forms we’ll look at a few
of the most common valid and invalid categorical argument forms.
In the remaining tutorials in this section all I really want to do is look at the semantics of
categorical claims, what they actually assert, and how to write the contradictory of these
categorical claims.

This material is important for knowing how to reason about generalizations, and it’s a
translation skill that you need to know how to do in order to work out certain LSAT
problems.
All A are B

This is the classic universal generalization, “All A are B”.

Here some examples of claims with this form:

“All humans are mortal.”

“All whales are mammals.”

“All lawyers are decent people.”

Two things to note about these sorts of generalizations:

First, the “all” is strict; when you read “All”, it really means “ALL”, no exceptions.

Second, we often don’t use the “all” to express a universal generalization.

“Humans are mortal” means “humans in general are mortal”, it’s implied that you’re
talking about all humans.

Similarly, “whales are mammals” means “all whales are mammals”, and “lawyers are
decent people” means “all lawyers are decent people.”

Now, in the real world, people sometimes aren’t careful and will make a generalization that
they will acknowledge has exceptions. They might say “All politicians are crooks”, but then
they might admit that one or two are pretty decent. They’re not lying, they’re just not being
precise, or maybe they’re exaggerating for the sake of dramatic effect. What they really
mean is “Most” or “Almost all” politicians are crooks.

In logic it matters a great deal whether you mean “all” or “almost all”, so just be aware of
the strictness of your language; if you don’t really mean “all”, then don’t say “all”.

The contradictory of a universal generalization is pretty straightforward, there’s just one


thing to be on the lookout for.

If I say “All humans are mortal”, it’s tempting to think that the contradictory might be “No
humans are mortal”.

But this is wrong. This isn’t the contradictory.

Remember that the contradictory has to have the opposite truth value in all possible
worlds; if one is true then the other must be false, and vice versa.

But imagine a world in which half the people are mortal and the other half are immortal. In
this world, BOTH of these statements would be FALSE. This shouldn’t happen if they’re
genuine contradictories.

No, these are CONTRARIES. They can’t both be true at the same time, but they can both
be false.
The contradictory of “All humans are mortal” is

“SOME humans are NOT mortal”.

or, “Some humans are immortal.”

These two claims always have opposite truth values. If one is true the other has to be
false.

Here’s the general form:

not-(All A are B) = Some A are not-B

And here are some examples:

Claim: “All dogs bark.”

Contradictory: “Some dogs don’t bark”.

Claim: “Canadians are funny.”

Contradictory: “Some Canadians are not funny.”

Note that here you need to remember that “Canadians are funny” makes a claim about all
Canadians, logically you need to read it as “All Canadians are funny”.

Claim: "All Michael Moore films are not good.”

The translation rule works just the same, but you need to use double-negation on the B-
part. So the contradictory looks like

Contradictory: “Some Michael Moore films are good”.

You see how this works. This is how you write the contradictory of a universal
generalization.
Some A are B

Let’s look at this expression, “Some A are B”.

“Some dogs have long hair.”

“Some people weigh over 200 pounds.”

“Some animals make good pets.”

It might seem that “some” is so vague that it’s hard to know exactly what it means, but in
logic “some” actually has a very precise meaning. In logic, “some” means “at least
one”.

“Some” is vague in one sense, but it sets a precise lower bound. If “some dogs have long
hair”, then you can be certain that at least one dog has long hair.

So, “some people weigh over 200 pounds” means “at least one person weighs over 200
lbs”.

“Some animals make good pets” means “at least one animal makes a good pet”.

There are couple of equivalent ways of saying this. If you want to say “some dogs have
long hair”, then you could say

“At least one dog has long hair”, or

“There is a dog that has long hair”, or

“There exists a long-haired dog”.

These are all different ways of saying “at least one”.

Here’s something to be aware of. The standard reading of “At least one A is B” is
consistent with it being true that “All A are B”.

So if I say “Some dogs have long hair”, this doesn’t rule out the possibility that all dogs in
fact have long hair.

But sometimes, “some” is intended to rule out this possibility. Sometimes we want it to
mean “at least one, but not all”. Like if I say, “some people will win money in next
month’s lottery”, I mean “at least one person will win, but not everyone will win”.
Which reading is correct -- whether it means “at least one” or “at least one but not all” --
will depend on the specific context.

Now, let’s look at the contradictory of “Some A are B”.

“Some dogs have long hair”.

If this is false, what does this imply? Does it imply that ALL dogs have long hair?
No. At most, this would be a contrary, if we were reading “Some” as “At least one, but not
all”.

No, the contradictory of “Some dogs have long hair” is

“No dogs have long hair”.

If no dogs have long hair then it’s always false that at least one dog has long hair, and vice
versa.

So, the general form of the contradictory looks like this:

not-(Some A are B) = No A are B

Examples:

Claim: “Some movie stars are rich.”


Contradictory: “No movie stars are rich.”
Claim: “There is a bird in that tree.”

Note that this is equivalent to saying that there is at least one bird in that tree, or “Some
bird is in that tree”. So the contradictory is

Contradictory: “No bird is in that tree.”

Another example:

Claim: “Some dogs don’t have long hair.”


Contradictory: “No dogs don’t have long hair.”

This is a bit awkward. The easiest way to say this is “All dogs have long hair”. Here
we’re just applying the rule for writing the contradictory of a universal generalization:

not-(All are B) = Some A are not-B

which is the form the original claim has.

This is about all you need to know about the logic of “Some A are B”.
Only A are B

Let’s take a look at “ONLY A are B”.

“Only dogs make good pets.”

“Only Great White sharks are dangerous.”

“Only postal employees deliver U.S. mail.”


You won’t be surprised to learn that the logic of “Only A are B” parallels the use of “only if”
in the logic of conditionals. There, “A if B” is equivalent to “B only if A”, where the
antecedent is switched between “if” and “only if”.

Here, the switch is between “Only” and “All”.

“Only dogs make good pets” means the same thing as “All good pets are dogs”.

“Only Great White sharks are dangerous” means “All dangerous sharks are Great
Whites”.

“Only postal employees deliver U.S. mail” means “All people who deliver U.S. mail
are postal employees”.

The general translation rule is this: “Only A are B” can re-written as “All B are A”.

Note how similar this is to the translation rule for conditionals:

“A only if B” is equivalent to “B if A”.

The difference, of course, is that the As and Bs refer to very different things. In categorical
logic the As and Bs refer to categories or classes of things. In propositional logic the As
and Bs refer towhole claims, in this case either the antecedent or the consequent of a
conditional claim.

It’s true, though, that the logical dependency relationships are very similar -- in both cases
when you do the translation you the switch the As and the Bs -- and recognizing the
analogy can be helpful when you’re doing argument analysis. We’ll come back to this in
the course on “common valid and invalid argument forms”.

Now, let’s look at the contradictory of “Only A are B”.


Someone says that “only dogs make good pets”. You say “no”, that’s not true. What’s
the contradictory?

The contradictory would be to say that “Some good pets are not dogs”.

How do we know this? How do we know that, for example, it’s not “Some dogs are not
good pets”?

Well, you can figure it out just by thinking about the semantics and knowing what a
contradictory is, but there’s a formal shortcut we can use to check the answer.

We can exploit the fact that we know that “Only dogs make good pets” is equivalent
to “All good pets are dogs”.

Now, we have the original claim in the form “All A are B”, and we already know how to
write thecontradictory of a universal generalization, it’s “Some A are not B”. And this gives
us the form of the contradictory, “Some good pets are not dogs”.

This is typical with these sorts of rules. Once you know some basic translation rules and
some rules for writing contradictories, you can often rewrite an unfamiliar claim in a form
that is more familiar and that allows you to apply the rules that you do know.

The general form of the contradictory looks like this:

not-(Only A are B) = Some B are not A

So, if our original claim is “Only movie stars are rich”, the contradictory is “Some rich
people are not movie stars”. It’s like the rule for “ALL”, but you need to reverse the As
and the Bs.

Here’s another one: “Only Starbucks makes good coffee”.


The contradictory is “Some good coffee is not made by Starbucks”.

You replace “only” with “some”, switch the As and Bs, but make sure you take the negation
of the predicate class.

This is all you need to know about the logic of “Only A are B”.
Square of Opposition

Here’s a handy diagram that might help some of you memorize the contradictories of the
different categorical forms.

This diagram is sometimes called the “Square of Opposition”.

It’s not a complete version of the Square of Opposition that shows up in most textbooks.
I’ve left off some relationships that appear in the complete diagram, since we haven’t
talked about them. But it captures at a glance the contradictories of the categorical claims
that use All, Some and No.

The contradictories are on the diagonal. At the top you have contraries. “All A are B” and
“No A are B” can’t both be true, but they can both be false.

At the bottom you have what are called “subcontraries”. Can you guess what this is?

Well, as the name subjects, it’s a contrary relationship, but in this case the two claims can
both be true at the same time, but they can’t both be false. That’s the opposite of a
contrary, so it’s called a “subcontrary”.

Anyway, I thought I’d put this up in case anyone finds it useful.

Common Valid and Invalid Argument Forms


Part 1: Argument Forms Using Disjunctions (A or B)
1.1 Valid forms using OR
Disjunctions are compound claims of the form “A or B”. We looked at the logic of
disjunctive claims in the propositional logic course.

Here we’re going to look at the valid argument forms that use the disjunction as a major
premise.

1. Either you’re with me or you’re against me.


2. You’re not with me.
So, you must be against me.

This is a valid argument that uses a disjunction as a major premise.

The basic valid argument form looks like this:

1. A or B
2. not-A
Therefore, B

A and B are the disjuncts. If you can show that one of the disjuncts is false, then the
remaining disjunct MUST be true.

Note that here we’re negating “A” and inferring “B”, but this is arbitrary, you can just as
easily negate “B” and infer “A”.

In logic texts this argument form is sometimes called “disjunctive argument” or “disjunctive
syllogism”.

Now, what about this one?

1. A or B or C or D
2. not-A
Therefore, … ?

You’re given a disjunction with four alternatives, four disjuncts. I say that the first one, “A”,
is false. What can I validly infer?

Well, you can’t infer that any specific one of the remaining alternatives is true. All that you
can infer is that they can’t all be false.

So the inference looks like this:

1. A or B or C or D
2. not-A
Therefore, B or C or D

Eliminating one possibility just leaves the rest. Either B or C or D must be true.

You can only arrive at a single conclusion if you can eliminate all of the remaining
alternatives. This is how that argument would look.

1. A or B or C or D
2. not-A
3. not-B
4. not-C
Therefore, D

This is the basic logic behind any reasoning based on a process of elimination.

We use this kind of reasoning every day, but it you see it most prominently in areas
like medical diagnosis or forensic research or detective work, or scientific reasoning
generally, where we’ve got a range of possible hypotheses that might explain some piece
of data, and you’ve got to find additional clues or information that will narrow down the list
of possibilities.

Just to close, here’s the basic valid argument form again.

1. A or B
2. not-A
Therefore, B

As I said at the top, in logic texts this argument form is often called “disjunctive argument”
or “disjunctive syllogism”.

The term “syllogism”, by the way, comes from Aristotle’s writings on logic. It’s normally
used to describe three-line categorical arguments, like “All humans are mortal, Socrates is
human, therefore Socrates is mortal”, but sometimes it’s used a bit more broadly, like in
this case, to refer to any simple valid argument form that has two premises and a
conclusion.

Now, if you’ve looked at the tutorials on propositional logic then you might be wondering
about how the “inclusive OR”-“exclusive OR” distinction factors in here. It is relevant, but
we’ll look at that in the next tutorial on invalid argument forms that use the “OR”.

1.2 Invalid forms using OR

Let’s talk about invalid argument forms that use “OR”.

Let’s look at this example again:

1. Either you’re with me or you’re against me.


2. You’re not with me.
So, you must be against me.

This is valid, because the disjunction states that both of these disjuncts can’t be false, at
least one of them must be true, so if you can eliminate one then the remainder has to be
true.

But what if I said something like this?

1. College teachers have to have either a Master’s degree or a Ph.D.


2. Professor Smith has a Master’s degree.
Therefore, he doesn’t have a Ph.D.
Is THIS a valid argument?

It doesn’t seem so. After all, why can’t it be the case that Professor Smith as BOTH a
Master’s AND a PhD? Generally this is the case, if you have PhD then you also have a
Masters degree, since having a Masters degree is usually a prerequisite for attaining the
PhD.

But if so, then this inference is clearly INVALID.

The general form of this invalid inference looks like this:

1. A or B
2. A
Therefore, not-B

In this form you’re affirming that one of the disjuncts is true, and on the basis of this,
inferring that the remaining disjunct must be false.

In general, this is not a valid inference when it’s logically possible for the two disjuncts to
be true at the same time.

In other words, it’s invalid when the “OR” is an “INCLUSIVE” OR.

An inclusive OR is one that asserts that “A is true, or B is true, OR BOTH may be


true.” The only case that it rules out is the case where both are FALSE.

Now, as you might expect, the case is different if the OR is “exclusive”. Here’s a clear
example of anexclusive OR:

1. The coin landed heads or tails.


2. The coin landed heads.
Therefore, the coin did not land tails.

Here you’re doing the same thing, you’re affirming one of the disjuncts and inferring that
the remaining disjunct must be false.

But in this case the inference is VALID, since the OR is an “exclusive or” -- it excludes
the case where both of the disjuncts can be true.

So, this argument form

1. A or B
2. A
Therefore, not-B

is VALID when the OR is an “exclusive OR”.

Let’s put the OR forms side-by-side:


1. A or B 1. A or B
2. not-A 2. A
Therefore, B Therefore, not-B

Always valid Invalid if OR is inclusive, valid if OR is exclusive


Part 2: Argument Forms Using Conditionals
2.1 Modus ponens

Conditional arguments and argument forms are central to logic and central to critical
thinking more broadly. You absolutely must know the basic valid and invalid argument
forms that use the conditional.

So here goes. We’ll start with the most basic valid form, known as “modus ponens”.

This is modus ponens:

1. If A then B
2. A
Therefore, B

If you’ve watched the tutorials in the basic concepts course then you seen a number
examples of this argument form already.

The term “modus ponens” is another hold-over from the days when logic was taught in
universities during the middle ages and the language of instruction was Latin, so we have
all these Latin names for argument forms and fallacies that people still use.

The full latin name is “modus ponendo ponens”, which means something like “the mode of
affirming by affirming”. It refers to the fact that with this conditional form we’re affirming the
consequent by affirming the antecedent. By everyone today just calls it “modus ponens”.

Here’s an example:

1. If your king is in checkmate then you’ve lost the game.


2. Your king is in checkmate.
Therefore, you’ve lost the game.

The conditional premise asserts that if the antecedent is true then the consequent is true.
The second premise affirms that the antecedent is in fact true, and you then you validly
infer that the consequent must also be true.

Now, if you’ve watched the tutorials in the propositional logic course then you know that
there are lots of different ways of writing conditionals. As long as the conditional premise is
equivalent in meaning to the one you see here, the argument will be an instance of modus
ponens.

For example, I might write the same conditional as

1. You’ve lost the game UNLESS your king is NOT in checkmate.


2. Your king is in checkmate.
Therefore, you’ve lost the game.

As we showed in the propositional logic course, this is an equivalent way of saying “If your
king is in checkmate, then you’ve lost the game”. If you’re not sure why this is so and
you’re curious then you might want to check out that course.
But the point I want to make here is that this argument has the same logical form as the
previous version, it’s an instance of modus ponens, even though it doesn’t use the “if A
then B” syntax to express the conditional. What matters is that the claim is logically
equivalent to a conditional of the form “If A then B”.

So, when we say that an argument has the form of modus ponens, we’re not saying that
it’s necessarily written in the form “if A then B, A, therefore, B”, we’re saying that
it’s logically equivalent to an argument of the form “if A then B, A, therefore, B”.

This is why knowing those translation rules for conditionals is important. They can help you
see past the superficial grammar and into the underlying logical structure of a conditional
argument.

I’d like to finish with a point that is important for all of the conditional argument forms we’ll
be looking at.

1. If (not-P) then (not-Q)


2. not-P
Therefore, not-Q

You see the conditional form I’ve written above. I’ve replaced the antecedent and the
consequent with negations throughout the argument.

The point I want to make is that this argument is still an instance of modus ponens, even
with those negation signs.

It’s equivalent to the standard form with the obvious substitutions,

1. If A then B
2. A
Therefore, B

where A = not-P and B = not-Q. The antecedent of the argument isn’t P, it’s not-P, and the
consequent isn’t Q, it’s not-Q.

In fact, you can have long, complex compound claims playing the role of the antecedent
and the consequent, and as long as they’re related in the right way, they’ll still be
instances of modus ponens.

Here’s an example:

1. If I get an A in Spanish and don’t fail French, then I’ll graduate this year.
2. I got an A in Spanish and didn’t fail French.
Therefore, I’ll graduate this year.

This is just modus ponens. The antecedent in this case has some structure to it, it’s a
compound claim, a conjunction, “I got an A in Spanish and I don’t fail French”. It’s
important when analyzing conditional arguments to understand that conditional claims can
have complex parts to them and yet still be equivalent to a simple conditional of the form
“If A then B”.
2.2 Modus tollens

Modus ponens is the direct way of reasoning with conditionals. There’s an indirect way
that is also commonly used, and it’s called “modus tollens”.

This is modus tollens, and it’s a valid argument form.

1. If A then B
2. not-B
Therefore, not-A

The full latin name is “modus tollendo tollens”, which means “the mode of denying by
denying”. It refers to the fact that with this conditional form we’re denying the consequent
by denying the antecedent. Everyone today just calls it “modus tollens”.

Let’s look at that checkmate example:

1. If your king is in checkmate then you’ve lost the game.


2. You have not lost the game.
Therefore, your king is not in checkmate.

Sounds reasonable! Put in anything for A and B and it works.

1. If the match is lit then there is oxygen in the room.


2. There is no oxygen in the room.
Therefore, the match is not lit.

Now, I want to show you an easy way of remembering the form.

From the propositional logic course you’ll remember the contrapositive of a conditional is
logically equivalent to the conditional, and you can always substitute the contrapositive
form for the standard form without loss of meaning.

Conditional: If A then B
Contrapositive: If not-B then not-A

If we do this, the argument form looks like this:

1. If not-B then not-A


2. not-B
Therefore, not-A.

This is the same argument with the conditional premise written in the contrapositive form.

But we should recognize the form of this argument -- it’s just modus ponens. This is an
argument of the form “If A then B, A, therefore B”, where the antecedent is “not-B”, and the
consequent is “not-A”. Since we know that modus ponens is valid, we can see directly
that modus tollens must also be valid.
So, if you’re ever confused about how to write modus tollens, just remember that if you
rewrite it in terms of the contrapositive then you should recover an argument that has the
form of modus ponens.
2.3 Hypothetical syllogism

Before we move on to the invalid forms, here’s one more valid conditional form that you
should know. It’s sometimes called “hypothetical syllogism” or “hypothetical
argument”, or more informally it’s sometimes called “reasoning in a chain”.

It’s obvious why it’s called this when you see the argument form in action:

1. If A then B
2. If B then C
Therefore, if A then C

It’s what you get when you chain a series of conditionals together, where the consequent
of one becomes the antecedent of another. You can chain as many of these together as
you like.

Note that both the premises and the conclusion are conditionals. In this argument form
we’re never actually asserting that A or B or C is true. All we’re asserting is a set of
hypothetical relationships: IF A was true, then B would follow; and IF B was true then C
would follow. And from this we can assert that IF A was true, then C would follow.

In logic and mathematics, this kind of relationship is called a “transitive” relationship. Some
relationships are transitive and some aren’t. “Tallness”, for example, is transitive. If Andrew
is taller than Brian, and Brian is taller than Chris, then Andrew must be taller than Chris.

On the other hand, being an object of admiration is NOT transitive. If Andrew admires
Brian, and Brian admires Chris, there’s no guarantee that Andrew will admire Chris.

At any rate, what this argument form shows is that the relation of logical implication is
transitive. If A logically implies B, and B logically implies C, then A logically implies C.

This argument form is often used to represent chains of cause and effect, like in this
example:

1. If the cue ball hits the red ball, then the red ball will hit the blue ball.
2. If the red ball hits the blue ball, then the blue ball will go into the pocket.
Therefore, if the cue ball hits the red ball, then the blue ball will go into the pocket.

This is precisely the reasoning that a pool player is using when they’re setting up a
combination shot like this. They have to reason through a chain of conditionals to figure
out how to best strike the cue ball.

It should be obvious, but in case it’s not, you can also use this form to generate a modus
ponenstype of argument:

1. If A then B
2. If B then C
3. A
Therefore, C
To establish this, note that from premises 1 and 2 we get the assumed premise, “If A then
C”:

1. If A then B
2. If B then C
Therefore, if A then C [by hypothetical syllogism on 1 and 2]

Then from this new premise, plus the affirmation of A, we can derive C using modus
ponens:

1. If A then B
2. If B then C
3. If A then C
4. A
Therefore, C [by modus ponens on 3 and 4]

Now, note that with hypothetical syllogism, the order of the terms is important. The
argument form below, for example, is not valid:

1. If A then B
2. If B then C
Therefore, if C then A

This is arguing backwards with conditionals. The direction of logical dependency


doesn’t work this way. With conditionals the only valid inference is from antecedent to
consequent, you can’t go the other way. To illustrate, let’s assume the following
conditionals are true.

1. If John studies hard then he’ll pass the test.


2. If John passes the test, then he’ll pass the class.

Now, assume that John does indeed pass the class. Does it follow with deductive certainty
that John studied hard for that test? That is, is this argument form valid?

1. If John studies hard then he’ll pass the test.


2. If he passes the test, then he’ll pass the class.
3. John passed the class.
Therefore, John studied hard for the test.

No, it isn’t. Maybe the teacher gave a very easy test that day that John could pass without
studying hard. Maybe he bribed the teacher to pass him. There are lots of possible ways
that these premises could be true and the conclusion still false.

On the other hand, if we knew that he studied hard for the test, we could validly infer that
he passed the class. That would be the valid form of reasoning in a chain.
2.4 Affirming the consequent

“Affirming the Consequent” is the name of an invalid conditional argument form. You
can think of it as the invalid version of modus ponens.

Below is modus ponens, which is valid:

1. If A then B
2. A
Therefore, B

Now, below is the invalid form that you get when you try to infer the antecedent by
affirming the consequent:

1. If A then B
2. B
Therefore, A

No matter what claims you substitute for A and B, any argument that has the form of
modus ponens will be valid, and any argument that AFFIRMS THE CONSEQUENT will
be INVALID.

Remember, what it means to say that an argument is invalid is that IF the premises are all
true, the conclusion could still be false. In other words, the truth of the premises does not
guarantee the truth of the conclusion.

Here’s an example:

1. If I have the flu then I’ll have a fever.


2. I have a fever.
Therefore, I have the flu.

Here we’re affirming that the consequent is true, and from this, inferring that the
antecedent is also true.

But it’s obvious that the conclusion doesn’t have to be true. Lots of different illnesses can
give rise to a fever, so from the fact that you’ve got a fever there’s no guarantee that
you’ve got the flu.

More formally, if you were asked to justify why this argument is invalid, you’d say that it’s
invalid because there exists a possible world in which the premises are all true but the
conclusion turns out false, and you could defend this claim by giving a concrete
example of such a world. For example, you could describe a world in which I don’t have
the flu but my fever is brought on by bronchitis, or by a reaction to a drug that I’m taking.

Another example:

1. If there’s no gas in the car then the car won’t run.


2. The car won’t run.
Therefore, there’s no gas in the car.
This doesn’t follow either. Maybe the battery is dead, maybe the engine is shot. Being out
of gas isn’t the only possible explanation for why the car won’t start.

Here’s a tougher one. The argument isn’t written in standard form, and the form of the
conditional isn’t quite as transparent:

“You said you’d give me a call if you got home before 9 PM, and you did call, so you
must have gotten home before 9 PM.”

Is this inference valid or invalid? It’s not as obvious as the other examples, and partly this
is because there’s no natural causal relationship between the antecedent and the
consequent that can help us think through the conditional logic. We understand that cars
need gas to operate and flus cause fevers, but there’s no natural causal association
between getting home before a certain time and making a phone call.

To be sure about arguments like these you need to draw upon your knowledge of
conditional claims and conditional argument forms. You identify the antecedent and
consequent of the conditional claim, rewrite the argument in standard form, and see
whether it fits one of the valid or invalid argument forms that you know.

Here’s the argument written in standard form, where we’ve been careful to note that the
antecedent of the conditional is what comes after the “if”:

1. If you got home before 9 PM, then you’ll give me a call.


2. You gave me a call.
Therefore, you got home before 9 PM.

Now it’s clearer that the argument has the form of “affirming the consequent”, which we
know is invalid.

The argument would be valid if the you said that you’d give me a call ONLY IF you got
home before 9 PM, but that’s not what’s being said here. If you got home at 9:30 or 10
o’clock and gave me a call, you wouldn’t be contradicting any of the premises.

If these sorts of translation exercises using conditional statements are unfamiliar to you
then you should check out the tutorial course on basic concepts in propositional logic,
which has a whole section on ways of saying “If A then B”.
2.5 Denying the antecedent

“Denying the antecedent” is the name of another invalid conditional argument form. You
should think of this as the invalid version of modus tollens.

Below is modus tollens, which is valid;

1. If A then B

2. not-B
Therefore, not-A

Below is the invalid form, known as “denying the antecedent”:

1. If A then B

2. not-A

Therefore, not-B

It’s no mystery why it’s called this. You’re denying the antecedent and trying to infer the
denial of the consequent.

Let’s look at some examples.

1. If the pavement is wet in the morning, then it rained last night.

2. The pavement is not wet this morning.

Therefore, it didn’t rain last night.

It’s not hard to see why this is invalid. It could have rained last night but it stopped early
and the rain on the pavement evaporated before morning. Clearly, these premises don’t
guarantee the truth of the conclusion.

Here’s another one, inspired by an example from the last tutorial:

1. If there’s no gas in the car then the car won’t run.

2. There is gas in the car.

Therefore, the car will run.


Note that we’ve eliminated the negations in the second premise and the conclusion by
using double-negation on the antecedent and the consequent. This still has the form of
denying the antecedent, “if A then B, not-A, therefore not-B”, but the antecedent, A, is
already a negation, so by denying the antecedent you’re saying “it’s not the case that
there’s no gas in the car”, which just means that there is gas in the car.

This is one is obviously invalid too. The fact there’s gas in the car is no guarantee that the
car is going to run.

Let’s do a trickier one:

“I know you didn’t say your wish out loud, because if you had, it wouldn’t have
come true, and your wish did come true.”

Hmm. The only way to be really sure about an argument like this is to re-write it in
standard form, either in your head or on paper.

First, of all what’s the conclusion?

This is the conclusion: “You didn’t say your wish out loud”. The word “because” is an
indicator word that flags this. (The “I know” isn’t part of the content of the conclusion, it
just helps to indicate that this in an inference that follows from something else.)

Okay, so what is the conditional premise?

In the original it reads “if you had, it wouldn’t have come true”. This is the conditional
premise. To make the antecedent explicit you need to clarify what “it” refers to -- “it” refers
to “your wish”.

Conditional premise: “If you say your wish out loud, then it won’t come true.”
I’ve rewritten the conditional in the present tense, because it will sound more natural when
it’s written in standard form, but you’re not altering the content of the claim in any
significant way by doing this.

Now, what we have left is the phrase, “and your wish did come true”. The “and” isn’t part
of the claim, the claim is “Your wish did come true”. This is the second premise.

Now we have all we need to write this in standard form:

1. If you say your wish out loud, then it won’t come true.

2. Your wish did come true.

Therefore, you didn’t say your wish out loud.

Now, does this argument have the invalid form of “denying the antecedent”?

No, it does not. This argument has the form of modus tollens, and it’s valid.

Some people can see the logical relationships right away just by glancing at the original
argument, but those people are in the minority. Most of us need to check to make sure
we’ve got it right, and the only way to do that is to reconstruct the argument and put it in
standard form, so we can compare the form with the valid and invalid forms that we do
know.
Part 3: Argument Forms Using Generalizations
3.1 Valid and invalid forms using ALL

Let’s look at the valid and invalid argument forms that use “All”.

Here’s an example of one of the most basic valid argument forms:

1. All monkeys love bananas.


2. Rico is a monkey.
Therefore, Rico loves bananas.

Structurally, it has the following form:

1. All A are B
2. x is an A
Therefore, x is a B

In the propositional logic course I talked about the semantics of categorical claims like this,
and I recommend checking that out for a more detailed discussion of categorical claims.

Here I want to point out that the As and the Bs in this argument don’t refer to whole claims,
they refer to categories or classes of things.

In this example, A is the category of “monkeys”. B is the category of “things that love
bananas”. Lowercase letters are used to refer to individual members of a category, so
Rico, an individual monkey, is represented by a lowercase x.

Now, note the directionality of the inference. If x is a member of the category A, then I can
validly infer that x is also a member of the category B. But I can’t go the other way.

Here’s an example of arguing backwards with “all”.

1. All monkeys love bananas.


2. Sammy loves bananas.
Therefore, Sammy is a monkey.

This is obviously invalid. If you grant that all monkeys love bananas, and you grant that
Sammy loves bananas, there’s no reason to think that Sammy has to be a monkey. Lots of
things other than monkeys love bananas.

Note that what makes the argument invalid has nothing to do with being a monkey or
being a thing that loves bananas. ANY argument that has this form …

1. All A are B
2. x is a B
Therefore, x is an A

… is going to be invalid. What makes it invalid is the relationship between the categories
that is being asserted, and the relationship of the individual, x, to the categories.

Here’s another valid argument that uses “ALL”.


1. All tigers are mammals.
2. All mammals are warm-blooded.
Therefore, all tigers are warm-blooded.

The only difference between this argument form and the first valid form we looked at is that
the second premise isn’t a claim about an individual, it’s another generalization.

This argument has features similar to “reasoning in a chain” with conditionals, but here
we’re reasoning in a chain with “ALL”.

1. All A are B
2. All B are C
Therefore, all A are C

And like the conditional, the inference is directional; you can argue “forward”, but you can’t
argue “backward”. Here’s a backward version.

1. All tigers are mammals.


2. All mammals are warm-blooded.
Therefore, all warm-blooded things are tigers.

Clearly an invalid argument. I’m warm-blooded, but I’m not a tiger!

There are lots of other combinations we could go through, but these are the basic valid
and invalid argument forms using “ALL” that you should know.
3.2 Valid and invalid forms using SOME

Let’s look at a couple of argument forms that use “Some”.

1. Some monkeys love bananas.


2. Rico is a monkey.
Therefore, Rico loves bananas.

This obviously is not valid. Once you downgrade the generalization from “all” to “some”,
you lose the validity of the inference. If you recall from the propositional logic course,
“some” just means “at least one”, so from the fact that at least one monkey loves bananas,
there’s no good reason to think that a random monkey like Rico is going to love bananas.

So this form is clearly invalid:

1. Some A are B.
2. x is an A.
Therefore, x is a B.

Here’s another example where replacing “all” with “some” makes the argument invalid.

1. Some musicians are rich.


2. Some rich people live in New York.
Therefore, some musicians live in New York.

This is “reasoning in a chain with SOME” and it’s invalid:

1. Some A are B
2. Some B are C
Therefore, some A are C

Reasoning in a chain with “ALL” is valid, but not with “SOME”.

This example may not strike everyone as obviously invalid, since in our world, every claim
in the argument is true. But when we’re assessing validity we’re not allowed to assume
background information that isn’t stated in the premises.

So, in this hypothetical world we know that at least one musician is rich, and we know that
at least one rich person lives in New York. But from this alone we can’t validly infer that
any musicians live in New York.

For most purposes this is all you really need to know about categorical arguments that use
“SOME”. The two forms that we’ve looked at are both invalid, and this will generally be the
case when reasoning with generalizations of the form “Some A are B”.

I would add that there are valid argument forms that use “some”, but they tend to be
awkward and not worth memorizing. For example, if I said, “It’s not the case that some
tigers are not mammals, Tony is a tiger, therefore, Tony is a mammal”, that’s a valid
argument, but the form is too convoluted to be helpful as a memory aid.
Introduction to Fallacies
Part 1: Introduction
1.1 What is a fallacy?

Here a very general definition of a fallacy:

A fallacy is an argument of a type that is generally recognized to be bad.

So, first and foremost, a fallacy is a bad argument.

But not every bad argument should be labelled a fallacy. What makes it a fallacy is that the
argument has certain general features that allow you to characterize it as a type, and it is
these general features that are responsible for the argument being bad.

This allows you to say that a given argument is bad because its an example or instance of
a particular KIND of argument that is generally recognized to be bad.

Here’s an example we’ve seen already.

“If John exercises every day and watches what he eats, then he’ll lose weight. John
lost weight, so he must be exercising every day and watching what he eats.”

This argument is bad because the logic is weak. From the fact that John has lost weight it
doesn’t follow that he’s exercising every day AND watching what he eats. He could be
restricting his diet and not exercising at all. Or he could be exercising and not changing his
diet. Or he could be sick and bed-ridden and he’s lost weight for that reason. And so on.

What makes this a fallacy is that we can recognize that this is an argument of a certain
general type -- it’s an instance of the invalid conditional argument form known as “affirming
the consequent”. For this reason, this is called “the fallacy of affirming the consequent”.

1. If A then B
2. B
Therefore, A

1. If John exercises every day and watches what he eats, then he’ll lose weight.
2. John lost weight.
Therefore, John exercised every day and watched what he ate.

This is an example of what is called a “formal” or “structural” fallacy, because the argument
form is invalid. But not all fallacies are formal in this way.

Here’s an example:

“It’s okay to lie on your taxes. Everyone does it.”

This is a bad argument, but it’s not bad because of its structural form. It’s bad because it
relies on an assumed premise that most of us would reject.
Here’s how you might reconstruct this argument:

1. If everyone does something, then it’s okay to do it.


2. Everyone lies on their taxes.
Therefore, it’s okay to lie on your taxes.

The assumed premise is something like “If everyone does something then it’s okay to
do it”.

When you reconstruct the argument in this way, the logic is perfectly fine, it’s a valid
argument. The problem is with that first general premise. Most people would reject the
view that an action is morally alright as long as everyone does it. What if you lived in a
slave culture and everyone practiced slavery? What would that make slavery morally
okay?

This is sometimes called the “bandwagon” fallacy, or “the appeal to common practice”, and
it’s an example of what is usually called a “content” fallacy.

A content fallacy is a fallacy that relies on a false or dubious premise of a certain


kind.

Note that here the second premise is false too; not everyone lies on their taxes. But simply
having a false premise isn’t enough to make an argument guilty of a content fallacy. It’s a
content fallacy only if the major premise that the argument is relying on is of a certain
general type that would be judged as false or dubious in the particular case in question. In
this case it’s the first premise, the general premise that relates common practice to moral
acceptability, that is responsible for this being a fallacious argument.

An important point to remember about fallacies is that you can’t judge an argument based
on its superficial form as given, since we often leave out parts and rely on our audience to
fill in the gaps.

So if I give an argument like this …

“Whales can’t breathe underwater. They’re mammals.”

... it would be unfair to say, for example, that it’s bad because it’s logically weak, since
there’s no stated connection between being a mammal and being able to breathe
underwater.

In other words, it would be unreasonable to evaluate the argument based on this


interpretation ...

1. Whales are mammals.


Therefore, whales can’t breathe underwater.

… and argue that the logic is weak.


Why? Because it’s obvious that the argument relies on an implicit premise, “Mammals
can’t breathe underwater”. Only after you add this premise can you then evaluate the
argument. And in this case the argument is good. Both premises are true and the
argument is valid.

So, let’s summarize:

One: A fallacy is an argument of a type that is generally recognized to be bad.

Two: You should evaluate an argument only after the argument has been
reconstructed to include any implicit or assumed premises.

Very often, the fallacy only becomes obvious after you’ve reconstructed the reasoning.
1.2 Categorizing Fallacies

Fallacies are often categorized into different groups or families. We’ve already seen one
type of categorization, between formal or structural fallacies and content fallacies.

If you search online it’s not hard to find long lists of fallacies grouped into hierarchies, like
a biological classification scheme. Here’s one example. This is Gary Curtis’s website:

FallacyFiles.org

It’s a great resource, tons of information on different types of fallacies.

Now if you click on the link titled “Taxonomy” then you go to a page that has over a
hundred different fallacies organized into a hierarchy. Here’s a part of the hierarchy:
As you move from left to right you have general fallacy types, then sub-types of that type,
that sub-sub types of that type, and so on.

So, within the category of formal fallacies there are a variety of sub-types, including
fallacies of propositional logic. Within this category you can see some that you should
recognize if you’ve watched the tutorial course on common valid and invalid argument
forms. In particular we looked ataffirming a disjunct, affirming the consequent and denying
the antecedent.

As you can see, you could spend a lot of time just looking at formal fallacies.
Now let’s shift down a bit. This is a section of the hierarchy rooted in the category of
“informal fallacies”:

"Informal fallacy" is another way of identifying fallacies where the problem with the
argument doesn’t come down to an issue of logical form. The problem has to do with
the content of what’s actually being asserted, or with other aspects of argumentation.

Notice here you can find the “bandwagon fallacy" that I mentioned in the previous tutorial.
It’s been categorized as sub-type of the category of “red herrings”.

Now, why am I showing you this?


Well, it’s to make a point about the pros and cons of learning logic and argumentation by
studying fallacy types.

There’s no doubt that you can learn a lot about logic and critical thinking by studying and
memorizing fallacy types. And when you’re given a classification scheme like this it can
help you to understand how different types of fallacies relate to one another.

There are some downsides, however.

First, it’s easy to get lost in all of this.

There are so many fallacies, it’s hard to remember their names, it’s easy to get confused.

Second, it’s easy to lose sight of the basic principles of argument evaluation if your focus
is entirely on memorizing fallacy types.

Every fallacy is just a bad argument, and arguments are bad either because they have
weak logic, or they rely on a false premise, or they violate some other basic principle of
argumentation. In principle you should be able to analyze any argument in terms of a small
handful of basic principles.

But you can lose sight of these basic principles if you start thinking of argument analysis
as essentially an exercise in pigeon-holing arguments into fallacy types.

But there are some up-sides!

Some fallacy types are very well-known and commonly referred to by their names, like
“straw man” and “red herring” and “ad hominem”. It’s important for basic “critical thinking
literacy” to know some of these more common fallacy types.

I want to mention one other reason why studying fallacies can be helpful.
A fallacy type is a kind of “pattern”. At first, learning to categorize arguments into fallacies
can be hard, because you haven’t yet internalized the logical patterns, you find yourself
needing to check and re-check the definitions to make sure you’ve got the right one.

But after a while you do start to internalize the patterns, and then something cool happens.
You can be given an argument and you’ll be able to recognize a fallacy in it without doing
a lot of conscious analysis in your head; you can just “see” it, because your brain has
learned to recognize and respond to the pattern.

I think it’s harder to develop this pattern recognition skill if you’re always starting your
argument analysis from first principles. So this is another reason, and I think the best
reason, why studying fallacy types is important for developing critical thinking skills.

So, we’ll be looking at few of the more common fallacy types in this tutorial course.

I’m not going to make a big deal about categorizing fallacies into a hierarchy of types. Why
not? Well, first, because we’re not doing a comprehensive survey of fallacy types.

And second, because there isn’t a universal consensus on how to categorize fallacies. If
you look at different online sources or at different textbooks, you’ll find a range of
classification schemes. Some categories are universally used, but others aren’t, and I
don’t want to waste time arguing about classification schemes.

One thing I will try to do is show how each fallacy type can be analyzed using basic
principles of argument analysis, to make it clear where and how each fallacy violates the
basic definition of a good argument.

I think this helps avoid the problem of focusing too much on definitions of fallacies and
losing touch with the basic logical principles that underly them.
Once you see this, you see that identifying fallacies by name isn’t really what’s
important. What’s important is being able to recognize a bad argument and understand
why it’s bad.

1.3 The rules of rational argumentation

Formal or structural fallacies involve problems with the logic of an argument. They
violate what we’ve called the “Logic Condition”.

Content fallacies involves problems with the truth or plausibility of one of the premises of
an argument; they violate what we’ve called the “Truth Condition”.

But some fallacies don’t fit into easily into either of these categories. Sometimes an
argument is bad because the arguer either isn’t willing or isn’t able to reason well. This is a
different kind of problem than what we’ve seen before.

In part 3 of this tutorial course we’ll be looking at fallacies that fall into this category, that
are best viewed as violations of the basic rules or principles on which rational
argumentation depends. In this video we’ll introduce a few of these rules.

Here’s an example of the kind of rule we’re talking about.

Rule #1:

You can’t argue with someone who is intentionally trying to mislead or deceive you.

In short, if someone is willing to lie to persuade you of their position-- to assert a premise
as true when they know it’s false, or vice versa -- and you know that they’re intentionally
misleading you, then there’s no point in engaging their argument, unless it’s to uncover the
lies and unmask the deception. Genuine argumentation requires that all parties be open
to rational persuasion, and a willingness to lie or mislead indicates that the person really is
more interested in rhetoric and persuasion than offering and responding to good reasons.
Rule #2:

You can’t argue with someone who is UNWILLING to reason well.

Rule #1 is really just a special case of this rule. This is very broad, I know, but I’m thinking
of cases where someone’s mind is clearly made up, and their main aim is to convert
people to their side, or undermine opposing views, by whatever means that are judged to
be most effective. We can all think of examples of people or occupations that are prone to
this: strongly ideological politicians come to mind; paid spokespersons and spin doctors;
lawyers advocating for a client; religious ideologues; anyone who is invested in a certain
position or outcome, who is not genuinely open to rational persuasion from opposing
viewpoints, and who is willing to use rhetorical devices in place of good argumentation to
further their case.

In cases like this, it really makes no sense to argue with such people, because even
though they may deploy arguments on occasion, they’re not really engaged in
argumentation.

Rule #3:

You can’t argue with someone who is UNABLE to reason well.

Here I’m thinking of cases like, if someone is very upset or very emotional, they’re often
not able to reason well, and if that’s the case, arguing with them really isn’t appropriate.

Also, younger children don’t have as developed a capacity for reason as adults, so arguing
with them is often not appropriate either.

And we have to admit that some adults can have a very difficult time following the logic of
more complex arguments. In such cases just repeating the argument isn’t helpful, some
other strategy is called for, maybe simplifying the argument, or tutoring the person on how
the logic works using other examples that are easier to grasp. But when we’re tutoring or
educating we’re not arguing anymore, we’re doing something different.

Rule #4:
An argument has to give reasons for believing or accepting the conclusion.

Of course this is obvious, it’s part of the definition of an argument. But as we’ll see, there
are whole categories of fallacies where the main problem is that the argument doesn’t
actually give its intended audience any reasons to accept the conclusion.

So, the take-home message is that if these rules are violated, then the conditions for
genuine argumentation simply aren’t present.

And the final point to note, which we’ll see in Part 3 of this tutorial course, is that there are
some important fallacies that are really best understood as violations of the rules of
rational argumentation. We’ll look at the strawman or strawperson fallacy, the red
herring fallacy, and the fallacy of begging the question, as examples

Part 2: Some Important Content Fallacies


2.1 Ad hominem (abusive)

“Ad hominem” is the name of a well known fallacy type. The name is derived from Latin
meaning “to the man”, or “to the person”. It’s the fallacy of rejecting a claim or an
argument given by someone because we don’t like something about the person.
We’re mistaking criticism of a person with criticism of a claim or an argument.

There are several different kinds of ad hominem fallacies. In this video we’ll look at the
most blatant form of ad hominem, the abusive ad hominem.

Here’s the most blatant form of this most blatant form of ad hominem. “Your argument is
bad because YOU SUCK.”

This is a fallacy because even if it’s true that you suck, your sucking isn’t relevant to the
goodness or badness of your argument.

If your argument is bad it’s bad because it has a dubious premise, or it has weak logic, or
some other necessary condition for an argument to be good is violated. Your sucking
might be a reason not to like you personally, but it’s not a reason to reject your argument.

Here’s a less blatant and more challenging example:

1. Hitler argued for the superiority of the Aryan race and the inferiority of the Jews.
2. But Hitler was a murderous, megalomaniacal anti-semite.
Therefore, we should reject Hitler’s arguments.
In his book Mein Kampf, Adolph Hitler gives his account of race and history and famously
argues for the superiority of the Aryan race and the inferiority of Jewish people.

Now, what if I said that we should reject Hitler’s arguments because he was a mass
murderer, or an insane megalomaniac, or racist and anti-semitic, or whatever other nasty
thing you want to say about him.

And let’s say that these nasty things are all true. Would these true facts about Hitler’s
character give us good reason to reject Hitler’s arguments about racial differences?

Here’s a case where a lot of people, maybe most people, will say that this DOES give us
good reason to reject his arguments.

But, if we accept that the ad hominem is indeed a fallacy, that it’s a mistake to reject an
argument based solely on qualities of the person giving the argument, then we have to
reject the argument given here. This does not give us good reason to reject Hitler’s
arguments.
Now, I think Hitler’s arguments are bad, I’m hoping that most people viewing this do too,
but the point is they’re not bad because Hitler was bad; they’re bad because they violate
one or more of the necessary conditions for an argument to be good.

This works the other way too, of course. If Mother Theresa gives an argument for giving
charity and aid to the poor, and we think Mother Theresa is a moral saint, that shouldn’t by
itself count as a reason to accept her argument for giving aid to the poor.

We wouldn’t normally call this an ad hominem fallacy, of course, since the term is usually
associated with criticism rather than praise, but it’s still a fallacy, and for the exact same
reasons. The problem isn’t with criticism or praise, it’s with confusing the judging of a
person with the judging of an argument.

Now, let’s look at this example from an argument analysis perspective. Let’s ignore the
fact that we’ve already labelled it an ad hominem fallacy and ask ourselves how we would
normally assess this argument using the tools we’ve learned so far.

Well, there are two basic questions to ask: Does the argument satisfy the Truth Condition,
and does it satisfy the Logic Condition? In other words...

1. Are all the premises true or plausible?

and

2. Is the argument valid or strong?

If the answer to either of these is “no” then the argument is bad.

Well, premise 1 is clearly true, and premise 2, while it uses a lot of loaded and judgmental
language, would be regarded as true or at least defensible by many people. As the
argument is given, the problem isn’t with the truth of the premises. As given, the problem is
with the logic. If those two premises are true, the conclusion doesn’t follow either with
certainty or with high probability. In other words, the logic of the argument is neither valid
nor strong -- it’s weak.
So, one way to evaluate this argument is simply to say that, as given, it’s bad because the
logic is weak. And that’s true. But SI want you to recall now the discussion about putting
arguments in standard form way back in the first tutorial course on Basic Concepts in
Logic and Argumentation.

There we emphasized that arguments are often presented as incomplete and rely on
background assumptions to be interpreted correctly; an argument might be weak as given,
but it might be relying on a background premise that would make it strong or valid. So in
general you’re always encouraged to look for implicit background assumptions like this,
and only evaluate the argument after you’ve reconstructed an argument, and that means
making explicit any background assumptions that the argument is relying on. Only then
should you go ahead and evaluate whether the argument is good or bad.

So, does this argument rely on a background premise that makes the argument valid or
strong? Well, it’s not hard to see what would be required to make the logic work. You can
often use a simple conditional claim, if A then B, or a generalization like All A are B, to fix
the logic of an argument.

So here you might add a conditional claim like this: “IF Hitler was a murderous,
megalomaniacal anti-semite, then his arguments on racial superiority are (very
likely) bad.” If you add the “very likely” then the argument is just strong, if you take it out
then it’s valid.

It’s acceptable to add an assumed premise like this because it’s clear that we’re not
putting words into the arguer’s mouth -- it reflects what the arguer is trying to get at. We
have every reason to think that someone advancing this argument would accept a premise
like this.

So with this reconstruction we’ve fixed the logical problem. It’s not appropriate anymore to
say that the argument is bad because the logic is weak. The logic is fine.

The problem, now, is with the plausibility of that assumed premise. If this argument is
bad, it’s bad because this assumed premise is false or dubious.

Now, this is a hard case for some people because it’s still very tempting to think that this
kind of character flaw is relevant to assessing the goodness of the arguments given, but by
now I hope its clear why this is a mistake.

An argument is a collection of claims, linked by relations of logical entailment or support.


The plausibility or implausibility of those claims, and the validity or invalidity of the
argument given, isn’t determined by facts about the moral character of the person
asserting the argument.

I grant that in cases like this it’s tempting to make the slide from criticism of a person to
criticism of an argument, but that’s a mistake. The value of discussing a hard case like this
is that if you can see the fallacy here, then you’ve probably understood the essence of the
fallacy. Facts about someone’s moral character, by themselves, don’t make it any more or
less likely that their arguments are good or bad.
So our final assessment is that this is a bad argument, and it’s bad because the
background assumption necessary to make the argument valid or strong is false or
dubious. So it violates the TRUTH CONDITION, as we’ve defined that term.

This might seem like a long-winded way of saying that the argument is fallacious, but the
point of this discussion is to show why it’s a fallacy, why ad hominems in general are
fallacies, by showing how the argument violates one of the basic conditions for an
argument to be good.

To sum up, we can say a few things in general about ad hominems.

When you reconstruct them, ad hominem arguments typically rely on the following types of
assumed premise:

Almost any CLAIM that a person makes about topic X is (probably) FALSE, because
of some feature of that person.

or

Almost any ARGUMENT that P gives about X is (probably) BAD, because of some
feature of that person.

The ad hominem is a fallacy whenever these implicit premises are false or dubious.

If you include the terms in brackets you get a more qualified version of the premise that
would make the argument strong, rather than valid.

Now, the characterization given here is somewhat broader than your typical abusive ad
hominem -- you get your typical ad hominem when you base your objection on a criticism
of someone’s character. But this broader characterization is helpful because it also
covers ad hominem cases that don’t necessarily involve insulting a person or criticizing
their character, as we’ll see in the next couple of tutorials.

Finally, I want to direct your attention to the “whenever” in that final statement. You commit
an ad hominem fallacy when you give an argument that relies on premises of this type,
but it’s only a fallacy if the premise is false. I want to point this out because, as most
textbooks will tell you,premises of this type aren’t always false, and in these cases, the
arguments don’t commit the ad hominem fallacy.

Here’s an example:

1. Johnny says that he saw Mrs. Jones stab her husband.


2. But Johnny is a known liar and has a motive to lie in this case.
Therefore, Johnny’s testimony does not give good reason to conclude that Mrs.
Jones stabbed her husband.

Johnny is on the witness stand testifying against Mrs. Jones in a murder case. He says
that he saw Mrs. Jones stab her husband. The argument for her guilt relies solely on his
testimony.
Now, in a case like this, where an argument relies on trusting someone’s testimony, facts
about a person’s character and motives ARE relevant to assessing the argument. If it’s
true that Johnny is a known liar, and he has a motive to lie in this case -- maybe he himself
is a suspect in the murder -- then it makes perfect sense to reject an argument that is
based solely on Johnny’s testimony.

So, while this argument for rejecting Johnny’s testimony does rely on claims about
Johnny’s character, it doesn’t commit the ad hominem fallacy, because in this case the
claim about his character is relevant to assessing the argument.

This example shows why we needed to qualify our characterization of the ad hominem.
We commit the ad hominem fallacy whenever the argument relies on premises like these,
and the premises are false -- it’s a fallacy because the argument violates the Truth
Condition. But premises like these aren’t always false.

In this case, the implicit assumption we’re making about Johnny’s testimony is that it’s
probably false, or at least we don’t have good reason to think it’s true, because Johnny
has a record of false testimony and a motive to lie in this case.

And in this case it’s a perfectly reasonable assumption. So the argument doesn’t violate
the Truth Condition, and consequently doesn’t commit the ad hominem fallacy.

Now, this discussion raises the question of whether there are any general rules for
deciding when the relevant assumptions are true or false. Well, to my knowledge this is
still a subject of debate among experts on the philosophy of argumentation.

But on a case-by-case basis it’s not hard to spot exceptions to the fallacy, so your best
guide, I think, is to look at cases as they come up and ask yourself whether the truth or
plausibility of a central premise in the argument really does turn on facts about the arguer.
The best examples are arguments that rely solely on the authority or testimony of an
individual, but context matters a great deal too.
2.2 Ad hominem (guilt by association)

Ad hominems can come in a variety of forms. The most blatant forms involve personal
attacks -- these are the “abusive” ad hominems. But some forms are more subtle. A very
common form of ad hominem fallacy involves guilt-by-association.

I was inspired to do a tutorial on this after the recent 2008 federal elections here in the US.
Criticism of candidates based on their associations has always been a part of politics, but
the number and frequency of guilt by association arguments that we heard in this
campaign was notable (in my experience at least).

We saw it most often with criticisms of Barack Obama from various conservative circles,
where it was argued that Obama had many “radical associations” and that these indicated
that he himself was much more socially and politically radical than he was letting on.

This has the structure of a “guilt-by-association” argument. X believes A, X has an


association with Y, and you conclude that Y probably also believes A.

Not everyone classifies guilt-by-association as an ad hominem argument, but it’s easy to


see how the main ideas can be used to generate an ad hominem-type argument.

1. Obama says X.
2. But Obama is associated with people who say Y, which contradicts X.
Therefore, Obama probably believes Y instead of X.

This is one way of phrasing the reasoning. Obama says X, but he’s associated with people
who seem to deny X, or say other things, Y, that seem to contradict X.

So we conclude that Obama probably doesn’t believe X, or is more sympathetic to Y than


he lets on.

The conclusion of an argument like this usually isn’t very specific -- but its primary use is to
ground a charge of hypocrisy or misrepresentation, and this is generally how it was used
against Obama.

I don’t want to suggest, by the way, that only Republicans are guilty of this sort of
reasoning. One of Obama’s main political tactics was to stress John McCain’s associations
with President Bush and the policies of his administration.

But one could argue that guilt-by-association was a much more more prominent feature of
the campaign against Obama than it was in the campaign against McCain. Certainly there
was more media discussion of the use of this argument form against Obama than there
was of its use against McCain.

Now, back to our main concern, is this a fallacy?

Looking at the argument above, it’s clear that, as stated, the argument is bad, and it’s bad
because the logic is weak -- the conclusion simply doesn’t follow from those premises.

Why doesn’t it follow? Two reasons.


First, it’s missing a premise that connects being associated with someone who believes Y,
with the conclusion that you probably believe Y too. So you’d have to add a premise to that
effect to fix the logic.

Second, the term “associated with” is too vague to be informative. Any defensible version
of this argument would have to get very specific about the kind of association that is at
issue, and the added premise to fix the logic would have to say something specific about
how that particular association gives reason to believe that a person is lying about their
stated beliefs.

Now, in principle this is doable. Certain kinds of associations may give good reason to
question someone’s honesty. But very often these details aren’t given, and the argument
relies on vague and general claims like the one above.

Under these conditions, the argument is bad and guilty of the ad hominem fallacy of “guilt-
by-association”.

But I said that in principle one could make an argument like this work if you were more
specific about the kind of association you have in mind, and how that association supports
the conclusion.

The problem with this strategy is that the additional premises needed to make the logic
work tend to rely on generalizations that aren’t very plausible, or claims of a specific nature
for which there just isn’t good evidence.

To make the point, let’s look at some examples:

1. Obama says he has always condemned the bombing of public buildings


conducted by the Weather Underground in the 1960s and 70s (the group that Bill
Ayers helped to found).

2. Obama has had casual but friendly relations with Ayers since 1995, and has
served on a couple of administrative boards with him.

Therefore, Obama probably condones the actions of the Weather Underground.

In the case of Bill Ayers, one of the ways that the guilt by association argument has played
out looks like this. The Weather Underground was a radical protest group that Ayers co-
founded in the 60s, and in the 60s and 70s they were responsible for some bombings of
government buildings as part of their protest against the Vietnam war.

Ayers is now a Professor of Education at the University of Chicago, and for many years
he’s been active in education reform and the fight against poverty in the Chicago area. He
and Obama met in Chicago in the mid-1990s while Obama was working as a community
organizer. They’ve served on a couple of boards together and by both of their admissions,
have generally had friendly though not particularly close relations over the years.

Now, if the conclusion we’re after is that in virtue of this association with Ayers, we have
good reason to think that Obama actually condones the bombings of those government
buildings carried about by the Weather Underground, then it’s obvious that that the logic is
still weak. To fix it, you’d need a premise like this:
“Anyone who has friendly relations with a person (of the sort described in premise
2) probably condones the actions of that person.”

This would fix the logic and make the argument strong. However, this premise, as a
generalization, is wildly implausible. We can all think of examples of friends and
acquaintances who have have done bad things in the past that we judge to be wrong, but
nevertheless remain friends or acquaintances with them.

The same applies for political affiliations. Having friendly relations with people who lean
strongly to the left doesn’t by itself give good reason to think that you lean strongly to the
left. This is what I mean when I say that guilt-by-association arguments often rely on
generalizations about people that are implausible.

Now, maybe Obama is more sympathetic to radical views than he lets on. My point is
that this kind of argument doesn’t give good reason to think so.

For the sake of contrast, an argument that WOULD give us good reason might look like
this:

1. Obama says he has always condemned the bombing of public buildings


conducted by the Weather Underground in the 1960s and 70s (the group that Bill
Ayers helped to found).

2. But we have tape recorded evidence of Obama speaking to Ayers in private,


where he admits that in fact he condones the radical actions of the Weather
Underground, and admires the people who had the courage to take them, but
realizes that he can’t say so in public.

Therefore, Obama condones the bombings of government buildings carried about


by Ayers and his associates.

If we knew -- maybe because we have tape recorded evidence -- that Obama had private
meetings with Ayers, where he admits that he condones the activities of the Weather
Underground, but acknowledges that he can’t say this in public without destroying his
political career, then of course we’d have good reason to accept the conclusion. If the
premises were true, this would be a good argument. But this is the problem: we don’t have
any evidence that this new premise is true.

Also note that when your association is very specific like this, and contains information that
directly supports the conclusion, then you’re really not dealing with a guilty-by-association
argument anymore, since the mere association with Ayers isn’t what’s driving the
inference, it’s the tape-recorded evidence of Obama’s own words that is driving the
inference.

I think this is a common pattern with guilty-by-association arguments. If the argument is


running solely on the association, then it’s generally a bad argument. But if the
association is specific enough and contains information that directly supports the
conclusion, then it’s really not a guilt-by-association argument anymore, it’s an argument
based on more tangible and relevant forms of evidence.
The upshot is that guilty-by-association is a fallacy when the argument relies entirely on
the association do drive the conclusion; but if it relies on other kinds of information, then
it’s not a guilt-by-association argument anymore.
2.3 Appeal to hypocrisy (tu quoque)

An “appeal to hypocrisy” is a type of ad hominem where you reject someone’s


conclusion or someone’s argument because that person is somehow
being inconsistent or hypocritical. The Latin term for this is the tu quoque fallacy, which
means something like “You, too”, or “You, also”.

Here’s a typical setup.

“Jason, you spent an hour last week lecturing me on the evils of eating factory-
farmed meat products. But I saw you buying a McDonald’s hamburger yesterday!”

This isn’t a fallacy yet. It’s not even an argument. You get the fallacy when you conclude
something like

“Why should I take your arguments seriously, since you’re obviously a hypocrite!”

This is a fallacy if the suggestion is that Jason’s arguments against meat-eating are bad,
or that hisconclusions are false, simply because Jason himself is a hypocrite.

Now, why is this a fallacy?

Well, because whether Jason’s arguments are good or bad is independent of his own
beliefs or behavior.

The charge of hypocrisy might be justified, but that alone won’t change a true premise into
a false premise, or a valid argument into an invalid argument. To think otherwise is to
mistake the person for the argument.

Here’s a schematic version of the fallacy:

1. X gives argument A for conclusion C.


2. X does not believe the conclusion C, or acts in ways that are inconsistent with C.
Therefore,
A is a bad argument and should be rejected.
or
C is false and should be rejected.

A person X gives an argument A for conclusion C. We discover that X, the person giving
the argument, doesn’t actually believe the conclusion, or maybe acts in ways that are
inconsistent with the conclusion (like buying and eating a hamburger at McDonald’s after
arguing that eating factory-farmed animal products is wrong).

Then we infer from this that the argument is bad and should be rejected, or that the
conclusion is false and should be rejected. So we’re moving from claims about the person
making the argument, to claims about the argument itself.

From an argument analysis standpoint, as given, this argument form is bad because the
logic is weak. The conclusion simply doesn’t follow from those premises.
Now, to fix the logic you could always add a premise like this:

“If X doesn’t believe their own conclusion, or acts in ways that are inconsistent with
that conclusion, then A is (probably) a bad argument, or C is (probably) false.”

This is a conditional claim that ties together the premises and the conclusion. IF I don’t
believe my own conclusion, or act in ways that are inconsistent with my conclusion, THEN
my arguments this conclusion are probably bad, or my conclusion is probably false.

But now the problem with the argument is with this additional premise. We just don’t have
any reason to think it’s true. Facts about ME and MY beliefs are irrelevant to whether my
ARGUMENT is good or bad.

If I say that eating factory-farmed animals is bad because, say, factory-farming methods
cause unnecessary suffering to animals, and it’s wrong to inflict unnecessary suffering on
animals, thenwhat makes these claims true, if they are true, isn’t anything to do with me.
What makes them true isfacts about factory-farming methods, or facts about the moral
status of animals and their suffering. I could be a closet sadist and enjoy torturing animals
in private, but that has no bearing on the truth or falsity of the claims being made in the
argument.

But this is just to repeat what we’ve been saying about ad hominem arguments all along,
that they’re based on a false belief that facts about a person are relevant to assessing
facts about arguments.

On the other hand, like with our previous examples, there are cases where charges of
hypocrisyare relevant.

They’re relevant when the issue at hand is either about someone’s character, or about the
consistency of the views they hold.

So, in the case of Jason, the moralizing anti-factory farming guy who secretly enjoys the
occasional McDonald’s hamburger, his behavior would be relevant if the issue is, say,
whether Jason has integrity or is a good public spokesperson for the animal rights
movement, but it’s not relevant if the issue is whether factory farming is good or bad.

2.4 Appeal to popular belief (or practice)

An appeal to popular belief says that an argument is good or bad, or a claim is true or
false, because it is widely believed to be so. An appeal to popular practice is similar
except we’re dealing not with beliefs but with practices, things you do, like giving to charity,
or spanking your kids.

Here’s an example. As tax season approaches, you might here the expression “well,
everyone lies on their taxes”. So if our unscrupulous Jason says “Yeah, there was
some income that I didn’t declare on my taxes. But look, everyone lies on their
taxes.”, this would be an appeal to common or popular practice.

Putting the argument in standard form, and adding the key premise, it looks like this:

1. If (almost) everyone lies on their taxes, then it’s okay to lie on your taxes.
2. (Almost) everyone lies on their taxes.
Therefore, it was okay for me to lie on my taxes.

The key premise is the first one, this is the premise that asserts that if everyone or almost
everyone does something, then it’s acceptable to do it.

Appeals to popular belief or popular practice are fallacies if that first major premise is
false, or dubious. The logic works fine, it’s the truth of the premises that is at issue.

In this case I think most of us would agree that even if everyone did lie on their taxes, that
by itself wouldn’t justify lying on your taxes. Would we want to say that if everyone stole
things that didn’t belong to them, then stealing would be okay, or if everyone, or the
majority, believed that slavery was acceptable, then slavery would be acceptable?

Here’s an example of appeal to popular belief.

“Surveys tell us that over 90% of the population believes in some form of God or
higher power. Surely we can’t ALL be wrong.”

This is an appeal to popular belief rather than popular practice because the issue is
whether a claim is true or false, not whether a practice is acceptable or unacceptable.

In standard form the argument might look like this:

1. If (almost) everyone believes X, then X is (probably) true.


2. (Almost) everyone believes in some form of God or higher power.
Therefore, it’s (probably) true that there exists some form of God or higher power.

Here I’ve written the key premise as a statement form with X as a placeholder for whatever
claim is at issue. Every appeal to popular belief relies on a premise of this or a similar
form, whether it’s explicitly stated or not.

Once again, the point to note is that the logic isn’t the problem with an argument like this,
the problem is with the truth or falsity of the premises. That’s what makes it a “content”
fallacy rather than a logical fallacy.

Let’s assume the second premise is true. In this case, the argument is fallacious just in
case that first major premise is false.

And in most cases where the claim at issue makes an assertion about what exists or
doesn’t exist objectively in the world, this premise is going to be false. Simply believing
that something exists doesn’t make it exist.

On the other hand, sometimes the claim at issue is about what people believe, like this
example:

“Vanilla is the most popular flavor of ice cream in the world.”


If this is true, it’s true simply in virtue of the fact that more people prefer vanilla to any other
flavor of ice cream. So the appeal to popular belief is relevant because what’s at issue is
precisely what people believe.

So if you surveyed people and found out that this was the case, then of course you
wouldn’t be guilty of this fallacy.

On the other hand, there’s obviously another sense in which whether or not vanilla IS the
most popular ice cream ISN’T determined solely by popular belief about the issue.

If you just ask people what they think the most popular ice cream is, and the majority says
“chocolate”, that doesn’t by itself make chocolate the most popular ice cream, since the
majority could easily be mistaken about what the actual preferences of people are. It’s
possible that in fact more people prefer vanilla to chocolate, but more people think that
chocolate is the more popular flavor.

We’re not being contradictory here, because we’re saying too very different things. In the
first case, we’re saying that if more people actually prefer vanilla to any other flavor of ice
cream, then vanilla really is the most popular flavor, since this is what it means for it to be
the most popular.

And this is clearly true, so there’s no fallacy here.

In the second case, we’re saying that if more people believe that vanilla is the most
popular ice cream, then vanilla really is the most popular ice cream. But this is clearly
false. An argument that relied on THIS kind of premise would be guilty of a fallacious
appeal to popular belief.

So, to sum up, appeals to popular belief or popular practice generally rely on major
premises like these, whether they’re stated explicitly or not:

“If (almost) everyone believes X, then X is (probably) true.”

“If (almost) everyone does X, then X is (probably) okay or acceptable.”

These arguments are fallacious when the major premise is judged to be false or dubious.
2.5 Appeal to authority

An appeal to authority says that an argument is probably good or bad, or a claim is


probably true or false, because an authority says so.

The authority in question is often a person, but it can also be a book, or a website, or an
institution. What makes it an appeal to authority is that the justification for the inference
rests primarily on the authority of the source.

Not all appeals to authority are fallacious. The trick is to figure out when they are and
when they aren’t.

Here’s an appeal to authority:

Two kids are talking about life on other planets and one reports that his Dad says that
Venus is too hot to have life on it. The other kid is dismissive, he says “So, what does
he know?”. The first kid responds that his dad is a planetary scientist who works for
NASA.

Assuming that he’s not lying and his Dad really is a planetary scientist, this looks like it
could be a good appeal to authority.

On the other hand, if he’d said this …

“Oh, my dad looked it up on a website.”


 ... then the argument wouldn’t be as convincing. Now the claim rests on the authority of a
nameless website. Without anything else to go on, this is a bad argument, since we don’t
know anything about the reliability of the website. It could be right, the internet is full of
reliable information, but it’s also full of false information and crackpot sites -- “the world
wide web”, as a collective body, can’t be treated as a reliable authority on anything.

Every appeal to authority relies on a claim like the following:

“(Almost) anything that A says about S is (probably) true.”

where A is the authority and S is the subject matter in question.

An appeal to authority is good just in case a claim of this sort can be plausibly defended. If
it’s true, then you can use a claim like this as a premise and use it infer the truth of claims
about S, the subject matter in question.

On the other hand, if we have don’t have good reason to think the claim is true, then it’s a
bad appeal to authority, and guilty of a fallacy.

So our planetary scientist example might look like this:

1. (Almost) everything that a planetary scientist says about the conditions


necessary for life to exist on a planet are (probably) true.

2. James is a planetary scientist.


3. James says that Venus is most likely too hot for life to exist.

Therefore, Venus is most likely too hot for life to exist.

The conclusion follows, the logic is fine. The only question is whether that first premise that
makes the authority claim is plausible or not. If we think it’s plausible then we should judge
the argument to be good, if we don’t then we should judge it bad, that it’s a fallacious
appeal to authority.

Unfortunately there’s no easy rule for judging authority claims. It rests entirely on
ourbackground knowledge. To judge this claim we have to know something about what
planetary scientists do, what their area of expertise is, how close the claim in question is to
their area of expertise, and so on.

In this case my first reaction is that a planetary scientist is a very good authority on this
kind of question. It seems right up their alley.

I’ve had students challenge this example though. They think that the term “life” is too
broad, and they’d want to restrict the authority claims of a scientist to “life as we know it”.
Maybe organic life as we know it can’t exist on Venus, but maybe there are other kinds of
living things that could evolve or survive on Venus, maybe non-organic life forms that
operate on very different physical principles than organic life on earth does. A planetary
scientist isn’t necessarily an expert on all possible forms of life -- maybe NO ONE is an
expert on this.

So they would reject premise 1 as it stands, but they would accept an amended form of the
argument like this ...

1. (Almost) everything that a planetary scientist says about the the conditions
necessary for life as we know it to exist on a planet are (probably) true.

2. James is a planetary scientist.

3. James says that Venus is most likely too hot for life as we know it to exist.

Therefore, Venus is most likely too hot for life as we know it to exist.

… where we’ve restricted the claim at issue to life “as we know it”.

Now, they say, we’ve got a good appeal to authority.

I’ll buy this. This seems like a reasonable amendment to the original argument. And it
illustrates nicely the kind of thinking you might have to do when evaluating appeals to
authority. You really have to think hard about whether the proposed authority really has
relevant expertise on the matter in question.

In lots of cases the answer is obvious. My daughter’s eleven year old friend isn’t going to
be a reliable authority on quantum field theory, but she may well be an authority on who
the popular and unpopular kids are in her class at school.
In other cases the judgment isn’t so obvious and people’s initial reactions may differ.

Here’s an example where the claim at issue is what happens to us after we die.

“The Pope says that when we die, if we’ve lived a good life, we go to Heaven.”

When the POPE makes a claim about this, how should we judge his authority on the
matter?

Well, a devout Catholic may well treat the Pope as an authority on such things, and they
would judge the argument to be good.

But you might find that even among practicing Catholics there is disagreement about what
kind of authority the Pope really has, and certainly among non-Catholics and atheists
you’re not likely to find many who take the Pope to be an authority on the afterlife. Many
might question whether anyone could be an authority on a question like this.

This just highlights a fact that we discussed in the very first tutorial course on basic logical
concepts -- judgments about the plausibility or implausibility of premises can vary from
audience to audience, depending on the background assumptions that different audiences
bring to the table.

There’s no getting around it, and appeals to authority are particularly sensitive to this kind
of variation.

So, to sum up:

1. appeals to authority rest on claims that assert that “Anything, or almost everything,
that A says about S is true, or probably true.” This is the “authority claim”.

2. An appeal to authority is good when the authority claim is plausible;


it’s fallacious when the authority claim is not plausible.

3. Judgments about the plausibility of authority claims are sensitive to differences in


the theexperience and background of different audiences.

One audience might recognize A as an authority on a subject while another audience


might reject A, or at least be skeptical about A as an authority. In cases like this, if you
want to pursue an appeal to authority then you’ll need additional argumentation to defend
the authority claim.

Now, let me make a final comment about appeals to authority that you might encounter if
you browse other sources on fallacies.

You’ll commonly find people saying that certain kinds of appeals to authority
are always fallacious.

Probably the most common example is about the authority of claims about a commercial
product coming from the lips of a paid spokesperson for the product. Many sources will tell
you that you should always treat celebrity endorsements as fallacious appeals to authority,
since these people are being paid for their endorsement, so they have a motive to be
biased, and on top of that they probably don’t have any special expertise in the pros and
cons of the product in question as compared to rival products on the market.

My response is that this is good advice as far as it goes, but I can’t see a rationale for
turning this into an absolute rule.

Sometimes paid spokespersons are very well informed about the pros and cons of a
product, and sometimes they really are good authorities on the subject matter.

Yes, a paid endorsement introduces concerns about bias that an unpaid endorsement
avoids, but I prefer to treat this as just one of many factors that people have to take into
consideration when evaluating appeals to authority.

For any appeal to authority you should always be asking questions like:

 “is the source biased, or is there some reason to mislead?”


 “how does the source’s claim compare with expert opinion on the subject?”
 “is the claim plausible or implausible on its face?”
 “is the source being cited correctly or is the claim being taken out of context?”
You need to consider many factors when judging appeals to authority.
2.6 False dilemma

“False dilemma” is also known as “false dichotomy”. I’ve also heard it called the
“either-or” fallacy.

You’re given two options, and told that you should certainly reject one of them, so you’re
forced to accept the remaining option. This is a fallacy when the options you’re given don’t
include all the real options.

Here’s an example:

“My dad is voting Democrat in the next election.”

“How do you know?”

“Because I overheard him say he’s definitely not voting Republican.”

Now, looking at this in purely logical terms, there’s nothing wrong with this argument:

1. He’s either voting Democrat or Republican. 2. He’s not voting Republican.


Therefore, he’s voting Democrat.

In fact, this is an instance of one of the valid argument forms we saw in the tutorial course
called “common valid and invalid argument forms”. This is called “disjunctive syllogism”,
or “disjunctive argument”:

1. A or B
2. not-B

Therefore, A

The problem here isn’t with the logic. The problem is with the assumption that the only two
options are voting Democrat or Republican.

Maybe his dad is going to vote for the “green” candidate, or the “libertarian” candidate.
Maybe he’ll decide not to vote at all. There are lots of other possibilities.

So the problem is with the truth of that first premise. It sets up a false dilemma, or false
dichotomy. Either A is true or B is true, there’s no third option. But there are other options.

Note that you can’t fix this argument by including the other options, because then the logic
wouldn’t work:

1. A or B or C or D
2. not-B
Therefore, A or C or D
If you add other options then you can’t infer any single option, just that one among the
remaining options must be true. If the options are that he’ll definitely vote Democrat or
Republican or Green or Libertarian, and you know that he’s not voting Republican, then all
we can infer is that he’s going to vote Democrat or Green or Libertarian.

Here’s the summary version of the fallacy. False dilemma involves mistaking an argument
of thisform:

1. A or B or C or D
2. not-B
Therefore, A or C or D

for an argument of this form:

1. A or B [this premise asserts the “false dilemma”]


2. not-B
Therefore, A

This is a content fallacy and not a logical fallacy, so detecting it requires that you evaluate
he truth or falsity of the major premise, the one that sets up the dilemma.

This fallacy can be hard to detect because it relies so much on your background
knowledge to judge whether the dilemma is plausible or not, and it’s subject to the same
relativity that any judgment of plausibility is subject to -- different audiences may judge the
premise to be plausible or implausible, depending on their background knowledge and
their preconceptions about the issue.

For example:

“Either you believe in God or you believe in evolution, you can’t have it both
ways. Well, I believe in God. That’s why I don’t believe in evolution.”

1. Either (God) or (evolution)


2. God
Therefore, not-(evolution)

Here’s a pretty common argument about God and evolution that you may have
encountered. Before we talk about the plausibility of that first premise, let me make a note
about the logical form being used.

If you recall the tutorial on common valid and invalid argument forms, then you might
remember these argument forms that use “OR”, the disjunction.

You get the following form …

1. A or B
2. not-A
Therefore, B
… when you assert that A or B must be true, and then deny one of the disjuncts, which
allows you to infer that the remaining disjunct must be true. This is a valid argument form.

Now, our example above doesn’t have this form. In premise 2 we’re affirming one of the
disjuncts and using this to infer that the remaining disjunct must be false. That is, we’re
making an argument of this form:

1. A or B
2. A
Therefore, not-B

But we know that this argument isn’t valid if the “OR” is an inclusive OR. It’s only valid if
the OR is an exclusive OR. To say that the OR is inclusive is to say that it’s possible that
A and B could BOTH be true, that it includes this possibility. With an exclusive OR you’re
excluding this possibility, you’re saying that either A is true or B is true, but they’re mutually
exclusive, they can’t both be true.

So, should we treat this argument about God and evolution as making an exclusive OR
claim or an inclusive OR claim? It’s important, because if it’s an inclusive OR then the
argument as given is invalid; if it’s an exclusive OR claim then the argument is valid.

In this case it’s not too hard to see that the arguer is intending this as an exclusive
OR. The key phrase is “you can’t have it both ways”. So we’re looking at an argument
that is intended to function like the valid form.

On this reading, the argument is valid, it satisfies the Logic Condition. The only question
that remains is whether it satisfies the Truth Condition, whether that “exclusive OR”
disjunctive premise is plausible or not.

This is our question: is it TRUE that belief in God is incompatible with belief in
evolution?

If you think it’s false, then this argument is guilty of posing a false dilemma. If you think
it’s true, then it’s not posing a false dilemma, it’s posing a genuine dilemma.

Now, one thing we can say for sure is that, as phrased, the meaning of that first premise is
ambiguous, too ambiguous to be allow us to properly evaluate the claim. How you answer
this will depend on what specifically you think belief in God entails, and what belief in
evolution entails.

For example, if you interpret belief in God and belief in evolution like this …

(God) = The God of the Old Testament exists and created the world and all living
organisms in six literal days as described in Genesis.
(evolution) = The earth is billions of years old, all organisms are evolutionarily
descended from a common ancestor, and the primary mechanism of evolutionary
change is natural selection.

… the way a biblical literalist would, then you do have a real incompatibility. A literal
reading of Genesis really is incompatible with the orthodox Darwinian conception of the
origins and evolution of life on Earth.

So, for an audience who holds these beliefs, this argument does NOT commit the fallacy
of false dilemma. Given this reading of the premises, this is a real dilemma.

But it would be different if the conception of God at issue was something like this ...

(God) = An all-knowing, all-powerful, all-good being is responsible for the


existence
of the universe and the laws of nature.

(evolution) = The earth is billions of years old, all organisms are evolutionarily
descended from a common ancestor, and the primary mechanism of evolutionary
change is natural selection.

On this reading, the God hypothesis merely entails the view that there is an all-knowing,
all-powerful, all-good being who is responsible for the existence of the universe and the
laws of nature. It doesn’t say anything about the Bible or creation in six days.
Now, if someone offered this argument using this conception of God and evolution, they’d
be guilty of posing a false dilemma. Why? Because there’s no reason to think that these
are mutually exclusive claims. A creator God like this might exist, OR the conventional
evolutionary story about the origins of life on earth might be true, or BOTH might be true.
This is, in fact, the view held by many religious people and religious scientists. This view is
consistent with the official position of the Catholic church, for example.

So, whether or not this argument commits the false dilemma fallacy depends on how you
interpret each of the horns of the proposed dilemma. Vague or ambiguous language
allows for multiple interpretations.

In my view, this is one of the ways that false dilemmas acquire the persuasive power that
they have. When the claims at issue are clear and precisely articulated, it’s easier to see
on their face whether they’re logically compatible or not, or whether there are other
alternatives that aren’t being considered. If they’re vague or ambiguous, different
interpretations can get muddled in your head, which makes it easier for a false dilemma to
come across as a genuine dilemma.

Here are some examples that illustrate a couple of general points about false dilemmas.

First, as we’ve seen, not all dilemmas or dichotomies are false:

“Every POSITIVE NATURAL NUMBER is either even or odd”.

This is true. This is just the number sequence 1, 2, 3, 4, 5, and so on, forever. Every
number in this set is either even or odd.
But this claim is different:

“Every REAL NUMBER is either even or odd.”

This claim is false. “Even” and “odd” only apply to integers. The real numbers include all
the integers, so some real numbers are even or odd, but the reals also include decimal
numbers like 2.5 or 1.99, and these are neither even nor odd. So this poses
a false dilemma.

Second, one of the most common forms of the fallacy occurs when a choice is presented
that really represents the two ends of a continuum of possibilities. For example, when
someone says …

“Either you support minimal government and free markets or you’re a socialist.”

... they’re setting up the dilemma as a choice between libertarianism -- which is associated
with minimal government, free markets and minimal state interference in the lives of
citizens -- andsocialism, which is associated with collective ownership, state regulation of
the economy, and forced redistribution of wealth from the rich to the poor.

This setup ignores the range of intermediate possibilities between these two poles, such
as the various forms of classical “welfare liberalism” that try to strike a balance between
libertarianism and socialism.

When false dilemma shows up in this form, it’s sometimes called the fallacy of “black-and-
white thinking”, for obvious reasons. It sets up a choice in terms of stark contrasts and
ignores the various shades of grey that might exist in between.

This is, I think, the most common and worrisome form of the false dilemma fallacy.
Unfortunately, because the fallacy isn’t a purely logical one, diagnosing it requires that you
actually know something about the subject matter, and that’s not something that can be
taught in a logic class.
2.7 Slippery slope

The last “content fallacy” that we’re going to look at is “slippery slope”.

Here’s a pretty extreme example of a slippery slope fallacy:

A high school kid’s mom insists that she study on Saturdays. Why? Because if she
DOESN’T study on Saturdays then her grades will suffer and she won’t graduate high
school with honors, and if she doesn’t graduate with honors then she won’t be able to get
into the university of her choice, and ... well, the rest isn’t clear, but the result of all this is
that she’ll end up flipping burgers for the rest of her life, and surely she doesn’t want
THAT, so she’d better darn well get serious and study!

I’ve actually heard a version of this discussion between two wealthy mothers who were
talking about which preschool to send their kids to. The gist was that if they didn’t get their
kid into a prestigious preschool then they’d be disadvantaged from that point forward in
ways that could ultimately threaten their future life prospects, so this was not a decision to
be taken lightly!

I did not envy those kids.

Here’s the schematic form of a slippery slope argument.

1. If A then B
2. If B then C
3. If C then D
4. not-D
Therefore, not-A

It’s a series of connected conditional claims, to the effect that if you assume that A is true
or allow A to occur, then B will follow, and if B follows then C will follow, and if C follows
then D will follow. But D is something nasty that we all want to avoid, so the conclusion is
that if we want to avoid D, we need to reject A, or not allow A to happen.

Note that, as stated, the logic of this argument is fine. In fact, this is a valid argument form
that we’ve seen before, we’ve called it “hypothetical syllogism” or “reasoning-in-a-chain”
with conditionals.

Slippery slopes are fallacious only if the premises are false or implausible. Everything
turns on whether these conditional relationships hold. Sometimes they do, and if they do,
it’s not a fallacy. But very often then don’t, and when they don’t we’ve got a slippery slope
fallacy.

Now, there’s a caveat to this way of analyzing slippery slopes. It’s usually the case that
slippery slope arguments aren’t intended to be valid. That is, they’re not intended to
establish that the dreaded consequence will follow with absolute certainty. Usually the
intent is to argue that if you assume A, then D is very likely to follow, so what’s being
aimed for is really a strong argument.

And that means we shouldn’t really be reading the conditional claims as strict conditionals,
with every link in the chain following with absolute necessity. We should be asking
ourselves, how likely is it that D will follow, if A occurs? If it’s very likely, then the logic is
strong, if not then it’s weak. So in a sense we’re evaluating the logic of the argument, but it
turns out that in cases like this, the strength of the logic turns on the content of the
premises, so in the end we are evaluating the plausibility of premises, which makes this a
content fallacy, and not a logical or formal fallacy.

For our example the chain of inferences looks like this:

Doesn’t study on Saturdays —>


Doesn’t graduate high school with honors —>
Doesn’t get into a top university —>
Winds up working in a fast food restaurant (or similar “working class” career”)

Now, this argument is obviously bad, at every stage of the reasoning.

It’s possible that not studying on Saturdays could make a difference to whether the student
gets on the honor roll, but there’s no evidence to suggest that this is likely.

Yes, if you’re not on the honor roll then maybe this will affect your chances of getting into a
top university, but without specifying what counts as a top university, and what other
factors may or may not be operating (like, for example, whether the student is a minority or
an athlete and might be eligible for non-academic scholarships of various kinds), then it’s
impossible to assess the chances of this case.

The last move, from failing to get into a top university to flipping burgers for a living, is
obviously the weakest link in the chain, this is just wildly pessimistic speculation with
nothing to support it.

So each link in the chain is weak, and the chain as a whole simply compounds these
weaknesses.

By saying this we’re saying that premises 1, 2 and 3 are not plausible, and so the
inference from A to D is not plausible. We have no reason to think that this slope is
slippery.

Now, there’s another obvious way that one can attack a slippery slope argument. You
might be willing to grant that the slope is slippery, but deny that what awaits at the bottom
of the slope is really all that bad.

This would be to challenge premise 4, “not-D”. “not-D” says that D is objectionable in some
way, that we don’t want to accept D. But this might be open to debate. If what awaits at the
bottom of the slope is “and then you die a painful death”, or “and then all our civil rights are
taken away”, then sure, just about everyone is going to agree that that’s a bad outcome.

But it’s not as obvious that everyone will find flipping burgers objectionable, or whatever
this notion stands for -- working in the service industry, or working in a low-paying job, or
whatever.

What’s important in evaluating a slippery slope argument is that the intended audience of
the argument finds the bottom of the slope objectionable. So this is another way to criticize
a slippery slope argument -- by arguing that the outcome of this chain of events really isn’t
as objectionable as the arguer would like you to think.

So, just to summarize what we’ve said so far, there are two ways of challenging a slippery
slope argument.

The first one is to challenge the strength of the conditional relationships that the argument
relies on. When people say that a slippery slope argument is fallacious, they usually mean
that this chain of inferences is weak.

(By the way, I hope it’s clear that slippery slope arguments don’t have to have only three
links -- my argument schema could have been longer or shorter.)

Second, you can also challenge a slippery slope argument by challenging the
“objectionableness” of whatever lies at the end of the chain. If it’s not obvious to the
intended audience that this is actually a bad thing, then the argument will fail to persuade,
regardless of how slippery the slope may be.

Before wrapping up, I’d like to make a few points about assessing the plausibility of
conditional chains. Fallacious slippery slope arguments often succeed at persuading their
audience because people misjudge the strength of the chain of inferences. They’re prone
to thinking that the chain is stronger than it actually is.

It’s important to realize two things. First, a chain of conditional inferences is only as strong
as its weakest link. The weakest conditional claim, the one that is least likely to be true, is
the one that sets the upper bound on the strength of the chain as a whole. So even if some
of the inferences in the chain are plausible, the chain itself is only as strong as the weakest
inference.

Second, weaknesses in the links have a compounding effect, so the strength of the whole
chain is almost always much weaker than the weakest link. To see why this is so, you can
think of conditional claims as probabilistic inferences -- If A is true, then B follows with
some probability, and this probability is usually less than 1, or less than 100%.

So the probability of D following from A, the probability of the whole inference, is actually a
multiplicative product of the probabilities of each of the individual links.

The odds of a coin landing heads on a single toss is 1/2, or 50%. The odds of a coin
landing heads twice in a row is 1/2 times 1/2, or 1/4, which is 25%. Conditional inferences
compound in a similar way.

So, if the odds for each link in the chain were, let’s say, 90%, then the odds of the whole
chain being true, of D actually following from A, would only be 0.73, or 73%, and this
number will go down further with each additional link in the chain.

People, in general, are very bad at estimating compound probabilities, and we’ll tend to
overestimate them.

Here’s the estimate if the one of the links is weaker than the rest, say, 0.6, or 60%. The
probability of D following from A actually drops below 50%, a very weak inference, but very
few people will read the probabilities this way. Their attention will focus on the highly
probable bits of the story and their estimate of the overall odds will be anchored to these
bits, especially if they’re either at the very beginning or at the very end of the chain, since
these make the biggest impression.

So, human beings in general are quite vulnerable to slippery slope reasoning, and knowing
these facts should motivate you to be more critical when you encounter these kinds of
arguments.
Part 3: Fallacies that Violate the Rules of Rational Argumentation
3.1 Straw man

We’re moving on now to fallacies that involve what I’ve called violations of the rules of
rational argumentation.

The rules in question are things like, being able and willing to reason well, not being willing
to lie or distort things simply to win an argument, and so on.

The first fallacy of this type that we’ll look at is more commonly known as the “straw man”
fallacy. For the sake of gender equity I’m going to call it “straw figure”, since there’s no
reason to think that men are the only ones who commit this fallacy or who are taken in by it
:).

The name comes from the practice of using human figures made of straw as practice
dummies in military training. Obviously it’s easier and safer to practice certain combat
techniques when your opponent is made of straw.

The fallacy works like this. Alice offers an argument to Bob, she wants to convince him of
something.

Let’s say that Alice’s argument is really pretty strong. Like this boxer.
Bob isn’t sure he can handle this argument.

So instead of trying to refute Alice’s actual argument, Bob decides to engage a different
argument. He decides to engage this straw figure. What is the straw figure? It’s a weaker,
distorted version of Alice’s original argument.

Because it’s weaker, Bob is easily able to refute the straw figure argument.

The straw figure fallacy is complete when Bob does the dance of joy and claims that he
has successfully refuted Alice’s argument.
But of course Bob hasn’t refuted the original argument, he’s only refuted a distorted
misrepresentation of it.

This is the straw figure or straw man fallacy.

This fallacy is often categorized as a fallacy of relevance, because the attacks made on
the weak straw figure are irrelevant to judging the actual strengths and weaknesses of the
original argument, and this is correct, but I prefer to think of it as a violation of the rules of
rational argumentation, especially when it involves knowingly and willfully misrepresenting
an argument.

When someone is willing to do this, they’re no longer playing by the rules, they’re more
concerned with the appearance of winning than with argumentation itself.

When you see this going on, you should try to correct the misrepresentation and get the
discussion back on track. If it’s an honest mistake and the arguer is willing to correct their
misunderstanding, that’s great, but if you catch them doing this again and again
then there’s probably no point in engaging argumentatively with this person, because
they’ve shown you that they’re not willing to play by the rules.

Let’s look at an example.

Jennifer has just finished giving her argument against mandatory prayer in public schools.
Let’s assume that her argument focused on separation of Church and State in the First
Amendment and the importance of respecting religious diversity in a multicultural society.

Bob responds like this:

“It’s clear from your argument that you’re really advocating for atheism. But we’ve
seen what state-sanctioned atheism does to societies. Look at Russia under Stalin
or China under Mao! Is that what you want for this country? The suppression of
religious freedom and the replacement of God by an omnipotent state?!”

It’s clear that Bob isn’t responding to Jennifer’s original argument, he’s responding to a
distorted misrepresentation of it, a straw figure. Appeals to religious diversity or separation
of Church of State are just as often made by religious people as by non-religious people.

But if Bob can reframe the argument so that it looks like an argument for atheism and
abolishing all forms of religious expression, then that’s a much easier argument to refute.

Rhetorically, this kind of move can be very powerful, and that’s why straw figure
arguments are so common in public debates on hot-button topics. But from a logic
standpoint, they represent a willful refusal to engage in genuine argumentation.
3.2 Red herring

“Red Herring” is another well-known fallacy type, but it’s easily confused with “straw
figure”, so I here I want to highlight the differences between the two.

The name “red herring” comes from an old method of training dogs for fox hunting. The
goal is to train the dogs to follow the fox’s scent, even if the dogs encounter other smells
that are potentially distracting.

So what they do is they let the fox go, so the fox leaves a scent trail. Then, before letting
the dogs go, they drag a bunch of smelly red herrings across the fox’s trail.

Then they release the dogs. When the dogs hit the herring trail they’ll be distracted by the
smell and some will be inclined to follow the herring trail instead, so the trainers try to get
the dogs to stay on the fox trail and resist the urge to follow the herring.

So, what’s what with this metaphor?

Well, the fox is some argument, the original argument that is at issue in a debate.

The dog can represent anyone who is interested and engaged in this argument.
The red herring is something that distracts you from following the trail of the
original argument.

It might be a new and different argument that raises a different issue, or simply an
irrelevant comment that distracts from the main issue.

What’s important is that it’s distracting enough to make the audience want to follow
this new trail, away from the original argument and the main issue.

So, putting all this together, you commit the red herring fallacy when, in an argument,
you divert attention away from the main issue or the main line of argumentation by
introducing something that changes the subject, that raises a new issue that isn’t relevant
to the preceding line of discussion.

The fallacy really occurs when you then conclude something from this different
issue, or presume that some conclusion has been established, and use this to claim
that you’ve won the argument or said something relevant about the original
argument.

In this respect the fallacy is very much like a “straw figure” fallacy, in that you’re mistakenly
or misleadingly saying that you’ve won an argument or refuted an argument when all that
you’ve really done is avoid engaging the original argument.

But it’s different from the straw figure in that a straw figure involves distorting or
misrepresenting some original argument, and then knocking down the distorted argument.

In a red herring, the arguer ignores the opponent’s argument, and subtly changes the
subject. So, to distinguish between the two, you need to ask yourself whether the arguer
has knocked down a distorted argument or simply changed the subject.

Here’s a summary of the points just made.

Straw Figure: Arguer misrepresents an opponent’s position.

Red Herring: Arguer tries to distract the attention of the audience by raising an
irrelevant issue.

To illustrate the difference, consider this example:

“I overheard my friend John argue that the Bible has errors in it. Funny, I never
figured
him for an atheist.”

This is a straw figure, not a red herring, since the conclusion being drawn is related to the
main argument that his friend is making about the Bible, but it’s clearly working off of a
distorted or exaggerated version of it if it equates biblical fallibilism with atheism.

Now compare that to this one:


“My opponent has argued that there’s an urgent need to reduce greenhouse gases
in order to minimize global warming. But the most serious problem facing future
generations is the risk posed by nuclear weapons in the hands of rogue states and
terrorists. This is where we need to focus our attention and resources.”

This is a red herring. The original issue was about greenhouse gases and the urgency of
global warming. This response side-steps that issue and introduces a new issue.

To avoid committing a red herring, the arguer would need to show that global warming isn’t
an urgent problem, or that reducing greenhouse gas emissions won’t be effective in
reducing it, or something like that. Nuclear weapons in the hands of terrorists is certainly a
serious issue, but that fact does nothing to undermine the original argument about global
warming.
3.3 Begging the question (narrow sense)

Begging the question is a very important fallacy. In my mind, it might be the most important
fallacy to understand on this whole list, because it bears directly on what it means to offer
someone reasons for believing something.

I’m going to split this tutorial into two discussions. The first one will focus on begging the
question in what I call it’s “narrow sense”, which is basically synonymous with “circularity”
or “circular reasoning”; and the second one will focus on what I call the “broader sense” of
begging the question, which is maybe less familiar but arguably even more important.

Let me first make a note about another way in which the term “to beg the question” is often
used in ordinary language.

Here’s a recent headline on a car blog:

“Honda recalls 1.3 million Yaris models, which begs the question: What’s the plural of
Yaris?”

And here’s a Seth Myers joke from Weekend Update on Saturday Night Live:

“A walrus at a zoo in Turkey has become a major attraction after learning to play the
saxophone. Which begs the question: How bored are Turkish zookeepers that they’re just
strapping instruments to animals and seeing what takes?”

So, in this context, “to beg the question” means something like “to raise the question”, or
“to inspire the question”.

People use this expression all the time, there’s nothing wrong with it, but what I want to
emphasize here is that this is NOT the sense that’s intended when we talk about “begging
the question” as a fallacy of argumentation.

Just like the terms “valid” and “invalid”, there’s a common usage in ordinary language and
there’s a more formal logical usage, which we want to keep separate.

But unlike the logical sense of the term “valid”, the logical sense of “to beg the question” is
fairly widely known and used outside of logic.

So both this more colloquial sense and the logical sense that we’ll look at next are “in play”
in ordinary language. But some people use the term almost exclusively in the sense of “to
raise or inspire a question” and aren’t even aware of the logical sense, so my advice is to
be on the lookout for confusions that might arise from misunderstanding the sense in
which you’re using this term.

In logic, we say that an argument begs the question when, in some way or another, it
assumes as true precisely what is at issue in the argument.

Another way to put this is that the argument contains a premise that in some way asserts
or presumes to be true what is being asserted in the conclusion.
Another common way of saying the same thing is that the reasoning in the argument is
“circular”.

Here’s the basic logical form of an argument that begs the question in this sense.

Premise P1
Premise P2
Premise P3
:
Premise Pi —> means the same thing as C
:
Premise Pn
Therefore, conclusion C

You’ve got an argument with premises P1, P2, and so on, down to Pn, and one of them
means the same thing as the conclusion, C, or asserts something that is logically
equivalent to C.

If this happens, then we’d say that this premise “begs the question”, meaning that it
assumes as true precisely what it is at issue, namely, whether the conclusion C is true or
not.

We call this “circular” because the conclusion C is supposed to be drawing logical support
from this premise, but the premise is simply restating the conclusion, so the argument as a
whole involves nothing more than repeating the conclusion without giving any additional
reasons to believe it.

Here’s an example:

“Capital punishment is justified for cases of murder because the state has a right to put
someone to death for having killed someone else.”

Maybe this doesn’t sound too bad when said this way, but let’s put this argument in
standard form.

1. The state has a right put someone to death for having intentionally killed someone else.
Therefore, capital punishment is justified for cases of murder.

Notice that this is just a one-liner. And notice that, even though the wording is different, the
single premise and the conclusion are asserting the same thing. After all, “the state has a
right to put someone to death” just means “capital punishment is justified”, and “for having
intentionally killed someone else” just means “for cases of murder”.

So saying this is just like saying

“Capital punishment is justified for cases of murder, therefore, capital punishment is


justified for cases of murder.”

But obviously this won’t do as an argument. The issue is whether capital punishment is
justified for cases of murder -- that’s the question that’s being begged by this argument.
Here are some other examples:

“Sky-diving is dangerous because it’s unsafe.”

“Paper is combustible because it burns.”

These are just different ways of saying the same thing. Of course if sky-diving is unsafe
then it’s dangerous, because that’s just what “unsafe” means. And “combustible” just
means “can burn”. So the first begs the question “why is sky-diving unsafe?”, and the
second begs the question “why does paper burn?”.

But these are pretty obvious examples. Here’s a sneakier one...

“Murder is morally wrong. This being the case, then abortion must also be morally wrong.”

Here we’re not given a mere restatement of a premise, so the fallacy is a bit harder to
detect. But when you put the argument in standard form, and fill in the background
premise that the argument relies on, then you get this:

1. Murder is morally wrong.


2. Abortion is murder.
Therefore, abortion is morally wrong.

This argument relies on the assumed premise that abortion is murder. If we grant this
premise then the conclusion follows immediately, since calling abortion “murder” implies
that it’s morally wrong, but this begs the question that is at issue, namely, whether abortion
should be classified as “murder”. This argument gives us no additional reason to accept
the conclusion -- it would never persuade anyone who didn’t already believe that abortion
was wrong.

This is a good example of a very common way that circular arguments can pass
themselves off as genuine arguments. You give an argument but leave out a key premise
necessary to draw the conclusion, and let your audience fill it in. This key premise
assumes precisely what is really at issue in the argument, it’s the offending circular
premise, but because it goes unstated, it’s less likely to be called out and brought under
critical scrutiny, and this helps to make the argument seem superficially persuasive.

So, to sum up, when we use the term “begging the question” in logic, we mean that an
argument is guilty of assuming as true something that really needs to be argued for, if the
argument can qualify as offering reasons to accept the conclusion. Arguments that beg the
question don’t offer any reasons to anyone who didn’t already accept the conclusion.

When I say that an argument begs the question “in the narrow sense”, I’m referring
specifically to arguments that employ premises that are roughly equivalent in meaning to
what the conclusion is asserting.

In the next tutorial I’ll loosen this definition and examine how a broader class of arguments
might be described as “begging the question”
3.4 Begging the question (broad sense)

Begging the question is usually associated with arguments where a premise or set of
premises is logically equivalent to the conclusion, so the premises don’t give any more
support to the conclusion than the conclusion has all by itself. The argument essentially
involves restating the conclusion in different language.

However, this idea can be applied to a broader range of arguments than the ones that are
normally called “circular”. In this broader sense, an argument begs the question
whenever it uses premises that are no more plausible than the conclusion is
already.

This gets back to the basic question of what it means to offer reasons to believe
something. For an argument to persuasive, the premises that you offer as reasons must
be more plausible than the conclusion is already.

Let’s assume that this green ruler is a “plausibility” meter.

It measures how plausible a particular claim is for a particular audience. Or in other words,
how confident the audience is that the claim is true.

So if you’re 100% certain that the claim is true then the marker would be up here:

If you’re only, say, 75% sure, it’ll be shifted to the left a bit, and so on.

So, for the claim that corresponds to the conclusion of an argument, we’d like to know how
plausible that claim is, for the intended audience of the argument, before being given any
premises to support it. We want to know the “initial plausibility” of the claim.

Now if this is a claim that is already widely accepted by the audience, then the plausibility
meter reading will be high, like it is here.
But of course if this the case, then you’re not going to need an argument to convince
people to accept it, since they’re already inclined to accept it.

In order to have a situation where an argument is called for, the initial plausibility will
be lower, reflecting the fact that there’s some doubt about the claim. So let’s do that.

This is a claim that the audience regards as (initially) not very plausible. So this is the sort
of claim that could benefit from an argument to support it, to offer reasons for an audience
to believe it.

Now, let’s assume that any argument we’re going to give is VALID, so the premises will
entail the conclusion.

Now we’re talking about the plausibility of the premises. The question is, what general
condition do the premises have to satisfy for them to count as offering reasons to believe
this conclusion?
The general principle is this:

Any premises that are offered as reasons to accept the conclusion must be MORE
plausible than than the conclusion was initially.

That is, the plausibility of each of the premises must be greater than the initial plausibility
of the conclusion, like shown here.

This means that each premise will be regarded as more likely to be true, to the intended
audience of the argument, than the conclusion was initially.

Our goal in argumentation is to get the audience to revise their plausibility judgments
about the conclusion -- their judgments about how likely it is that the conclusion is true -- in
light of the premises.

So, after being given the premises, we’d like to see an upward shift in the plausibility of the
conclusion, maybe like this ...
This would reflect an argument that was effective in making someone change their
mind. After considering the argument they’re more convinced that the conclusion is true
than they were before.

Now, note that I didn’t move the plausibility meter for the conclusion all the way up to
100%. It’s higher than it was initially, but it’s no higher than the least plausible premise in
the argument.

There’s no reason for anyone to accept the conclusion with a higher degree of confidence
than they accept the premises. If someone is only 75% confident that premise 2 is true,
and this premise is offered as a reason to accept the conclusion, then it wouldn’t be
rational for someone to accept the conclusion with a higher degree of confidence than
75%.

In short, a conclusion will only be as plausible to as the least plausible premise in


the argument.

That means that if I wanted to convince someone of the conclusion even more strongly
than this, then I’ll need to use an argument with premises that are even more plausible,
maybe like this ...
Now, let’s talk about what happens when the premises are less plausible than the
conclusion. Let’s say my initial confidence in the conclusion is high, like it is here, but I’m
willing to be persuaded that I should be even more confident in it.

Now let’s say the premises I’m given look like this ...

Each of the premises is less plausible than the conclusion is to me already. Even if the
argument is valid, an argument like this could never be successful in persuading me
to revise the initial plausibility of the conclusion upward.

The same principle would apply to an argument like this ...


The initial plausibility of the conclusion is fairly low, so there’s lots of room to move
upward, and one of the premises is more plausible than the conclusion is, but the other
one is less plausible than the conclusion.

An argument like this doesn’t give me any reason to revise my confidence in the
conclusion.

This is the general principle: For an argument to be persuasive, all of the premises
must be more plausible, to the intended audience of the argument, than the
conclusion is initially.
Now, what I want to say is, arguments that violate this principle are guilty of “begging
the question“ in the broad sense of that term.

I should say that I’m borrowing this formulation from Richard Epstein and his textbooks on
critical thinking, the term certainly isn’t always used in quite this sense, but I think it’s a
helpful way of thinking about what it means to beg the question.

Let’s look at some examples.

“Why is it wrong to kill chickens for food? I’ll tell you why. Because all animals
have divine souls, and it’s wrong to kill and eat for food anything that has a divine
soul.”

Interesting argument. Let’s look at it in standard form.

1. All animals have divine souls.

2. It’s wrong to kill and eat for food anything with a divine soul.

3. Chickens are animals.

Therefore, it’s wrong to kill and eat chickens for food.

This argument is valid, there’s no doubt that the conclusion follows from those premises.
But we’re interested in the relative plausibility of the conclusion and the premises.

Let’s say I’m a meat eater, I eat chicken. So the initial plausibility of that conclusion is
pretty low for me, but I’m willing to be convinced, I know some smart people who are
vegetarians. So let’s set the initial plausibility at the low end.
Working upward, let’s look at premise 3. Are chickens animals? Yes, that’s a very
plausible premise, I’ll shove that right up to the right end of the meter.

Premise 2, is it wrong to kill and eat for food anything with a divine soul?

This premise is certainly less plausible to me than the claim that chickens are animals, but
if I read it as a conditional, saying that "If a thing had a divine soul, THEN it would be
wrong to kill and eat it for food", well, maybe that’s plausible. Certainly there are lots of
people who think that killing and eating humans for food is wrong for this very reason. So
let’s say I’m willing to grant this conditional premise, put the plausibility up above 50%.

So far so good. Two of our premises satisfy our general principle.

But this last premise, premise 1. Gosh, that’s different. How plausible do I think it is that
animals have divine souls?

Well, seeing as we already know that I’m a meat eater, and seeing as such a view would
be extremely rare even among vegetarians and the very religious, I’d say that for myself,
and for most audiences, that premise will rank very low on the plausibility scale.

So, this argument violates the general condition that was stated earlier. It relies on a
premise is that less plausible than the conclusion is initially.

And the result is that there isn’t a chance in the world that an argument like this
would convince anyone to accept this conclusion who wasn’t already convinced of
it initially.

So, by our definition, this argument “begs the question” in the broad sense of that
term. When used in this way, the question that is being begged is precisely whether all
animals do in fact have divine souls -- the argument can only succeed if the audience has
reason to believe this is true, but no reasons are given.
Now, why do we call this “begging the question” in the “broad” sense?

Well, because it isn’t strictly begging the question in the “narrow” sense introduced in the
previous video. It’s not circular, in the sense that none of the premises is equivalent in
meaning to the conclusion. The premises don’t just restate the conclusion in slightly
different language.

But what’s wrong with this argument is precisely the same as what’s wrong with
arguments that beg the question in the narrow sense.

To illustrate, let’s take a look at a blatantly circular argument.

1. Abortion is the unjust killing of an innocent human being.


Therefore, abortion is morally wrong.

This is circular because calling abortion “unjust” in premise 1 automatically implies that it’s
morally wrong. The premise just restates the conclusion with slightly different wording. So
this begs the question in the narrow sense.

Now, imagine that I’m a pro-choice person, so I go into this thinking that abortion is morally
acceptable, or at least its not always unjust. For me, the plausibility of the conclusion is
going to be low.

But notice that, because the argument is circular, the premise can be no more plausible for
me than the conclusion is initially, since they assert the same thing. So by our definition,
this argument also begs the question in the broad sense, because it’s relying on premises
that are no more plausible than the conclusion is initially.

And this captures exactly why begging the question in the narrow sense is a fallacy --
because these kinds of arguments don’t give you any reasons to accept the conclusion.

Notice also that this argument would commit the fallacy even if I was a pro-lifer and
thought the initial plausibility was very high. Now the premise is very plausible to me, but it
still doesn’t give me any more reason to accept the conclusion than I already had to start
with, because the premise and the conclusion are saying the same thing.
So, to summarize, an argument begs the question in the narrow sense when it uses
premises that simply restate what’s being asserted in the conclusion, in slightly different
language.

An argument begs the question in the broader sense when it uses premises that are no
more plausible than the conclusion is already. To avoid this fallacy, all the premises must
be MORE plausible than the conclusion is initially.

Now, it’s important to note that every argument that begs the question in the narrow sense
ALSO begs the question in the broader sense. The former category is a subset of the latter
category because it’s just a special case of the latter.
So, the more fundamental fallacy is the broader one, because it reflects a necessary
condition for any argument to be good. If you want your arguments to be persuasive,
they have to employ premises that are more plausible than the conclusion is already.
What is Probability?
Part 1: Introduction
1.1 Probability: Why Learn This Stuff?

Hi everyone and welcome to this tutorial course on reasoning with probabilities. I want to
start off by acknowledging that studying probability theory isn’t high on most people’s
“bucket lists” of things to do before they die, so we should probably spend some time
talking about why this stuff is important from a critical thinking standpoint.

Here are five reasons to study probability.

First, it’s an essential component of so-called “inductive logic”, which is the branch of logic
that deals with risky inferences, and inductive logic is arguably more important for critical
thinking purposes than deductive logic.

Second, it’s an essential component of scientific reasoning, so if you want to understand


scientific reasoning, you need to understand something about probability.

Third, there are many interesting fallacies associated with probabilistic reasoning, and
critical thinkers should be aware of at least some of these fallacies.

Fourth, human beings suffer from what some have called “probability blindness” -- on our
own, we’re very bad at reasoning with probabilities and uncertainty; or to put it another
way, we’re very susceptible to probabilistic fallacies -- and this fact about us is absolutely
essential to understand if we’re going devise strategies for avoiding these fallacies.

And finally, probability is philosophically very interesting, and a lot of important


philosophical debates turn on the interpretation of probabilistic statements, so some
grounding in the philosophy of probability can be very helpful in both understanding those
debates and making informed critical judgments about those issues.

Just to give an example, the so-called “fine-tuning” argument for the existence of God is
based on the premise that we live in a universe that is probabilistically very unlikely if it
wasn’t the product of some kind of intelligent design, and therefore the best explanation for
our existence in this universe is that it was, in fact, a product of intelligent design. But this
kind of argument turns on what it means for something to be “probabilistically unlikely”,
and whether it’s even meaningful to talk about the universe in this way. I won’t say any
more about that here, but that’s just one example of an interesting philosophical debate
where probability plays an important role.
1.2 What is Inductive Logic?

Okay, in the remainder of this introduction I want to revisit the first point we raised, which is
about inductive logic. I want to lay out some terms here so that it’s clear what we’re talking
about, and the role that probability concepts play in inductive reasoning.

We distinguish deductive logic from inductive logic. Deductive logic deals with deductive
arguments, inductive logic deals with inductive arguments. So what’s the difference
between a deductive argument and an inductive argument?

The difference has to do with the logical relationship between the premises and the
conclusion. Here we’ve got a schematic representation of an argument, a set of premises
from we infer some conclusion.

1. Premise
2. Premise
:
n. Premise
∴ Conclusion

That three-point triangle shape is the mathematicians symbol for “therefore”, so when you
see that just read it as “premise 1, premise 2, and so on, THEREFORE, conclusion.

Now, in a deductive argument, the intention is for the conclusion to follow from the
premises with CERTAINTY. And by that we mean that IF the premises are all true, the
conclusion could not possibly be false. So the inference isn’t a risky one at all -- if we
assume the premises are true, we’re guaranteed that the conclusion will also be true.

For those who’ve worked through the course on basic concepts in logic and
argumentation, you’ll recognize this as the definition of a logically VALID argument. A
deductive argument is one that is intended to be valid. Here’s a simple example, well-worn
example.

1. All humans are mortal.


2. Socrates is human.
∴ Socrates is mortal.

If we grant both of these premises, it follows with absolute deductive certainty that
Socrates must be mortal.

Now, by contrast, with inductive arguments, we don’t expect the conclusion to follow with
certainty. With an inductive argument, the conclusion only follows with some probability,
some likelihood. This makes it a “risky” inference in the sense that, even if the premises
are all true, and we’re 100 % convinced of their truth, the conclusion that we infer from
them could still be false. So there’s always a bit of a gamble involved in accepting the
conclusion of an inductive argument.

Here’s a simple example.

1. 90% of humans are right-handed.


2. John is human.
∴ John is right-handed.

This conclusion obviously doesn’t follow with certainty. If we assume these two premises
are true, the conclusion could still be false, John could be one of the 10% of people who
are left-handed. In this case it’s highly likely that John is right-handed, so we’d say that,
while the inference isn’t logically valid, it is a logically STRONG inference. On the other
hand, an argument like this …

1. Roughly 50% of humans are female.


2. Julie has a new baby.
∴ Julie’s baby is female.

... is not STRONG. In this case the odds of this conclusion being correct are only about
50%, no better than a coin toss. Simply knowing that the baby is human doesn’t give us
good reasons to infer that the baby is a girl; the logical connection is TOO WEAK to justify
this inference.

These two examples show how probability concepts play a role in helping us distinguish
between logically strong and logically weak arguments.

Now, I want to draw attention to two different aspects of inductive reasoning.

When you’re given an inductive argument there are two questions that have to be
answered before you can properly evaluate the reasoning.

The first question is this: How strong is the inference from premises to conclusion? In
other words, what is the probability that the conclusion is true, given the premises?

This was easy to figure out with the previous examples, because the proportions in the
population were made explicit, and we all have at least some experience with reasoning
with percentages -- if 90% of people are right-handed, and you don’t know anything else
about John, we just assume there’s a 90% chance that John is right-handed, and 10%
chance that he’s left-handed. We're actually doing a little probability calculation in our head
when we draw this inference.

This is where probability theory can play a useful role in inductive reasoning. For more
complicated inferences the answers aren’t so obvious. For example, if I shuffle a deck of
cards and I ask you what are the odds that the first two cards I draw off the top of the deck
will both be ACES, you’ll probably be stumped. But you actually do have enough
information to answer this question, assuming you’re familiar with the layout of a normal
deck of cards. It’s just a matter of using your background knowledge and applying some
simple RULES for reasoning with probabilities.

Now, the other question we need to ask about inductive arguments isn’t so easy to
answer.

The question is, “how high does the probability have to be before it’s rational to accept the
conclusion?”.
This is a very different question. This is a question about thresholds for rational
acceptance, how high the probability should be before we can say “okay, it’s reasonable
for me to accept this conclusion -- even though I know there’s still a chance it’s wrong”. In
inductive logic, this is the threshold between STRONG and WEAK arguments -- strong
arguments are those where the probability is high enough to warrant accepting the
conclusion, weak arguments are those where the probability isn’t high enough.

Now, I’m just going to say this up front. THIS is an unresolved problem in the philosophy of
inductive reasoning. Why? Because it gets into what is known as the “problem of
induction”.

This is a famous problem in philosophy, and it’s about how you justify inductive reasoning
in the first place. The Scottish philosopher David Hume first formulated the problem and
there’s no consensus on exactly how it should be answered. And for those who do think
there’s an answer and are confident that we are justified in distinguishing between strong
and weak inductive arguments, the best that we can say is that it’s at least partly a
conventional choice where we set the threshold.

To refine our reasoning on this question we need to get into rational choice theory where
we start comparing the costs and benefits of setting the bar too low versus the costs and
benefits of setting it high, and to make a long story short, that’s an area that I’m not
planning on going into in this course.

In this course we’re going to stick with the first question, and look at how probability theory,
and different interpretations of the probability concept, can be used to assign probabilities
to individual claims AND to logical inferences between claims.

With this under our belt we’ll then be in a good position to understand the material on
probabilistic fallacies and probability blindness, which is really, really important from a
critical thinking standpoint.

1.3 Probability as a Mathematical Object vs. What That Object Represents

The first thing I want to do is distinguish probability as a mathematical object from the
various things that this object is used to represent. This distinction helps to frame what
we’re doing in this first tutorial course in the probability series, which is the meaning of
probability, and how it differs from what we’re doing in the second tutorial course, which is
on the rules for reasoning with probabilities.

First thing to note is that modern probability theory is really a branch of mathematics. The
first formal work on the subject is from the 17th century in France by mathematicians
Pierre de Fermat and Blaise Pascal, who were trying to figure out whether, if you throw a
pair of dice 24 times, you should bet even money on getting at least one double-six over
those 24 throws. They had an exchange of letters, and out of this exchange grew the first
mathematical description of the rules for reasoning with probabilities.

Modern probability theory is a complicated beast, but here some are the key ideas. You
imagine some set of elementary events or outcomes. Let’s assume there are only six, so
there are six elementary outcomes or events. These could be the six sides of a dice. We
want to associate a probability with each elementary outcome -- rolling a one, or a two, or
a three, etc.

In this case it’s pretty obvious, the odds for each of these elementary outcomes is just 1/6.

But we also want to be able to figure out the odds of different logical combinations of these
elementary outcomes. Like for example, the odds of rolling an even number, or a number
less than 4, or a number that’s either a 1 or a 5, or a number that’s not a 6.

Now, I just want you to notice what’s going on here. We’ve got these expressions that read
“the probability of event A equals some number”, P(A) = n, and this event is an an
elementary outcome or some logical combination of elementary outcomes.

Mathematically, what we have here is a function that assigns to each event a number.
That symbol, P, represents a mathematical function.

More specifically, this function takes as input some description of a possible event, and
maps it onto the real number line. The value of this number is going to lie between 0 and
1, where 0 represents events that can’t happen, that have probability 0, and 1 represents
events that MUST happen, that have probability 1.

So, the odds of rolling a 1 are just 1/6, which is about 0.17; the odds of rolling an even
number is just 1/2, or 0.5, because the even numbers include 2, 4 and 6, which make up
half of all the possible outcomes.

The other thing to note here is that, mathematically, the way we represent these different
events is in terms of subsets of the space of all possible events. That’s how a description
of an event gets translated into mathematical form. So, a probability function is a mapping
between the subsets of this larger set and the real numbers between 0 and 1.

Now, we’re not doing formal probability theory here. This is just about all I want to say
about probability as a mathematical concept, since for critical thinking purposes this is
about all you need to know. Mathematicians will use all kinds of terminology to really
specify what’s going on here. They’ll talk about “sigma-algebras” and structures that satisfy
the Kolmogorov axioms, all of this is stuff that we don’t need to worry about.

The one thing I want you to note about probability theory is this. Given an assignment of
probabilities to events A and B, the mathematics of probability gives us rules for figuring
out the probabilities of various other events. We’ll learn the basic rules for these four basic
logical relations,

P(not-A)
P(A and B)
P(A or B)
P(A given B)

i.e., negation, conjunction, disjunction, and conditional probability, later in the second
tutorial course in this series.
But note that it says “GIVEN” an assignment of probabilities to A and B, we can work out
these other probabilities. Here’s a question: how exactly do we assign values to P(A) and
P(B) in the first place?

The mathematics of probability doesn’t really address this question.

Why not? Because this is really a question about what it means to say that the probability
of an event is such-and-such; this is about what probability, as a concept, represents in the
world OUTSIDE of mathematics.

This is the question that different interpretations of probability try to answer. We’re going to
look at a few of these and their variations in the next section of the course. They each
represent a distinct way of thinking about chance and uncertainty in the world.

The mathematics of probability puts some constraints on what can count as a viable
interpretation of probability, but it allows for more than one interpretation. So the question
isn’t which interpretation is correct, but rather which interpretation is suitable or appropriate
for a given application.

That’s why, as critical thinkers, it helps to be familiar with these different interpretations,
because no single interpretation is suitable for every situation, and there are some
situation where NO interpretation is suitable, and we have to conclude that it’s simply a
mistake to apply probabilistic concepts to situations like this
Part 2: Interpretations of The Concept of Probability
2.1 Classical Probability.

In this next series of tutorials we’re going to be looking at different ways that
mathematicians and philosophers have interpreted the concept of probability, what it
means to say that the probability of rolling a six is 1 in 6, or there’s a sixty percent chance
of rain today. We’ll see that there are several different ways of interpreting this language,
and for this reason, these are sometimes called different “interpretations” or different
“theories” of probability.

The first interpretation we’re going to look at is also one of the earliest and most important,
and it’s come to be called the “classical” interpretation of probability.

The classical interpretation of probability comes from the work of mathematicians in the
17th and 18th century — people like Laplace, Pascal, Fermat, Huygens and Leibniz.
These guys were trying to work out the principles that governed games of chance and
gambling games, like dice and cards and roulette, and in particular they were interested
in working out the best betting strategies for different games. This was where the modern
mathematical theory of probability was born.

The main idea behind the classical interpretation is very straightforward. Given some
random trial with a set of possible outcomes, like tossing a coin or rolling a dice, we say
that the probability of any particular outcome is just the ratio of the favorable cases
to the total number of equally possible cases. Here a “favorable” case is just a case
where the outcome in question occurs.

So, if we’re talking about a coin toss, the probability of it landing heads is obviously 1/2
on this interpretation. There are only two possible outcomes, heads or tails, so the
denominator is 2; and of those two there’s only one case where it lands heads, so the
numerator is a 1.

Let’s look at a dice example. What’s the probability of rolling a 2 on a six-sided die?
Well, there are 6 equally possible outcomes, and only one outcome where it lands 2, so
the numerator is 1 and the denominator is 6, so the answer is 1/6, or 0.17, or about 17
percent.

If we want to know the probability of rolling an even number, then our situation is a bit
different. Now our favorable cases include three of the 6 possible outcomes -- 2, 4 and 6 --
which are even numbers. So the probability is just 3 out of 6, or 1/2.

These results are all correct, and the reasoning seems intuitively right. But it’s clear
that this only works if each of the elementary outcomes is equally possible. The
classical interpretation is especially well suited to games of chance that are designed
precisely to satisfy this condition — this is an interpretation of probability that was born in
casinos and gambling halls and card tournaments.

However, it’s not at all clear that this interpretation of probability is adequate as
a generalinterpretation of the probability concept. In particular, this condition that all the
outcomes be equally possible has been a cause for concern. What exactly does “equally
possible” mean, in general? If we just mean “equally probable”, then there’s a risk of
circularity, since our definition of probability is now invoking the concept of probability in
the definition.

The French mathematician Laplace famously tried to clarify this idea. He says that we
should treat a set of outcomes as equally possible if we have no reason to consider
one more probable than the other. This is known as Laplace’s “principle of
indifference” (though it was John Maynard Keynes who coined this expression). The idea
is that if we have no reason to consider one outcome more probable than the other, then
we shouldn’t arbitrarily choose one outcome to favor over another, that doing so would be
irrational.

We’re intended to use this principle of indifference in cases when we have no


evidence at all for what the elementary probabilities might be, and cases where we
havesymmetrically balanced evidence, like in the case of coin tosses and dice rolls,
where you know, given the geometry and symmetries of a cubical dice, that each side is
as equally like to land as any other.

So, a strength of the classical interpretation is that it gives intuitively satisfying answers to
a wide variety of cases where these conditions apply, like games of chance. But it has a lot
of weaknesses as a general theory of probability.

Let me just lay out a couple of objections to the theory.

Consider for example how we might use this interpretation to assign a probability value to
the question, what are the odds that it’s going to rain today? Okay, what’s the
favorable outcome? It rains. What’s the set of possible alternative outcomes? It rains or it
doesn’t? But if we’re forced to assign equal probabilities to each outcome in order to use
the classical definition of probability, as the principle of indifference suggests, and these
are the two elementary outcomes, then the probability of it raining is always going to
be 1/2 according to this definition.

That makes no sense, something is clearly not right. This is an example of a situation
where it’s very hard to see how the necessary conditions for the use of the classical
definition could apply and make intuitive sense of the question. In this case it’s not
obvious how to define the set of alternative outcomes that are supposed to be
equally possible.

But the most serious objections to the classical interpretation of probability


are consistency objections. It seems that under this interpretation it’s possible to come
up with contradictory probability assignments depending on how you describe the
favorable outcomes relative to the space of possible outcomes, and the interpretation
doesn’t have the resources to resolve these contradictions without smuggling in other
concepts of probability.

Here’s a well known example from the literature that illustrates the problem. Suppose a
factory produces cubes with a side-length between 0 and 1 meter. We don’t know anything
about the production process.
Question: what is the probability that a randomly chosen cube has a side-length
between 0 and half a meter? In other words, what is the probability that a randomly
chosen cube is smaller than that box X right there?

Well, given this phrasing of the question, it’s natural to spread the probability evenly over
these two event types: picking a cube that has a side length between 0 and half a meter,
and picking a cube that has a side length between half a meter and 1 meter.

Why? Because we don’t have any reason to think that one outcome is more probable than
the other.

So the classical interpretation would give an answer of 1/2 to this question, and we can
see why.
Since we’ve got two equally possible outcomes -- the box length is between 0 and 0.5, OR
the box length is between 0.5 and 1 -- and that number goes in the denominator; and only
one favored outcome -- the box length is between 1 and 0.5 -- and that number goes in the
numerator, this gives us probability one half, 0.5.
Now, to see how the consistency problem arises, let’s take the exact same setup, but
let’s phrase the question slightly differently. Suppose our factory produces cubes
with face-area -- not side-length, but the area of the face of a cube -- between 0 and 1
square meter. So the area of the face of every cube is randomly between 0 and 1
square meter.

Question: What is the probability that a randomly chosen cube has a face-area
between 0 and one-quarter square meters?
Now, phrased this way, the natural answer, using the classical interpretation, is going to
end up being 1/4, instead of 1/2.

Why? Because it’s natural now to consider four equally possible event types: picking a
cube with an area between 0 and 1/4 square meters, picking a cube between 1/4 and 1/2
square meters, picking a cube between 1/2 and 3/4 square meters, and picking a cube
between 3/4 and 1 square meter.

We don’t have any reason to think that one of these outcomes is more likely than any
other, so the principle of indifference will tell us to assign equal probabilities to each.

Our favorable outcome is just one out of these four equally possible outcome, so the
numerator is 1 and the denominator is 4, giving 1/4.

I hope that’s clear enough. In the diagram I’ve just labeled the outcomes a, b, c and d to
help make the point, all we’re doing is calculating the ratio of the number of favorable
outcomes to the total number of possible outcomes.

Now, here’s the point. I want you to see that these two questions - 1) what is the
probability of randomly choosing a cube with side-length between 0 and a half meter? and
(2) what is the probability of randomly choosing a cube with face-area between 0 and 1/4?
— are asking for the probability of the SAME EVENT.
Why? Because the cubes with a side-length of 1/2 are ALSO the cubes with a face-area of
1/4, since the area of the face is just 1/2 times 1/2, which is 1/4 (or .5 times .5, which is
.25). So, all the cubes that satisfy the first description also satisfy the second description.
The events are justdescribed differently. In other words, that “box X” is the same box in
both cases.

And here’s the problem: the classical interpretation of probability lets you assign
different probabilities to the same event, depending on how you formulate the
question. It turns out that there’s literally an infinite number of different ways of
reformulating this particular question, and the classical interpretation gives different
answers for every formulation.

Now, this might not seem like a big deal to you, but this is regarded by mathematicians
and philosophers as a fatal flaw in the theory, and it’s one of the reasons why you won’t
find any experts today who defend the classical interpretation of probability as a general
theory of probability. It gives the right answers in a bunch of special cases, and there’s
something about the reasoning in those special cases that is intuitively compelling, but
that’s about the most you can say for it.
2.2 Logical Probability

When you’re reading about different interpretations of the probability concept, you might
encounter the term “logical probability” used basically as a synonym for “classical
probability”, which we discussed in the previous tutorial. There’s nothing wrong with this
usage if it’s clear what you’re talking about, but there’s potential for confusion because the
term “logical probability” is also used to refer to a broader 20th century research
program in the foundations of probability theory and inductive reasoning, and there
are significant differences between this interpretation of probability and the classical
interpretation of the 17th and 18th century.

Okay. The basic idea behind logical probability is to treat it as a generalization of the
concept of logical entailment in deductive logic. Just to refresh your memory, in a
deductively valid argument the premises logically entail the conclusion in the sense that, if
the premises are all true, the conclusion can’t possibly be false -- the truth of the
premises guarantees the truth of the conclusion. If the argument is deductively invalid then
the premises do not logically entail the conclusion, which simply means that even if the
premises are all true, it’s still possible for the conclusion to be false.

Logical entailment in this sense is a bivalent or binary property, it only has two values,
“yes” or “no”, like a digital switch — every argument is either deductively valid or it isn’t.
There are no degrees of validity, no degrees of logical entailment.

However, we can all recognize examples where logical support seems to come in
degrees, and we can sometimes quantify the strength of the logical entailment between
premises and a conclusion.

Here’s an example:

1. There are 10 students in this room.


2. 9 of them are wearing green shirts.
3. 1 is wearing a red shirt.
4. One student is chosen randomly from this group.
Therefore, the student that was chosen is wearing a green shirt.

This is a case where the conclusion doesn’t follow with deductive certainty, the argument
isn’t deductively valid in this sense, but our intuition is that the conclusion does follow
with a high probability.

How high? Well, we’re about 90% sure that the conclusion is true; not 100%, but still
something that a reasonable person might bet on.

Another way to say this is that the premises don’t completely entail the conclusion, but
theypartially entail the conclusion. So on this reading, the statement “The student is
probably wearing a green shirt”, or more precisely, “it is 90% likely that the student is
wearing a green shirt” — these statements can be read as making a claim about the
degree of partial entailment or logical support that the premises confer on the
conclusion.
The logical approach to probability defines probability in these terms, as a measure of
the degree of partial entailment, or degree of logical support that a conclusion has,
given certain premises. When you think of the probability of a statement as ranging in
value between 0 and 1, then 1 represents classical logical entailment, where the premises
guarantee the truth of that statement, and values less than 1 represent greater and lesser
degrees of partial entailment.

Another way this is often framed as in terms of the degree of confirmation that evidence
confers on a hypothesis, where the evidence is identified with the premises of an
argument, and the hypothesis is identified with the conclusion. When you phrase it this
way you’re making explicit the connection between logical probability and some very basic
issues in scientific reasoning, like how to estimate how likely it is that a scientific theory is
true, given all the evidence we have so far.

One of the interesting features of this approach to probability is that it makes no sense to
talk about the probability of the conclusion all by itself; you’re always talking
about conditional probability, the probability of the conclusion given the
premises. Unconditional probability makes no sense — probability is always relative to
some body of evidence, or some background assumptions. Many critics of logical
probability take issue with this point, they think it should be perfectly reasonable to talk
about unconditional probabilities.

So, that’s the basic idea behind logical probability. The difficulty with this approach
comes when you try to work out this idea in detail.

For this to be a candidate for a general theory of probability you’ll need to work out
a general account of what this logical relationship is and how to operationalize it, how to
actually assign a value to the strength of the logical support that evidence confers on a
hypothesis. And this has proven to be a very tricky problem to solve. It has
preoccupied some of the smartest minds of the 20th century, including the British
economist John Maynard Keynes in the 1920s, and most notably the
philosopher Rudolph Carnap in the 1950s. It’s fair to say that no one has yet come up
with a satisfactory way of defining and operationalizing logical probability.

The difficulty of the problem arises in part from the desire to have a genuinely logical
definitionof partial entailment, or degree of confirmation. That means that this
relation should only depend on logical features of the world, or more
accurately, logical properties of our descriptions of the world. So, for example, it
shouldn’t depend on specific knowledge we may have about the particular world that we
find ourself in, it should be independent of that kind of substantive empirical knowledge.

So, Carnap for example tries to define a confirmation function that applies to formal
languages, and to illustrate the idea he uses little toy models of worlds with, say, only three
objects in them, and only one property, that each object either has or doesn’t have.
In these toy worlds you can list all the possible states that such a world can be in, and you
can define different event-types as subsets on this state space. And when you do this you
can show how, given information about one of the objects, you can formally define how
likely it is that certain facts will be true of the other objects. So in this example, if you have
evidence that Ball C is red, then this will change your estimation of the likelihood that, say,
ball A is red, or that all the balls are red.

Now, Carnap was trying to generalize this procedure in a way that would give a general
definition of logical confirmation in the form of a confirmation function that would apply to
all cases where evidence has a logical relationship to a hypothesis. But Carnap himself
realized that there’s more than one way to define the confirmation function in his system,
and logic alone can’t tell us which of these to choose.

The details don’t matter too much for our purposes, but Carnap’s system runs into
technical and philosophical problems when you try to work it out.

There are other objections to the whole program of logical probability that I won’t go into
here, again, but most of them arise, as I said, from the constraint that the definition of
probability be a logical or formal one that doesn’t rely on specific knowledge about the
world.

It’s fair to say that today, this program is mostly of academic interest to certain
philosophers and people working in the foundations of probability; it’s not where the
cutting edge of the discussion is among scientists or the mainstream of people working on
probability.

Today, most of the discussion is between proponents of frequency-based approaches to


probability, and proponents of subjective or Bayesian approaches, which is what we’ll turn
to in the next couple of tutorials.
2.3 Frequency Interpretations

In the last two videos we’ve looked at the classical and the logical interpretations of
probability. Now let’s turn to one of the most widely used interpretations of probability in
science, the “frequency” interpretation.

The frequency interpretation has a long history, it goes all the way back to Aristotle, who
said that the probable is that which happens often. It was elaborated with greater
precision by the British logician and philosopher (1834-1923) John Venn, in his 1866
book The Logic of Chance, and there are plenty of more contemporary figures who
elaborated or endorsed some version of the frequency interpretation -- I’m thinking of
figures like Jerzy Neyman and Egon Pearson andRonald Fisher and Richard von
Mises -- if you know the field these are big names in the history of statistics and probability
theory.

The basic idea behind the frequency approach, as always, is pretty straightforward. Let’s
turn once again to coin tosses. How does the frequency interpretation define the
probability of a coin landing heads on a coin toss?

Well, let’s start flipping the coin, and let’s record the sequence of outcomes. And for each
sequence we’ll write down the number of heads divided by the total number of tosses.

First toss, heads. So that’s 1/1.


Second toss is a tail. So that’s 1/2.
Third toss is a head, so now we’ve got 2/3.
Fourth is a tail, so now it’s 2/4.
Fifth is a tail, so know it’s 2/5.
Let’s cycle through the next five tosses quickly:

These ratios on the bottom are called “relative frequencies”, and a sequence like this is
called arelative frequency sequence.
There are a few obvious observations we can make about this sequence. First, we see
that it jumps around the value of one half, sometimes exactly one half, sometimes higher,
sometimes lower.

It might be easier to look at the sequence in decimal notation to see this more clearly. And
it to make it even more clear, let’s graph this sequence.

It’s more obvious now that the sequence bounces around the 0.5 mark. Three times it’s
exactly 0.5, but we know it can’t stay at 0.5, since the next toss will move the ratio either
above or below 0.5.

What’s also obvious, I think, is that the range of variation gets smaller as the number
of tosses increases, and if we were to continue tossing this coin and recording the
relative frequency of heads, we would expect that this number would get closer and
close to 0.5 the more tosses we added.

Now, none of this is surprising, but what does it have to do with the definition of
probability?

Well, everyone agrees that there are important relationships between probability and the
relative frequencies of events. This is exactly the sort of behavior you’d expect if this was a
fair coin that was tossed in an unbiased manner. We assume the probability of landing
heads is 1/2, so the fact that the relative frequency approaches 1/2 isn’t surprising.

But what’s distinctive about frequency interpretations of probability is that they want to
IDENTIFY probabilities WITH relative frequencies. On this interpretation, to say that
the probability of landing heads is 1/2 is JUST TO SAY that if you were to toss it,
it would generate a sequence of relative frequencies like this one. Not exactly like this
one, but similar.
For a case like this one, the frequency interpretation will define the probability of landing
heads as the relative frequency of heads that you would observe in the long run, as you
kept tossing the coin.

To be even more explicit, this long-run frequency is defined as the limit that the
sequence of relative frequencies approaches, as the number of tosses goes to
infinity. In this case it’s intuitive that the sequence will converge on 1/2 in the limit. And if
that’s the case, then, according to this approach, we’re justified in saying that the
probability of landing heads is exactly 1/2.

Actually, what we’ve done here is introduce two different relative frequency definitions:

You can talk about probabilities in terms of finite relative frequencies, where we’re only
dealing with an actual finite number of observed trials; or we can talk about probabilities in
terms of limiting relative frequencies, where we’re asked to consider what the relative
frequency would converge to in the long run as the number of trials approaches infinity.

Some cases are more suited to one definition than the other.

Batting averages in baseball, for example, are based on actual numbers of hits over
actual numbers of times at bat. It doesn’t make much sense to ask what Ty Cobb’s
batting average would be if he had kept playing forever, since (a) we’d expect his
performance to degrade as he got older, and (b) in the long run, to quote John Maynard
Keynes, we’re all dead!

Coin tosses, on the other hand -- and other games of chance -- look like suitable
candidates for alimiting frequency analysis. But it’s clear that more work needs to be
done to specify just what the criteria are and what cases lend themselves to a limiting
frequency treatment, and this is something that mathematicians and philosophers have
worked on and debated over the years.

Now, I’ve said a couple of times that frequency interpretations are widely used in
science, and I’d like to add a few words now to help explain this statement. There’s a
version of the frequency approach that shows up in ordinary statistical analysis, and it’s
arguably what most of us are more familiar with. It’s based on the fact that sequences of
random trails are formally related to proportions in a random sampling of populations.

Just to make the point obvious, when it comes to relative frequencies, there’s no real
difference between flipping a single coin ten times in a row and flipping ten coins all
at once. In either case some fraction of the tosses will come up heads.

In the single coin case, as you keep tossing the coins, we expect the relative frequency of
heads to converge on 1/2.

In the multiple coin case, as you increase the number of coins that you toss at once -- from
ten to twenty to a hundred to a thousand -- we expect the ratio of heads to number of coins
to converge on 1/2.

This fact leads to an obvious connection between relative frequency approaches and
standard statistical sampling theory, like what pollsters use when they try to figure out
the odds that a particular candidate will win an election. You survey a representative
sampling of the population, record proportions of “Yes” or “No” votes, and these become
the basis for an inference about the proportions one would expect to see if you surveyed
the whole population.

All I’m drawing attention to here is the fact that frequency approaches to probability are
quite commonly used in standard statistical inference and hypothesis testing.

Now, let’s move on to some possible objections to the frequency interpretation of


probability. Let me reiterate that my interest here is not to give a comprehensive tutorial on
the philosophy of probability. My goal, as always, is nothing more than probability
literacy -- we should all understand that probability concepts can be used and interpreted
in different ways, and some contexts lend themselves to one interpretation better than
another. These objections lead some to believe that the frequency interpretation just won’t
cut it as a general theory of probability, but for my purposes I’m more concerned about
developing critical judgment, knowing when a particular interpretation is appropriate and
when it isn’t.

Let’s start with this objection. If probabilities are limiting frequencies, then how do we
know what the limiting frequencies are going to be? The problem arises from the fact
that these limiting behaviors are supposed to be inferred from the patterns observed in
actual, observed, finite sequences, they’re not defined beforehand, like a mathematical
function. So we can’t deductively PROVE that the relative frequencies of a coin toss will
converge on 0.5. Maybe the coin is biased, and it’s going to converge on something
else? Or let’s say that we now get a series of ten heads in a row? Does that indicate that
the coin is biased and that it won’t converge on 0.5? But isn’t a series of ten heads in a
row still consistent with it being a fair coin, since if you tossed the coin long enough you’d
eventually get ten in a row just by chance.
I’m not saying these questions can’t be worked out in a satisfying why, I’m just pointing out
one of the ways that the application of the limiting frequency approach to concrete cases
can be difficult, or can be challenged.

Let’s move on to another objection, which is sometimes called the “reference class
problem”. And this one applies both to finite and limiting frequency views.

Let’s say I want to know the probability that I, 43 years old at the time of making this video,
will live to reach 80 years old. One way to approach this is to use historical data to see
what proportion of people who are alive at 43, also survive to 80. The question is, how do
we select this group of people from which to measure the proportion? A random sample of
people will include men and women, smokers and non-smokers, people with histories of
heart disease and people who don’t, people of different ethnicities, and so on. Presumably
the relative frequency of those who live to age 80 will vary across most of these reference
classes. Smokers as a group are less likely to survive than non-smokers, all other things
being equal, right?

The problem for the frequency interpretation is that it doesn’t seem to give
a singleanswer to the question “What is the probability that I will live to 80?”.
Instead, what it’ll give me is a set of answers relative to a particular reference class —
my probability as a male, my probability as a non-smoker, my probability as a male non-
smoker, and so on.
To zero in on a probability specific to me, it seems like you need to define a reference
class that is so specific that it may only apply to a single person, me. But then you don’t
have a relative frequency anymore, what you’ve got is a “single-case” probability.

And single-case probabilities are another category of objection to frequency


interpretations. When I toss a coin, it doesn’t seem completely crazy to think that for this
one, single coin toss, there’s an associated probability of that toss landing
heads. But frequency interpretations have a hard time justifying this intuition. This is
important to see -- on the frequency interpretation, probabilities aren’t assigned to single
trials, they’re assigned to actual or hypothetical sequences of trials. For a strict
frequentist, it doesn’t make any sense to ask, "what is the probability of a single-
case event?" But a lot of people think this concept should make sense, and so they reject
frequency interpretations in favor of interpretations that do make sense of single-case
probabilities.

So, for these and other reasons, many believe that the frequency interpretation just can’t
function as a truly general interpretation of probability.

In the next two tutorials we’ll look at interpretations of probability that, as we’ll see, are
much better at handling single-case probabilities. These are the subjective interpretations
and the propensity interpretations, respectively.
2.4 Subjective (Bayesian) Probability

In this video we’re going to look at so-called “subjective” or “Bayesian” interpretations of


the probability concept. By the end you should have a better idea what this approach is all
about and why many people find it an attractive framework for thinking about probabilities.

In the previous video we looked at “relative frequency” interpretations of probability. On


this view, when we say that the probability of a fair coin landing heads is 1/2, what we’re
really saying is that if you were to toss the coin repeatedly, in the long run half of the
tosses would land heads, and the probability is just identified with the long-run relative
frequency behavior of the coin.

But there are lots of cases where the relative frequency interpretation just doesn’t make
intuitive sense of how we’re using a probability concept.

For example, let’s say I’m at work and I remember that I left the back door of my house
unlocked, and I’m worried about the house being robbed because there have been a rash
of robberies in my area over the past two weeks. So I’m driving home from work and I’m
asking myself, what are the odds that when I get home I’ll discover that my house has
been robbed?

This is an example of a single-case probability. I’m not interested in any kind of long-run
frequency behavior, I’m interested in the odds of my house being robbed on this specific
day, on this one, single occasion. Examples like these are very hard for frequency
approaches to analyze.

The more natural way to think of this case is this. I have a certain DEGREE OF BELIEF in
whether this event occurred. If someone asks me how likely it is that I’ve been robbed
today, and I say I think there’s at least a 10% chance I was robbed, what I’m doing
is reporting on the strength of my subjective degree of confidence in this outcome. What
I’m reporting on is a subjective attitude I have toward the belief that I’ve been robbed — if
my degree of belief is very low, that’s a low probability; if it’s moderate, that’s a moderate
probability; if it’s high, that’s a high probability.

This is what it means to say that probability is SUBJECTIVE. What you’re saying is that
what probabilities represent are not features of the external world, but rather features of
your personal subjective mental states, namely, your degree of belief that a given event
will occur or that a given statement is true.

Subjectivists about probability want to generalize this idea and say that probability in
general should be interpreted as a measure of degree of belief.

This conceptual framework can be applied even to cases where the frequency
interpretation also works. If we’re talking about the probability of a coin landing heads
being equal to 1/2 , the subjectivist will interpret this as saying that you’re 50% confident in
your belief that the coin will land heads.
Okay, at this point there’s an obvious objection to interpreting probability in this way. The
objection is this: People are notoriously BAD at reasoning with probabilities, our degrees of
belief routinely violate the basic mathematical rules for reasoning with probabilities.

If probabilities are interpreted as mere subjective degrees of belief, then in what sense can
we possibly have a theory of probability, a theory that distinguishes good reasoning from
bad reasoning?

Subjectivists, or Bayesians, as they’re often called, have an answer to this question.

They argue that the only logically consistent way of reasoning with subjective degrees of
belief is if those degrees of belief satisfy the basic mathematical rules for reasoning with
probabilities. All the action in the subjective interpretation lies in the details of this
argument. What I’m going to give here is just a rough sketch of the reasoning, which I’ll
break down into three steps.

One of the challenges of reasoning with subjective probabilities is that because they’re a
feature of our inner mental states, they’re hard to access. We need some way of assigning
a number to represent a degree of belief. How do we do this?

In the early part of the 20th century Frank Ramsey and Bruno de Finetti independently
came up with the basic solution, which exploits the fact that there is a close
relationship between belief and action, and in this case, between degrees of belief and
one’s betting behavior when one is asked to choose between different bets or gambles or
lotteries.

I’m going to use a visual device to help illustrate the idea.

Let’s say I want to measure the strength of your belief that this coin will land heads on the
next toss. I know we know the probability in this case, it’s 50%, but let’s just use this case
to illustrate the procedure. Then we can use an example that isn’t so obvious.

Okay, we imagine that you’re faced with a choice, to select between two different bets.
Bet 1: The bet is whether you think it’s true or not that the coin will land heads. If it lands
heads, you win $1000. If it lands tails, you win nothing.

Bet 2: The bet is whether you should play a lottery, where the lottery has a thousand
tickets. And in this lottery there are 250 winning tickets. If you draw a winning ticket you
win $1000. If you don’t draw a winning ticket you win nothing.

So the question is, which bet would you prefer to take, Bet 1 or Bet 2?

This is easy, we’re all going to pick Bet 1, right? Because we think the odds of the coin
landing heads are higher than the odds of winning the lottery, which is 25%. We believe
that we’re more likely to win Bet 1 than Bet 2.

Now imagine that Bet 2 was different. Imagine that the lottery in Bet 2 has 750 winning
tickets, so you win if you draw any of those 750 winning tickets. Now which bet would you
pick?

Well, this time we’d all pick Bet 2, because now the odds of winning the lottery is 75%, so
we’re more confident that we’d win this bet than Bet 1.

So, what we’ve established here, by examining your preferences between different bets, is
thatyour degree of confidence that the coin will land heads lies somewhere between .25
and .75.

We can narrow this range by selecting different bets with different numbers of winning
lottery tickets.

Now, what will happen if we’re offered a lottery with exactly 500 winning tickets?
In this case, we should be indifferent between these two bets, since we think the odds of
winning the 1000 dollars are the same in both cases. We wouldn’t prefer one bet over the
other.

And THIS is the behavioral fact that fixes your degree of belief in the proposition at hand.
When you’re indifferent between these two choices, the percentage of winning tickets in
the lottery can function as a numerical measure of the strength of your belief in the
proposition that you’re betting on in Bet 1.

We can use this imaginary procedure to measure the strength of your belief in any
proposition. Like my belief that when I get home I’ll discover that my house was robbed.
If I end up being indifferent between a bet that my house was robbed and a bet that I’ll win
the lottery with, say 100 winning tickets, then we can say that I’m about 10% confident that
my house was robbed.

Now, if you ever find yourself reading the subjective probability literature, the more
common language you’ll encounter is the language of betting ratios and betting rates, but
the main idea is the same. The procedure I’m describing here using lotteries is more
commonly used in decision theory, but it’s inspired by the same body of work by Ramsey
and de Finetti.

So, we now have a way of representing our personal degrees of belief by betting rates on
imaginary gambles. This gives us an operational procedure for assigning a real number to
a degree of belief. But we still don’t have any rules for how to reason with these degrees of
belief.

The next step in the subjectivist program is to show that a rational betting strategy will
automatically satisfy the basic mathematical rules for reasoning with probabilities.

In this context, all we mean by a rational betting strategy is this: no rational person will
willingly agree to a bet that is guaranteed to lose them money. A bet that is guaranteed to
lose you money is called a ‘sure-loss contract’.

If someone’s personal degrees of belief are open to a sure-loss contract, then that person
can become a ‘money pump’ — a bookmaker could exploit this knowledge to sell you
betting contracts that you will accept, but that you will never win, you’ll always lose money.
Not a good thing.
These sure-loss contracts are also known as “Dutch book” contracts, and this kind of
argument is called a “Dutch book” argument. Ramsey was the one who introduced this
language but I don’t know why he called it a Dutch book, I’m not sure what being Dutch
has to do with it, but the term Dutch book is now standard in probability theory and
economics. I’m going to follow Ian Hacking and just call it a “sure-loss contract”.

We can now define an important concept: If a set of personal degrees of belief is not open
to a sure-loss contract, then the set of beliefs is called “coherent”.

In other words, if your set of beliefs is coherent, then by definition you can’t be turned into
a money pump for an unscrupulous bookie. Note that this is a technical sense of
“coherence” specific to this context. It’s intended as an extension of the logical concept of
consistency, applied to partial degrees of belief.

Now, the main theoretical result that Ramsey and de Finetti developed was this: A set of
personal degrees beliefs is coherent if and only if the set satisfies the basic rules of
probability theory. And here we’re just talking about the standard mathematical rules.

The details of this theorem aren’t important, what’s important is what it represents for the
subjectivist program. Our original concern, remember, was that personal degrees of beliefs
are unconstrained, they don’t follow any rules. What Ramsey and de Finetti and others
have shown is that if one adopts this very pragmatic and self-interested concept of
rationality — namely, that a rational person won’t willingly adopt a set of subjective
degrees of belief that is guaranteed to lose them money — then it follows that this person’s
belief set will satisfy all the basic rules of probability theory. And it is in this sense that the
subjective approach to probability brings with it a normative theory of probability.

Now, in the literature, people who work within the subjective framework I’m describing here
are often called “Bayesians”, and this approach is called “Bayesianism”. So let’s say a few
words about this language.

Bayes’ Rule can be derived from the basic mathematical rules of probability, it’s basically
just a way of calculating conditional probabilities given certain information. Here’s the
simplest form of Bayes’ Rule.
H and E can stand for any two propositions, but in practice we often use Bayes’ Rule to
evaluate how strongly a bit of evidence supports a hypothesis, so let H be a
hypothesis and let E be some bit of evidence. Maybe H is the hypothesis that a patient has
the HIV virus, and E is a positive blood test for the virus.

We read this term as as the probability that H is true, given that E is true. On the
subjectivist reading, this is the degree of belief that we should have in hypothesis H, once
we’ve learned about the evidence E. This is also called the “posterior probability” of the
hypothesis.

P(H), all by itself, is called the “prior probability” of the hypothesis. This is the degree of
belief we had in H before ever learning about the new evidence E. In our example, this
would be the probability that the patient has HIV, before learning the results of the blood
test.

P(E|H) is called the “likelihood” of the evidence, given the hypothesis. This is how likely it
is that we would observe evidence E, if the hypothesis H was in fact true. So in our
example, this is the probability that someone will test positive for the HIV virus, given that
they actually have the virus.

The term in the denominator is called the “total probability” of the evidence E. In our
example, this term is going to represent the probability of testing positive for the HIV virus,
whether or not the patient actually has the virus. So this term will also depend on
information about the false-positive rate for the rest, the percentage of times a patient will
test positive, even when the don’t have the virus.

I’m not going to spend any more time explaining how this calculation will go right here,
because it’s not vital to the point I’m making, and I’ve got a whole other course on the rules
of probability theory that explains it in more detail.

The point I want to make here is that when you interpret probabilities the way that
subjectivists do, Bayes’ Rule gives us a model for how we ought to learn from experience,
how we ought to update our degrees of belief in a hypothesis in light of new evidence.
Bayes’ Rule has lots of important applications in statistical inference theory, but in the
hands of subjectivists it also functions as the central principle of a theory of rational belief
formation and rational inference.

So this is why subjectivists are often called “Bayesians”; it’s because within this
interpretation, Bayes’ Rule takes on great importance as part of a general theory of
rationality. For frequency theorists, Bayes’ rule is just another useful formulation of
conditional probability, and its use is restricted to cases where relative frequencies can be
defined. For subjectivists, it’s fundamental to their approach to rationality, and it can be
used in a much wider range of applications, since they’re not restricted to applications
using relative frequencies.

There’s also a whole field of philosophical work that you could describe as falling under the
label of “Bayesian epistemology”, which applies Bayesian principles to various problems in
the philosophy of knowledge, the philosophy of science, in decision theory and learning
theory, and so forth.

Regardless of what you think of it, this approach to probability has had a huge impact on
philosophy and science.

Here’s the summary of what we’ve been talking about.


Step 1 in the Bayesian program is to find a way of numerically representing a person’s
degree of belief. We use betting rates to do this. Once we’ve got this, we can talk about
rational and irrational betting strategies.

In Step 2 we show that if our degrees of belief are “coherent”, then they’ll automatically
satisfy the basic mathematical rules of probability theory.

And Step 3 involves the use of Bayes’ Rule as a guide for how we ought to update our
beliefs based on evidence.

Now, we’ve been looking at objections to all the previous interpretations of probability that
we’ve covered, so it’s only fair to mention that of course there are objections to the kind of
subjective Bayesianism that I’ve been describing here. The best I can do here is just name
a few, since it would take too long to try to explain them all in detail here.

Here we go:

First objection: Bayesianism assumes logical omniscience

The claim is that if our beliefs satisfy the basic rules of probability, then the rules require
that all beliefs about logical truths have probability 1, and beliefs about logical
contradictions have probability 0. So on this view, if our beliefs are coherent then we can
never believe a contradiction. The objection is that this is just false of human beings, none
of us are logically omniscient in this way, and so it’s unreasonable standard to impose on
our beliefs.

Second objection: Bayesianism assumes that classical logic is the only logic
This follows from the bit about logical omniscience. We’ve never talked about non-classical
logics before, but there are such things -- logical theories that use different fundamental
rules of inference from standard classical logic. We’ve largely moved away from the days
when everyone thought that classical logic was the only possible logic one could use. The
objection is that Bayesianism presupposes that the rules of classical logic are correct, and
makes them immune to revision based on empirical evidence, and consequently it grants
them a kind of a priori status that few people actually think it has anymore.

Third objection: The problem of old evidence

From Bayes’ rule it follows that if the probability of a piece of evidence is 1, then the
likelihood of the evidence given some hypothesis is also 1. But if this is so, then such
evidence can never raise the probability of a hypothesis: the posterior probability will
always be just the same as the prior probability.

This poses a problem for Bayesian views on how so-called “old evidence” might support a
new scientific theory. For example, Newton’s theory of gravity doesn’t completely predict
the orbit of Mercury, it doesn’t adequately account for the precession of Mercury’s orbit
around the sun. This behavior of Mercury’s orbit was know in the mid 19th century. 60
years later Einstein comes up with the general theory of relativity and his new theory
accurately predicts this piece of “old evidence”, the precession of Mercury’s orbit. The
objection is that this is rightly viewed as an empirical success of Einstein’s theory, it should
lend support to his theory; but the Bayesian has a hard time explaining HOW this old
evidence can give us additional reason to believe the theory.

Fourth objection: The problem of new theories

It seems intuitive that sometimes, the invention of a new theory, all by itself, can influence
our confidence in an old theory, especially when the old theory didn’t have any rivals.
Imagine the old earth-centered cosmology of Ptolemy, where all the heavenly bodies move
around a motionless Earth. This theory had no competition for a long time. Then along
comes Copernicus with this Sun-centered cosmology, that can explain everything that
Ptolemy’s theory did. Wouldn’t this fact alone lead some people to reconsider their support
for Ptolemy’s theory, to lower their conviction in the truth of this theory? The objection to
Bayesianism is that it’s not clear how this kind of shift in support can be explained or
justified in the Bayesian framework.

Fifth objection: Additional constraints on prior probabilities are needed


This is sometimes just called “the problem of the priors”. The issue is this. What we’re
calling “subjective” Bayesianism doesn’t place any restrictions on the values of the prior
probabilities, beyond the requirement of coherence. In other words, it doesn’t constrain
your beliefs, beyond the requirement that they be consistent with the rules of probability,
and when you learn new evidence, you update your beliefs according to Bayes’ rule.

The objection is that this is just way too permissive. You can have literally crazy views of
the world that would be permitted by these rules. Within those belief sets, you’d be
updating your beliefs rationally when you encountered new evidence, but the belief sets
themselves would be wildly different.

So different constraints on prior probabilities have been proposed. One proposal, for
example, is that subjective degrees of belief should, at the very least, track the relative
frequencies that are known. So, for example, if it’s known that a baseball player is hitting
.350 this season, then all other things being equal, it seems reasonable to assign a degree
of belief of .35, or 35%, as the prior probability that he’ll get a hit the next time at bat. Now,
if you’re a Bayesian and you think along these lines, then you’re not a strict, subjective
Bayesian; you’re what’s called an “objective” Bayesian, because you think there are
objective features of the world, like relative frequencies, that should restrict the
probabilities you assign to your beliefs. What we end up with is really a family of Bayesian
approaches to probability, that range from more subjective to more objective varieties, and
where you fall on this range depends on how many and what sorts of additional constraints
you’re willing to place on the prior probabilities.

Okay, I think that’s more than enough for this introduction to subjective probability. The are
other objections of course, but these are some of the main ones.
To wrap things up, I’ll just conclude with this: There are a lot of smart people working today
in philosophy, statistics, applied math and science, computer science and artificial
intelligence, who are engaged in research within what can be described as a broadly
Bayesian framework. That doesn’t mean that there aren’t a lot of open problems with the
framework that need to be solved. But it does mean that this is a framework that people
are willing to openly endorse without embarrassment.
2.5 Propensity Interpretations

The last interpretation of probability that we’re going to look at is known as the propensity
interpretation. The term was coined by the philosopher Karl Popper in the 1950s.

Actually before we get into the concept of a propensity, let me just back up and situate this
discussion a little bit.

We’ve looked at a number of different interpretations of the probability concept, but you
can carve up interpretations into roughly two camps, corresponding to two larger, umbrella
concepts of probability.

The first concept is sometimes called “epistemic probability” or “inductive probability”. Ian
Hacking calls it ‘belief-type’ probability. This kind of probability is about how probable a
statement is, or how strongly we should hold a belief, given certain facts or evidence.
Given such-and-such evidence, what is the probability that the Big Bang Theory is true?,
or that it’ll rain tomorrow?, and so on.

The key thing about this kind of probability is that it doesn’t depend on unknown facts
about the world, it only depends on our available evidence; probability judgments of this
kind are always relative to the evidence that is available to some agent.

Looking back at the probability concepts that we’ve discussed, it’s clear that LOGICAL
probability and SUBJECTIVE probability belong in this camp.

But there’s another probability concept that we’ve also been discussing, which some call
“objective probability”, or “physical probability”. This kind of probability is associated with
properties of the world itself, independent of available evidence, independent of what
anyone happens to believe about the world.

So, for example, when we hear reports of an outbreak of a new flu virus, we’re told that in
certain regions there’s an increased chance of contracting the virus, and if this is true, it’s
true independently of what anyone happens to believe about the world.

Or think about radioactive decay, where there’s, say, a 50% chance that a particular atom
of some element will decay in the next hour. The half-life of a radioactive element is an
objective feature of the world that we discover, it’s not something that depends on the
evidence or beliefs that we have about it.

Of the probability concepts that we’ve looked at, CLASSICAL probability and
FREQUENCY interpretations of probability belong more to this camp. Now, admittedly the
physical properties that are associated with probabilities in these theories are a little weird
and abstract. In the classical theory they’re ratios of favored outcomes over all equally
possible outcomes; in frequency theories they’re relative frequencies of observed or
hypothetical trials. But in both cases, the probability of a coin toss landing heads is
identified with a feature of coin tossings, not with our beliefs about coin tossings.

So what does this have to do with Popper and the propensity interpretation?

Well, propensity interpretations land squarely in the objective, physical probability camp.
Popper introduced this concept because he thought that frequency-style interpretations of
physical probability weren’t adequate, so he’s trying to articulate the concept of a physical
probability in a better way.

So, what’s the difference between relativity frequencies and propensities? Let’s consider
our coin tossing example again.

On a relative frequency interpretation, the probability of the coin landing heads is identified
with the long-run relative frequency of heads; so in the long run this frequency will
converge on 0.5 for a fair coin, and this limiting frequency, this ratio, is what we’re referring
to when we say that P(H) = 0.5.

In other words, a probability, on this view, isn’t a property of any individual coin toss; it’s a
property of a potentially infinite sequence of coin tosses.

Now, Popper thinks there’s something incomplete about this approach. Consider two
coins. The first is a fair, unbiased coin. The other is a biased coin, it’s weighted more on
one side than the other; so that when you toss it, it lands heads more often than tails. Let’s
say that in the long run it lands heads 3/4 of the time.

Popper asks us to consider these two coins, sitting in front of us on the table. These coins
will generate different long-run frequencies when you toss them. Why? What explains this
difference in behavior?

The obvious answer, says, Popper, is that the two coins have different physical
characteristics that are causally responsible for their long-term frequency behavior.

It’s these different physical characteristics that Popper calls “propensities”. It’s their
different propensities to land heads that account for the differences in their frequency
behavior. And this is what numerical probabilities are taken to represent, propensities of an
experimental setup to generate these different relative frequencies of outcomes.

Now, an important feature of these propensities is that they belong to individual coin
tosses, not to sequences of coin tosses. Propensities are supposed to be causally
responsible for the patterns you see in sequences of coin tosses, but propensities
themselves are properties of individual coin tosses.

So on a propensity interpretation, if you toss both of these coins just once, you can say of
this singular event, this individual coin toss, that the unbiased coin has a probability = 0.5
of landing heads, and the biased coin has a probability = 0.75 of landing heads.

Popper and other propensity theorists take it as a major advantage of this approach that it
lets us talk about single-case probabilities, and it has a theoretical advantage in that it
explains the long-run frequency behavior of chance setups, rather than just treat them as
brute empirical facts, as frequency approaches tend to.

Popper also thought that a propensity interpretation was the only way to interpret the
probabilities associated with quantum mechanical properties, like the decay rates of
atoms. He interprets quantum mechanical probabilities as measuring genuine
indeterminacies in the world, not just our ignorance of the physical details that actually
determine when the atom decays. According to standard interpretations of quantum
mechanics, there are no such details, the quantum probabilities represent genuinely
indeterministic processes, an objective chanciness in the laws of nature itself. So Popper
thinks that the propensity interpretation is the most natural way to interpret these kinds of
physical probabilities.

Okay, that’s the basic idea behind propensities. As you might suspect by now, this is of
course just the tip of the iceberg. We haven’t said anything yet about possible objections to
propensity interpretations, or even whether they’re a viable interpretation of the probability
calculus. I mean, maybe propensities can help us understand objective indeterminacy in
the world, but what’s the guarantee that they’ll obey the mathematical rules of probability
theory? And how exactly do propensities relate to relative frequencies? And what exactly
are propensities, metaphysically speaking?

All of these questions are interesting, and in the decades since Popper introduced this
approach various different theories of propensity and objective chance have been
developed to help answer these questions. I’m a little hesitant to get into this literature
because, (a) it’s mostly of philosophical interest — this is something that scientists or
statisticians tend not to have much interest in; and (b) I don’t want this introduction to be
any longer than it has to be. In an introductory classroom discussion I would probably stop
right here.

But … since I talked about objections in all the other tutorials I might as well say something
about how this approach has been developed and the sorts of challenges it faces.

First of all, there really are two kinds of propensity theories in circulation, and these
theories differ in how they view the relation between propensities and relative frequencies.

For Popper, for example, the probability of landing a 2 on a dice roll is interpreted as a
propensity of a certain kind of repeatable experimental setup — in this case, the dice
rolling setup — to produce a sequence of dice rolls where, in the long-run, the dice lands a
2 with relative frequency 1 in 6.

So on this view, propensities are always associated with long-run relative frequencies;
they’re precisely the physical features of the experimental setup that are causally
responsible for those long-run frequencies.

So on Popper’s view, even though he talks about single-case propensities, these


propensities are only defined for single cases that involve some repeatable experimental
setup, and the physical property associated with the propensity is defined in terms of its
ability to generate these long-run frequencies, if you were to repeat the experiment over
and over. Notice that this is not a propensity to produce a particular result on a particular
occasion; this is a propensity to produce a sequence of results over repeated trials.

For this reason, some people call this kind of propensity theory a “long-run” propensity
theory, and they distinguish it from a genuinely single-case propensity theory, which treats
propensities as propensities to produce particular results on specific, singular occasions.

I know this might seem like just a verbal distinction, but metaphysically the two views really
are quite different.

For example, for a single-case propensity theory, the propensity for rolling a 2 on a fair
dice roll is relatively weak, it’s measured by the ratio of 1 in 6, or about 0.17. That’s a low
number. The probability is a direct measure of this weak tendency, or propensity, to land
‘2’ on a single dice roll.

For a long-run propensity theorist like Popper, on the other hand, the propensity for rolling
a 2 on a dice roll is NOT measured by this low number, it’s not identified with the
probability of rolling a 2. The propensity for rolling a 2 is a very strong, extremely strong
tendency, but not for rolling a 2. The propensity is the tendency of the dice rolling setup to
land ‘2’ with a long-run relative frequency of 1/6, and THAT is a VERY, VERY strong
tendency. We have the same outcome as with the single-case propensity approach — the
probability of rolling a ‘2’ is defined as 1/6, but the interpretation of the physical property
that is responsible for this outcome is very different.

Now, I draw this distinction because objections to propensities theories differ between
these two types of theory.

As I mentioned earlier, one concern that all interpretations of probability face is whether
they can function as a suitable interpretation of the probability calculus, the mathematical
theory of probability.

Long-run propensity approaches tie propensities to relative frequencies, which is good in


one sense, since it can piggy-back on the widespread use of relative frequencies in
science. But from a foundational standpoint it’s not so good, since — as we saw in the
tutorial on frequency interpretations — there are reasons to question whether relative
frequencies can provide a suitable interpretation of the probability calculus.

With single-case propensities it’s even less clear why we should think they would obey the
laws of probability theory. Of course if we wanted to we could DEFINE single-case
propensities in such a way that they necessarily satisfy the laws, but as Alan Hajek puts it,
simply defining what a witch is doesn’t show us that witches exist; so simply defining
propensities in this way doesn’t give us any additional reason to think they exist.

Another class of objections focuses precisely on this question of existence. Unlike relative
frequencies or subjective degrees of belief, which aren’t metaphysically mysterious to most
of us, it’s not at all clear what propensities are, metaphysically speaking. The closest
category we have to describe physical tendencies of things is “dispositions” — certain
kinds of physical properties are dispositional properties. Think of a property like “fragility”,
which we can think of as a disposition to break when subjected to a suitably strong and
sudden stress. So maybe a propensity is a probabilistic disposition of some kind. But
making this idea clear is more challenging than it looks.

Some people object that, in the absence of a proper metaphysical theory, the term
“propensity” is an empty concept. If believing in propensities amounts to nothing more than
believing there is SOME property of this dice rolling setup which entails that the dice will
land ‘2’ with a certain long-run frequency, then this is fine as far as it goes, but it doesn’t
add to our understanding of what generates those frequencies. It’s like saying that I
understand how it is that birds know how to build nests, by saying that they have an
“instinct” for next-building, and defining this instinct as “an innate ability to build nests”.
This language only tricks us into thinking we understand something when we really don’t.

So these are some objections to propensity interpretations. But the story isn’t all bad. If
you survey the literature you’ll see there’s been quite a bit of work on propensity
interpretations that have been re-branded as theories of “objective chance”. Here I’m
thinking of work by David Lewis and Isaac Levi and Hugh Mellor and including more recent
work by people like Carl Hoefer and Michael Strevins and others. These folks are trying to
fit the concept of objective chance into a broader theory of probability that integrates
elements of subjective and frequentist approaches, and to show how these various
probability concepts are implicitly defined by their relationships to one another. From a
philosophical standpoint I think this work is very interesting, but it’s still very much a
heterogeneous research program with a lot of unresolved problems.

From a critical thinking standpoint, however, I don’t think that much of this matters. What
matters are the broad distinctions, like the distinction between epistemic probability and
physical probability, or the distinction between subjective or Bayesian approaches and
frequency approaches.

Critical thinking about probabilities and probabilistic fallacies requires a certain level of
basic philosophical literacy, but I don’t think it requires anything beyond what we’ve
covered here.
The Rules for Reasoning With Probabilities
PART I: PRELIMINARY CONCEPTS
1. What has a probability? Propositions versus Events

In this video I want to talk about what sorts of things can have probabilities. I know we did
a whole course on philosophical interpretations of the probability concept, but in
applications you generally see one of two different languages used to talk about
probabilities, what I’ll call a “proposition language” and an “event language”. These really
are just different ways of saying the same thing, but it helps to know how to translate back
and forth between them.

Let’s say I toss a coin, and I ask “What is the probability that the coin will land heads?”

Grammatically, what I’m attributing the probability to is this statement, “The coin will land
heads.” And the question we’re asking is, what is the probability that this proposition, this
statement, is true?

So, when we say P(H) = 0.5, that the probability of heads is 0.5, and we’re using the
language of propositions, we’re interpreting H as the proposition that the coin will land
heads, and we read the answer as saying that it’s 50% likely that this proposition is true.

Now, we can also ask the question this way: What is the probability of the coin landing
heads?

The grammar is subtly different. Here, the probability is being associated with an event,
the event of the coin landing heads.

What’s the difference? A proposition is a linguistic entity that asserts a claim that can be
either true or false. An event is not a linguistic entity, and events don’t assert anything;
they aren’t the sort of thing that can be true or false. An event is a state of affairs in the
world that either happens or doesn’t happen.

So in the event language, we interpret “the probability of heads is 0.5” as asserting that the
probability of an event occurring, the event of the coin landing heads, is 0.5.

For the most part you can think of these as just two different ways of saying the same
thing. So why do we have these two different languages?

Well, the proposition language is most natural when we’re talking about what beliefs can
be inferred on the basis of what evidence, or how likely it is that the conclusion of an
argument is true, given the premises. This is the domain of “inductive logic”, so you’re
more likely to see this language in a logic text.

Also, the proposition language might be more natural under certain interpretations of the
probability concept than others. For example, statements about subjective probability,
where probabilities are associated with degrees of belief, are more natural in the
proposition language than in the event language.
On the other hand, the event language is more commonly encountered in statistics and
probability theory textbooks, and it’s often the more natural language when talking about
statistical analysis of data.

It’s also more natural when we’re using, say, the frequency interpretation of probability,
since frequencies are usually defined in terms of ratios of events.

The main point here is just you’ll encounter both ways of talking about probabilities, and in
most cases you can convert back and forth between them. The mathematical rules are
applied the same either way, it’s just a matter of how you interpret the language.

So the value of knowing both languages is similar to the value of knowing both languages
in any bilingual community. You want to be able to understand both languages so you can
understand the conversations that people are having. What you want to avoid is thinking
that there’s only one right way to talk about probabilities, because a) it’s just false, and b)
it’ll get in the way of a productive dialogue between people in each camp.
Probabilities range between 0 and 1

In this video we’re going to talk about the range of mathematical values that a probability
can take, and how we interpret probabilities at the extremes and between the extremes.
We also talk about how the language of “necessity” and “contingency” relates to all this.

We commonly talk about probabilities in terms of percentages, but it’s easy to forget that a
percentage is just a fraction. 50% probability is 50/100, or 1/2. 25% is 25/100, or 1/4.

The first convention we adopt when talking about probabilities is that probabilities are real
numbers that can take on any value between 0 and 1.

Now, the extremes are interesting. What do we mean when we say that the probability of
an event or proposition is 0, or 1?

Well, if the probability is 1, in the event language that means the event is certain to occur,
there is no chance that it won’t occur. In the proposition language, it means that the
proposition must be true, it cannot be false.

Now, this is just a convention. It doesn’t say anything about when we can judge an event
or proposition to have probability 1. But there are some obvious examples.

If we grant that that the only two options in a coin toss are to land heads or to land tails,
then it’s safe to say that the probability of the coin landing heads OR tails is 1, since these
exhaust all the possibilities.

And this is the common way that this concept is used in probability theory and statistics.
Given a set of mutually exclusive and exhaustive possibilities, the probability that one of
these possibilities will be realized is equal to 1.

Now, a probability equal to 0 means just the opposite. This refers to an event that can’t
possibly occur, or a proposition that can’t possibly be true.

In mathematics and logic there’s a convention that contradictory statements can’t be true,
so contradictions are automatically assigned a probability value of 0. If a coin landing
heads means that it didn’t land tails, and vice versa, then it’s a contradiction to say that the
coin landed both heads and tails, at the same time — that’s a contradiction, and can’t
possibly be true, so we assign it probability 0.

In philosophy we have a pair of concepts that we often use to distinguish events or


propositions at the extremes from those that aren’t at the extremes.

If a proposition or event has probability 1, that means it must be true, or it must occur, and
we say that the proposition is NECESSARILY TRUE, or that it’s a NECESSARY EVENT.

Similarly, if it has probability 0, that means it must be false, it can’t possibly occur, and we
say that it refers to a NECESSARILY FALSE proposition, or that the event is an
IMPOSSIBLE EVENT.

So, if a proposition or an event has probability that is not 0 or 1, but lies between 0 and 1,
that means that it’s possible that the proposition is true, or that the event will occur. When
this is the case, we commonly say that the proposition is a CONTINGENT proposition,
rather than a necessary proposition. Similarly, we’d say that it’s a contingent event, rather
than a necessary event. In this context, contingent just means “possible” — it’s possible
that the event will occur. If it occurs, it occurs, but we understand that it could have
happened otherwise, and that’s what we mean when we say it’s contingent rather than
necessary.

Now, having just said this, I think it’s important to point out that these concepts of
necessary and contingent propositions and events are not part of formal probability theory,
they’re really a part of a broader philosophical framework for interpreting the world, and
they’re separable from probability theory. A philosopher might argue, for example, that
there is no principled way of distinguishing necessary propositions from contingent
propositions, or might dispute the existence of necessary truths in a given domain. But
these philosophical debates are largely independent of the conventions we use in
probability theory and statistics.

I just wanted to make this point clear. The distinction between necessarily and contingently
true propositions or events is an important one in philosophy, and probability theory can
help us articulate what we mean by these concepts.

But within probability theory, probability 1 or probability 0 has an independent formal


meaning and formal justification, it’s part of the definition of what a probability is as a
mathematical concept, and this definition is independent of any philosophical uses we
might want to make of these concepts.
Mutually exclusive events

This is a short one. I used the expression “mutually exclusive” in the previous tutorial. Here
I want to clarify what this mean.

Simply put, two propositions, or two events, are called “mutually exclusive” if they can’t
both be true, or occur, at the same time.

So, a coin can’t land both heads and tails on the same toss.

A spin of a roulette wheel can’t land on both red and black on the same spin.

These are mutually exclusive events. Mutually exclusive events are sometimes also called
“disjoint” events.

Obviously, then, if A and B can both be true at the same time, or occur at the same time,
then they’re NOT mutually exclusive.

If A is “I draw a heart from a deck of cards”, and B is “I draw a King from a deck of cards”,
those aren’t mutually exclusive -- if you draw the King of Hearts then both A and B are true
at the same time.

So, that’s it. I just wanted to make sure the concept was clear, because it’s important for
understanding some of the rules for reasoning with probabilities.
Independent events

Independence is a very important concept in probability theory.

The basic idea is straightforward. Two events are said to be independent if the occurrence
of one doesn’t influence the probability of the occurrence of the other.

In other words, given two events A and B, if A occurring or not occurring has no effect on
the probability of B occurring, then we say that A and B are probabilistically independent.

On the other hand, if the occurrence or non-occurrence of A does have an effect on the
probability of B, then we say that A and B are probabilistically dependent.

We can say the same thing in proposition language: Two propositions are independent if
the truth of one doesn’t make the truth of the other any more or less likely.

A schematic way of representing independence is like this:

P(A) = P(A given B)


P(B) = P(B given A)

If A and B are independent, then the probability of A given B, is just the same as the
probability of A all by itself.

And if they’re independent, then it works the other way too. The probability of B given A is
the same as the probability of B all by itself.

If these are NOT equal, then A and B are probabilistically DEPENDENT on one another.
We’re saying that the occurrence of B changes the probability of A, and vice versa.

Some examples:

Toss a coin. Assume this is a fair coin, so the probability of it landing heads is 50 percent,
or 0.5.

Let’s assume that it in fact landed heads. Given this, what is the probability that it will land
heads again on the second toss?

A = the coin lands heads on toss 1


B = the coin lands heads on toss 2

What is P(B given A)?

Well, if you think it’s higher or lower than 0.5, then you’d be making a mistake. These are
probabilistically independent events; that is, P(B given A) = P(A) = 0.5. The probability of
landing heads on a second toss is still 0.5. This would be the case even if you’d previously
landed ten heads in a row; the probability of the next toss landing heads is still just 0.5. To
believe otherwise is to commit what’s known as the “gambler’s fallacy”. We’ll talk more
about the gambler’s fallacy in the course in fallacies of probabilistic reasoning.

Now, consider this example:


A = the dice roll is an even number
B = the dice roll is a 2

We know that the probability of rolling a 2 on a six-sided dice is 1 in 6. So the probability of


event B by itself is 1/6.

But if A is true, then the set of possible values is restricted to just the even numbers. The
probability of rolling a 2, given that it’s even, is 1/3.

A and B are probabilistically dependent events. The occurrence of one affects the
probability of the other.

I want to note that dependence is a symmetrical relationship, in that it works both ways: if
the occurrence of A affects the probability of B, then the occurrence of B will also affect the
probability of A.

But I also want to point out that this doesn’t mean that the numerical values will be the
same. In our example, if we know the dice roll is even, then the probability of rolling a 2 is
1/3, rather than 1/6.

But lets do it the other way around. We roll the dice but we don’t know anything about the
outcome. What are the odds that the dice roll is even? Well, that’s just 1/2, since the even
numbers are 2, 4 and 6, and that’s half of the possible outcomes.

But now let’s say we know that we rolled a 2. This obviously affects the probability that the
dice roll is even. In fact, it’s certain that it’s even -- given that it’s a 2, the probability that it’s
even is 1.

And this illustrates the point.

P(even) = 1/2, but P(even, given it's a 2) = 1


P(2) = 1/6, but P(2, given that it's even) = 1/3

They’re probabilistically dependent whichever way you go, but the numerical values may
differ depending on which way you go.
PART II: THE BASIC RULES
The Negation Rule: P(not-A)

Okay, the first of the basic rules we’re going to cover is the negation rule.

We’re given an event A, and we know the probability of A occurring; call this P(A).

Question: What is the probability that A will NOT occur?

The rule is simple: Given P(A), the probability that A will occur, or that A is true, the
probability of A NOT occurring, or NOT being true, is equal to 1 minus the probability of A.

P(not-A) = 1 - P(A)

Here’s a simple example:


We know that the probability of rolling a 2 on a six-sided die is just 1 in 6. Our space of
possible outcomes has six elements in it, and the outcome that we care about is just one
of those elements, so the ratio is 1 in 6. Thus, P(2) = 1/6.

Now, what’s the probability of NOT rolling a 2? Well, this is just the probability of rolling a
1 or a 3 or a 4 or a 5 or a 6 -- anything that’s not a 2. This is 5 of the 6 possible outcomes,
so the probability is 5 in 6.

But note that 5/6 is just 1 − 1/6, or 1 - P(2), the probability of rolling a 2. If that’s not clear
just remember that 1 is equal to 6/6, so 6/6 - 1/6 equals 5/6. Thus,

P(not-A) = 1 - P(A)
P(not-2) = 1 - P(2)
P(not-2) = 1 − 1/6
P(not-2) = 5/6

Now, when you look at the sets of numbers in the curly brackets here …

{1, 2, 3, 4, 5, 6}

… you notice something interesting. These sets represent what is sometimes called the
“sample space” of this experimental setup, the set of all possible elementary outcomes of
a random trial. Notice that the events A and B can be represented by subsets of this
sample space. Let’s make this clearer.

The event of rolling a 2 is represented by this single element of the set: {2}

The event of NOT rolling a two is also represented by a subset of the sample space, in this
case the remaining five elements: {1, 3, 4, 5, 6}

In set theory language, we’d call the set on the bottom the “complement” of the set on the
top.

The general point is that given a space of possible outcomes, you can represent an event
A as a subset of that space, and the negation of A, “not-A” is represented by the
complement of that subset -- all the members of the space that are NOT in A.

And when you conjoin these sets together, you recover the whole sample space, because
A and not-A partition the sample space into two parts with nothing left over, so when you
put the parts back together, you get the whole space back.

We can generalize this point graphically. First let’s set up a convention that is often used to
graphically represent probability relations. I find it really helps to develop your intuition
about these relationships. This only works for certain kinds of sample spaces, but it’s still
very helpful.
Picture
Let’s represent the total sample space, the complete set of possible elementary outcomes,
by a square. Call it “omega”. We’ll set the area of this sample space equal to 1. Now, when
we do this, different events, different possible outcomes, can be represented by subsets of
this area. And the probability of those events will be proportional to the area of the
subsets.

So, let event A be represented by a given area. The probability of A is proportional to the
area of this subset -- the larger the area, the more probable the event. If the area of A
includes the whole sample space, omega, then the probability would be 1 -- that event
would happen with certainty. In general this will be a number less than 1.

So, how do we represent not-A on this diagram? Well, not-A is just the complement of A,
it’s the fraction of omega that is NOT in A.

Now, from this we can see a couple of useful relations.

First, as we’ve seen, when we conjoin or take the union of A and not-A, we recover the
whole sample space, which has probability 1. This simply involves adding up the areas,
and they fit together like a the pieces of a jigsaw puzzle.

This schema shows us how to think about the probability of any event, or proposition, and
it’s negation, even ones that are hard to assign a numerical measure to.

For example, if A is the proposition that it’s going to rain tomorrow, and we think this has a
75% chance of being true, then according to this rule we’re compelled to assign a
probability of 25% to the proposition that it won’t rain tomorrow. This is a consistency
constraint on how we’re supposed to reason with probabilities as defined in standard
probability theory. If we don’t do this then we’re not following the rules of standard
probability theory. If you break this rule where standard probability is known to apply, like
when you’re at the roulette wheel or playing cards, then you’ll just be in error, you’ll
misjudge the probabilities of complementary events.

The second thing I wanted to point out is that this rule builds in a rule of standard formal
logic, which is known as the law of the excluded middle. For every proposition A, either A
is true, or its negation is true, there’s no third truth-value that A could have.

It’s an interesting fact that there are logical systems, and mathematical systems, that reject
this law. There’s no room to go into those here, but I just wanted to point out that there are
interesting relationships between probability theory and logic. But we’re doing standard
probability theory here, and it won’t pay to get too distracted, so for the most part I’ll be
ignoring these possible digressions.
Restricted Disjunction Rule: P(A or B) = P(A) + P(B)

In this video we’ll look at the disjunction rule, the rule for calculating the probability of a
disjunction of events, or, in more familiar terms, given probabilities for events A and B,
what is the probability that EITHER A OR B will occur? Statements of the form “A is true
OR B is true” are known as “disjunctions” in math and logic, so that’s where the rule gets
its name.

There’s a more general formulation for this rule, and there’s a more restricted special case.
In this video we’ll just deal with the special case, which occurs when the two events in
question aremutually exclusive, meaning that they cannot both occur at the same time.

Let’s consider dice rolls this time. The probability of any particular number, say, a 2,
coming up on a single dice roll is 1/6, right?

So what would be the probability of rolling EITHER a 2 OR a 6?

P(2 or 6) = ?

Well, getting a 2 or a 6 is more likely than getting just one or the other by itself, so we
know the probability is going to be higher. In this case we can actually count the
elementary outcomes to get the answer.

There are six possible elementary outcomes, and the event in question picks out two of
these outcomes. So the probability of getting a 2 or a 6 is just this ratio, which is equal to
2/6, or 1/3.

The algebraic translation of this reasoning is straightforward. What you’re doing is


ADDING the probabilities of the individual outcomes.

P(2 or 6)
= P(2) + P(6)
= (1/6) + (1/6)
= 2/6
= 1/3
And this is our rule:
P(A or B) = P(A) + P(B)
if events A and B are “mutually exclusive”.
The probability of A or B is just the probability of A, plus the probability of B.

Now, note the disclaimer: the rule only works in this simple form if A and B are mutually
exclusive, meaning that they can’t both occur at the same time, either A occurs, or B
occurs, or neither occur, but they can’t both occur at the same time.

This is the case with our examples. You can’t roll both a 2 and a 6 at the same time on a
single dice roll, these are mutually exclusive outcomes. Similarly, you can’t toss both a
head and a tail on a single coin toss.

Let’s look at an another example.

What is the probability of drawing either a face card, or a 10, from a well shuffled deck of
playing cards?

First we’ll need to make sure we’ve identified all the cards.

The face cards include Jacks, Queens and Kings. The ten is just a 10, so this is total of
four types of cards.

So we’d write this as the probability of getting a face card or a 10, which is just the
probability of getting a jack or queen or a king or a 10.

Now, if these events are mutually exclusive then we can apply the restricted disjunction
rule.

Are they? Sure they are. If you draw any one of these cards you can’t simultaneously
draw any of the others, so they’re all mutually exclusive.
So, our expression would look like this:

P(face card OR 10)


= P(face card) + P(10)
= P(J or Q or K) + P(10)
= P(J) + P(Q) + P(K) + P(10)
The probability of drawing a jack or a queen or a king or a 10 is equal to the probability of
drawing a jack, plus the probability of drawing a queen, plus the probability of drawing a
kind, plus the probability of drawing a 10.

Now, what’s the probability of drawing a jack? Well, there are four jacks in a deck of 52
cards, one for each suit. So the probability of drawing a jack is 4/52. And this will be the
same for all the cards.

This is our answer, the rest is just algebra.


P(face card OR 10)
= P(J) + P(Q) + P(K) + P(10
= (4/52) + (4/52) + (4/52) + (4/52)
= (1/13) + (1/13) + 1/13) + (1/13)
= 4/13

Note however that you can simplify the algebra by recognizing that 4/52 is equal to 1/13.
So the answer is 4/13, which is roughly 0.31, or 31 percent.

Before we leave we should look at this expression from the sample space perspective to
get some additional insight into what it means.

If you recall from the tutorial on negations, we showed that you can represent the total
sample space by a unit area, call it “omega”. Subsets on this sample space represent
events, and the probability of the event is just the area of the subspace as a fraction of the
total sample space. The total sample space has area equal to 1.
The restricted disjunction rule looks like this, graphically:

Events A and B are represented by areas on the sample space. The larger the area, the
larger the probability associated with the event. The probability of A or B occurring is
represented by the sum of these areas -- you just add up the areas of A and B, and this
sum will obviously be a larger fraction of the total sample space.

What it means to say that A and B are mutually exclusive is that these areas don’t overlap,
there are no regions of intersection. So if A occurs, B doesn’t occur, and vice versa.

As we’ll see in the next video, if A and B are not mutually exclusive, this means that their
areas DO overlap, and this would correspond to cases where A and B both occur at the
same time. As a result it’s a bit trickier to calculate the sum of the areas, it’s not a simple
algebraic sum of the two areas taken separately. We’ll turn to this now in the next video.
General Disjunction Rule: P(A or B) = P(A) + P(B) - P(A and B)

In the last video we looked at the disjunction rule applied to the special case where the
events in question are mutually exclusive. Now lets look at the general rule, which also
applies to cases where the events are NOT mutually exclusive.

Here’s our Venn diagram depiction of mutually exclusive events.


Picture
The blue square is the whole sample space, the areas of A and B represent the two events
in question, and the size of these areas is proportional to the probability of each event.

In this example of drawing either a King or a Jack from a deck of cards, these are mutually
exclusive events, so we represent these as having no overlap in the sample space, and
the probability of drawing either a King or a Jack is just the algebraic some of their
individual probabilities.

Now let’s consider a different case. What is the probability of drawing a card that is either a
King or a Spade?

P(K or S) = ?

Let’s start with the first part: what is the probability of drawing a King?

Well, there are four kings in a deck of cards, one for each suit, so that’s just 4 out of 52.
Thus,

P(K) = 4/52

Now what’s the probability of drawing a spade?

Well, there are four suits, so one in every four cards is a spade. So the probability is 1 in 4,
but we’ll write this as 13 out of 52 to make it easier to add them in a second.

P(S) = 13/52
Now if we were to just add up these probabilities to get the probability of drawing a King or
a Spade, the answer would be this:

P(K or S)
= P(K) + P(S)
= (4/52) + (13/52)
= 17/52

But there’s a problem with this answer. The problem is that in calculating the probabilities
for drawing a king and a spade, we’ve double-counted one card; namely, the KING OF
SPADES.

The King of Spades is BOTH a KING AND a SPADE, so this card is included in the
calculation of both probabilities; he’s included in the probability of drawing a King, and he’s
included in the probability of drawing a spade.

But there’s only ONE King of Spades in the deck, so by counting him twice we’re
overestimating the probability of drawing either a King or a Spade, and that’s an error.

Graphically our situation looks like this:


Picture
In this case our events are NOT mutually exclusive, there are cases where they overlap,
and in this particular case, the overlap represents the event of drawing the King of spades.
By the way, I know that these areas shouldn’t be the same size, but for this introduction
here it’ll be helpful to keep them the same size.

In a Venn diagram you define the overlap region as “K and S”, the cards that are both
Kings and Spades.

And you can see how if we’re just adding up the areas of K and S then we’d be counting
the overlap region twice. What we’re going for is the area of the white space, that peanut-
shaped area defined by the external boundaries of K and S.
To get THAT area, all you need to do is SUBTRACT the area of the overlap region from
the sum of the two separate areas.

This picture let’s you visualize what’s going on.


Picture
The probability of drawing a King OR a Spade is represented by that peanut shaped area
on the left, and you get it by subtracting the overlap region from the sum of the two areas.

And this gives us the algebraic expression that we need to fix the error caused by double-
counting the overlap region.

P(K or S) = P(K) + P(S) - P(K and S)

You just add up the probabilities of the two events taken separately, and then subtract the
probability associated with the conjunction of the two events. In this case there’s only one
card that is both a king and a spade, the probability of drawing that card is just 1 in 52.

P(K or S)
= (4/52) + (13/52) - (1/52)
= 16/52

So we subtract 1 in 52 from our sum, and we get the correct answer, which is 16/52, rather
than 17/52.

And here’s the general disjunction rule in terms of arbitrary events A and B:

P(A or B) = P(A) + P(B) - P(A and B)

Notice that this general rule includes the restricted rule as a special case, since if A and B
are mutually exclusive, then A and B don’t overlap and the conjunction term on the right
goes to zero, and we recover the restricted rule.

Let’s look at another example. What is the probability of drawing either a Face Card or a
Spade? A face card, remember, is a Jack, Queen or King, any card with a face on it.
Here’s our rule:

P(F or S) = P(F) + P(S) - P(F and S)

It’s equal to the probability of drawing a face card, plus the probability of drawing a spade,
minus the probability of drawing a card that is BOTH a face card and a spade.

With examples like these you can just count the cards to get the probabilities.

Let’s start with P(F). There are 3 face cards per suit, and four suits, so that gives us 12 out
of 52 cards that are face cards. So P(F) = 12/52.

P(S) is easy, there are 13 cards in a suit, so P(S) = 13/52.

Now, how many card are there that are both face cards and spades? Well, just the Jack,
Queen and King of spades, so that’s 3 out of 52. So P(F and S) = 3/52.

And the rest is simple algebra:

P(F or S) = P(F) + P(S) - P(F and S)


= (12/52) + (13/52) - (3/52)
= 22/52

and our answer is 22/52, or 11/26, which is roughly 0.26, or 26 percent.

So this is how you use the general rule for calculating probabilities of disjunctions. You just
have to remember to check whether the events are mutually exclusive, and if not, you
need to subtract the probability of the conjunction of the two events, the cases where both
events occur, or where both of the corresponding propositions are true.
Restricted Conjunction Rule: P(A and B) = P(A) x P(B)

In this video we’ll look at the conjunction rule, the rule for calculating the probability of a
conjunction of events. But we’ll deal with a special case, where the events in question
areindependent, and the rule takes on a very simple form.

Let’s consider coin tosses again. The probability of a single coin landing heads is 1/2,
right?
Now what if we toss two coins at the same time? What is the probability that both
coins will land heads?

Most people will see the answer right away, because we’re familiar with these sorts of
cases. We know that the probability of both landing heads is going to be less than the
probability of just one landing heads. When we’re dealing fractions, we know that
multiplying fractions gives us a smaller number.

In this case, if we multiply the probabilities of the two independent events, we get the right
answer: 1/2 times 1/2 equal to 1/4. There’s a 25% chance of both coins landing heads.

If we go for three coins, it’s the same idea. We just multiply the probabilities of each
individual event, and the answer is half as small again, 1/8.

This is the restricted conjunction rule:


P(A and B) = P(A) x P(B)

If A and B are independent events, the probability of the conjunction of two events, which
is just the probability of the two events both occurring, or of the corresponding propositions
both being true, is just the product of the probabilities taken separately.

If the events are not independent, then we need to use a modified version of this rule, but
we’ll take that up in the next video.
Let’s just look at one more example.
Using a six-sided dice, what is the probability of rolling three sixes in a row? That is,
what is
P(6 and 6 and 6) = ?

Dice rolls, like coin tosses, are independent events, so we can use the restricted
conjunction rule. The probability of rolling a six is just one in six. So the probability of
rolling three sixes in a row is just 1/6 x 1/6 x 1/6, which is one in 216,

P(6 and 6 and 6)


= P(6) x P(6) x P(6)
= (1/6) x (1/6) x (1/6)
= 1/216

This is a pretty simple rule, but before we leave I want you to think about what this rule
means from the sample space perspective. In this framework, events are associated with
subsets of the sample space, and probabilities are associated with the ratios of the
corresponding subsets to the total space of possible outcomes.

Here’s the sample space for single coin toss: {H, T}

It’s just a listing of the set of possible elementary outcomes, and in this case there are just
two, heads and tails. The probability of the coin landing heads is equal to the ratio of
outcomes where it lands heads, divided by the total number of elements in the sample
space, which in this case is obviously just 1 out of 2 or 1/2.

Here’s the sample space for the event where two coins are tossed:

{(H, H), (H, T), (T, H), (T, T)}

You’ve got twice as many possible combinations of outcomes -- heads heads, heads tails,
tails heads, and tails tails. This sample space has four elementary outcomes, and the
probability of two heads is just the ratio of the number of elements where the outcome is
two heads, divided by the total number of elementary outcomes, which is 4. So the
probability for this event is just 1 in 4.
Here’s the sample space for three coin tosses:

{(H, H, H), (H, H, T), (H, T, H), (T, H, H), (H, T, T), (T, H, T), (T, T, H), (T, T, T)}

There are eight possible combinations of heads and tails, and the probability of landing
three heads is just 1/8.

Here’s the sample space for rolling two dice:

There are 36 possible outcomes. The probability of rolling two sixes is 1/6 times 1/6, or 1
in 36.

It’s rare that you’ll have to write out the sample space like this to solve a problem, but it’s a
helpful reminder of what the mathematical rule means, and why it gives us the right
answers.
General Conjunction Rule: P(A and B) = P(A) x P(B|A)

In the last video we looked at the restricted conjunction rule, which is the rule for
calculating the probability of a conjunction of events, when those events are independent
of one another. What independence means in this context is that if one of the events
occurs, this has no effect on the probability of the other event occurring. In that case you
simply multiply the probabilities for each event.

Now we need to look at the more general case, where the events are not independent.

Let’s consider dice rolls once again. Let’s call E the event that the dice roll is an even
number.

And let’s call P the event that the dice roll is a prime number.

Here’s what those dice rolls look like:

E = {2, 4, 6}

P = {1, 2, 3, 5}

We’re interested in the probability that the dice roll is BOTH even AND prime:

P(E and P) = ?

Now, if we were to use the restricted conjunction rule, we’d just multiply these probabilities
together. Let’s see how that would work.

The probability of a dice roll being even is just 3/6, or 1/2, since we’ve only got six
possibilities and three of those are even. So P(E) = 3/6.
The probability of a dice roll being prime is 4 in 6, since the primes make up 4 of the six
possible rolls. So P(P) = 4/6.

Now if we use the restricted conjunction rule, the calculation looks like this:

P(E and P)
= P(E) x P(P)
= (3/6) x (4/6)
= 12/36
= 1/3

We just multiply these numbers together, and we get a final answer of 12 out of 36, which
is equal to 1/3. So according to this calculation, on any given dice roll there’s a one in
three chance of getting a roll that is both even and prime.

But we know this answer can’t be right.

How do we know this? Because by inspection we know that there’s only ONE possible
dice roll that is both even and prime -- it’s the 2.

But if there’s only one possible dice roll that is both even and prime, then we know the
answer. The answer has to be 1/6. But the restricted rule gives us 1/3, or 2/6 -- it
overestimates the probability.

So, this example shows us that the restricted conjunction rule doesn’t apply to this case.
Why doesn’t it apply?

It doesn’t apply because E and P are not independent events. We’re interested in the
probability that E and P are both true of a given dice roll, but if P is true, for example -- if
we know the dice roll is a prime, then that affects the probability that E is true, that it’s
even. If it’s prime then just look at the options, there’s only one even number in that list of
four prime numbers, so the probability of the roll being even is 1/4, not 1/2.

And similarly if we know that the dice roll is even, that affects the probability that it’s also
prime. In this case, if we know it’s even, then our options are 2, 4 and 6, and only one of
those is prime, so the probability of it being even, given that it’s prime, is 1/3.

So, we know that the restricted conjunction rule doesn’t work, and we know it doesn’t work
because the rule doesn’t take into account the probabilistic dependence of the two events
on one another. This gives us an idea for how we might modify the conjunction rule to fix
this problem.

Instead of just multiplying the probabilities of A and B, we can try multiplying the probability
of A with the probability not of B, but of B GIVEN A, or P(B|A):

General Conjunction Rule:

P(A and B) = P(A) x P(B|A)


An expression like this, P(B|A), is called a “conditional probability”. We’ve got a whole
other video on conditional probabilities, but for now it’s enough to just read it as the
probability of B given A.

Let’s try out this new rule with our example.

First of all, let’s remember that we know what the answer is supposed to be, just by
inspection. There’s only one dice roll that is both even and prime, so the probability has to
be 1/6. Let’s see if our general formula actually gives us this answer.

Here’s our formula:

P(E and P) = P(E) x P(P|E)

The probability of a dice roll being even and prime equals the probability of it being even
times the probability of it being prime, given that it’s even.

And here are the numbers:

P(E and P)
= P(E) x P(P|E)
= 1/2 x 1/3
= 1/6

The probability that a roll is even is 1/2. The probability that a roll is prime, given that it’s
even, is 1/3, since among the even numbers, 2, 4 and 6, only one of these, the 2, is prime,
so the probability is 1/3.

1/2 times 1/3 is 1/6, and lo and behold, we get the right answer!

Now, you might be wondering whether it also works the other way, if we consider the
probability of a roll being even, given that it’s prime.

And the answer is “yes it does”. Let’s do it that way. If we switch around the As and Bs in
our general rule, the result still holds.

P(E and P) = P(P) x P(E|P)

When you plug in the numbers you get this:

P(E and P)
= P(P) x P(E|P)
= (4/6) x (1/4)
= 4/24
= 1/6

The probability of a roll being a prime number is 4/6, since 1, 2, 3 and 5 are primes.

The probability of a roll being even, given that it’s prime, is 1/4, since of those four primes,
only one, the 2, is even.
4/6 times 1/4 is 4/24, which is 1/6.

Same result. It works both ways.

So, to sum up, this is the general conjunction rule:

P(A and B) = P(A) x P(B|A)


P(A and B) = P(B) x P(A|B)

It’s written here in two forms, depending on which you choose as the conditional
probability term, but they’re equivalent formulations and they’ll give you the same answers.

Note also how this rule reduces to the restricted conjunction rule when A and B are
independent. In that case, following our definition of probabilistic independence, the
conditional probabilities just reduce to the unconditional probabilities and you recover the
simple restricted rule.

In the next video we’re going to focus our attention on those conditional probabilities.
General Conditional Probability Rule

In the last video we looked at the general conjunction rule, which involves the use of
conditional probabilities. Now if you look at this expression for conditional probability on
the screen …

P(A|B) = P(A and B)/P(B)

… which I’m calling the “general conditional probability” rule, you might notice that it’s
exactly the same formula as the general conjunction rule, it’s just rearranged. This is, in
fact, how conditional probability is defined in standard probability theory. In this video I
want to explore what this formula means from the sample space perspective, with the
hope that it’ll help to develop some intuitions about why it works.

Let me say up front that I’m not going to be talking about the Bayes’ Rule formulation of
conditional probability here, I’m going to save that for another video.

Let’s start off with a simple example to refresh our memories. Consider these two events:
the first event is rolling a 2 on a dice roll. We’ll label this event with the number 2. The
second event isrolling an even number on a dice roll. We’ll label this event “E”. Thus,

2 = the dice roll is 2


E = the dice roll is even

We’ll call a probability a “categorical probability” if we’re just talking about the probability
of an event that is not conditional on other events, so we’re just asking about the
probability of A occurring, we’re not asking about the probability of A given some other
event occurs — P(A).

We contrast “categorical” probabilities with “conditional” probabilities -- here we’re


asking about the probability of A, given that some other event, B, has occurred — P(A|B).
Or in other words, we’re asking about the probability of A, on the condition that B occurs.

So, with these two events, what are the categorical probabilities?

That’s easy: the probability of rolling a 2 is 1/6, and the probability of rolling an even
number is 3/6, or 1/2.

P(2) = 1/6
P(E) = ½

But of course we’re interested in conditional probabilities, so a natural question to ask is,
what is the probability of rolling a 2 , given that it’s even?

P(2|E) = ?

We know the answer to this just by inspection: if the dice roll is either a 2, 4 or 6, then the
probability that it’s a 2 is just one third. Thus,
P(2|E) = 1/3

Now let’s see how this answer squares with our definition of conditional probability.

Here’s our definition in terms of general events A and B:

P(A|B) = P(A and B) / P(B)

The probability of A given B is equal to the probability of the conjunction of A AND B,


divided by the unconditional probability of B all by itself.

If we substitute our events for this example it looks like this:

P(2|E) = P(2 and E) / P(E)

The probability of rolling a 2, given that it’s even, equals the probability of rolling both a 2
and an even number, divided by the probability of just rolling an even number.

We know the value of the denominator term, it’s just one half; P(E) = 1/2. The numerator
is the only tricky part. It’s a conjunction, and in the last video we covered the general
conjunction rule. In fact this is just another way of writing the general conjunction rule. The
question we want to ask is this:

How many possible dice rolls are there, where the dice roll is both a 2 and even?

Answer: Just one. Rolling a 2 is the only dice roll that is both a 2 and even. And the
probability of rolling a 2 is just 1/6, right? So now we have our numbers:

P(2|E) = P(2 and E) / P(E)


= (1/6) / (1/2)
= 2/6
= 1/3

And this is, indeed, the answer that we figured out just by inspection.
So the formula works. Now, I know for a fact that a lot of students don’t have a good
intuitive sense of why it works. They’re not sure exactly why the conjunction is relevant,
and they’re not sure why we’re dividing by the probability of the conditioning event. To
help see why the formula makes sense, it helps to look at the situation from the sample
space perspective.
Here’s our generic sample space that we’re familiar with from previous videos. The blue
square, labeled “omega”, represents the set of all possible outcomes of a probabilistic trial,
or what we’ve been calling the “sample space”. Events are represented by subsets of this
sample space, and probabilities of events are represented by the area of the subset
associated with a given event.

So in this example the ovals labeled A and B are events, and the areas of A and B are
proportional to the probability of A and B occurring.

If we consider the area of the whole sample space, omega, then we assign this probability
1, which means that the outcome has to land somewhere inside this area.

If we think of a probabilistic trial by analogy with throwing a dart at a board, then we’re
saying that the dart is guaranteed to land somewhere inside the blue square.
Now, in this diagram A and B overlap. This represents events where A and B both occur at
the same time, indicating that these aren’t mutually exclusive events. In previous videos
we used the example of drawing a playing card that is both a face card and a spade. In
this video we used the example of getting a dice roll that is both an even number and a 2.
The area of this overlap region represents the probability that both A and B will occur.

Okay, let’s clear the darts off the board.

And now let’s ask the question, how is conditional probability represented on this
diagram?

Well, here’s our formula for the probability of A given B.

P(A|B) = P(A and B) / P(B)

And let’s think about what’s going on when we ask “what is the probability of A, given B?”.

What we’re saying is that in this case, we know some additional information that we didn’t
know before. We know that event B occurred. This is like saying that we know that our dart
landed somewhere inside B.
So we’re saying, given that we know the dart landed in B, what are the odds that it also
landed in A? In other words, given that we know the dart landed inside B ,what are the
odds that the dart landed in the overlap region between B and A?

The overlap region makes up a fraction of B, and that’s precisely the fraction that we’re
trying to estimate with the conditional probability rule. We’re asking for the ratio of the
overlap region to the area of B.

We can say the same thing with a slightly different emphasis. When we know that B is
true, or that B occurred, what we’re saying is that we’re no longer dealing with the whole
sample space. We’re dealing with a reduced sample space, and treating this as our new
“omega”.

The conditional probability of A given B is the area of the overlap region, the events where
A and B both occur, divided not by the area of the original sample space, but by the area
of the reduced sample space, B.
I like the darts, so let’s put the back on.

Now I think it’s much easier to visualize what the numerator and the denominator
represent in the general rule for conditional probability.

Also, notice that this discussion is consistent with what we did when we solved the dice
problem.
Omega is equal to the set of equally probable outcomes 1 through 6. We imagine
assigning an area to this set equal to 1.

The event of rolling a 2 is a subset of this sample space. In this case the 2 takes up
exactly one sixth of the total area, so the probability is 1/6..

The event of rolling an even number is a different, larger subset. It takes up exactly one
half of the total area, so the probability is 1/2.

Now, if we we’re considering the probability of rolling a 2, given that it’s even, we’re
dealing with a reduced sample space. We’re treating the evens as the new sample space,
and looking at the proportion of events corresponding to the number 2, as a fraction of this
new sample space. And we get the answer, 1/3..

Note that in this case the overlap is complete. The 2s are entirely within the evens. This
corresponds to a sample space that looks like this:
Here the overlap is complete, with the 2 a proper subset of the evens. But the formula
works all the same.

If A is included inside B, then the intersection of A and B is just equal to the area of A. In
this case, the intersection of 2 and E is just equal to the area of 2. This simplifies the
calculation.

The numerator is just the area of 2, and the denominator is just the area of E. Plugging in
the probabilities we get the answer, 1/3.

Now, for the sake of completeness, let’s work it the other way. What’s the probability of
the dice landing even, given that it’s a 2?
Well we know the answer to this already, it’s got to be equal to 1, since 2 is an even
number. The calculation gets it right too.
We use the fact, once again, that the area of overlap is just equal to the area of 2, so P(E
and 2) is just equal to P(2). And now the conditional probability is just 1/2 over 1/2, which
is equal to 1. It’s like asking, what’s the probability that the dart landed on the 2, given that
it landed on the 2? That’s a sure bet!

Okay, that wraps up this introduction to the general conditional probability rule in
probability theory. I hope this video gives you a better sense of why the rule has the form it
does and why it works.

In the next video in this series we’re going to be looking at a special form of the conditional
probability rule known as Bayes’ rule.
Total Probability Rule

We’re working up to a derivation of Bayes’ Rule from the general rule for conditional
probability. An important step in that proof involves an application of what’s called the “law
of total probability”. This is actually a very handy rule, very useful for working out certain
kinds of problems, and it’s just interesting enough to warrant its own video, so here it is.
We’ll start off with an algebraic derivation using Venn diagrams that will help us
understand that scary expression above, and then look at an example calculation.

The basic idea behind the law of total probability is that you can think of the
unconditional probability of an event as a sum of conditional probabilities, where
you’re conditioning over all the various alternative ways that the event could come
about.

In this Venn diagram, which you should be pretty familiar with by now, we’re looking at a
representation of various kinds of events.

How many? At first glance it looks like three. There’s event A ...

... represented by this oval shape. There’s B, represented by this one ...

And there’s the event where A and B both occur, represented by the overlap region.
But there are more events on this diagram. There’s the event associated with A NOT
occurring, which looks like this:

We call this the logical or set-theoretic complement of A -- everything in the event space
which is outside of A, covering all the cases where A does not occur.

Similarly, there’s the complement of B:

which represents “not-B”, the event in which B does not occur.


And you could have the area outside of the overlap region as well, representing all the
cases where A and B don’t occur at the same time, but I want to look at a specific overlap
region ...
... I want to look at the overlap between A and not-B. These are the events that are
inside A, that are also OUTSIDE of B, the events where A occurs but B does NOT occur.

So, what’s the point of this exercise? The point is that we have more than one way of
representing an event, and we can use these alternate constructions to give us information
about an event.

Consider for example event A again. Here it’s just a simple event and we’re not paying
attention to what events its overlapping with. But when do pay attention, we see the
following ...

Event A can be viewed as the sum of two overlap regions. The first is the overlap
region between A and B. The second is the overlap region between A and NOT-B. This
gives us a new logical representation of event A:

A is just the sum of (A and B) and (A and not-B).

But then we immediately have a representation for the probability of A, since these areas
correspond to probabilities.
P(A) = P(A and B) + P(A and not-B)

The probability of A is equal to the probability that A and B will both occur, plus the
probability that A will occur and B not occur.

This is a version of the law of total probability.

This doesn’t quite correspond to the big long formula we saw at the start of the video with
all the conditional probabilities in it, but to get there we just do a little rearranging and apply
the general conjunction rule, which if you recall, expresses conjunctions in terms of
conditional probabilities.

So we start with this expression:

P(A) = P(A and B) + P(A and not-B)

And we can switch the order of terms since conjunction is a commutative operation, you
get the same answer either way:

P(A) = P(B and A) + P(not-B and A)

And now we just apply the general conjunction rule to these conjunctions:

P(A) = P(B) x P(A|B) + P(not-B) x P(A|not-B)

We need to pay attention to the order when applying this rule so we don’t get confused
about what’s playing the role of the As and the Bs, but it’s just a simple substitution.

And finally, the result of this substitution is just the law of total probability that’s on the title
slide at the beginning of the video.

One thing I want to note here is that there’s nothing special about having only two events,
A and B. In general you can imagine A overlapping with any number of events.
Here are two events, B and C, that overlap with event A. Event B is everything to the left of
the red line in the event space, event C is everything to the right of the red line in the event
space.

The law of total probability says that we can represent the probability of A as the sum of
the intersections of these two events with the event A.

Substituting our formulas for the conjunctions in terms of conditional probabilities, we get
this expression, which is the law of total probability for this case:

P(A) = P(B) x P(A|B) + P(C) x P(A|C)

Okay, let’s look at an example total probability calculation.

You’re given two boxes, each with ten balls in them. You’re told that box 1 has 5 red ball
and 5 green balls in it, and box 2 has 7 red balls and 3 green balls. You can’t see inside
them, so you don’t know which is which. You pick a box at random and draw out a ball.

Question: What is the probability that the ball you drew out is green?
This is a typical total probability question. What you’re actually given is a lot of information
aboutconditional probabilities and from this you’re supposed to work out
an unconditional probability, the probability of drawing a green ball.

Let’s first define our events:

Let G be the event that the drawn ball is green.


Let B1 and B2 be the events of picking box 1 and box 2 respectively.

Now let’s look at the equation for total probability. Here’s the equation:
P(G) = P(B1) x P(G|B1) + P(not-B1) x P(G|not-B1)
P(G) = P(B1) x P(G|B1) + P(B2) x P(G|B2)

In the second line I’ve simplified the expression in an obvious way, since in this context,
the act of not choosing box 1 is equivalent to choosing box 2, since they’re mutually
exclusive and exhaustive alternatives.

So this reads, the probability of drawing a green ball is equal to the probability of choosing
box 1, times the probability of drawing a green ball out of box 1, plus the probability of
choosing box 2, times the probability of drawing a green ball out of box 2. Now we write
down the information we have and see if we can fill in these terms.

The probabilities for choosing box 1 and box 2 are easy, they’re both one half or point-five
since we’re randomly choosing between them. So

P(B1) = 0.5
P(B2) = 0.5

And the probabilities for choosing green given that we’re in a particular box are easy too,
since they’re just the fractions given: half the balls in box 1 are green, so that’s 0.5, and 3
of the 10 balls in box 2 are green, so that 0.3. Thus,

P(G|B1) = 0.5
P(G|B2) = 0.3
And now we just substitute these values:

P(G) = P(B1) x P(G|B1) + P(B2) x P(G|B2)


P(G) = (0.5)(0.5) + (0.5)(0.3)
P(G) = (0.25) + (0.15)
P(G) = 0.4 or a 40% chance

Add these up and you get the probability of drawing a green ball as 0.4, or 40%.

Which makes sense. It should be less than 50%, since in box 2 only 30% of the balls are
green. But it should be higher than 30%, since there’s a 50% chance that you’ll pick box 1,
which has more green balls in it. The answer is a weighted sum of the two conditional
probabilities.
We’re in a good position now to introduce Bayes’ Rule. Bayes’ rule is intended to answer
a different version of this problem.

Here’s the problem:

Given that a green ball was in fact drawn, and not knowing which box it was drawn
from, what is the probability that it came from box 1, or alternatively, box 2?

This is a typical Bayes’ Rule question.

Our intuition tells us that it’s more likely to have come from box 1 than box 2, since box 1
has more green balls, but Bayes’ Rule can tell us exactly what this probability is, given the
information we have. So let’s move on now to Bayes’ Rule.

Bayes’ Rule

In this video we’re going to take a look at Bayes’ Rule, which is arguably the most famous
rule of probability theory. It’s famous because it is provides a model for an important
process in human reasoning, namely, learning from experience. Bayes’ rule can tell us
how we should modify the strength of our belief in a particular hypothesis after
we’ve learned some new bit of evidence. In this video we’re just going to show how
Bayes’ Rule follows from the general rule for conditional probabilities, and look at two
example calculations.

Here’s the general rule for conditional probability. It serves as the basic definition of
conditional probability in probability theory:

P(A|B) = P(A and B) / P(B)

We can rewrite this formula by swapping out the numerator with the general formula for
conjunctions:

P(A|B) = [P(A) x P(B|A)] / P(B)

This is just a rearranged form of the general conditional probability rule, so really we’re just
swapping around terms here.
What we have now is the simplest form of Bayes’ Rule. But there’s a more useful
formulation that is more commonly used, which we get by rewriting the denominator in
terms of the total probability of B:

If you don’t have a clue where the expression for total probability comes from then I’d
recommend watching the previous tutorial where we derived it. But this is the “working
version” of Bayes’ Rule, it’s the version we generally use in calculations.

Now let’s look at the problem we left off with at the of the last tutorial and see how Bayes’
rule can be used to solve it.
We’re given two boxes, each with a different proportion of red and green balls in them. Box
1 has 5 red and 5 green balls, while box 2 has 7 red and 3 green. A box is randomly
chosen, we don’t know which one, and a ball is drawn. The ball is green.

Question: What is the probability that the ball came from box 1? Or alternately, from
box 2?

The information on the right summarizes what we know about the setup. The probability of
picking any given box is 0.5 because it’s random. And the probability of picking a green
ball out of any particular box is given by the proportion of green balls in the box, which are
0.5 and 0.3 respectively.

Now, what is that we want to calculate? It’s this

P(B1|G),

the probability that box 1 was selected, given that the ball is green.
Now let’s look at Bayes’ Rule.

This is the general rule in terms of generic As and Bs. We need to rewrite this in terms of
G, B1 and B2.

Note that A is going to become B1, the event of choosing box 1. What is not-A? Well, the
setup says that we’ve only got two options, box 1 or box 2, so if it’s not box 1 then it’s got
to be box 2. So not-A is going to be replaced with B2, the event of selecting box 2.

And here’s what we get when me make those substitutions.


This is Bayes’ Rule for this particular question. And notice that every term in this
expression is known, they’re all on the side in the information given. So now we just
substitute:

and we get this. You see now how the numbers are actually pretty easy to work with.
When we evaluate the products and do the sums we get the answer:
The probability that the ball came from box 1 is 0.625, 62.5 percent.

This example illustrates how the Bayes calculation models learning from experience.
Before we knew the color of the ball, we were completely ignorant about which box was
chosen, and the probability of it coming from box 1 was just 50%, reflecting this ignorance.
Now, after having come to know the color of the ball, we can revise the probability of the
hypothesis, and we see that it’s more likely that the ball came from box 1 than from box 2.
Which is exactly what we would expect, given the proportions of green and red balls in the
boxes, but Bayes’ rule gives us a precise estimate of how much more likely it is. That’s the
power of the rule.

Now, can you guess what the probability is of the ball coming from box 2? There are two
ways to do this, a short way and a longer way. The short way is just to realize that we’ve
only got two possibilities here and their probabilities have to add up to 1, so the short way
is like this:

P(B2|G) = 1 - P(B1|G)
= 1 − 0.625
= 0.375

The probability is 0.375, or roughly 37.5%. The long way to do is to solve Bayes’ Rule for
Box 2 instead of Box1. That setup looks like this:
And when you plug in the right values, you get this:

And when you evaluate the products and do the sums you get this:
which is the same answer. So it works either way.
Now, let’s look at another example.
Here’s a typical sort of problem you often see on math tests.

“Two computer companies sell computer chips to a technology


company. Company A sold 100 chips of which 5 were defective. Company B sold
300 chips of which 21 were defective.”

Question: What is the probability that a given defective chip came from Company B?

When we approach a problem like this the first thing we should do is define our variables
and write down what we know in terms of those variables.

So let
A = the chip came from Company A
B = the chip came from Company B
D = the chip is defective

The question we’re being asked to solve is, what is the probability of B, that the chip came
from company B, given D, that it was defective? That is, what is

P(B|D) = ?
Now let’s write down what we know.
P(D|A) = 5/100 = 0.05
P(D|B) = 21/300 = 7/100 = 0.07
These are the conditional probabilities of getting a defective chip from each of the
companies, respectively. You can just read these off the question, 5 out of 100 and 21 out
of 300, which give us 0.05 and 0.07 when we simplify them.

Now, we’re also going to need the unconditional probabilities for A and B, the prior
probabilities P(A) and P(B) before taking into account the new information that the chip
was defective.

In this case the prior probabilities aren’t distributed equally across the alternatives. The
question says that 400 chips in total were bought, with 100 coming from company A and
300 coming from company B. So any random chip is more likely to have come from B than
from A.

More precisely, the probability of a chip coming from Company A is 0.25, or 25%, and
from Company B is 0.75, or 75%.

P(A) = 100/400 = 0.25


P(B) = 300/400 = 0.75
These prior probabilities are also called “base rates”; people often ignore information
about base rates when estimating probabilities, it’s called the “base rate fallacy”. We’ll be
looking more closely at base rate fallacies in the course on fallacies of probabilistic
reasoning.

Now, to solve this problem, all we have to do is apply Bayes’ Rule. When we substitute our
values we get this:

P(B|D) = .81, or 81%


The prior probability of any given chip coming from company B was 75%. But once we
learned that the chip was defective, and we knew the ratio of defective chips that come
from company B, and that it actually has a worse track record, percentage-wise, than
company A, this information raised the probability that the defective chip came from
company B to 81%.

Okay, I think this will do for an introduction to Bayes’ Rule. In the tutorial course on
fallacies of probabilistic reasoning we’ll come back to Bayes’ rule and talk about how most
people’s intuitive judgments about conditional probabilities are flawed because they fail to
consider base rates, the prior probabilities of events, in their estimates. This can have very
serious consequences when we’re dealing with, for example, a doctor’s estimate of how
likely it is that someone has the HIV virus, given that they’ve tested positive for it. But we’ll
save that discussion for another course.

How to Write a Good Argumentative Essay


Introduction

In this tutorial series we’re going to be looking at the process of organizing and writing a
GOOD argumentative essay.

An argumentative essay is sometimes called a “persuasive” essay. It’s an essay that tries
to persuade the reader to accept some thesis or conclusion. It does this by providing the
reader reasons to accept the thesis.

So, an argumentative essay is really just an ARGUMENT. Now, in other tutorial courses
I’ve talked a great deal about recognizing and evaluating arguments, but I’ve never talked
about essays or essay writing. So in this tutorial course we’re going to focus on how to
write arguments in essay format.

In this introduction I want to say a few words about what makes an argumentative essay
an essay, rather than just an argument. That’s point number 1. But what I really want to
talk about is why it’s important that you know HOW to write one. That’s point number 2.
And finally I’ll say a bit about how we’re going to proceed in the tutorials that follow.

Okay, let’s start with what makes an argumentative essay an essay. What makes it an
essay is that it has a certain conventional structure. We’re all familiar with the basic
elements of this structure -- an essay will have, minimally, an introductory section, a main
body, and concluding section. There’s a lot more to it than this, but these are the main
elements that everyone will recognize.

This structure is just a convention -- it’s not the only way you can present an argument --
but it’s an established convention.

Why? Because it has proven to be an effective and efficient means of communicating


complex ideas and arguments to readers who may not know much about the issue to
begin with, and may not even know whether they’re interested in the issue.

We’re going to talk a lot more about the functions of the various parts of an essay in later
tutorials, so I won’t say any more about this here.
Now, why is it important to learn how to write in this style?

Well, for starters, many classes in high school, college and university require you to write
argumentative essays. If you happen to be bad at this then you’re going to be penalized
for it over and over.

Second, and this is a frustrating point for many students, in MOST classes you’re expected
to know how to write in proper essay format, before you come to class. Unless you’re
taking a composition class where essay writing is the subject of study, your teachers aren’t
going to spend much time lecturing on essay writing technique. So you end up learning, if
you ever do, by trial-and-error, and the error usually comes at the price of a grade that is
lower than it has to be.

And third, once you’ve internalized the logic of the argumentative essay style, once you’ve
understood the rationale for the conventions, then you can transfer that same logic to any
situation that requires presenting an argument to an audience. Situations like, presenting
a sales pitch to a client, or writing a memo to your boss requesting a raise, or delivering a
closing argument in a court case.

So the point here is that understanding the logic of the traditional essay format can help
you to construct more persuasive arguments in any form, in any situation. It’ll be more
obvious how this works once we’ve looked more closely at the logic of the essay format.

Okay, this is how we’re going to proceed in this tutorial series. First, we’ll review the basic
elements of the traditional argumentative essay style, and the focus here will be on
understanding the logic, the rationale, for why the conventions are what they are. We’ll
answer questions like,
why do introductions have the structure they do?
why does the main body have to have this structure?
why should conclusions be constructed like this?
and so on.

Along the way we’ll look at some good and bad examples of these elements to illustrate
the main ideas.

And finally, I’m going to present a short example of a bad student essay. Or maybe we
should just call it an essay that needs some work. The topic is “Should Teachers Be
Allowed to Ban Laptops in the Classroom?”.

We’ll analyze the logical structure of the essay, discuss recommendations for improving it,
and then I’ll present a revised and improved version of the essay based on those
recommendations.

After all this you should have a pretty good idea of what a good argumentative essay is,
how they’re organized, and how to go about writing one.

One final note:


In this tutorial series the focus is on the logic of the basic argumentative essay format.
There’s a lot that I won’t be covering. I’m not going to be focusing on things like, for
example, citation styles and footnotes and bibliographies, or how to use research tools, or
avoid plagiarism, and so on. These are all important topics, but they aren’t the focus of this
course.
Part 1: Guidelines for Structuring an Argumentative Essay
1.1 A Minimal Five-Part Structure

In this tutorial I’m going to review the minimal five-part structure that an essay has to have
to qualify as a good argumentative essay, and talk a bit about strategies for organizing this
structure on the page.

Now, by “minimal” I mean that any good argumentative essay is going to have at least
these five elements or parts. They can have many more parts, but they can’t have any
fewer.

As we’ve seen, an essay will have at least these three parts, an introduction, a main body,
and a conclusion. We’ll talk more about what should go into the introduction and the
conclusion later. Here I want to focus on the main body of the essay.

The main body is obviously going to include the main argument of the essay. This is the
argument that offers reasons in support of the main thesis of the essay.

Now, technically we could stop right here. We’ve got an essay and we’ve got an argument,
so we’ve got an argumentative essay, right?

Well, we’re not going to stop here. Why? Because our aim isn’t just to write an
argumentative essay. Our aim is to write a good argumentative essay, and a good
argumentative essay is always going to have more structure than this.

In fact, a good argumentative essay is going to contain at least three distinct arguments
within the main body.

For starters, a good argumentative essay is always going to consider an OBJECTION to


the main argument that was just given, and this objection is itself going to be an argument.
The conclusion of this argument, the objection, is that the main argument that was just
given is in fact a BAD argument, that the main argument fails in some way. It’s going to
argue that the main argument relies on a false or implausible premise, or that the logic is
weak, or that it fails to satisfy some other necessary condition for an argument to be good.

Now, why do we need to consider objections? Remember, we’re aiming for a good
argument -- we want our essay to give the most persuasive case possible for the intended
audience of the argument. But it’s important to remember that the intended audience of the
argument isn’t the people who are already inclined to agree with your thesis -- that’s what
we’d call “preaching to the choir”. If this was your audience then you wouldn’t need to give
an argument in the first place, since they’re already convinced of the conclusion. No, for an
argumentative essay, we have to assume that our audience is the people who aren’t
convinced yet of the main thesis, who are inclined to be skeptical of the conclusion and will
be looking for reasons to reject your argument.

So, if your essay is going to have any hope of persuading this audience, it’s going to have
to consider the skeptic’s point of view. That’s why any good argumentative essay is always
going to have a section that deals with objections to the main argument.
Of course raising an objection isn’t going to help your case unless you can come up with a
convincing reply to it. If you can’t meet the objection then it’ll have the opposite effect,
you’ll be making the case for the opposition. So a good argumentative essay is also going
to have a section where you defend your argument by replying to the objections raised.

It’s important to remember that the objection is a distinct argument, and the reply is
another distinct argument. The conclusion of the objection is that your main argument is a
bad argument. The conclusion of your REPLY is that the objection just given is a bad
objection.

So, the main body of your argumentative essay is actually going to contain at least three
distinct arguments: a main argument, an objection and a reply.

This is where we get the minimal 5-part structure. The introduction is the first part, then
you’ve got at least the three arguments in the main body, giving us four parts, and the
conclusion makes five.

I call this a minimal five-part structure because it’s the bare minimum that an essay has to
have if it’s going to qualify as a good argumentative essay. You can summarize it by
saying that a good argumentative essay is going to have an introduction and a conclusion,
and a main body where an argument is presented, objections are considered and replies
are offered that defend the argument against the objections.

Now, here’s a very important point about objections. It may be tempting to pick a weak
objection, one that’s easy to refute, and reply to that. But doing this won’t strengthen your
argument, because it won’t satisfy a thoughtful skeptic. What the skeptic wants to know is
how you would respond to what they consider the strongest and best objections. If you can
successfully refute what your audience regards as the strongest objections to your
position, then you’ve got the best chance of winning them over.

So, a good argumentative essay is always going to look for the strongest possible
objections to its main argument, present them accurately and fairly, and then attempt to
systematically respond to those objections.

Now, here’s a question that my students sometimes ask me. Let’s say you’ve developed
what you think is a pretty good argument, and then you come across an objection to that
argument that really stumps you -- it really does seem to point out a weakness in your
argument, and you honestly don’t know how you should respond to it. Now what do you
do? How do you proceed with the essay?

Well, if you were only concerned with the appearance of winning the argument then you
might consider using a rhetorical device, like misrepresenting the objection in a way that
makes it look weaker than it actually is, and then respond to that weaker version. But if
you’ve seen the tutorial course on fallacies then you’ll know that in doing this you’d be
guilty of a fallacy, the straw figure fallacy, and more importantly, a thoughtful critic will likely
see it as a fallacious move too, and it may actually weaken your case in the eyes of your
intended audience, which is the opposite effect of what you intended.

I think that if you’re really stumped by an objection, then you can do one of two things.
One, you can change your mind, you can accept that your argument fails, and either give
up the thesis or look for a better argument for it.

But maybe you’re not willing to give up your argument so soon. In the face of a tough
objection, there’s nothing wrong with saying “That’s a good objection, I’ll have to think
about that.”. Maybe with a little thought you can come up with a good response. But until
then, in my view, rationality dictates that you should at least suspend judgment about
whether your argument is really as good as you thought it was. Maybe it is and you can
come up with a good defense, but maybe it’s not -- what you’re admitting when you can’t
come up with a good reply is that you’re not in a position to be confident about that.

Okay, another question. We’ve got this three-part structure to the main body, with a main
argument followed by an objection and then a reply. The question is, should this be the
way you actually organize the essay on the page, with a section devoted to the main
argument, followed by the objection, followed by the reply?

The answer is yes, you could, but no, you don’t have to. The logical structure I’ve given
here is what people will be focusing on when they try to extract the argumentative content
from your essay, but just as you can write the same argument in many different ways, you
can organize an argumentative essay in many different ways that preserves the same
logical structure.

How you choose to organize it will depend on a bunch of different things, like whether your
audience is already familiar with the main argument, or whether an objection is going to
focus on the truth of a specific premise or whether it’s going to challenge the logic of the
main argument taken as a whole, or whether you’re going to focus on lots of different
objections rather than one big objection, or whether you’re going to focus more on replies
to common objections, and so on. And some of it will come down to stylistic choices, how
you want to lead the reader through the argument. There’s no one set way of doing this.

Just to illustrate, here’s an example of an alternative organizational structure. You start off
presenting your main argument. You lay out premise 1 and premise 2 of your main
argument, but you anticipate that premise 2 is going to be contentious for some audiences,
so instead of waiting to address the natural objection, you deal with it right here. You
consider the objection to premise 2, and you respond to the objection right away. Then you
move on and finish the argument.

Now your main argument is presented, you’ve dealt with one objection, but maybe now
you want to consider another objection, one that turns on the logic of the argument as a
whole. So you raise that objection and follow up with a reply.

This is a perfectly good way of presenting the argument to the reader, even though some
of the replies and objections are mixed into the presentation of the main argument.

This is also a perfectly good way of organizing the essay into paragraphs. Not every
element in the reasoning needs its own paragraph, it all depends on context and how
much actually needs to be said to make a particular point. For example, sometimes you
can state an objection in a single sentence. Let’s say that the objection to premise 2 above
can be phrased as a single sentence. Then it might be very natural to combine the reply
and the objection into a single paragraph.
There are no set rules for how to do this, and you might find yourself adding and deleting
and reorganizing paragraphs as you work through the essay, but however you organize it,
the three-part structure of argument, objection and reply needs to be clear.

Okay, we’ve covered a lot here, so lets sum up.


An argumentative essay has a minimal five-part structure. It has an introduction, a
conclusion, and a main body that itself contains at least three distinct arguments.
The main argument of the essay is a distinct argument, but you also have to consider the
strongest objections that you can think of, and offer replies to those objections, and each
of these are distinct arguments as well.
And finally, the organization of the logical elements of the main body can vary. You can
present a whole argument, then proceed to list objections, then consider replies, or you
can consider objections and replies on the fly, as you work through the main argument.
Regardless, your final paragraph structure should reflect the logical structure of these
argumentative elements, however that logical structure is organized.

1.2 Writing the Introduction

Every essay, if it’s following standard form, will have an introduction. In this tutorial we’ll
look at what should and shouldn’t go into the introductory section of an argumentative
essay.

To write a good introduction you need to know what functions an introduction is supposed
to serve. An introduction has several distinct functions, but they all come down to making
life easier for readers.

First, an introduction needs to tell the reader what the general subject matter of the essay
is, what the issue is that you’ll be discussing in the essay.

Second, unless the issue is well known to everyone in your audience, you might also need
to provide additional background information to help explain and set up the issue. How
much background will depend on the issue and what you can assume about your intended
audience. The key is that when you do finally state your main thesis, the reader has a
good idea of what you’re saying and what the issue is about.

That gets us to the third function, to state your main thesis. By “main thesis”, all we mean
is the conclusion of the overall argument of the essay, what you’re trying to argue for. One
of the most common problems with student essays is a failure to be clear about what the
main thesis of the essay is. This needs to be stated as clearly as possible in the
introduction, before you get into the main body.

Finally, if your argument has any kind of complexity to it at all, then it can be very helpful to
let the reader know what to expect in the remainder of the essay, how it’s going to be
structured and organized. You can think of it as providing a roadmap or plan or outline of
how the argument is going to proceed. For smaller, simpler essays these roadmaps may
not be vital, but they become more and more important for both the reader and the essay
writer as an essay become longer and more complex.

Something to watch out for if you’re going to give a roadmap of this kind is to make sure
that you actually do in the essay what you said you’d do in the introduction. The
introduction sets up expectations for the reader, and you want to do your best to fulfill
those expectations.

That’s what goes into the introductory section of an argumentative essay. It’s also
important to remember what doesn’t go in. Another common mistake that students make
with introductions is to begin describing arguments or providing other kinds of information
that really belong in the main body of the essay. The introduction is for setting up the main
argument, providing background and context so the reader is prepared to understand and
follow the arguments in the main body, but that’s it. Once you start giving premises and
considering objections that pertain to the main thesis of your essay, you’re not
“introducing” your essay anymore.

One last point: In essay writing guides people will often refer to the “introductory
paragraph” of the essay. It’s true that sometimes you can state what you need to state in
one paragraph, especially if the essay is short and simple, but more often you’ll need more
than one paragraph to introduce the issue, state your thesis and sketch the outline for the
essay.

So it’s more accurate and more helpful to talk about the introductory section of an essay,
where it’s understood that this introductory section can include more than one paragraph.

Okay, let’s look at an example. Here’s the introductory paragraph of a student essay on
the ethics of fighting in hockey.

“We’ve all seen hockey players drop the gloves and start swinging. Fighting is part of the
game of hockey that shouldn’t go away because it helps to regulate aggressive players
and is part of the entertainment value of the game that hockey fans enjoy. In this essay I
will argue that fighting should be allowed in hockey. Some people object that fighting in
hockey sends the message to children that violence is acceptable, but I will argue that
fighting actually prevents more injuries than it causes.”

Okay, the first thing to say is, yes, there are some style and sentence structure issues that
could be improved, but it’s important to distinguish issues of style from issues of function,
so for now let’s ignore the style issues. In terms of function, what does this introduction do
right?

Does it clearly introduce the issue? Is there a clear thesis statement? Does it tell the
reader what to expect in the remainder of the essay?

This introduction actually does a pretty good job on all three counts. It’s clear that the issue
is about the ethics of fighting in hockey.

Mind you, there’s still some room for clarification. Someone might read this and wonder
whether the issue is about whether fighting in hockey should be banned, or whether it’s
about fighting in hockey as a general moral issue. These two aren’t necessarily the same
thing. I might judge an action or a practice to be morally wrong but not necessarily agree
that the practice should be banned. And it’s also unclear whether this is about professional
hockey or whether it’s meant to include amateur hockey, and if so, what age-range of
players is being considered. So there’s room for improvement in clarifying precisely what
the issue is, but it’s still not too bad.

Now, does this introduction have a clear thesis statement? Yes it does!

The writer makes it clear in a couple of places what side he’s going to come down on in
this debate, but the clearest place is right here, in that middle sentence, “In this essay I will
argue that fighting should be allowed in hockey”. There are lingering questions about
precisely what this means, but there’s no ambiguity about what side of the issue the writer
is on.

Now, does the writer give us an idea of what to expect in the rest of the essay? Yes, he
does, especially in that last sentence:

“Some people object that fighting in hockey sends the message to children that violence is
acceptable, but I will argue that fighting actually prevents more injuries than it causes.”

This tells us something about the argumentative structure of the essay. We know that the
writer is going to consider an objection and present a reply to the objection, and we’re told
what issues the objection and the reply are going to address.

Does an introduction NEED this kind of outlining? For a shorter essay maybe not, this is
partly a matter of preference, but you’ll never go wrong by adding some discussion of how
the argument is going to proceed, it’s a good habit to pick up.

So this introduction gets a few important things right. Does it do anything wrong?

Well, if I were editing this I’d recommend that the writer re-think that second sentence:

“Fighting is part of the game of hockey that shouldn’t go away because it helps to regulate
aggressive players and is part of the entertainment value of the game that hockey fans
enjoy.”

This sentence tells us a lot about how the argument of the essay is going to go. I think it
tells us TOO MUCH -- it’s actually giving an argument for the main conclusion, and that’s
not what an introduction is for. This belongs in the MAIN BODY of the essay. In an
introduction you can talk about the argumentative issues that your essay is going to
address, but you want to reserve the presentation of these arguments for the main body.

The reason for this is simply to avoid confusing the reader. You don’t want to start arguing
for one side of an issue before you’ve finished explaining what the issue is.

So, if I were to summarize my editorial comments on this introduction, the key suggestions
would be to move this argumentative bit to the main body, and to spend a little more time
clarifying the issue -- is the claim just that fighting in hockey shouldn’t be banned, or that
fighting is actually a good thing, a desirable feature of the game; is it about fighting only at
the higher levels of amateur and professional hockey, or all levels? And so on.
I’m not going to bother rewriting this introduction here, the example is meant simply to
illustrate the thought process that goes into writing introductions. The main idea is to think
about the functions that an introduction is supposed to serve and to make sure that your
introduction fulfills those functions.
1.3 Writing the Conclusion

Every essay, if it’s following standard form, will also have a conclusion. In this tutorial we’ll
look at what should and shouldn’t go into the concluding section of an argumentative
essay.

Just as with introductions, to write a good conclusion you need to know what functions a
conclusion is supposed to serve. And just as with introductions, these functions are
designed to make life easier for the reader.

The introduction and the conclusion are like bookends that help to frame the essay,
impose some structure on it, and make it easy to get a grip on what the essay is about and
how the conclusion is going to be defended. Many readers will skim the introduction or the
conclusion of an essay to determine whether it’s something they’re interested in and worth
reading through.

Okay, let’s get to it.

The first thing that a conclusion should do is restate or summarize the main thesis or
conclusion of the main argument of the essay, what the essay argued for.

Second, the conclusion should briefly summarize the key argumentative moves that were
made in the essay. So you’re reminding the reader not just what you argued for, but how
you argued for it.

And third, the concluding section of an essay is a place where the writer can give some
additional commentary on the argument or the issue. This is optional, and there aren’t any
hard and fast rules about what sorts of comments are appropriate or inappropriate here.
Some writers use the conclusion as an opportunity to comment on the significance of the
issue, or point to questions that need further research.

What you should avoid doing, however, is add additional argumentative material in the
conclusion. You shouldn’t be introducing new material relevant to the main argument . If
you find yourself wanting to add additional argumentative content in the conclusion, you
should think about how that material can be integrated into the main body, because that’s
where it belongs.

And finally, as we mentioned with introductions, in spite of the fact that many essay guides
will talk about the “concluding paragraph”, a conclusion can, and often will, require more
than one paragraph to do the job properly. So it’s better to think of it as the “concluding
section” of the essay, rather than the concluding “paragraph”.

Here’s an example. This is the conclusion of the essay on fighting in hockey, after it had
been through a couple rounds of editing.

“In professional and amateur hockey, fighting appears to be an accepted part of the culture
of the sport. Some have argued that fighting in hockey should be banned, or at least
penalized with the same severity as we see in other sports, like baseball or basketball. In
this paper I’ve tried to defend the toleration of fighting in hockey. I argued that the tradition
of using “enforcers” in hockey to pick fights with aggressive players on opposing teams
helps to protect smaller and more vulnerable players from more serious injury, by
functioning as a deterrent against the dirtiest and most dangerous behaviors. A natural
objection is that a general ban on fighting would also curtail these more dangerous
behaviors, but I offered reasons to believe that a general ban would not be as effective at
preventing the intentional infliction of injury as some might hope. It may be counter-
intuitive, but hockey with enforcers may actually be a safer sport than hockey without
enforcers.”

This paragraph performs all of the functions that we expect of a conclusion. It restates the
main thesis, and it outlines the key argumentative moves that were made in the essay,
including, in this case, an objection and a reply. Anyone reading this would have a clear
idea of what the essay was about and what was argued for, and that’s the key function of
the concluding section of an essay.
Part 2: A Sample Essay with Some Problems (and Strategies for Fixing Them)
2.1 The Essay: Should Teachers Be Allowed to Ban Laptops in Classrooms?

In Part 2 of this tutorial on how to write a good argumentative essay, we’re going to take a
look at an example of a short essay that has some problems, and spend the rest of the
tutorials diagnosing these problems and discussing solutions.

This example is a somewhat edited version of a real essay that was submitted as part of a
classroom writing assignment. I’ve got permission from the author to use his essay here.
In keeping with the main theme of these tutorials, I’m mostly going to ignore problems with
style and focus more on problems with organizational structure and function.

Should Teachers Be Allowed to Ban Laptops in Classrooms?

I know some college teachers are starting to ban laptops from classrooms. I think that
laptops should not be banned. Yes, some students surf the web or play games in class,
but that doesn't mean the rest of us who use laptops responsibly should be punished for
the actions of a few. In this essay I will argue that using laptops is a right that teachers
should not infringe upon.

For starters, many of us who have grown up with computers have very poor handwriting
and sometimes our fingers get sore if we have to use a pen to write a lot. That's why I like
to take notes on my laptop, I can type much faster than I can write, and I can keep my
notes organized in one place.

A second argument for laptops is that students should have a right to take notes as they
choose. We pay good money for our courses and we all have different learning styles, so
we should be free to choose the methods that work best for us.

Teachers complain that having a laptop is too much temptation for some students. They
just can't keep themselves from browsing Facebook or playing solitaire in class, so they
don't pay attention and miss out on important information or don't participate in class
discussions. To this I say that college students are adults and need to be treated as adults,
and that means they should take responsibility for their own education. If someone wants
to chat on Facebook all day let him, it's his choice to fail, not the teacher's.

In conclusion, I feel strongly that laptops should not be banned from classrooms. Laptops
may be a distraction for some students, but that's not a good enough reason to ban them.
2.2 Analysis: The Introduction

Let’s take a look at the introduction.

I know some college teachers are starting to ban laptops from classrooms. I think that
laptops should not be banned. Yes, some students surf the web or play games in class,
but that doesn't mean the rest of us who use laptops responsibly should be punished for
the actions of a few. In this essay I will argue that using laptops is a right that teachers
should not infringe upon.

The introduction should tell me what the issue is, what the essay is about. I think this
introduction does a pretty good job telling me what the issue is. It’s clear that we’re talking
about the use of laptop computers in college classrooms, and specifically whether
teachers should be able to ban the use of laptops. And because the issue is familiar
enough to most people it doesn’t need a lot of additional background information to clarify
it.

By the way, it’s helpful to remember the distinction between the issue being discussed in
the essay and the thesis of the essay. The answer to the question “what is the issue being
discussed?” can always be answered with a “whether” statement: “whether marijuana
should be legalized”, “whether Pete Rose should be allowed into the baseball hall of
fame”, “whether teachers should be allowed to ban laptops”, and so on.

The thesis, on the other hand, is the writer’s answer to this question. It’s the conclusion of
the main argument of the essay. You can always answer the thesis question with a “that”
statement: my thesis is “that marijuana should not be legalized”, “that Pete Rose should be
allowed in the hall of fame”, “that teachers should not be allowed to ban laptops”.

And that leads naturally to our next question, whether the thesis of this essay is clearly
stated in the introduction.

I think it’s pretty clear what the thesis is here, though I’ll point out a minor ambiguity. This
author states the thesis in two places (see the bolded portions below):

I know some college teachers are starting to ban laptops from classrooms. I think that
laptops should not be banned. Yes, some students surf the web or play games in class,
but that doesn't mean the rest of us who use laptops responsibly should be punished for
the actions of a few. In this essay I will argue that using laptops is a right that teachers
should not infringe upon.

The first statement says that laptops shouldn’t be banned, the second states that teachers
shouldn’t ban laptops because students have a right to use them.

That second statement is more specific and has more content than the first statement. It
says not only that laptops shouldn’t be banned, but also why they shouldn’t be banned. I
like this thesis statement better precisely because it’s more informative about what the
argumentative issue is that the essay is going to be addressing.

Now, it remains to be seen whether the rest of the essay actually makes good on this
thesis statement.
Another thing we like to see in an introduction, especially if the essay is longer or the
argumentation a bit more complex, is an outline or description of how the argument is
going to proceed, what the rest of the essay is going to look like.

But this is a short essay and it doesn’t really need an outline section to help the reader
follow along. It wouldn’t hurt to have one, especially if after rewrites you found the essay
becoming longer or more complex than you originally thought, but at this stage I wouldn’t
penalize the essay for not having an outline. However, if you look at this introduction you
do get a suggestion of how the argument is going to go.

I would recommend that the author be aware of the information they put in the introduction
(and what they don’t put in the introduction), because this information sets up expectations
in the reader’s mind about what they’re going to see in the essay, and you don’t want to
say you’ll do things that you end up not doing.

Anyway, to sum up, not knowing anything else about what’s in the essay, this is a pretty
good introduction. It may have to be re-written in light of what we actually find in the main
body of the essay, but at the very least it presents the issue clearly and has a clear thesis
statement.
2.3 Analysis: The Main Body: First Argument

Let’s look at the first argument that we encounter in the main body of the essay.

For starters, many of us who have grown up with computers have very poor handwriting
and sometimes our fingers get sore if we have to use a pen to write a lot. That's why I like
to take notes on my laptop, I can type much faster than I can write, and I can keep my
notes organized in one place.

Now, the first thing I would say to this author is that if there’s an argument here at all, it
needs to be clarified, because as it stands it sounds like what you’re giving me in this
paragraph is an explanation for why you happen to prefer using a laptop to type your
notes. But that by itself isn’t an argument for why teachers shouldn’t be allowed to ban
their use.

So, my first question to the author would be, how are these facts supposed to bear on the
issue? How do we get from this to the conclusion that teachers shouldn’t be allowed to ban
laptops?

We need to try to understand what the author was really trying to get at here.

What’s actually going on here, it seems, is that the author is responding to a possible
objection that questions the necessity of using laptops.

The teacher says “Why can’t you just take notes with pen and paper? You don’t really
NEED to use a laptop to take notes.”

This is relevant, right? Because if using a laptop to take notes really is nothing more than a
matter of personal preference, then it’s hard to see how that could supercede a teacher’s
right to conduct their class as they see fit.

But this paragraph looks like it’s a response to this objection: It’s saying “No, I DO need to
use a laptop. The quality of my note-taking will suffer if I don’t use a laptop. It’s not just a
matter of preference.”

Okay, let’s assume that this was the author’s intent. Now, you still need to connect this to
the main conclusion in some way.

I want to point out that the conclusion we’re going for is a moral conclusion, it says that
teachers shouldn’t be allowed to ban the use of laptops in classrooms. So at some point
the author has to think about how these facts about sore fingers and slow note-taking are
relevant to this moral claim.

Now, it’s a generally accepted principle of moral reasoning that you can’t derive a moral
conclusion from purely descriptive premises. Somewhere in the premises you need to
refer to a general moral principle or a statement about moral values.

So my question to the author is, what sort of moral argument are you going for? You
should make this explicit so that readers can see how the claims you make in this
paragraph are relevant to the conclusion.
If someone was to ask me, I think a natural way to develop this argument (maybe not the
only way) is to cast it as a fairness issue. Banning the use of laptops would be UNFAIR to
the students who rely on laptops for their note-taking, it would UNFAIRLY
DISADVANTAGE them (relative to the other students in the class who don’t rely on them),
to take their laptops away.

So if we go with this strategy, then the argument might look like this:

1. Banning the use of laptops will disadvantage certain students in the classroom (those
that really benefit from the use of laptops...).

2. Teachers should not adopt classroom policies that disadvantage certain students but
not others.

Therefore, teachers should not ban the use of laptops in classrooms.

“Banning the use of laptops will disadvantage certain students in the classroom” —
namely, those students who, like the author, have poor handwriting, write slow, get sore
fingers, are more disorganized with paper notes, etc..

“Teachers should not adopt classroom policies that systematically disadvantage certain
students but not others.” This is the moral premise that’s doing all the work for us, a
premise that basically says that, all other things being equal, a policy that disadvantages
one group of students but not others, is unfair, it’s unjust discrimination.

Now, what we’ve done here is the sort of thing I’d try to work through with a student if they
were looking for feedback on a draft. We’re trying to clarify and make explicit the reasoning
that’s really animating this paragraph.

From here I might have some suggestions for rewriting the paragraph to make this logic
clear, but I’d usually let the student take a crack at it first.

In this tutorial, all we’re doing is argument analysis, so I’ll stop here. Later on we’ll look at a
rewritten version of the essay that incorporates some of this analysis.

2.4 Analysis: The Main Body: Second Argument

Now let’s look at the second argument we encounter in the main body.

A second argument for laptops is that students should have a right to take notes as they
choose. We pay good money for our courses and we all have different learning styles, so
we should be free to choose the methods that work best for us.

Unfortunately, this is a confusing pair of sentences. It sounds like there are some different
competing considerations going on at once. That first sentence just begs the question, if
taken as an argument. If not taken as an argument, then it just restates the conclusion.

All the action seems to be in the second sentence. If we try to reconstruct this as an
argument, the reasoning seems to rely on two premises:
1. Students pay good money for their classes.
2. Students have different learning styles.
Therefore, teachers should not ban the use of laptops in classrooms.

The premises are both true, but it’s not at all clear how our intended conclusion is
supposed to follow from this. As it stands it’s clearly a weak argument.

Nor is it obvious how we might charitably repair this argument in a way that reflects the
author's intentions, because the intentions aren't clear. How is the point about paying good
money relevant to the conclusion? Are the two premises intended to work together or are
they really intended to be separate points? If you have to be a mind-reader to properly
reconstruct an argument, that’s a bad sign, and this is a case where you have to be mind-
reader.

So, given this, I would advise the author to rethink what they're trying to say here, and
especially if they really think that the money issue is relevant. And the different learning
styles thing seems like it belongs more naturally with the first argument, where you can
think of different note-taking methods as part of different "learning styles".

My advice would be to either rethink this paragraph or get rid of it entirely.


2.5 Analysis: The Main Body: Third Argument

Now let’s look at the third argument that we encounter in the main body of the essay.

Teachers complain that having a laptop is too much temptation for some students. They
just can't keep themselves from browsing Facebook or playing solitaire in class, so they
don't pay attention and miss out on important information or don't participate in class
discussions. To this I say that college students are adults and need to be treated as adults,
and that means they should take responsibility for their own education. If someone wants
to chat on Facebook all day let him, it's his choice to fail, not the teacher's.

This is nice. The author is considering an objection to the main argument and offering a
reply. Remember that an objection is itself an argument, so we should be able to state the
objection as an argument, distinct from the reply, and for our purposes it’ll be helpful to do
so. So let’s restate this objection in a way that fills in the reasoning.

What we’re trying to do is make explicit the reasoning in the first half of the paragraph.
Remember, this is going to be an argument for giving teachers the right to ban laptops
from the classroom.

Here’s a premise that summarizes the point being made here: “Some students are unable
to resist the temptation to use laptops in ways that interfere with their ability to learn and
participate in class.”

Next premise: “Banning laptops would remove this obstacle to learning for certain
students.”

Third premise — this is the relevant moral premise — “Teachers have a right to set
classroom policies that remove obstacles to learning and fulfill the educational goals of the
class.”

Remember, what we’re doing here is trying to reconstruct the argument underlying the
objection. This third premise isn’t stated explicitly in the essay, but what we’re doing is
reconstructing an argument that we think best fits the author’s intentions. It’s a moral
conclusion, so we need to refer at some point to a moral premise, and one like this would
do the job.

Now we can infer the moral conclusion: “Therefore, teachers have a right to ban laptops
from the classroom.”

Here's the full reconstructed argument written in standard form:

1. Some students are unable to resist the temptation to use laptops in ways that interfere
with their ability to learn and participate in class.

2. Banning laptops would remove this obstacle to learning for certain students.

3. Teachers have a right to set classroom policies that remove obstacles to learning and
fulfill the educational goals of the class.

Therefore, teachers have a right to ban laptops from the classroom.


Okay, in the second half of this paragraph, the author gives a reply to this objection.
Remember that a reply is also an argument. And it’s an argument for a specific conclusion
-- namely, that the objection just given is a bad objection. If we grant that in this objection
the conclusion follows from the premises, then the reply is going to have to target the truth
of one of the premises.

So let’s take a look at the author’s reply:

To this I say that college students are adults and need to be treated as adults, and that
means they should take responsibility for their own education. If someone wants to chat on
Facebook all day let him, it’s his choice to fail, not the teacher’s.

The key idea here is expressed in these lines, “college students are adults and need to be
treated like adults”, “they should take responsibility for their own education”.

So, how do these points challenge any of the premises in the objection?

This response seems to be directed at the “paternalism” underlying the objection, the
notion that teachers are like parents, or father figures -- the root word of paternal is “pater”,
which is Latin for “father” -- and teachers know what's best for students and have a right to
force them to do what they judge to be in the students' best interest.
The author's response is to say "no", you don't have that right, or at least not an
unconditional right. Part of being an adult is being free to make bad choices and taking
responsibility for those choices.
In terms of the argument on the right, this reply is really a challenge to premise 3 — the
premise that says that teachers have a right to set classroom policies that remove
obstacles to learning and fulfill the educational goals of the class. Teachers don't have an
unconditional right to set classroom policies as they see fit. In some cases the rights of the
students outweigh the rights of the teacher, and the implication is that this is one of those
cases.
So, as we’ve reconstructed it here, this comes down to a “conflict of rights” issue -- the
rights of teachers versus the rights of students.

And that’s how I would recommend the author of this essay frame the objection and the
reply, as a conflict of rights issue. It already does this to a certain extent, but I would push
the author to make the reasoning more explicit.
2.6 Analysis: The Main Body: Evaluation and Recommendations

In this video I’ll summarize the overall logic of the argument presented in the essay, and
compare it with the organizational structure recommended in Part 1 of this tutorial series.
Then I’ll offer some suggestions on how to strengthen the argument of the essay.

The original essay has five paragraphs, three of which constitute the main body of the
essay. The author intended for these three paragraphs to be read as giving three separate
arguments for the conclusion, but our analysis showed that this wasn’t really the case.

The first paragraph does give an argument, what we might call the “hardship” argument.
This is the one that says that removing laptops would impose on unfair hardship on
students who really benefit from the advantages of taking notes on a laptop. If you take
away their laptops then those students are at an unfair disadvantage in the classroom.
The second paragraph, however, doesn’t really have any argumentative content. All it
does is repeat the conclusion, that teachers don’t have a right to take away students’
laptops. Then there’s the point about students “paying good money for their education”, as
though that fact alone is supposed to entitle them to use their laptops, but this point isn’t
developed, and it’s not clear how the author thought it should be relevant to the moral
issue, so it was hard to know how to reconstruct an argument.

The third paragraph is interesting in that it doesn’t really present a third argument as such;
rather, it presents an objection to the main conclusion and offers a reply to the objection.

Once we did some argument reconstruction, we saw that the objection was that some
students who use laptops in class are going to be distracted and won’t learn as well, and
teachers have right -- maybe, a duty -- to set policies that remove obstacles to learning,
and therefore they have a right to ban laptops from the classroom, for the sake of those
students who just can’t help but be distracted by the presence of their laptops.

The reply focused on the assumption that teachers have a right to set whatever classroom
policies they want if they think the policies are in the best interests of the students. The
author challenges this paternalistic assumption, arguing that it treats college students like
children who can’t take responsibility for their own educational choices. But college
students aren’t children, they’re adults, and teachers should treat them like adults, and if
that means a student fails because they can’t stay away from Facebook during class, then
so be it.

Okay, my first recommendation is to get rid of paragraph 2, since it’s not really doing any
work for us.

Now we can work on developing the arguments in the first and third paragraphs.

The first thing I want to point out is that the objection considered in paragraph three isn’t
really an objection to the “hardship” argument given in paragraph 1.

Remember the hardship argument is based on the claim that some students would be
disadvantaged by the loss of their laptops, and the claim that teachers shouldn’t adopt
policies that unfairly discriminate against certain students. The objection considered in
paragraph 3 doesn’t address either of these claims. The objection does challenge the main
conclusion, the main thesis of the essay, but it’s really a separate argument against the
conclusion, it’s not targeting the premises or the logic of the argument given in the first
paragraph.

So, my next recommendation is that the author consider an objection and a reply to this
argument, the hardship argument.

The principle I’m appealing to here is that a good argumentative essay should consider
objections to every distinct argument for the main thesis that is presented in the main
body. Objections to one argument don’t automatically count as objections to other
arguments.

So, let’s go back and take a look at this argument and ask ourselves what a natural
objection to it might be.
1. Banning the use of laptops will disadvantage certain students in the classroom (those
that really benefit from the use of laptops...).

2. Teachers should not adopt classroom policies that disadvantage certain students but
not others.

Therefore, teachers should not ban the use of laptops in classrooms.

As presented here, the logic works fine, if there’s a problem it’s with the plausibility of the
premises.

Now, I’m inclined to accept premise 2, that teachers shouldn’t adopt policies that
disadvantage some students but not others. But only if the disadvantage is significant -- if
the disadvantage is minor, a mere inconvenience, then premise 2 isn’t so compelling. So
the question is whether the disadvantage to students caused by removing laptops is a
significant disadvantage.

So, a weakness of this argument is that premise 2 is only plausible if the hardships
imposed on students by banning laptops are significant.

Consequently, the natural objection is this: the disadvantages, the hardships, imposed on
students by banning laptops are, in general, not significant. If you’ve got a student with a
disability that’s one thing, but if the complaint is that your fingers get tired fast, or your
handwriting isn’t all that clear, or you’re forced to use a paper filing system rather than an
electronic filing system, that sounds more like an inconvenience than a genuine hardship.

Now, if this is the objection, then you can only reply in one of two ways. You could argue
that even if the disadvantages are minor inconveniences, teachers should still be able to
adopt policies that remove those disadvantages. I’m not sure off the top how to defend
that, but that’s one way to go.

The other way to reply is just to argue the empirical issue: “No, the disadvantages imposed
ARE significant, for some students.”

You could make this reply stronger by presenting, say, the results of studies that show
what fraction of students use typing as their primary mode of written communication, or
studies on learning styles and the value of supporting a diversity of learning styles in the
classroom. And so on, you get the idea.

Now, let’s go back and take a look at the objection-reply pair in paragraph 3.

A question that might naturally arise here is whether the author should make an effort to
first present an argument that can serve as the target of the objection, so that we have a
nice 3-argument set, with argument, followed by objection, followed by reply.

My answer is sure, you could do that, but you don’t have to, there are lots of ways of
organizing the argumentative points here that could be equally effective.

Sometimes an argumentative essay is structured around responses to possible objections


to the main thesis, so the format is closer to “Here’s my claim. Tell me why I should reject
it”, and the burden of proof is passed on to the opposition to provide compelling arguments
against the claim, and the essay focuses on systematically replying to possible objections.
That’s a perfectly good format, and you can use some of that format here, in this part of
the essay.

A more important issue is whether the reply is as strong as it could be. Once again, here’s
the original objection:

To this I say that college students are adults and need to be treated as adults, and that
means they should take responsibility for their own education. If someone wants to chat on
Facebook all day let him, it's his choice to fail, not the teacher's.

You can see the focus is on treating students as adults, and we interpreted this as a
challenge to the unconditional truth of premise 3 above. Yes, teachers have a right to set
classroom policies that remove obstacles to learning, but this right isn’t absolute. In this
case it conflicts with the right of students to choose their own learning styles, and it’s
unjustifiably paternalistic, it treats students like children rather than like adults.

That’s the reply. My concern about this reply -- and this is what I would tell the author of
this essay -- is that it pits one rights claim against another, the teachers’ versus the
students’, but it’s not clear in the reply why the students’ interests in having the freedom to
fail should outweigh the teacher’s interests in optimizing the learning experience for
students.

This is an important concept in moral reasoning. When you pit rights claims against one
another, or moral considerations of any kind against one another, you’ve got a situation
like this, where the moral issue turns on which claim is stronger -- are they equally strong,
or does one outweigh the other? If they’re equally strong then you’re at a stalemate, it’s
unclear what the policy should be.

However, if the student’s rights clearly outweigh the teacher’s rights in this case, then we
judge the policy to be wrong and the students win.

But it goes both ways, if the teacher’s rights clearly outweigh the student’s rights, then the
policy is justified and the teachers win.

The problem, from an argumentative standpoint, is that different people may have different
intuitions about which rights claim is stronger. Relying on people’s intuitions about the
case is risky, because you might have people who grant the setup but think that the
teacher’s rights in this case outweigh the students’ rights.

So, in a case like this, what you might want to do is offer some additional reasons why one
set of rights claims should outweigh the other set. The author of this essay needs to say
why the students rights claims should outweigh the teacher’s rights claims. As it stands the
essay doesn’t give us any additional reasons.

So how do you do this? There are different ways you could do it. (A good tutorial course
on moral reasoning would help, but that’s a future project).

Here’s one way: You could reason by analogy. Let’s consider some other policies that
would have the effect of removing obstacles to learning.
Claim: coming to school tired and hungry impedes student learning.

Clearly a true claim. So, if teachers have a right to impose policies that remove obstacles
to student learning, then why not impose this policy?

Policy: All students are required to sign a contract promising that they’ll eat three well-
balanced meals a day and get at least 8 hours of sleep at night.

Heck, why not require that they all wear monitors that keep track of their food intake and
sleep periods, so we don’t have to rely on the honor system. That would be even more
effective.

Or how about this?

Claim: Students who work more than 20 hours a week at part-time jobs do worse in
school, on average, than students who work fewer hours.

Let’s say this figure is true, that students who work more than 20 hours a week are likely to
do worse in school than those who work fewer hours.

Why, then, shouldn’t teachers be allowed to put limits on student work hours?

Policy: Students are not allowed to work more than 20 hours a week at part-time jobs.

You see where this is going. Most of us will think “no, these policies are not defensible”.
Even if these policies would improve student learning, our intuition is that teachers don’t
have a right to micro-manage the lives of students, it’s a violation of a student’s rights to
non-interference, and it fails to respect the autonomy of students, their right to make and
take responsibility for their own choices.

So the question is, why isn’t a ban on laptops similar?

This is an example of reasoning by analogy, or reasoning by “similar cases”. You present


a series of cases where the intuitions are clearer, and then claim that the case under
consideration is similar in all relevant respects to the cases you just presented; therefore,
rationality dictates that the intuitions in those cases should carry over.

Anyway, this is meant only to give some idea of how one might beef up this part of the
essay.

I’m not saying this line of reasoning is ultimately persuasive, arguments from analogy are
notoriously vulnerable to certain kinds of objection -- like, whether the cases really are
similar in all the relevant respects -- but on the whole, offering some considerations like
this might be helpful in strengthening the reasoning in this section.

Okay, that was long, but we’re done with my recommendations for strengthening the logic
of the essay.

As a point of summary, let me just note the two principles that informed my evaluation and
recommendations in this section.
The first is that a good argumentative essay should consider objections to every distinct
argument presented in the main body. We saw that in the essay we’ve been looking at, the
author didn’t consider possible objections to the first argument given, so we had to give
that some thought.

The second principle is that when a moral issue is framed as a conflict of rights or conflict
of values issue, and it’s not obvious to your audience which rights or values should
outweigh the other, you need to provide additional argumentation in favor of one side. This
was the guiding principle in my recommendations for strengthening the objection-reply pair
in paragraph 3.
2.7 Analysis: The Conclusion

After that lengthy discussion of the main body it may seem a bit anti-climactic to look at the
conclusion, since it’s only two lines long, and when we rewrite this essay in light of this
discussion, the conclusion will likely be rewritten as well. But for the sake of completeness
let’s do it anyway. Actually, I’ll use this discussion to highlight two important points about
conclusions.

Here’s the conclusion of the essay:

In conclusion, I feel strongly that laptops should not be banned from classrooms. Laptops
may be a distraction for some students, but that's not a good enough reason to ban them.

Is this a good conclusion? Well, it could be better, but it could also be worse.

What’s good about it is that, one, it restates the main thesis of the essay -- laptops should
be not be banned from classrooms.

And two, in that second sentence it gives some indication of how the conclusion was
argued for.

These are things we like to see in a conclusion.

On the other hand, this conclusion also has a couple features that I think students should
avoid when they can. So let’s talk about those here.

First, this expression, “I feel strongly that laptops should not be banned”. I see these sorts
of expressions a lot in student essays -- “I feel that”, “I believe that”, “in my opinion”, and
so on. And students also seem to think that by being more emphatic about it -- by saying,
“I feel STRONGLY that “, or “I FIRMLY believe that” -- they’re somehow making the
conclusion more persuasive, or the argument stronger.

But the fact is that this kind of language actually tends to weaken an essay rather than
strengthen it.

Now, why is this? Well, compare

“I feel strongly that laptops should not be banned.”

with

“Laptops should not be banned.”

These two sentences assert very different things. For one thing, the subject and predicate
of each sentence are completely different.

The subject of the first sentence is ME, or rather you YOU, the author of the essay.

And the predicate is what? The predicate is “feel strongly that laptops should not be
banned.” So the sentence is about YOU, the author, and it asserts something about how
you feel, namely that, that you feel strongly that laptops should not be banned.
When I see a sentence like this, I have to admit that in my mind, I find myself saying “I
thought this essay was about the pros and cons of laptop use in the classroom, not about
your feelings about laptop use.” I’ll return to this point later.

Now compare this to “Laptops should be banned.” What’s the subject?

The subject is “laptops”. And what’s the predicate? The predicate is “that they should be
banned”.

So, this sentence isn’t about the feelings or beliefs of you, the author, it’s about laptops
and their use in the classroom.

Remember, this is what the author is supposed to be arguing for in this essay, this is the
conclusion.

And here’s the point. The essay isn’t about you or how strongly you feel about the
conclusion. The fact that you may feel strongly about it is irrelevant to the conclusion, it
carries no argumentative weight whatsoever.

I remember once when I was an undergrad student in a philosophy class, and we had to
submit weekly writing assignments on the assigned readings for the class. And when my
professor returned the first couple of assignments to me I saw all this red ink in my
concluding section, and it said something to effect of “Ack, this was going so well until the
end! You have such a strong voice right up until the end, and then you start qualifying your
argument with “I feel that ...” and “I think that ...” and “I believe that ...”

And then he wrote this, which has stuck in my head ever since:

“No one cares what you believe. They only care about why you believe it.”

Now, let me be clear about the point that my professor was trying to make here, and the
point I want to make to the author of this essay. I, as a person, may very well care about
and be interested in what you, as a person, feel and believe about a particular issue. But
from the standpoint of an argumentative essay, where the goal is to provide good reasons
for your audience to accept a claim, the fact that you may feel strongly about the claim is
not, by itself, a reason for anyone else to accept that claim. That’s the sense in which your
feelings about the issue are irrelevant.

And that’s why expressions like “I feel strongly that …” tend to weaken rather than
strengthen the argumentative force of your essay.

If the arguments that you’ve given in your essay are good then that should be enough to
persuade the reader, and that should be the focus of your concluding statement, not your
personal convictions about the conclusion.

Switching into this mode can also be distracting because the subject of the conclusion
suddenly becomes you and your beliefs and feelings, and this can come across as
amateurish. If you do this a lot in an essay then the essay will read more like a diary entry
or an essay about you and your personal feelings, rather than an argumentative essay
about the issue in question.
So, that’s a long way of saying that you should avoid phrasing like this, and stick with
phrasing like this:

“In this essay, I have argued that …”


“I have shown that ...”
“It was demonstrated that ...”
“Reasons were given to believe that ...”

where the emphasis isn’t on the author’s feelings about the conclusion, but on the reasons
that where presented in the main body of the essay.

Okay, that was the first comment I wanted to make about this conclusion. Here’s the
second.

I said earlier that one thing this conclusion does well is that it gives some indication of how
the conclusion was argued for. I was referring to this second sentence,

“Laptops may be a distraction for some students, but that’s not a good enough reason to
ban them.”

It points to the conflict of rights issue that we talked about in the third section of the main
body.

However, saying it does a good job at indicating how the conclusion was argued for is
really overstating it. It doesn’t do a very good job. It has two problems:

First, it’s too vague. Or I should say, if it’s to function as a summary statement of how the
conclusion was argued for, it’s too vague. If it’s just functioning as additional commentary,
then I guess it’s not too vague, but that’s not what this looks like.

Second, it gives an incomplete picture of the logic of the argument. More specifically, it
only refers -- obliquely -- to the argument developed in paragraph three, the paternalism
argument. It doesn’t say anything about the hardship argument developed in the first
paragraph.

The principle here, is that if you’re going to bother reviewing the argumentative moves that
you made in the essay, don’t tell only half the story, tell the whole story. Otherwise you risk
misleading the reader who jumps to the conclusion looking for a summary review of the
essay.

So, in summary, here are my recommendations for this conclusion:

One, get rid of the “I feel strongly” language. You want to talk about what you argued for,
what you demonstrated, not what you happen to feel or think about the conclusion.

And two, give a more complete summary of the argumentative moves in the essay.
2.8 The Essay: Improved Version

In this video I’m going to present a rewritten and hopefully improved version of the essay,
“Should Teachers Be Allowed to Ban Laptops in Classrooms”. This version might strike
you as a completely different essay, it’s so heavily rewritten, but as I’ll try to show in the
commentary on the next video, the major changes in the introduction, the conclusion, and
the logical structure of the main body were all motivated by the analysis and
recommendations we just discussed. There are stylistic changes too, the writing style is a
bit more mature, but what I want to focus on is the logical and organizational structure of
this rewritten version, and the continuities between this version and the original version.

Should Teachers Be Allowed to Ban Laptops in Classrooms?

Introduction
It is increasingly common to see students using laptops in college classrooms Many
students use laptops for taking notes, and others use the internet to help research points
of interest that are relevant to class lectures. However, it is also common for students to
spend time in class casually browsing the net, instant-messaging and reading Facebook
pages, or playing games. Some college teachers have found laptops so distracting to
students in their classes that they have banned their use. This policy invites the question,
should college teachers be allowed to ban laptops from the classroom?

Is this essay I will argue that teachers should not be allowed to ban the use of laptops in
classrooms. The essay will attempt to defend two claims: first, that a ban on laptop use
unfairly discriminates against students who will be disadvantaged by not using a laptop for
note-taking in the classroom; and second, that a ban on laptop use is an unjustifiable
infringement on the rights of students to make and take responsibility for their own
educational choices.

A Ban on Laptop Use is Discriminatory


I use a laptop for taking notes. I and many other students have grown up using computers
and keyboards as our preferred mode of written communication. We rarely hand-write
anything, our handwriting is hard to read, and our fingers get tired and sore if forced to
write for extended periods of time. Some of my college classes are seventy-five minutes
long, some as long as three hours. For myself and many other students, it is a
considerable hardship to be forced to take handwritten notes in these classes. In addition,
I organize my notes electronically, using a system that is convenient and that suits my
learning style.

I grant that many students are comfortable taking handwritten notes and would not be
impacted by a ban on laptops. The hardships I am describing would be felt by only a
minority of students. But the point is that a ban on laptops would unfairly disadvantage
this minority who rely on electronic note-taking in the classroom. I can type much faster
than I can write, and consequently I can pay better attention in class if I'm typing than if I'm
handwriting (and if my fingers aren't sore). Students like me would be put at an unfair
disadvantage if laptops were banned. Teachers should not be allowed to adopt policies
that put a certain group of students at an unfair disadvantage.

It might be objected that I'm exaggerating the hardships imposed by being forced to hand-
write notes, or that only a very small minority of students will be subject to them. I can
attest that I am not exaggerating the hardship in my own case, and my informal survey of
laptop users in my classes showed that about a quarter would feel seriously
disadvantaged by being forced to hand write notes. This is a minority, but it's not a "very
small minority". Ultimately this is an empirical question, and proper scientific studies must
be conducted to determine just how many students would be seriously disadvantaged by
the removal of laptops.

However, it is reasonable to believe that the problem will only increase with time. A
number of studies have shown that increasing percentages of elementary school children
have difficulty mastering and experience frustration with handwriting. The introduction of
computer labs in elementary schools, and the increased number of computers in homes,
has lead to more students bypassing handwriting and moving straight to word processors
to complete most of their written school work. One study of middle-school students
revealed that in some school districts, 15% of students could not even read cursive
handwriting because they had little or no exposure to it in their elementary schooling.
Given these trends, college teachers should expect that more and more students will be
entering their classes with poorer and poorer handwriting skills, and that the hardships
imposed by banning laptop use in the classroom will become increasingly common and
obvious over time.

To sum up, I affirm that the banning of laptops in the classroom imposes a significant
hardship on an increasing number of students, that this would be unjustly discriminatory,
and hence that teachers should not be allowed to ban their use.

A Ban on Laptop Use is Paternalistic


A common argument for banning laptops is that some students are simply unable to resist
the urge to use their laptops in ways that distract their attention from classroom lectures
and discussion, leading to poorer academic performance overall. Consequently, removing
laptops from the classroom is likely to lead to improved academic performance for these
students. Teachers, it is argued, have a right to set classroom policies that remove
obstacles to learning for their students. Hence, teachers have the right to ban laptops
from the classroom, if they believe that doing so will improve the academic performance
and learning experience of these students.

I'm willing to grant the factual premises here. Yes, some students find it difficult to resist
the urge to browse the web and be distracted in class, and some would benefit from a ban
on laptops. Where I disagree is with the implied assumption that teachers have an
unconditional right to set academic policies that restrict the freedoms of students whenever
they believe that doing so is in the students' best interest. I would argue that teachers only
have a right to restrict student behavior when that behavior would be harmful to the
teacher or the other students. For example, we can agree that students shouldn't be
allowed to be disruptive in class, but this is because doing so interferes with the ability of
the teacher to teach the class and with the learning experience of the other students.
Teachers shouldn't have a right to restrict a student’s behavior when that behavior is only
harmful to the student.

Let me give an example to illustrate the principle. It's well established that students who
come to school tired and hungry perform less well than students who are not tired and
hungry. But if teachers have a right to set classroom policies that promote student
learning, why not set the following policy?: All students must sign a contract promising to
come to class well fed and rested. Heck, why not require that they wear medical monitors
that track their food intake and sleep cycles, so we don't have to rely on the honor system?
Of course no one believes that teachers have a right to impose this kind of policy, even
though it would likely improve the learning experience of many students. Why not?
Because it's overly paternalistic. It assumes that students can't or shouldn't be allowed to
take responsibility for their own academic choices, that they should be treated like
children. If I choose to come to school hungry and tired, that's my choice, and I bear the
responsibility for the consequences, not my teacher.

My claim is that the argument for banning laptops based on the harm caused by their use
to the laptop user (and not to the other students in the class, or to the teacher) is open to
the same objection, that it's overly paternalistic. If I choose to come to class and surf the
web or update my Facebook page, that's my choice, I should be allowed to bear the
responsibility for those choices. To say otherwise is to deny me the freedom and respect I
deserve as an adult. 
 


Conclusion

In this essay I argued that teachers should not be allowed to ban the use of laptops in
classrooms. My argument was based on two separate lines of reasoning: one, that a ban
on laptop use unfairly discriminates against students who will be disadvantaged by not
using a laptop for note-taking in the classroom; and two, that a ban on laptop use is an
unjustifiable infringement on the rights of students to make and take responsibility for their
own educational choices.

2.9 The Essay: Improved Version with Commentary

Let’s walk through our new and improved essay.

By the way, the student who wrote the original essay also did a rewrite. His rewritten essay
has some elements in common with this one, especially in terms of the overall
argumentative structure, but I’ve had a larger hand in editing the language for this version.

I’ve broken the essay into four sections on this page and added some commentary after
each section.

You’ll have noticed that we’ve added headings to help flag the introduction, the conclusion,
and the two arguments presented in the main body. Headings aren’t necessary, but they
do help the reader to follow the organization of the essay, and in academic essay writing
they’re the norm rather than the exception. It is a style thing, though, and whether you use
it depends on the venue and the genre in which you’re writing. For example, you usually
won’t find headings in newspaper opinion columns, even though opinion columns really
are a form of argumentative essay writing. On the other hand, bloggers tend to use
headings a lot, so it really does depend on the venue.

Should Teachers Be Allowed to Ban Laptops in Classrooms?


Introduction
It is increasingly common to see students using laptops in college classrooms Many
students use laptops for taking notes, and others use the internet to help research points
of interest that are relevant to class lectures. However, it is also common for students to
spend time in class casually browsing the net, instant-messaging and reading Facebook
pages, or playing games. Some college teachers have found laptops so distracting to
students in their classes that they have banned their use. This policy invites the question,
should college teachers be allowed to ban laptops from the classroom?

Is this essay I will argue that teachers should not be allowed to ban the use of laptops in
classrooms. The essay will attempt to defend two claims: first, that a ban on laptop use
unfairly discriminates against students who will be disadvantaged by not using a laptop for
note-taking in the classroom; and second, that a ban on laptop use is an unjustifiable
infringement on the rights of students to make and take responsibility for their own
educational choices.

Notice that we’ve broken it into two paragraphs, one paragraph for presenting background
information and setting up the issue, and another paragraph for stating the thesis and
giving a summary outline of how the main argument will proceed.

Notice also that we got rid of that second paragraph of the original essay, the one that we
decided wasn’t doing any work for us. So the main body of the essay focuses on the two
lines of argumentation developed in the first and third paragraphs of the original essay,
what we called the “hardship argument” and the “paternalism argument”. Let’s look at the
hardship argument.

A Ban on Laptop Use is Discriminatory


I use a laptop for taking notes. I and many other students have grown up using computers
and keyboards as our preferred mode of written communication. We rarely hand-write
anything, our handwriting is hard to read, and our fingers get tired and sore if forced to
write for extended periods of time. Some of my college classes are seventy-five minutes
long, some as long as three hours. For myself and many other students, it is a
considerable hardship to be forced to take handwritten notes in these classes. In addition,
I organize my notes electronically, using a system that is convenient and that suits my
learning style.

I grant that many students are comfortable taking handwritten notes and would not be
impacted by a ban on laptops. The hardships I am describing would be felt by only a
minority of students. But the point is that a ban on laptops would unfairly disadvantage
this minority who rely on electronic note-taking in the classroom. I can type much faster
than I can write, and consequently I can pay better attention in class if I'm typing than if I'm
handwriting (and if my fingers aren't sore). Students like me would be put at an unfair
disadvantage if laptops were banned. Teachers should not be allowed to adopt policies
that put a certain group of students at an unfair disadvantage.

It might be objected that I'm exaggerating the hardships imposed by being forced to hand-
write notes, or that only a very small minority of students will be subject to them. I can
attest that I am not exaggerating the hardship in my own case, and my informal survey of
laptop users in my classes showed that about a quarter would feel seriously
disadvantaged by being forced to hand write notes. This is a minority, but it's not a "very
small minority". Ultimately this is an empirical question, and proper scientific studies must
be conducted to determine just how many students would be seriously disadvantaged by
the removal of laptops.

However, it is reasonable to believe that the problem will only increase with time. A
number of studies have shown that increasing percentages of elementary school children
have difficulty mastering and experience frustration with handwriting. The introduction of
computer labs in elementary schools, and the increased number of computers in homes,
has lead to more students bypassing handwriting and moving straight to word processors
to complete most of their written school work. One study of middle-school students
revealed that in some school districts, 15% of students could not even read cursive
handwriting because they had little or no exposure to it in their elementary schooling.
Given these trends, college teachers should expect that more and more students will be
entering their classes with poorer and poorer handwriting skills, and that the hardships
imposed by banning laptop use in the classroom will become increasingly common and
obvious over time.

To sum up, I affirm that the banning of laptops in the classroom imposes a significant
hardship on an increasing number of students, that this would be unjustly discriminatory,
and hence that teachers should not be allowed to ban their use.

This section starts off much like the original essay, describing the hardships that some
students would suffer if laptops were banned.

The second paragraph completes the main argument -- notice those last two lines:
“Students like me would be put at an unfair advantage if laptops were banned. Teachers
should not be allowed to adopt policies that put a certain group of students at an unfair
advantage.”

It’s important that you be as explicit as possible about what your conclusion is and how it’s
supposed to follow from what you’ve said before.

Now, in our earlier evaluation of this line of reasoning I pointed out that the original author
never considered any obvious objections to this argument.

Well, that’s what we do in the next paragraph, in the first line: “It might be objected that I’m
exaggerating the hardships imposed by being forced to handwrite notes, or that only a
very small minority of students will be subject to them.”

The rest of this section is an attempt to answer this objection. This isn’t an easy thing to
do. You’re trying to argue that a certain proportion of the student population either is
suffering or is going to suffer a significant hardship by not being allowed to take notes on
their laptops. This makes it an empirical issue that would be best answered by citing
research studies on the question. But what if there are no studies, or no studies that
you’ve been able to discover? Then how do you make the case?

Well, my thought was that you could at least try to make the claim plausible, give some
reasons why at the very least it shouldn’t be dismissed.

And so we have a reference to an informal survey of the author’s peers, not conclusive by
any means, but it’s something. The author admits that this is an empirical question,
though, and that studies would need to be done to properly estimate just how many
students would be seriously disadvantaged by the removal of laptops.

Now, the next paragraph is meant to support this line of reasoning by arguing that, even if
the numbers are small now, it’s reasonable to think that the problem will only increase over
time, as more and more students arrive in college with less and less experience with hand
writing.

Notice the conclusion. “Given these trends, college teachers SHOULD EXPECT that more
and more students will be entering their classes with poorer and poorer handwriting skills
...”.

This conclusion is actually much easier to argue for, since the increasing difficulties that
children are having with hand writing is easier to document, and the conclusion is simply
that it’s reasonable to expect that this problem will grow over time. It’s rhetorically effective
in the context of this argument because even if someone wasn’t convinced that this is a
serious problem for college students right now, they may still be convinced that it’s likely to
become a problem, and that might be just enough to persuade a skeptic to accept the
empirical premises of the main argument.

Notice that you see a summary concluding paragraph in this section. This is often a good
idea. If you’ve just finished presenting an important argument and you’re about to switch
gears and talk about something else, then it’s helpful to flag this transition with a summary
conclusion like this, to remind the reader of what you’ve established and to clearly
demarcate one section of the essay from another.

The next section takes up the “paternalism argument”. Or rather, we introduce the
objection to which the paternalism argument is a reply.

A Ban on Laptop Use is Paternalistic


A common argument for banning laptops is that some students are simply unable to resist
the urge to use their laptops in ways that distract their attention from classroom lectures
and discussion, leading to poorer academic performance overall. Consequently, removing
laptops from the classroom is likely to lead to improved academic performance for these
students. Teachers, it is argued, have a right to set classroom policies that remove
obstacles to learning for their students. Hence, teachers have the right to ban laptops
from the classroom, if they believe that doing so will improve the academic performance
and learning experience of these students.

I'm willing to grant the factual premises here. Yes, some students find it difficult to resist
the urge to browse the web and be distracted in class, and some would benefit from a ban
on laptops. Where I disagree is with the implied assumption that teachers have an
unconditional right to set academic policies that restrict the freedoms of students whenever
they believe that doing so is in the students' best interest. I would argue that teachers only
have a right to restrict student behavior when that behavior would be harmful to the
teacher or the other students. For example, we can agree that students shouldn't be
allowed to be disruptive in class, but this is because doing so interferes with the ability of
the teacher to teach the class and with the learning experience of the other students.
Teachers shouldn't have a right to restrict a student’s behavior when that behavior is only
harmful to the student.
Let me give an example to illustrate the principle. It's well established that students who
come to school tired and hungry perform less well than students who are not tired and
hungry. But if teachers have a right to set classroom policies that promote student
learning, why not set the following policy?: All students must sign a contract promising to
come to class well fed and rested. Heck, why not require that they wear medical monitors
that track their food intake and sleep cycles, so we don't have to rely on the honor system?

Of course no one believes that teachers have a right to impose this kind of policy, even
though it would likely improve the learning experience of many students. Why not?
Because it's overly paternalistic. It assumes that students can't or shouldn't be allowed to
take responsibility for their own academic choices, that they should be treated like
children. If I choose to come to school hungry and tired, that's my choice, and I bear the
responsibility for the consequences, not my teacher.

My claim is that the argument for banning laptops based on the harm caused by their use
to the laptop user (and not to the other students in the class, or to the teacher) is open to
the same objection, that it's overly paternalistic. If I choose to come to class and surf the
web or update my Facebook page, that's my choice, I should be allowed to bear the
responsibility for those choices. To say otherwise is to deny me the freedom and respect I
deserve as an adult. 


This section starts off much as it did in the original essay, but here I tried to flesh out the
reasoning and fill in the premises that lead to the conclusion that teachers SHOULD have
a right to ban laptops. The reply starts when we identify the key premise that we want to
challenge:

“Where I disagree is with the implied assumption that teachers have an unconditional right
to set academic policies that restrict the freedoms of students whenever they believe that
do doing is in the students’ best interest.”

“I would argue that teachers only have a right to restrict student behavior when that
behavior would be harmful to the teacher or the other students.”

This the key to the reply. We’re basically invoking a classical notion of liberal freedom,
freedom as freedom to act and do as you wish as long as you’re not hurting anybody else.
We’re trying to cast a ban on laptops, when it’s done ostensibly for the sake of the student
using the laptop, as an unjustifiable infringement on a student’s right to non-interference.
That’s the argumentative strategy, anyway.

Do you remember the image of the teeter-totter in our discussion of this argument, and
about the need to give some reasons why the student’s rights in this case should outweigh
the teacher’s rights? Remember we discussed one way of doing this, by drawing
analogies between a ban on laptop use and more obviously unjustifiable policies, like
forcing all students to come to class rested and well fed?

Well that’s what we’re doing in the rest of this reply. Here’s the analogy: “My claim is that
the argument for banning laptops based on the harmed caused by their use to the laptop
user (and not to the other students in the class, or to the teacher) is open to the same
objection, that it’s overly paternalistic. ... To say otherwise is to deny me the freedom and
respect I deserve as an adult.”
Conclusion

 In this essay I argued that teachers should not be allowed to ban the use of laptops in
classrooms. My argument was based on two separate lines of reasoning: one, that a ban
on laptop use unfairly discriminates against students who will be disadvantaged by not
using a laptop for note-taking in the classroom; and two, that a ban on laptop use is an
unjustifiable infringement on the rights of students to make and take responsibility for their
own educational choices.

Rounding off the essay is the conclusion. I didn’t put a lot of effort into this, it basically
parallels the structure of the outline presented in the introduction, but that’s not an
uncommon stylistic device in argumentative essays. You sort of want the introduction and
the conclusion to be like bookends, propping up and framing the arguments in the main
body. If there’s a certain symmetry between the introduction and the conclusion, that’s not
a bad thing.

Still, you could write this conclusion differently, maybe add some additional commentary
on how you argued for it, but the important thing is that it do its job as a conclusion, which
is to restate the main thesis and give a brief summary of how the thesis was argued for.

One final note before we end this. I just want to reiterate that the purpose of this sample
essay, and the analysis we did on it, and this revision, was to illustrate some of the general
principles of argumentative essay writing. The specific issue we’re discussing here is
irrelevant. Whether laptops should or shouldn’t be banned from the classroom is
irrelevant, that wasn’t the point of this exercise. So if you find yourself disagreeing with the
conclusion of this essay, or wanting to criticize the argumentative moves that were made
in it, that’s fine, that’s great. I can think of three or four weaknesses in this argument that a
skeptic could exploit in developing a rebuttal to this essay, so feel free to take it apart.

What I do think is clear, however, is that this revised version of the essay is a better
argumentative essay than the original version. The point of the exercise is to get a better
understanding of the general principles of argumentative essay writing that explain why
this is so, so that you can bring these principles to bear on your own writing, and help you
to improve your essay writing skills.
How to Cite Sources and Avoid Plagiarism
Part 1: What is Plagiarism?
1.1 Plagiarism: The Basic Definition

Here’s a widely cited definition of plagiarism. It’s from the Modern Language Association
Style Manual.

“Scholarly authors generously acknowledge their debts to predecessors by carefully giving


credit to each source. Whenever you draw on another’s work, you must specify what you
borrowed, whether facts, opinions, or quotations, and where you borrowed it from. Using
another person’s ideas or expressions in your writing without acknowledging the source
constitutes plagiarism. Derived from the Latin plagiarius (which means “kidnapper"),
plagiarism refers to a form of intellectual theft. . . In short, to plagiarize is to give the
impression that you wrote or thought something that you in fact borrowed from someone
else, and to do so is a violation of professional ethics.”

This is the basic idea that we try to convey to students. You plagiarize when you take
someone else’s ideas or words and pass them off as your own.
Now, there’s a lot of agreement over this core definition, and over obvious examples of
plagiarism, like paying for someone else to write your paper for you. But there are also lots
of cases where students simply aren’t aware that what they’re doing constitutes
plagiarism.

In the next few tutorials we’ll take a look at a range of examples of plagiarism, starting with
the most obvious and egregious cases.
1.2 Downloading or Buying Whole Papers

Of course there’s no question that handing in a paper that you downloaded from an essay
mill or paying for someone to write your paper for you constitutes plagiarism. What may
surprise some people is just how easy it is to do this, and how large and sophisticated an
industry there is devoted to fulfilling student demand for essays.

The common term for these online sources for papers is “paper mills”. There are two broad
categories. You’ve got your free essay sites, where you can search for and download
essays for free, and you’ve got your for-pay sites, where the student pays a (sometimes
hefty) fee for the essay, usually by the page. These sites often offer a range of services,
including writing original essays that are custom-designed for your particular writing
assignment.
Here I’m showing only a sample of the websites that currently offer these services, there
are plenty more. A recent trend in the for-pay site category is to specialize in certain topic
areas or niches, like anthropology essays or english literature essays, or computer science
essays, and so on. These sites employ people with PhDs in the relevant field to write the
essays. You can even pay to have your masters or doctoral thesis written for you, there
are sites that specialize entirely in thesis writing for graduate students.

From a student perspective (apart from the ethical issue), the main disadvantages of the
free essay sites are that the quality of the essays varies widely, from very good to totally
awful, and, because there will likely be multiple copies of the essay online that are
accessible to search engines like Google, it’s much easier for a teacher to find the original
copy online and confirm the plagiarism.

Because there’s a demand for original papers, for-pay sites often advertise that they offer
original content that won’t be detected by search engines and other plagiarism-detection
services. Sometimes this is true, but other times it’s just a scam, or at least misleading.
Some will sell a paper and say it’s original when it isn’t, or they’ll write an original paper
and sell it to a student, but then turn the paper over to a paper-mill so that other students
can also buy copies of it, making it no longer unique. In either case it’s not hard for a
teacher to find the duplicates online if they have any experience at all finding plagiarized
sources on the web.
I don’t have much else to say about paper mill sites here. Handing in one of these papers
and putting your name on it is plagiarism in its purest form. Students should know that
penalties for this kind of plagiarism are often stiffer than for more subtle kinds of
plagiarism, because there’s just no way to plead ignorance, it’s obvious that you’re
cheating, and that you intended to cheat.

Students should also know that it’s often much easier to spot a plagiarized paper, and find
the original paper on the web, than they might think. When I suspect a paper is plagiarized
it usually takes me no more than five or ten minutes to find a duplicate somewhere on the
web. So keep that in mind if you’re considering this route.
1.3 Cutting and Pasting From Several Sources

Though it certainly happens enough, comparatively speaking, downloading a free paper


or buying one from a paper mill is the rarest form of plagiarism. More common is the “cut
and paste from several sources” paper.
Here’s what they tend to look like. A student is assigned a topic and they get their hands
on a several useful sources. Wikipedia articles are usually at the top of the list, but the
sources might include textbooks, blog entries, newspaper articles, and whatever else
Google can scoop up on the topic. The student then browses the materials, identifies
sections that he or she likes or thinks would work in the essay, and then copies and pastes
them into a Word document, maybe does a little rearranging and editing, and “voila”,
you’ve got an essay. It’s also common for students to write their own introductory and
closing paragraphs, I guess to make it read more like a student essay and maybe tie the
copied and pasted bits together a bit better.

Is this plagiarism? Yes it is, and most students understand this, but you’d be surprised how
many students think it’s not. They think this is an acceptable research essay because
they’re doing their own research and organizing the bits and pieces into a coherent
narrative. It’s true, they’re doing more work than plagiarizing an entire essay, but they’re
still taking other people’s words and ideas and passing them off as their own, and that’s
plagiarism.

Students should know that cut and pasted essays are often very easy to spot, because of
the dramatic changes in writing style and vocabulary that you often see, and the awkward
transitions from one section to the next.

Like this example.

The second paragraph is not the student’s, it’s copied and pasted from the Stanford Online
Encyclopedia of Philosophy article on Dante. If this was a graduate student writing a thesis
then I wouldn’t have batted an eye, but this was a second-year university student in an
introductory philosophy class who had a hard time putting together three or four sentences
without a grammar or spelling error. That student would not have written

“After his banishment he addressed himself to Italians generally, and devoted much
of his long exile to transmitting the riches of ancient thought and learning, as these
informed contemporary scholastic culture, to an increasingly sophisticated lay
readership in their own vernacular”.
It took about 30 seconds to find the original source online.
1.4 Changing Some Words But Copying Whole Phrases

Cut and pasted essays are common enough, but even more common are essays that
have a mixture of the student’s own writing and writing borrowed from other sources, but
the student changes up the wording so it’s not an exact duplicate. Here’s an example:

The text on the left is from the Stanford Online Encyclopedia of Philosophy entry on
Aristotle’s Ethics, written by Richard Kraut. The text on the right is from a student essay.
It’s clearly just a shortened and slightly modified version of the text on the left. The student
cut out these sections ...
... that I’ve highlighted in yellow with a strike-through, and then made one wording change,
changing the words “ more or less” in the original, to “similar” in the student version. And
they passed off this paragraph as their own writing, with no citation to the original source.

This kind of plagiarism is harder to spot, but in my experience it’s very common. A lot of
students aren’t even aware that this qualifies as plagiarism, and for those who do, they
tend to think of it as a fairly minor violation. But it still qualifies, and they can still get in
trouble for it, so students need to learn how to avoid this kind of writing.
1.5 Paraphrasing Without Attribution

Moving down our list, we have “paraphrasing without attribution”. Students are usually
taught that paraphrasing is good, it’s better than relying heavily on direct quotations, but
they’re sometimes aren’t told that they still have to cite the original source.

Here’s an example.

The text on the left is the original source, it’s a paragraph from an essay on the works of
the British philosopher Thomas Hobbes. The text on the right is a paraphrase of that
source.

To paraphrase is to restate the same content but using your own words and sentence
structure. The ability to paraphrase accurately is a good sign of understanding, it’s a very
useful skill. But if a student were to use this paraphrase in an essay, without citing the
original source, they’d be guilty of plagiarism.

Why? Because plagiarism involves not just using someone else’s words and passing them
off as your own; it also involves using someone else’s ideas and passing them off as your
own. Here you’re borrowing ideas, and if they’re not your own, or not common knowledge,
then you need to cite the source.

1.6 The Debate Over Patchwriting

Before moving on to talking about how to properly cite sources, I want to acknowledge that
there’s a debate going on in academic circles over just how wrong certain kinds of
plagiarism are. In particular, there’s a debate about what’s called “patchwriting”, and
whether patchwriting is an acceptable form of plagiarism.

Let’s go back to our paraphrase example. When a paragraph stays this close to the
original text, and includes some of the original wording and sentence structure, this is
sometimes called “patchwriting”. Technically, it’s a form of plagiarism, but there are people
who study writing, and in particular the process of learning how to write within a specific
academic genre, who want to argue that patchwriting should be viewed as a predictable
stage that people pass through on the way to acquiring mastery of the vocabulary and
writing conventions of a particular genre.

A lot of attention has been payed to how international students -- students for whom
English isn’t their first language -- use patchwriting, but the argument has been made
about writing in general, that we learn the conventions of a genre by mixing our ideas and
language with the ideas and language of those who we’re trying to emulate.

So the question is whether we should teach students to avoid patchwriting because it’s
wrong, and punish it when it’s detected, or whether we should accept that it’s a natural part
of the learning process and just educate students about the conventions and standards for
academic writing.

I guess I’m sympathetic to the latter view. It seems obviously true that people learn to
speak and write by copying the conventions and style and sometimes the language of
others. But it also seems obvious to me that the ultimate goal should be to move beyond
this stage, to fully internalize the conventions and the language of the genre so that the
writer can truly claim ownership of their words and ideas.

At any rate, I just wanted to point out that there is an ongoing debate in academic circles
over what forms of plagiarism are genuinely wrong and should count as “cheating”, and
what forms are a predictable part of the normal process of learning how to write. Either
way, we still need to learn how to cite sources correctly, and that’s a good segue into Part
II of this tutorial course.

Part 2: How to Cite Sources


2.1 When Should I Cite a Source?

In the previous set of tutorials we looked at some examples of plagiarism. To avoid


plagiarism you need to cite your sources. In this next series we’re going to look at when
and how to cite sources, and talk about different citation style conventions. In this video
we’re going to start with the basic question, “when should I cite a source”?

I should say at the outset that in the next few videos I’ll be using some terminology and
rules of thumb that I first picked up from this great book by Robert Harris (The Plagiarism
Handbook: Strategies for Preventing, Detecting, and Dealing With Plagiarism, 2001). The
content is common knowledge, but the specific flowchart method of presentation and
terms like “mark the boundaries” that I’ll use are Harris’s.

Okay, our first question is, when should you cite a source?

There’s are two questions you need to ask.

One: “Did you think of it?” In other words, is it your idea or your words?

If the answer is “Yes”, then you don’t need to cite it.

If your answer is “no”, if you didn’t think of it, if it’s not your words or your idea, then you
need to ask one other question:

"Is the fact or claim that you’re referring to common knowledge?"

By “common knowledge” I mean, are you referring to an easily observed or commonly


reported fact, or a common saying that most people would be familiar with? If so, then you
don’t need to cite it.

For example, you don’t need to give a citation to support the statement that George
Washington was the first President of the United States, or that Seinfeld was a popular
American sitcom, or that biology is the study of living organisms.

But if your answer is “no”, then you need to cite the source.

This is the basic idea. If the claim that you’re making isn’t your idea, and it’s not common
knowledge, then it had to come from somewhere, and you need to cite the source of the
claim. But if you thought of it, or if it’s common knowledge, then you don’t have to cite it.

Now there are some subtleties about this notion of “common knowledge” that are worth
looking at more closely, but I’m going to save a comment on this until a later tutorial.
2.2 What Needs to be Cited?

In this video I want to clarify a distinction that students sometimes forget to draw when
they cite sources. It’s the distinction between using someone else’s words and using
someone else’s ideas. You need to cite both, and you cite them differently.

Our first question is, are you using someone else’s actual words? If the answer is yes,
then you need to quote and cite the source.

If the answer is “no”, you still have to ask whether you’re using someone else’s IDEAS.

Sometimes students think that if you’re not using a direct quotation then you don’t have to
cite the source, but you do.

Now, if the answer to this is “no” then you don’t have to cite anything. But if it’s “yes”, then
you cite the source.
This might seem obvious, but I just want to reinforce the point that when you’re citing
sources you have to pay attention both to the language you’ve borrowed and the ideas
that you’ve borrowed. If you’re borrowing language, you need to quote the language and
cite the source. If you’re just borrowing an idea, you won’t be quoting anything, but you still
need to cite the source of that idea.

2.3 How to Cite: Mark the Boundaries


We’ve talked about when to cite a source and what needs to be cited, but we haven’t
talked about HOW to cite a source. In this video we’ll talk about a simple rule that will help
you remember how to cite sources. Robert Harris calls this rule “marking the boundaries”.
This means that the words or the ideas that you’re borrowing must have a beginning and
and ending that is marked in some way. This is done differently depending on whether
you’re citing a short quotation, a long quotation, or just the source without a quotation.

For a short quotation, where you’re citing exact words, the boundaries are marked by
opening and closing quotation marks and a citation. Here’s an example, using the APA
citation style. APA stands for American Psychological Association, and it’s one of several
citation styles that are commonly used in academic writing. I’ve got several tutorials on
citation styles later in the course so I won’t dwell on the formatting issues. Here I just want
to highlight the general idea of marking the beginning and ending of a citation.
In this example we’ve got more than just quotation marks serving to mark the boundaries
of the quotation. This introductory bit that I’ve highlighted is commonly called a “lead-in”. A
“lead-in” is a bit of text that signals to the reader that you’re about to quote a source. In this
case the lead-in actually includes the source citation, which happens to be common in the
APA citation style, but in other styles the source citation might come after the quotation.

Our closing marker in this example is a page number reference, to help identify where the
quotation is located in the original source. Again, the particular formatting used here is
characteristic of a particular citation style, the APA style, and there are other citation styles
with different formatting conventions, but they all function in the same basic way, to mark
the boundaries of the quotation or the idea that you’re citing, so that it’s clear to the reader
which are your words and ideas and which are the words and ideas of someone else.

In the next few videos we’ll look at different examples of marking the boundaries, for
shorter quotations, longer quotations, and citing ideas without quotations.
2.4 Citing Exact Words

When you’re citing the exact words of a source, and the quotation isn’t too long, then you’ll
use what’s called an “in-text” citation, which means you’ll embed the quotations inside the
paragraph you’re working on, rather than give it a separate paragraph of its own.

You’ll use this style of in-text citation when you’re citing:

 a term that’s distinct to the source, or a phrase


 a part of a sentence
 a single sentence
 or two or three sentences. Any more than this and you should be thinking about
using a separate block quote.

The basic rule is that the boundaries the source are marked by opening and closing
quotation marks and a citation, which may come before or after quotes. Let’s look at a
couple of examples.

In this paragraph the author is citing a book by Glenn Gray published in 1970. It’s an in-
text citation.
The phrase “atmosphere of violence” is quoted, followed by part of a sentence, “draws a
veil over our eyes, preventing us from seeing the plainest facts of our daily existence”. The
citation with the author’s last name, date of publication of the source, and page number,
comes after the closing quotes. And again, this is an APA style citation, which I’ll generally
use in these examples, but the citation style can vary depending on what discipline you’re
in, where you’re publishing and who your audience is. The important thing is that it be
clear where the boundaries are between your words and the words of the source you’re
citing.

Here’s an example with two complete sentences cited.

The quoted sentences are in blue, and the citation itself is in red. Notice that this time the
citation is split, you’ve got the author and date embedded in the “lead-in” before the
quotation, with the page number reference following the quotation. There’s some flexibility
in how you do this, but you’ll need to refer to a style guide to see what variations are
acceptable within a particular citation style.

Next we’ll look at how to cite longer quotations.


2.5 Citing a Longer Quotation

You use an in-text citation for quotes that shorter, but once your quotation gets up around
four sentences or longer, it makes more sense to give the quote it’s own paragraph.

As a rule of thumb you should be using this “block quote” style when citing:

 a quotation that’s more than four lines long (note that this is “lines”, not
“sentences”)
 a whole paragraph
 or multiple paragraphs.

Sometimes a style guide will be more specific and give a word-limit. The APA style guide
says to use a block quote when your quotation is longer than 40 words, for example, but
the four-line limit is a reasonable approximation.

So, what does a block quote look like?

It looks something like this. There’s typically a lead-in sentence that sets up the quotation,
then the quotation itself is inserted as a separate paragraph, and indented about five
spaces from the margin.

Apart from the indent, the important difference between this and a standard in-line
quotation is that you don’t use quotation marks. The indenting by itself is enough to mark
the quotation.

In this case the actual citation is split, with the name of the author and the date of
publication in the lead-in, and the page number immediately following the block quote.
Again, this is a convention that might vary between citation formats, and even within a
citation format there’s often some leeway in where the citation information is placed, but
the basic idea, as always, is to clearly mark the boundaries between your words and ideas
and the words and ideas of your source, the indented block quote makes this very easy to
see.
2.6 Citing a Source But Not Quoting

In academic writing we often refer to books or studies or ideas from other sources, but we
don’t always use direct quotations from the source. But the same principles apply: we
need to mark the boundaries that distinguish your words and ideas from the words and
ideas of someone else. In this case we’re not citing the exact words, we’re citing the ideas.

When do we use this kind of citation?

We use this when we’re

 summarizing the source


 paraphrasing the source
 mentioning the source briefly
 or simply making use of an idea taken from the source.

Summaries, by the way, are generally shorter than paraphrases, so that’s why I’m
distinguishing them here. I can summarize the results of a study or the plot of a novel in a
paragraph, but that’s a very different thing than paraphrasing, which involves re-writing a
bit of text in your own words, but in a way that is faithful to the original meaning of the text.
A paraphrase will be much closer to the length of the original text than a summary will be (I
can summarize a novel, but I can’t paraphrase it).

Okay, let’s look at a couple of examples.

In this example there’s a reference to an article by Vanderburg, using the name-date


system and a page reference. We’re not directly quoting the article, we’re just referring to
the claim, made in the article, that a couple of inquiries had concluded that the Captain
and crew of the SS Californian failed to give proper assistance to the victims of the Titanic
iceberg disaster.
A problem that can arise when you use citations without quotations is that it can
sometimes be unclear where the ideas or the claims that you’re citing end, and where your
ideas begin. In other words, the boundaries are unclear. Here, the lead-in and the citation
clearly mark the beginning of the citation and the page reference makes the end of the
citation, so when the reader starts on the next sentence, starting with “However”, they
know that they’re reading your words and ideas, but if you didn’t have that page number
there it would be harder to tell. So that’s another reason to use some kind of device to
clearly mark the end of the citation.

Just to highlight this point, look at this version on top, where I’ve bolded the lead-in and the
page number. Here the boundaries are clear. The version on the bottom doesn’t have the
page number reference, so it’s unclear exactly where the citation ends.

Here’s another way to make the boundary clear:


I’ve changed the version on the bottom so the name and date citation come at the end of
the citation, and this helps to clearly mark the boundary. (You could put the page number
here as well, it but wouldn’t make the boundary any clearer).

Now, in doing this we might be introducing some new ambiguity over where the citation
begins, but it’s pretty clear that at least the entire preceding sentence is being cited.
Paragraph breaks and good lead-in sentences can help to mark boundaries too. But this
ambiguity might be a reason to prefer some version of the method above, that splits the
citation so that part of it marks the beginning and part of it marks the end of the text being
cited.

There are lots of occasions where a page reference just doesn’t make sense, of course,
like when you’re summarizing or describing or commenting on a whole book or article.
Here some kind of author-date citation style is most commonly used, like in this example:
Okay, well, there’s lots more that could be said about style issues when it comes to
citations, but that’s not really our goal here. The goal was to focus on how citations
function to mark the boundaries between your ideas and the ideas of someone else, and I
think we’ve done that.
2.7 A Comment About Common Knowledge

I’ve received essays from students who’ll say something like “Plato was a student of
Socrates”, and then give a citation for the statement, where they cite the textbook I’ve
assigned for the course.

That’s an example of a claim that doesn’t need a citation, because in this context it would
count as “common knowledge”, and as a general rule you don’t have to cite facts or claims
that are common knowledge.

But this can be confusing to students who may not have known this particular fact about
Plato before taking the course or reading it in the textbook, so it wouldn’t count as common
knowledge to them. And this points to ambiguities in the concept of common knowledge
that need to be cleared up.

So, what counts as common knowledge? Well, turns out that there is no consensus view
of what should count as common knowledge for citation purposes. It’s one of those topics
that writing experts will disagree about, and the criteria for common knowledge can vary
among disciplines and among academic and professional cultures. But there are a few
guidelines that most people would agree on.

First, you don’t have to cite commonly reported facts. The statement that Plato was a
student of Socrates is an example of a commonly reported fact. Now, this doesn’t imply
that you, the writer, knew this fact before sitting down to write your essay. Ideally, what we
would like to see is two things:

First, the fact can be found in numerous, independent sources. So if you can look it up in
an encyclopedia or a textbook or a magazine article, and it’s the same claim being made in
all of these independent sources, then it’s a commonly reported fact. You don’t have to cite
the melting point of lead, for instance, or who won the Academy Award for best picture in
2002, since anyone can look up this information.

Second, there may be some facts that aren’t easily found in reference works, but that are
widely regarded as true by a large group of people, or by people within a certain discipline.
If this is the case, then you can often treat these kinds of facts as common knowledge. But
whether this is a good idea or not may depend on the discipline you’re writing in, on the
context in which you’re writing, and on the specific audience you’re writing for.

For example, if I was writing for an audience of physicists that was familiar with Albert
Einstein’s theories and achievements, I could get away with saying, without citing a
source, that Einstein was unhappy with the standard interpretation of quantum theory, or
that he was a better physicist than he was a mathematician, since both of these would be
regarded as common knowledge by this particular audience. But if I was writing for a more
general audience I would probably want to back up those claims with a source.

Now, let me add an additional caution when it comes to citing commonly reported facts.

You don’t have to cite common knowledge, but you do have to site specific expressions of
this common knowledge, when those expressions are not your own.
For example, if I look up “Charles Darwin” in the encyclopedia I can use common facts
about Darwin that I learn from that article in my essay without citing the encyclopedia
entry, but I can’t summarize or paraphrase long passages from this entry without citing the
source, since now I’m doing something else -- now I’m borrowing the language and the
specific expression of ideas from that source, rather than just the facts.
2.8 Citation Styles: MLA, APA, CSE, Chicago, Turabian

When I started writing essays I had no clue how to format bibliographies and citations, I
had never heard of citation styles or style manuals. So what I did is just find a journal
article or an essay or a book and copied the citation style from that source. Eventually I
noticed that there were different citation styles but I still had no clue why one was used
instead of another. In this video I want to give you a survey of the different citation styles
that you might encounter in academic writing, and some tips on what styles to use and
what style manuals you should have on your bookshelf.

Let’s start with the MLA style. MLA stands for Modern Language Association. It’s actually a
pretty recent citation format, the MLA Style Manuel is only in its third edition as of this
recording, with the first edition in 1985. The is the most commonly used style for
academics working in the humanities, like english literature, literary criticism, media and
cultural studies, and a hodge-podge of other arts disciplines.

In the examples I used in the plagiarism videos, I was using APA style for the most part.
MLA style is like the APA style in that it uses in-text citations, but an important difference is
that MLA uses the “author-page number” system rather than the “author-date” system. In
this example the quote is from a book by William Wordsworth, and the citation is
“Wordsworth 263”, where 263 is the page number from that source.

Here’s what the source reference would look like in the bibliography, which in MLA style is
titled “Works Cited”. Every citation style requires similar information -- author, title,
publication date, where it was published, and so on, but the specific formatting will be
different in each style. Here, for example, the date of publication comes at the end, but in
the APA style, as we’ll see, the date comes right after the author’s name.
APA stands for American Psychological Association. The first APA style manual came out
in 1927, and was designed to serve the writing needs of psychologists and
anthropologists, but the APA style is now widely used across the social sciences, and
there are branches of the humanities that use the style as well.

If you were to write the Wordsworth citation in APA style it would look like this.

APA style uses the author and the date of publication rather than the author and page
number, though page number references are often included as well, like here. APA uses
more commas to separate the elements of the citation than the MLA, and it uses that “p.”
preface to indicate page numbers, where MLA doesn’t require this.

A student once asked me why APA uses the author-date system while MLA uses the
author-page number system. I hadn’t a clue, so I went and researched it.

The basic idea, as far as I can tell, is that there’s a presumption about what kinds of
information are most important for someone writing in the humanities versus someone
writing in the social sciences. In the social sciences more often you’re citing the
conclusions of the most current sources that bear on your work, so the date of publication
is emphasized, and details of where in the source the quote is located less so. In the
humanities it’s assumed that your work is engaging with the ideas of other writers that
appear at specific places in their works, so the precise location of the ideas within a given
work is emphasized more than the date of publication of the work.
Here’s what the bibliography entry would look like in an APA formatted paper (see the
above image). Here you would title the bibliography “References” rather than “Works
Cited”. Notice that you only give the initial for the first name of an author, and the date
follows in brackets.

Okay, next on our list is the CSE style, which can be quite different from the MLA and APA
styles. CSE stands for “Council of Scientific Editors”, and this citation style is commonly
found in the natural sciences. Originally the editors were mostly biologists and the format
was focused on life sciences like microbiology, zoology, plant sciences, and so on, but
now it’s used across a wide array of physical and biological sciences.

CSE format actually supports two quite different citation styles. The first is just a variant of
the author-date system that we’ve seen before. Here’s an example. Nothing too unusual
here.

The second is the ‘citation-sequence’ system, and this is different. Here the citation is
labeled with a superscripted number, it looks like a footnote reference. And each new
citation is numbered sequentially, so the next would be “2”, followed by “3” and so on.
Sometimes the number won’t be superscripted, it’ll just be at regular size placed in
brackets at the location of the citation.

Obviously with a system like this, the bibliography has to reflect the citation sequence, so
each bibliographic entry is labeled by the corresponding citation number. But if you refer to
the same reference more than once, you use the same number.

Here’s an example. The citations are superscripted, the first sentence cites references 1
and 4, the second sentence cites reference 4 again.
A bibliography might look like this. Each reference is numbered, so the numbers 1 and 4 in
the document would refer to references 1 and 4 in this reference list.

This list also illustrates the CSE citation format for bibliographic entries, which is quite
different from, say, the APA format. Just to note a few, here you only give initials for the
first names of the authors, you don’t put spaces between them, and you don’t capitalize
the names of journal articles or put them in quotes, you just capitalize the first letter of the
title. And you also abbreviate the journal titles. This is the sort of thing you just have to
consult a style manual to figure out, or (as most people do), just use a representative
journal article or essay as a model when formatting your own essay.

Okay, our last citation style we’ll look at is the Chicago style, sometimes called “CMS” or
“CMOS”, after the initials for the Chicago Manual of Style. It’s named after Chicago since
it’s been published by the University of Chicago Press since the first edition came out in
1906.

The Chicago Manual of Style is a beast, almost a thousand pages long. It’s not just a style
manual for essays, it also has sections on editorial practice and English grammar and
usage, and it covers style and format issues for all kinds of publications, including books
for non-academic audiences. The Chicago style manual is also used for some social
science and humanities journals, so you might see this used in places where you’d
otherwise see MLA or APA style used.

Luckily, for students who need help writing research papers, there’s an excellent and much
shorter style guide, written originally by Kate Turabian in 1937, that basically compiles all
the relevant style rules that students would need to know. The book is called A Manual for
Writers of Research Papers, Theses, and Dissertations, and it’s been periodically revised
over the years by various teams. This is sometimes called “Turabian style”, but it’s really
just the Chicago style edited and compiled for undergraduate and graduate students.

What’s distinctive about the Chicago style is that it supports two different reference styles,
and supports the mixing of those two styles in the same publication.

When most writers hear the term “Chicago style” they think of a sequential numbering,
notes-based citation system, like the one illustrated here. The citations are numbered, and
then the source information is given either in a footnote at the bottom of the page, or in an
endnote at the end of the document. With this system you can avoid having a separate
bibliography section altogether.

One of the advantages of a system like this is that the page isn’t cluttered with citation
information and it can make for a more pleasant reading experience.
Now, although this is, in some respects, the stereotypical Chicago style system, the
Chicago Style Manual also supports in-line citation styles, like the MLA and APA styles, so
it gets complicated explaining what makes the citation style distinct.

For example, here’s a citation system that the Chicago style also supports. On top you’ve
got a paragraph with some standard in-text citations. You’ve also got a reference to a
footnote, footnote number 6.

At the bottom of the page is footnote number 6, and in it you’ve got some commentary --
these are sometimes called “discursive notes” to distinguish them from footnotes that just
contain source information. And in the commentary there’s another in-text citation.

And all of this is supported by the Chicago manual of style. So you can see that it
encourages flexibility and mixing of citation styles, which can come in handy for different
purposes.

Okay, that about wraps up this introduction to citation styles. Here’s the summary table.
What I would recommend, for any student, is to pick up a copy of a good style guide
suitable to your discipline, because it’ll make life easier for you in the end. Just google any
of these style names and you’ll get lots of online resources, and if you go to Amazon just
search for these names and you’ll get the style manuals themselves as well as a bunch of
third-party style guides that are basically like “dummies guides” to a particular citation
style, which can also be very helpful.

También podría gustarte