"Strong Inference"
I keep six honest serving men
(They taught me all I knew);
Their names are What and Why and When
And How and Where and Who. - Kipling
If I have seen further, is is by standing on the shoulders of giants
-Sir Isaac Newton Letter to Robert Hooke 5 February 1675
A dwarf standing on the shoulders of a giant may see farther than a giant
himself
- Robert Burton (1577-1640) The anatomy of melancholy
Assignment
Audesirk et al., Chapter 1, pp. 610-611 & 629 (also pay attention to
several figures and readings in other chapters)
Today's musical selection
Tina Turner One
of the living
Nature of scientific inquiry
Fig. 1-4
Scientific inquiry and scientific method is based on observation - systematic,
objective, repeatable
Figure 14-7
You cannot always manipulate things, example: studying the fossil record.
But you can make observations like the similarities in the forelimbs of
birds and mammals.
Think about high school geometry. In the text book, there are "postulates"
that we assume must be true and "theorems" that can be proven.
In the home work, there are "if - then" problems in which something
iis assumed ("if"), a conclusion ("then") can be worked
out by applying the theorems and postulates step-by step.
There are various kinds of science.
Some scientists make models (mathematical or electrical) of biological systems
that have predictive value. Other scientists collect descriptive data that
further substantiates a global theory.
Fig. 1-10
(Already shown in first lecture outline)
Model: energy flow in biology
Is there one right kind of science?
"Strong inference" is the title of a paper by John R. Platt (Science,
146, 347-353, 1964) in which he criticizes some approaches to science. Here
are some quotes:
"...some fields of science are moving forward much faster than others"
"Those rapidly moving fields are fields where a particular method of
doing scientific research is systematically used and taught, an accumulative
method of inductive inference that is so effective that I think it should
be given the name of 'strong inference.'"
"Strong inference consistds of...:
(1) Devising alternative hypotheses;
and
(2) Devising a crucial experiment ... with alternative outcomes ... each
of which will ... exclude one or more of the hypotheses"
"It is like climbing a tree."
"[focus] on the exclusion of a hypothesis"
"How small and elegant an experiment can you perform?
"You must study the simplest system you think has properties you are
interested in." (attributed to C. Levinthal
"...there is no such thing as proof in science... science advances
only by disproofs." (attributed to Karl Popper
Doing an experiment
In an experiment:
An independent variable is what we manipulate, typically graphed on the
abscissa (X-axis)
A dependent variable is what we measure, typically graphed on the ordinate
(Y-axis)
Control - other variables are to be controlled
The population is everything out there
The sample is what, out of the population, we measure (to make inferences
about the population)
Sampling - random sampling is best
Descriptive statistics gives us averages (means)
as well as measurements of how variable the data are (standard deviations)
In inferential statistics we propose a null hypothesis vs. an alternative
hypothesis.
Doing statistics
Hypotheses are small questions. You "test" these hypotheses -
answers (in the form of "rejecting the null hypothesis") are never
certain but rather involve an acceptably low probability (usually 5%) of
being wrong, hence the involvement of statistics in experimental design.
Enter (stage left): Statistics
"There are three kinds of lies: lies, damn lies and statistics"
Benjamin Disraeli (quoted in Mark Twain's autobiography, Chapter 29)
Enter (stage left) The normal distribution
Figure 15-13
(I just needed a picture showing the normal distribution)
Figure 15-13 will be shown again in the context of genetic
basis of evolution
Figure
Area under the curve = 1
Middle of the curve is the mean, 0 in the "standard" normal distribution
variation is indicated by standard deviation
z-score indicates the number of standard deviations away from the mean the
a score is
About 5% of the curve is 2 or more standard deviations from the mean
Central limit theorem:
The distribution of sample averages approaches the normal distribution as
the sample size increases
(It's a theorem, that means it can be proven, it's not just an idea)
"The results are significant beyond the 0.05 level." (This its
typical) This means that the results could have happened by chance (rather
than because of our independent variable) 5% of the time
"It has been proven beyond a shadow of a doubt using the scientific
method" - Nonsense!
Collect more data, and you will be more certain. You can only reject the
null hypothesis. You can not accept (or prove) the null hypothesis. In other
words, "absence of data is not the same as data of absence."
Thought problem & story
Suppose someone says that a certain plant is extinct. How can he know? To
state that for certain would require examination of every square inch of
the Earth. Most naturalists admit that they can say for (fairly) certain
if a bird species is gone. The reason is that birds are so conspicuous and
that there are so many well-trained bird-watchers. Birds do go extinct!
Here is a SLIDE
I took at the Smithsonian natural history museum of Martha, the last passenger
pigeon who died 1 pm Sept 1, 1914 at the Cincinnatti Zoo. The famous ornithologist
Audubon, saw them darken the skies hundreds of times during their migration.
However, the ivory-billed woodpecker was considered extinct after its last
sighting in 1944. Then it was (presumably) seen again in 2004 (although
there remains some uncertainty.
"We have no evidence showing that (whatever)" - Nonsense!
In summary, the step-by-step progress of science involves statistics, and
asking the right questions, that can be answered appropriately. A course
in advanced statistics is usually called "experimental design"
because you cannot even do the right experiment without the right design,
controls, and statistics in mind.
Your faculty are scientists
"Publish or perish." Perhaps you have heard that expression. In
my opinion, "a university is a community of scholars dedicated to the
acquisition and disemination of knowledge." Faculty are expected to
do research as well as to teach courses and are usually granted tenure only
if their research publications are favorably viewed by their peers (at other
universities)
Watch out!
Sometimes "scientists" are not completely honest when there is
a conflict of interest
Personalities
Fig. E9-3
Science is a human endeavor.
See story on p. 155 about Maurice Wilkins and Rosalind Franklin
See story on p. 156 about James Watson and Frances Crick
1962 Nobel
Prize in Physiology and Medicine
American chemist Linus Pauling might have made discovery but he was not
allowed to go to a meeting where data were presented because of strong anti-Communist
movement in the US in the early 1950's
Scientific revolutions
Book by Thomas Kuhn (1962)
"paradigm" - set of beliefs shared by the scientific community
When too many contradictions accumulate - "paradigm shift"
Questions used in 2007 & 20008 relating to this outline
I said that "absence of data is not the same as data of absence"
to paraphrase
(a) "the sample must be random."
(b) "observations must be unbiased."
*(c) "you never accept the null hypothesis."
(d) "not all science is based on experiments."
(e) "publish or perish."
Description of a hypothesis:
(a) It unifies all the observations in a field.
*(b) It should be stated in a way so that it can be disproved.
(c) Like a postulate in geometry, we must assume it is true.
(d) Like a theorem in geometry, it can be proved.
(e) It is used to describe a process and to make predictions.
It is reasonable to hypothesize that the average height of the Bio 110 class
can be estimated by the average of the Tuesday morning lab because
(a) students were not biased when they did those measurements.
(b) we controlled all other variables in this experiment.
*(c) there is no reason to think it is not a random sample.
(d) we measured a population, not a sample, Tuesday morning.
(e) height is an independent variable.
The normal distribution is used because
*(a) it can be proven that the sample means are normally distributed.
(b) it is how the independent variable is related to the dependent variable.
(c) it is systematic, objective and repeatable.
(d) it is the best description of how postulates and theorems work in a
high school geometry "if-then" homework problem.
(e) it is the best way to graph homology on a taxonomic tree.
The ivory-billed woodpecker
(a) has wings that are not homologous to human forelimbs.
(b) is dangerous because it has prions.
(c) was found on the Galapagos by Darwin.
*(d) was thought to be extinct but may not be.
(e) became extinct in the Old World and then was re-introduced from the
New World.
Why do scientists study a sample instead of a population?
A) A sample is more accurate.
B) A population is random.
*C) A population is too big to study.
D) The sample is like a theorem while the population is like postulate.
E) Only with the sample can you prove the null hypothesis.
Which is true about the discovery of the structure of DNA?
A) American Linus Pauling discovered it.
B) Nobody believes Watson and Crick's model because they had a conflict
of interest.
C) It's primary structure is the sequence of amino acids in it.
*D) The Nobel Prize went to three men, and Rosalind Franklin did not share
the award.
E) It can be altered by exposure to the abnormal structure of a prion.
The null hypothesis
A) is the same as a law.
B) is the same as a model.
C) is the same as a theory.
*D) cannot be accepted, it can only be be rejected.
E) allows us to have a "paradigm shift" (scientific revolution).
I want to test a drug, so I do a clinical trial. I give half the subjects
the drug and half the people a "placebo" (a dud). Why?
A) The placebo guarantees that the researchers do not have a conflict of
interest.
B) The placebo guarantees that the sample is random.
C) The placebo guarantees that the observations are repeatable.
D) If we are testing a diabetes drug, we should give overweight people the
drug and thin people the placebo.
*E) The people getting the placebo serve as the control group.
I calculate the mean from my sample. What is true about the sample mean?
*A) The distribution of sample means is described by the normal distribution.
B) It is the same as the standard deviation.
C) It is the same as the median.
D) It is the same as the population mean.
E) It is one of the controlled variables.
What is the significance of the "tails" (more than 2 standard
deviations from the mean) of the normal distribution?
A) Homology is in the tails while analogy is in the rest of the distribution.
B) If it is in the tails, scientists consider it to be a law, while the
rest of the distribution is a model.
*C) It has to do with disproving the null hypothesis.
D) The tail on the left is the dependent variable and the tail on the right
is the independent variable.
E) The mean is in the tail on the left while the median is in the tail on
the right.
Question used in 2002 relating to this outline
A small scientific question which the typical biologist "tests"
with an experiment is called a
(a) theory.
*(b) hypothesis.
(c) model.
(d) correlation.
(e) law.
Return to Bio 110 Syllabus
Return to Stark Home Page
This page was last revised 6/8/09