Está en la página 1de 123

Outline

♦ What is AI?
♦ A brief history
Artificial Intelligence ♦ The state of the art

Chapter 1

Chapter 1 1 Chapter 1 2

What is AI? Acting humanly: The Turing test
Turing (1950) “Computing machinery and intelligence”:
♦ “Can machines think?” −→ “Can machines behave intelligently?”
Systems that think like humans Systems that think rationally ♦ Operational test for intelligent behavior: the Imitation Game
Systems that act like humans Systems that act rationally HUMAN

HUMAN
INTERROGATOR ?
AI SYSTEM

♦ Predicted that by 2000, a machine might have a 30% chance of
fooling a lay person for 5 minutes
♦ Anticipated all major arguments against AI in following 50 years
♦ Suggested major components of AI: knowledge, reasoning, language
understanding, learning
Problem: Turing test is not reproducible, constructive, or
amenable to mathematical analysis
Chapter 1 3 Chapter 1 4

Thinking humanly: Cognitive Science Thinking rationally: Laws of Thought
1960s “cognitive revolution”: information-processing psychology replaced Normative (or prescriptive) rather than descriptive
prevailing orthodoxy of behaviorism
Aristotle: what are correct arguments/thought processes?
Requires scientific theories of internal activities of the brain
– What level of abstraction? “Knowledge” or “circuits”? Several Greek schools developed various forms of logic:
– How to validate? Requires notation and rules of derivation for thoughts;
1) Predicting and testing behavior of human subjects (top-down) may or may not have proceeded to the idea of mechanization
or 2) Direct identification from neurological data (bottom-up) Direct line through mathematics and philosophy to modern AI
Both approaches (roughly, Cognitive Science and Cognitive Neuroscience) Problems:
are now distinct from AI 1) Not all intelligent behavior is mediated by logical deliberation
Both share with AI the following characteristic: 2) What is the purpose of thinking? What thoughts should I have
the available theories do not explain (or engender) out of all the thoughts (logical or otherwise) that I could have?
anything resembling human-level general intelligence
Hence, all three fields share one principal direction!

Chapter 1 5 Chapter 1 6

Acting rationally Rational agents
Rational behavior: doing the right thing An agent is an entity that perceives and acts
The right thing: that which is expected to maximize goal achievement, This course is about designing rational agents
given the available information
Abstractly, an agent is a function from percept histories to actions:
Doesn’t necessarily involve thinking—e.g., blinking reflex—but
thinking should be in the service of rational action f : P∗ → A

Aristotle (Nicomachean Ethics): For any given class of environments and tasks, we seek the
Every art and every inquiry, and similarly every agent (or class of agents) with the best performance
action and pursuit, is thought to aim at some good Caveat: computational limitations make
perfect rationality unachievable
→ design best program for given machine resources

Chapter 1 7 Chapter 1 8

AI prehistory Potted history of AI

Philosophy logic, methods of reasoning 1943 McCulloch & Pitts: Boolean circuit model of brain
mind as physical system 1950 Turing’s “Computing Machinery and Intelligence”
foundations of learning, language, rationality 1952–69 Look, Ma, no hands!
Mathematics formal representation and proof 1950s Early AI programs, including Samuel’s checkers program,
algorithms, computation, (un)decidability, (in)tractability Newell & Simon’s Logic Theorist, Gelernter’s Geometry Engine
probability 1956 Dartmouth meeting: “Artificial Intelligence” adopted
Psychology adaptation 1965 Robinson’s complete algorithm for logical reasoning
phenomena of perception and motor control 1966–74 AI discovers computational complexity
experimental techniques (psychophysics, etc.) Neural network research almost disappears
1969–79 Early development of knowledge-based systems
Economics formal theory of rational decisions
1980–88 Expert systems industry booms
Linguistics knowledge representation
1988–93 Expert systems industry busts: “AI Winter”
grammar
1985–95 Neural networks return to popularity
Neuroscience plastic physical substrate for mental activity
1988– Resurgence of probability; general increase in technical depth
Control theory homeostatic systems, stability
“Nouvelle AI”: ALife, GAs, soft computing
simple optimal agent designs
1995– Agents, agents, everywhere . . .
2003– Human-level AI back on the agenda
Chapter 1 9 Chapter 1 10

State of the art State of the art
Which of the following can be done at present? Which of the following can be done at present?
♦ Play a decent game of table tennis ♦ Play a decent game of table tennis
♦ Drive safely along a curving mountain road

Chapter 1 11 Chapter 1 12

State of the art State of the art Which of the following can be done at present? Which of the following can be done at present? ♦ Play a decent game of table tennis ♦ Play a decent game of table tennis ♦ Drive safely along a curving mountain road ♦ Drive safely along a curving mountain road ♦ Drive safely along Telegraph Avenue ♦ Drive safely along Telegraph Avenue ♦ Buy a week’s worth of groceries on the web Chapter 1 13 Chapter 1 14 State of the art State of the art Which of the following can be done at present? Which of the following can be done at present? ♦ Play a decent game of table tennis ♦ Play a decent game of table tennis ♦ Drive safely along a curving mountain road ♦ Drive safely along a curving mountain road ♦ Drive safely along Telegraph Avenue ♦ Drive safely along Telegraph Avenue ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Play a decent game of bridge Chapter 1 15 Chapter 1 16 .

State of the art State of the art Which of the following can be done at present? Which of the following can be done at present? ♦ Play a decent game of table tennis ♦ Play a decent game of table tennis ♦ Drive safely along a curving mountain road ♦ Drive safely along a curving mountain road ♦ Drive safely along Telegraph Avenue ♦ Drive safely along Telegraph Avenue ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Play a decent game of bridge ♦ Play a decent game of bridge ♦ Discover and prove a new mathematical theorem ♦ Discover and prove a new mathematical theorem ♦ Design and execute a research program in molecular biology Chapter 1 17 Chapter 1 18 State of the art State of the art Which of the following can be done at present? Which of the following can be done at present? ♦ Play a decent game of table tennis ♦ Play a decent game of table tennis ♦ Drive safely along a curving mountain road ♦ Drive safely along a curving mountain road ♦ Drive safely along Telegraph Avenue ♦ Drive safely along Telegraph Avenue ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Play a decent game of bridge ♦ Play a decent game of bridge ♦ Discover and prove a new mathematical theorem ♦ Discover and prove a new mathematical theorem ♦ Design and execute a research program in molecular biology ♦ Design and execute a research program in molecular biology ♦ Write an intentionally funny story ♦ Write an intentionally funny story ♦ Give competent legal advice in a specialized area of law Chapter 1 19 Chapter 1 20 .

State of the art State of the art Which of the following can be done at present? Which of the following can be done at present? ♦ Play a decent game of table tennis ♦ Play a decent game of table tennis ♦ Drive safely along a curving mountain road ♦ Drive safely along a curving mountain road ♦ Drive safely along Telegraph Avenue ♦ Drive safely along Telegraph Avenue ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Play a decent game of bridge ♦ Play a decent game of bridge ♦ Discover and prove a new mathematical theorem ♦ Discover and prove a new mathematical theorem ♦ Design and execute a research program in molecular biology ♦ Design and execute a research program in molecular biology ♦ Write an intentionally funny story ♦ Write an intentionally funny story ♦ Give competent legal advice in a specialized area of law ♦ Give competent legal advice in a specialized area of law ♦ Translate spoken English into spoken Swedish in real time ♦ Translate spoken English into spoken Swedish in real time ♦ Converse successfully with another person for an hour Chapter 1 21 Chapter 1 22 State of the art State of the art Which of the following can be done at present? Which of the following can be done at present? ♦ Play a decent game of table tennis ♦ Play a decent game of table tennis ♦ Drive safely along a curving mountain road ♦ Drive safely along a curving mountain road ♦ Drive safely along Telegraph Avenue ♦ Drive safely along Telegraph Avenue ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Play a decent game of bridge ♦ Play a decent game of bridge ♦ Discover and prove a new mathematical theorem ♦ Discover and prove a new mathematical theorem ♦ Design and execute a research program in molecular biology ♦ Design and execute a research program in molecular biology ♦ Write an intentionally funny story ♦ Write an intentionally funny story ♦ Give competent legal advice in a specialized area of law ♦ Give competent legal advice in a specialized area of law ♦ Translate spoken English into spoken Swedish in real time ♦ Translate spoken English into spoken Swedish in real time ♦ Converse successfully with another person for an hour ♦ Converse successfully with another person for an hour ♦ Perform a complex surgical operation ♦ Perform a complex surgical operation ♦ Unload any dishwasher and put everything away Chapter 1 23 Chapter 1 24 .

State of the art Unintentionally funny stories Which of the following can be done at present? One day Joe Bear was hungry. Irving agreed. Henry slipped and fell in the river. and ♦ Give competent legal advice in a specialized area of law swallowed the cheese. so Joe offered to bring him a worm if he’d tell him where some honey was. who refused to say. He walked over to the river bank where his good ♦ Buy a week’s worth of groceries on the web friend Bill Bird was sitting. Irving told him there was a beehive in the oak tree. But Joe didn’t know where any worms were. Joe threat- ♦ Play a decent game of table tennis ened to hit Irving if he didn’t tell him where some honey was. Gravity ♦ Buy a week’s worth of groceries at Berkeley Bowl drowned. He asked Irving Bird where some honey was. . He asked his friend Irving Bird where some honey was. The End. ♦ Play a decent game of bridge ♦ Discover and prove a new mathematical theorem Once upon a time there was a dishonest fox and a vain crow. One day the ♦ Design and execute a research program in molecular biology crow was sitting in his tree. holding a piece of cheese in his mouth. Intelligent Agents so he asked Irving. Irving agreed. So Joe offered to bring him a worm if he’d tell him where a worm was. The fox walked over to the crow. Irving refused to tell him. But Joe didn’t know where any worms were. He became hungry. So Joe offered to bring him a worm if he’d tell him where a worm was . . Chapter 2 Chapter 2 1 Chapter 1 27 . who refused to say. He ♦ Write an intentionally funny story noticed that he was holding the piece of cheese. The End. ♦ Drive safely along a curving mountain road ♦ Drive safely along Telegraph Avenue Henry Squirrel was thirsty. ♦ Translate spoken English into spoken Swedish in real time ♦ Converse successfully with another person for an hour ♦ Perform a complex surgical operation ♦ Unload any dishwasher and put everything away Chapter 1 25 Chapter 1 26 Unintentionally funny stories Joe Bear was hungry. The End. so he asked Irving.

softbots. Suck.. N oOp The agent program runs on the physical architecture to produce f Chapter 2 4 Chapter 2 5 . Environment. e. etc. [A. 271 Soda ♦ Rationality ♦ PEAS (Performance measure. robots. Reminders Outline Assignment 0 (lisp refresher) due 1/28 ♦ Agents and environments Lisp/emacs/AIMA tutorial: 11-1 today and Monday. Sensors) ♦ Environment types ♦ Agent types Chapter 2 2 Chapter 2 3 Agents and environments Vacuum-cleaner world sensors percepts A B ? environment agent actions actuators Agents include humans.g. thermostats. Dirty] f : P∗ → A Actions: Lef t. Actuators. The agent function maps from percept histories to actions: Percepts: location and contents. Right.

minus one per move? [B. Sensors?? Sensors?? video. Dirty] Suck A rational agent chooses whichever action maximizes the expected value of [A. pedestrians. Clean]. learning. we must specify the task environment To design a rational agent.g. we must specify the task environment Consider. . . accelerator. GPS. engine sensors. Clean] Right the performance measure given the percept sequence to date [A. . Clean] Lef t – penalize for > k dirty squares? [B. . Rational 6= omniscient – percepts may not supply all relevant information function Reflex-Vacuum-Agent( [location. brake. e. Environment?? Environment?? US streets/freeways. weather.. horn.. accelerometers. Dirty] Suck . rational 6= successful else if location = B then return Left Rational ⇒ exploration.. . . A vacuum-cleaner agent Rationality Percept sequence Action Fixed performance measure evaluates the environment sequence [A.g. . the task of designing an automated taxi: Consider. Clean]. .status]) returns an action Rational 6= clairvoyant if status = Dirty then return Suck – action outcomes may not be as expected else if location = A then return Right Hence. e. speaker/display. Actuators?? Actuators?? steering. Chapter 2 8 Chapter 2 9 . legality. destination. Clean] Right – one point per square cleaned up in time T ? [A. autonomy What is the right function? Can it be implemented in a small agent program? Chapter 2 6 Chapter 2 7 PEAS PEAS To design a rational agent. . [A. . keyboard. [A. profits. the task of designing an automated taxi: Performance measure?? Performance measure?? safety. comfort. gauges.. . traffic. Dirty] Suck – one point per clean square per time step. . .

follow URL. appropriateness. Internet shopping agent Internet shopping agent Performance measure?? Performance measure?? price. shippers Actuators?? Actuators?? display to user. scripts) Chapter 2 10 Chapter 2 11 Environment types Environment types Solitaire Backgammon Internet shopping Taxi Solitaire Backgammon Internet shopping Taxi Observable?? Observable?? Yes Yes No No Deterministic?? Deterministic?? Episodic?? Episodic?? Static?? Static?? Discrete?? Discrete?? Single-agent?? Single-agent?? Chapter 2 12 Chapter 2 13 . quality. efficiency Environment?? Environment?? current and future WWW sites. fill in form Sensors?? Sensors?? HTML pages (text. vendors. graphics.

Environment types Environment types Solitaire Backgammon Internet shopping Taxi Solitaire Backgammon Internet shopping Taxi Observable?? Yes Yes No No Observable?? Yes Yes No No Deterministic?? Yes No Partly No Deterministic?? Yes No Partly No Episodic?? Episodic?? No No No No Static?? Static?? Discrete?? Discrete?? Single-agent?? Single-agent?? Chapter 2 14 Chapter 2 15 Environment types Environment types Solitaire Backgammon Internet shopping Taxi Solitaire Backgammon Internet shopping Taxi Observable?? Yes Yes No No Observable?? Yes Yes No No Deterministic?? Yes No Partly No Deterministic?? Yes No Partly No Episodic?? No No No No Episodic?? No No No No Static?? Yes Semi Semi No Static?? Yes Semi Semi No Discrete?? Discrete?? Yes Yes Yes No Single-agent?? Single-agent?? Chapter 2 16 Chapter 2 17 .

stochastic. multi-agent Chapter 2 18 Chapter 2 19 Simple reflex agents Example Agent Sensors function Reflex-Vacuum-Agent( [location. dynamic. sequential. Environment types Agent types Solitaire Backgammon Internet shopping Taxi Four basic types in order of increasing generality: Observable?? Yes Yes No No – simple reflex agents Deterministic?? Yes No Partly No – reflex agents with state Episodic?? No No No No – goal-based agents Static?? Yes Semi Semi No – utility-based agents Discrete?? Yes Yes Yes No All these can be turned into learning agents Single-agent?? Yes No Yes (except auctions) No The environment type largely determines the agent design The real world is (of course) partially observable.status]) returns an action if status = Dirty then return Suck What the world else if location = A then return Right is like now else if location = B then return Left Environment (setq joe (make-agent :name ’joe :body (make-agent-body) :program (make-reflex-vacuum-agent-program))) What action I (defun make-reflex-vacuum-agent-program () Condition!action rules should do now #’(lambda (percept) (let ((location (first percept)) (status (second percept))) Actuators (cond ((eq status ’dirty) ’Suck) ((eq location ’A) ’Right) ((eq location ’B) ’Left))))) Chapter 2 20 Chapter 2 21 . continuous.

. .status]) returns an action State static: last A. numbers. initially ∞ What the world if status = Dirty then . Reflex agents with state Example Sensors function Reflex-Vacuum-Agent( [location. last B. How the world evolves is like now Environment What my actions do (defun make-reflex-vacuum-agent-with-state-program () (let ((last-A infinity) (last-B infinity)) #’(lambda (percept) (let ((location (first percept)) (status (second percept))) Condition!action rules What action I (incf last-A) (incf last-B) should do now (cond ((eq status ’dirty) Agent Actuators (if (eq location ’A) (setq last-A 0) (setq last-B 0)) ’Suck) ((eq location ’A) (if (> last-B 3) ’Right ’NoOp)) ((eq location ’B) (if (> last-A 3) ’Left ’NoOp))))))) Chapter 2 22 Chapter 2 23 Goal-based agents Utility-based agents Sensors Sensors State State How the world evolves What the world How the world evolves What the world is like now is like now Environment Environment What my actions do What it will be like What my actions do What it will be like if I do action A if I do action A How happy I will be Utility in such a state What action I What action I Goals should do now should do now Agent Actuators Agent Actuators Chapter 2 24 Chapter 2 25 .

Learning agents Summary Performance standard Agents interact with environments through actuators and sensors The agent function describes what the agent does in all circumstances Critic Sensors The performance measure evaluates the environment sequence feedback A perfectly rational agent maximizes expected performance Environment changes Agent programs implement (some) agent functions Learning Performance element element PEAS descriptions define task environments knowledge learning Environments are categorized along several dimensions: !!goals observable? deterministic? episodic? static? discrete? single-agent? Problem Several basic agent architectures exist: generator reflex. reflex with state. goal-based. utility-based Agent Actuators Chapter 2 26 Chapter 2 27 Reminders Assignment 0 due 5pm today Assignment 1 posted. due 2/9 Problem solving and search Section 105 will move to 9-10am starting next week Chapter 3 Chapter 3 1 Chapter 3 2 .

g. e. initially null problem. a goal. state) seq ← Remainder(seq. Arad. percept) if seq is empty then goal ← Formulate-Goal(state) problem ← Formulate-Problem(state. goal) seq ← Search( problem) action ← Recommendation(seq. Chapter 3 3 Chapter 3 4 Example: Romania Example: Romania Oradea On holiday in Romania. currently in Arad. Sibiu. 71 Neamt Flight leaves tomorrow from Bucharest Zerind 87 75 151 Formulate goal: Iasi Arad be in Bucharest 140 92 Sibiu Fagaras 99 Formulate problem: 118 Vaslui states: various cities 80 Rimnicu Vilcea actions: drive between cities Timisoara 142 111 211 Find solution: Lugoj 97 Pitesti sequence of cities. Outline Problem-solving agents ♦ Problem-solving agents Restricted form of general agent: ♦ Problem types function Simple-Problem-Solving-Agent( percept) returns an action static: seq. state) return action Note: this is offline problem solving. Fagaras. some description of the current world state ♦ Example problems goal.” Online problem solving involves acting without complete knowledge. Bucharest 70 98 146 85 Hirsova Mehadia 101 Urziceni 75 138 86 Bucharest Dobreta 120 90 Craiova Eforie Giurgiu Chapter 3 5 Chapter 3 6 . solution executed “eyes closed.. an action sequence. initially empty ♦ Problem formulation state. a problem formulation ♦ Basic search algorithms state ← Update-State(state.

8} Conformant. 3. 4. Solution?? [Right. 6. Right goes to {2.. start in #5. start in {1. Solution?? Chapter 3 9 Chapter 3 10 . 6. Solution?? e. Problem types Example: vacuum world Deterministic. 2. 5. fully observable =⇒ single-state problem Single-state. execution Unknown state space =⇒ exploration problem (“online”) Chapter 3 7 Chapter 3 8 Example: vacuum world Example: vacuum world Single-state. 4. Lef t. Suck. 7. 8}.. Suck] 5 6 5 6 Contingency. 5. 8}. start in #5. 3. Right goes to {2. Suck] 1 2 [Right. 8} 3 4 3 4 e. start in #5. solution (if any) is a sequence Nondeterministic and/or partially observable =⇒ contingency problem 5 6 percepts provide new information about current state solution is a contingent plan or a policy 7 8 often interleave search. Solution?? Agent knows exactly which state it will be in. location only. Suck] 1 2 Conformant. Solution?? Single-state. start in #5 Murphy’s Law: Suck can dirty a clean carpet 7 8 7 8 Local sensing: dirt.g. 7.g. 6. 6. start in {1. 4. 4. Solution?? [Right. 2. solution is a sequence 1 2 Non-observable =⇒ conformant problem 3 4 Agent may have no idea where it is.

etc. Lef t. start in #5.g.g. 7. 5. can be Murphy’s Law: Suck can dirty a clean carpet explicit. e.. x = “at Bucharest” 7 8 implicit.} 5 6 Contingency. Example: vacuum world Single-state problem formulation Single-state. Right goes to {2. Suck. 4. Solution?? 3 4 successor function S(x) = set of action–state pairs [Right. Solution?? path cost (additive) [Right... 4. .g. 6. Suck] 1 2 initial state e. e. y) is the step cost. 2. c(x. start in {1.. 8}. Suck] e. L R For guaranteed realizability.g. number of actions executed.g. a. rest stops. “at Arad” Conformant. detours.g. ..g. S(Arad) = {hArad → Zerind.. any real state “in Arad” L must get to some real state “in Zerind” S S (Abstract) solution = states?? set of real paths that are solutions in the real world actions?? goal test?? Each abstract action should be “easier” than the original problem! path cost?? Chapter 3 13 Chapter 3 14 . sum of distances. assumed to be ≥ 0 A solution is a sequence of actions leading from the initial state to a goal state Chapter 3 11 Chapter 3 12 Selecting a state space Example: vacuum world state space graph R Real world is absurdly complex L R L ⇒ state space must be abstracted for problem solving S S (Abstract) state = set of real states L R R L R R L L (Abstract) action = complex combination of real actions S S S S e. location only. if dirt then Suck] e. . Solution?? A problem is defined by four items: [Right. 8} e. 6. start in #5 goal test. N oDirt(x) Local sensing: dirt. 3.. etc. Zerindi. “Arad → Zerind” represents a complex set R of possible routes.

N oOp actions??: Lef t. Right.) actions?? actions??: Lef t.) states??: integer dirt and robot locations (ignore dirt amounts etc. Suck. Example: vacuum world state space graph Example: vacuum world state space graph R R L R L R L L S S S S R R R R L R L R L R L R L L L L S S S S S S S S R R L R L R L L S S S S states??: integer dirt and robot locations (ignore dirt amounts etc. Suck.) actions??: Lef t. Right. Right. N oOp goal test?? goal test?? path cost?? path cost?? Chapter 3 15 Chapter 3 16 Example: vacuum world state space graph Example: vacuum world state space graph R R L R L R L L S S S S R R R R L R L R L R L R L L L L S S S S S S S S R R L R L R L L S S S S states??: integer dirt and robot locations (ignore dirt amounts etc. Suck.) states??: integer dirt and robot locations (ignore dirt amounts etc. N oOp goal test??: no dirt goal test??: no dirt path cost?? path cost??: 1 per action (0 for N oOp) Chapter 3 17 Chapter 3 18 .

) goal test?? goal test??: = goal state (given) path cost?? path cost?? Chapter 3 21 Chapter 3 22 . up. down (ignore unjamming etc. right. Example: The 8-puzzle Example: The 8-puzzle 7 2 4 5 1 2 3 7 2 4 5 1 2 3 5 6 4 5 6 5 6 4 5 6 8 3 1 7 8 8 3 1 7 8 Start State Goal State Start State Goal State states?? states??: integer locations of tiles (ignore intermediate positions) actions?? actions?? goal test?? goal test?? path cost?? path cost?? Chapter 3 19 Chapter 3 20 Example: The 8-puzzle Example: The 8-puzzle 7 2 4 5 1 2 3 7 2 4 5 1 2 3 5 6 4 5 6 5 6 4 5 6 8 3 1 7 8 8 3 1 7 8 Start State Goal State Start State Goal State states??: integer locations of tiles (ignore intermediate positions) states??: integer locations of tiles (ignore intermediate positions) actions??: move blank left. down (ignore unjamming etc. right. up.) actions??: move blank left.

a.) goal test??: = goal state (given) actions??: continuous motions of robot joints path cost??: 1 per move goal test??: complete assembly with no robot included! [Note: optimal solution of n-Puzzle family is NP-hard] path cost??: time to execute Chapter 3 23 Chapter 3 24 Tree search algorithms Tree search example Basic idea: Arad offline. simulated exploration of state space by generating successors of already-explored states Sibiu Timisoara Zerind (a. up. Example: The 8-puzzle Example: robotic assembly P R R 7 2 4 5 1 2 3 R R 5 6 4 5 6 R 8 3 1 7 8 Start State Goal State states??: real-valued coordinates of robot joint angles states??: integer locations of tiles (ignore intermediate positions) parts of the object to be assembled actions??: move blank left. strategy) returns a solution.k. right. expanding states) function Tree-Search( problem. or failure Arad Fagaras Oradea Rimnicu Vilcea Arad Lugoj Arad Oradea initialize the search tree using the initial state of problem loop do if there are no candidates for expansion then return failure choose a leaf node for expansion according to strategy if the node contains a goal state then return the corresponding solution else expand the node and add the resulting nodes to the search tree end Chapter 3 25 Chapter 3 26 . down (ignore unjamming etc.

depth. problem) returns a set of nodes 6 1 88 successors ← the empty set state for each action. children. or path cost! if fringe is empty then return failure parent. fringe) includes parent. Tree search example Tree search example Arad Arad Sibiu Timisoara Zerind Sibiu Timisoara Zerind Arad Fagaras Oradea Rimnicu Vilcea Arad Lugoj Arad Oradea Arad Fagaras Oradea Rimnicu Vilcea Arad Lugoj Arad Oradea Chapter 3 27 Chapter 3 28 Implementation: states vs. State[s] ← result The Expand function creates new nodes. fringe) returns a solution. path cost g(x) loop do States do not have parents. Depth[s] ← Depth[node] + 1 add s to successors return successors Chapter 3 29 Chapter 3 30 . filling in the various fields and Path-Cost[s] ← Path-Cost[node] + Step-Cost(node. depth. Action[s] ← action. State(node)) then return node fringe ← InsertAll(Expand(node. fringe) depth = 6 State 5 4 Node g=6 function Expand( node. action. result in Successor-Fn(problem. nodes Implementation: general tree search A state is a (representation of) a physical configuration function Tree-Search( problem. problem). s) using the SuccessorFn of the problem to create the corresponding states. action node ← Remove-Front(fringe) if Goal-Test(problem. or failure A node is a data structure constituting part of a search tree fringe ← Insert(Make-Node(Initial-State[problem]). State[node]) do 7 3 22 s ← a new Node Parent-Node[s] ← node. children.

. new successors go at end A A B C B C D E F G D E F G Chapter 3 33 Chapter 3 34 . Search strategies Uninformed search strategies A strategy is defined by picking the order of node expansion Uninformed strategies use only the information available in the problem definition Strategies are evaluated along the following dimensions: completeness—does it always find a solution if one exists? Breadth-first search time complexity—number of nodes generated/expanded space complexity—maximum number of nodes in memory Uniform-cost search optimality—does it always find a least-cost solution? Depth-first search Time and space complexity are measured in terms of Depth-limited search b—maximum branching factor of the search tree d—depth of the least-cost solution Iterative deepening search m—maximum depth of the state space (may be ∞) Chapter 3 31 Chapter 3 32 Breadth-first search Breadth-first search Expand shallowest unexpanded node Expand shallowest unexpanded node Implementation: Implementation: fringe is a FIFO queue.. new successors go at end fringe is a FIFO queue.e.e. i. i.

Breadth-first search Breadth-first search Expand shallowest unexpanded node Expand shallowest unexpanded node Implementation: Implementation: fringe is a FIFO queue. i. new successors go at end fringe is a FIFO queue. new successors go at end A A B C B C D E F G D E F G Chapter 3 35 Chapter 3 36 Properties of breadth-first search Properties of breadth-first search Complete?? Complete?? Yes (if b is finite) Time?? Chapter 3 37 Chapter 3 38 ..e. i..e.

can easily generate nodes at 100MB/sec ∗ /ǫ⌉ so 24hrs = 8640GB. + bd + b(bd − 1) = O(bd+1). exp. not optimal in general Complete?? Yes. in d Time?? 1 + b + b2 + b3 + . O(b⌈C ) where C ∗ is the cost of the optimal solution Space?? # of nodes with g ≤ cost of optimal solution... in d Implementation: fringe = queue ordered by path cost. i. . Time?? # of nodes with g ≤ cost of optimal solution. if step cost ≥ ǫ Space is the big problem. . Properties of breadth-first search Properties of breadth-first search Complete?? Yes (if b is finite) Complete?? Yes (if b is finite) Time?? 1 + b + b2 + b3 + . + bd + b(bd − 1) = O(bd+1). + bd + b(bd − 1) = O(bd+1).. . exp. . in d Space?? Space?? O(bd+1) (keeps every node in memory) Optimal?? Chapter 3 39 Chapter 3 40 Properties of breadth-first search Uniform-cost search Complete?? Yes (if b is finite) Expand least-cost unexpanded node Time?? 1 + b + b2 + b3 + . i.e. exp. O(b⌈C ∗ /ǫ⌉ ) Optimal?? Yes—nodes expanded in increasing order of g(n) Chapter 3 41 Chapter 3 42 . . i. . lowest first Space?? O(bd+1) (keeps every node in memory) Equivalent to breadth-first if step costs all equal Optimal?? Yes (if cost = 1 per step).e.e.

put successors at front fringe = LIFO queue....e.e. i. put successors at front A A B C B C D E F G D E F G H I J K L M N O H I J K L M N O Chapter 3 43 Chapter 3 44 Depth-first search Depth-first search Expand deepest unexpanded node Expand deepest unexpanded node Implementation: Implementation: fringe = LIFO queue.e. put successors at front fringe = LIFO queue. i.. i. put successors at front A A B C B C D E F G D E F G H I J K L M N O H I J K L M N O Chapter 3 45 Chapter 3 46 . i. Depth-first search Depth-first search Expand deepest unexpanded node Expand deepest unexpanded node Implementation: Implementation: fringe = LIFO queue.e.

put successors at front fringe = LIFO queue. i. Depth-first search Depth-first search Expand deepest unexpanded node Expand deepest unexpanded node Implementation: Implementation: fringe = LIFO queue.e. i.. put successors at front A A B C B C D E F G D E F G H I J K L M N O H I J K L M N O Chapter 3 47 Chapter 3 48 Depth-first search Depth-first search Expand deepest unexpanded node Expand deepest unexpanded node Implementation: Implementation: fringe = LIFO queue.e.e.. i.. put successors at front fringe = LIFO queue.e.. put successors at front A A B C B C D E F G D E F G H I J K L M N O H I J K L M N O Chapter 3 49 Chapter 3 50 . i.

put successors at front A A B C B C D E F G D E F G H I J K L M N O H I J K L M N O Chapter 3 51 Chapter 3 52 Depth-first search Depth-first search Expand deepest unexpanded node Expand deepest unexpanded node Implementation: Implementation: fringe = LIFO queue.e. i.. i. put successors at front fringe = LIFO queue. i.e.e.. put successors at front fringe = LIFO queue... Depth-first search Depth-first search Expand deepest unexpanded node Expand deepest unexpanded node Implementation: Implementation: fringe = LIFO queue. i.e. put successors at front A A B C B C D E F G D E F G H I J K L M N O H I J K L M N O Chapter 3 53 Chapter 3 54 .

may be much faster than breadth-first Space?? Space?? O(bm). spaces with loops Modify to avoid repeated states along path Modify to avoid repeated states along path ⇒ complete in finite spaces ⇒ complete in finite spaces Time?? O(bm): terrible if m is much larger than d Time?? O(bm): terrible if m is much larger than d but if solutions are dense. may be much faster than breadth-first but if solutions are dense. i. Properties of depth-first search Properties of depth-first search Complete?? Complete?? No: fails in infinite-depth spaces. spaces with loops Modify to avoid repeated states along path ⇒ complete in finite spaces Time?? Chapter 3 55 Chapter 3 56 Properties of depth-first search Properties of depth-first search Complete?? No: fails in infinite-depth spaces. linear space! Optimal?? Chapter 3 57 Chapter 3 58 .e. spaces with loops Complete?? No: fails in infinite-depth spaces..

problem. may be much faster than breadth-first function Depth-Limited-Search( problem. limit) returns soln/fail/cutoff Recursive-DLS(Make-Node(Initial-State[problem]). linear space! function Recursive-DLS(node.e.. nodes at depth l have no successors ⇒ complete in finite spaces Recursive implementation: Time?? O(bm): terrible if m is much larger than d but if solutions are dense.. limit) returns soln/fail/cutoff cutoff-occurred? ← false Optimal?? No if Goal-Test(problem. depth) if result 6= cutoff then return result end Chapter 3 61 Chapter 3 62 . State[node]) then return node else if Depth[node] = limit then return cutoff else for each successor in Expand(node. problem. limit) if result = cutoff then cutoff-occurred? ← true else if result 6= failure then return result if cutoff-occurred? then return cutoff else return failure Chapter 3 59 Chapter 3 60 Iterative deepening search Iterative deepening search l = 0 function Iterative-Deepening-Search( problem) returns a solution Limit = 0 A A inputs: problem. limit) Space?? O(bm). problem. Properties of depth-first search Depth-limited search Complete?? No: fails in infinite-depth spaces. a problem for depth ← 0 to ∞ do result ← Depth-Limited-Search( problem.e. spaces with loops = depth-first search with depth limit l. i. Modify to avoid repeated states along path i. problem) do result ← Recursive-DLS(successor.

Iterative deepening search l = 1 Iterative deepening search l = 2 Limit = 1 A A A A Limit = 2 A A A A B C B C B C B C B C B C B C B C D E F G D E F G D E F G D E F G A A A A B C B C B C B C D E F G D E F G D E F G D E F G Chapter 3 63 Chapter 3 64 Iterative deepening search l = 3 Properties of iterative deepening search Complete?? Limit = 3 A A A A B C B C B C B C D E F G D E F G D E F G D E F G H I J K L M N O H I J K L M N O H I J K L M N O H I J K L M N O A A A A B C B C B C B C D E F G D E F G D E F G D E F G H I J K L M N O H I J K L M N O H I J K L M N O H I J K L M N O A A A A B C B C B C B C D E F G D E F G D E F G D E F G H I J K L M N O H I J K L M N O H I J K L M N O H I J K L M N O Chapter 3 65 Chapter 3 66 .

000 + 100. 000 + 100. . . 111. . . solution at far right leaf: N (IDS) = 50 + 400 + 3. if step cost = 1 Can be modified to explore uniform-cost tree Numerical comparison for b = 10 and d = 5. 000 + 10. + bd = O(bd) Space?? Chapter 3 67 Chapter 3 68 Properties of iterative deepening search Properties of iterative deepening search Complete?? Yes Complete?? Yes d d 0 1 2 Time?? (d + 1)b + db + (d − 1)b + . 000 + 999. 990 = 1. 450 N (BFS) = 10 + 100 + 1. . . 000 = 123. + bd = O(bd) Space?? O(bd) Space?? O(bd) Optimal?? Optimal?? Yes. Properties of iterative deepening search Properties of iterative deepening search Complete?? Yes Complete?? Yes Time?? Time?? (d + 1)b0 + db1 + (d − 1)b2 + . 100 IDS does better because other nodes at depth d are not expanded BFS can be modified to apply goal test when a node is generated Chapter 3 69 Chapter 3 70 . 000 + 20. + b = O(b ) Time?? (d + 1)b0 + db1 + (d − 1)b2 + .

Summary of algorithms Repeated states Criterion Breadth. Uniform. if l ≥ d Yes A A bd+1 b⌈C /ǫ⌉ bm bl bd ∗ Time B ⌈C ∗/ǫ⌉ B B Space bd+1 b bm bl bd Optimal? Yes∗ Yes No No Yes∗ C C C C C D Chapter 3 71 Chapter 3 72 Graph search Summary Problem formulation usually requires abstracting away real-world details to function Graph-Search( problem. problem). Iterative Failure to detect repeated states can turn a linear problem into an exponential First Cost First Limited Deepening one! Complete? Yes∗ Yes∗ No Yes. fringe) Variety of uninformed search strategies loop do if fringe is empty then return failure Iterative deepening search uses only linear space node ← Remove-Front(fringe) and not much more time than other uninformed algorithms if Goal-Test(problem. or failure define a state space that can feasibly be explored closed ← an empty set fringe ← Insert(Make-Node(Initial-State[problem]). fringe) end Chapter 3 73 Chapter 3 74 . State[node]) then return node if State[node] is not in closed then Graph search can be exponentially more efficient than tree search add State[node] to closed fringe ← InsertAll(Expand(node. Depth. Depth. fringe) returns a solution.

591=338+253 450=450+0 526=366+160 553=300+253 (Also require h(n) ≥ 0. breadth-first adds layers) Sibiu Timisoara Zerind Sibiu Timisoara Zerind Contour i has all nodes with f = fi. Best-first search Greedy search example Greedy search example Idea: use an evaluation function for each node Arad Arad – estimate of “desirability” 366 Informed search algorithms ⇒ Expand most desirable unexpanded node Sibiu Timisoara Zerind 329 374 Implementation: fringe is a queue sorted in decreasing order of desirability Arad Fagaras Oradea Rimnicu Vilcea 366 380 193 Special cases: Chapter 4. Sections 1–2 7 Chapter 4.g. Sections 1–2 24 . Sections 1–2 22 Properties of greedy search A∗ search example A∗ search example Optimality of A∗ (standard proof ) Complete?? No–can get stuck in loops.g.) Bucharest Craiova Rimnicu Vilcea E. Arad fringe ← Insert(Make-Node(Initial-State[problem]). Sections 1–2 14 Chapter 4. but a good heuristic can give dramatic improvement 447=118+329 449=75+374 447=118+329 449=75+374 h(n) = estimated cost to goal from n Space?? f (n) = estimated total cost of path through n to goal Arad Fagaras Oradea Rimnicu Vilcea Arad Fagaras Oradea Rimnicu Vilcea 646=280+366 415=239+176 671=291+380 413=220+193 646=280+366 671=291+380 A∗ search uses an admissible heuristic Sibiu Bucharest Craiova Pitesti Sibiu i. e. e. Sections 1–2 19 Chapter 4. Sections 1–2 3 Chapter 4. but a good heuristic can give dramatic improvement 447=118+329 449=75+374 n Space?? O(bm)—keeps all nodes in memory Arad Fagaras Oradea Rimnicu Vilcea 646=280+366 415=239+176 671=291+380 Optimal?? Craiova Pitesti Sibiu G G2 526=366+160 417=317+100 553=300+253 f (G2) = g(G2) since h(G2) = 0 > g(G1) since G2 is suboptimal ≥ f (n) since h is admissible Since f (G2) > f (n).. fringe) Arad Fagaras Oradea Rimnicu Vilcea 366 176 380 193 A strategy is defined by picking the order of node expansion Chapter 4.g. Idea: avoid expanding paths that are already expensive Arad Arad Iasi → Neamt → Iasi → Neamt → Evaluation function f (n) = g(n) + h(n) Complete in finite space with repeated-state checking Sibiu Timisoara Zerind Sibiu Timisoara Zerind g(n) = cost so far to reach n Time?? O(bm). or failure Evaluation function h(n) (heuristic) Complete?? No–can get stuck in loops. Sections 1–2 2 Chapter 4.. hSLD(n) = straight-line distance from n to Bucharest Sibiu Timisoara Zerind node ← Remove-Front(fringe) 329 374 Time?? Greedy search expands the node that appears to be closest to goal if Goal-Test[problem] applied to State(node) succeeds return node fringe ← InsertAll(Expand(node. Sections 1–2 8 Chapter 4. Sections 1–2 9 Chapter 4. Sections 1–2 10 Outline Romania with step costs in km Greedy search example Properties of greedy search ♦ Best-first search Straight−line distance Complete?? Oradea to Bucharest Arad 71 Neamt Arad ♦ A∗ search Bucharest 366 0 Zerind 87 75 151 Craiova 160 ♦ Heuristics Arad Iasi Dobreta Eforie 242 Sibiu Timisoara Zerind 140 161 253 329 374 92 Fagaras 178 Sibiu 99 Fagaras Giurgiu 77 118 Hirsova Vaslui 151 80 Iasi 226 Rimnicu Vilcea Lugoj Timisoara 244 142 Mehadia 241 111 211 Neamt 234 Lugoj 97 Pitesti Oradea 380 70 98 Pitesti 98 146 85 Hirsova Mehadia 101 Urziceni Rimnicu Vilcea 193 75 138 86 Sibiu 253 Bucharest Timisoara 329 120 Dobreta 90 Urziceni 80 Craiova Eforie Vaslui 199 Giurgiu Zerind 374 Chapter 4. Sections 1–2 23 Properties of greedy search A∗ search example A∗ search example Optimality of A∗ (more useful) Complete?? No–can get stuck in loops.. hSLD(n) never overestimates the actual road distance 418=418+0 615=455+160 607=414+193 Theorem: A∗ search is optimal Chapter 4.g. e.e. Sections 1–2 6 Chapter 4.. so h(G) = 0 for any goal G. problem). Sections 1–2 12 Properties of greedy search A∗ search A∗ search example A∗ search example Complete?? No–can get stuck in loops. with Oradea as goal. Lemma: A∗ expands nodes in order of increasing f value∗ Arad Arad Iasi → Neamt → Iasi → Neamt → Complete in finite space with repeated-state checking Gradually adds “f -contours” of nodes (cf. e. h(n) ≤ h∗(n) where h∗(n) is the true cost from n. Sections 1–2 1 Chapter 4. Suppose some suboptimal goal G2 has been generated and is in the queue.. Sections 1–2 13 Chapter 4. Sections 1–2 4 Chapter 4. Complete in finite space with repeated-state checking Start Sibiu Timisoara Zerind Time?? O(bm).. Sections 1–2 17 Chapter 4. Sections 1–2 15 Chapter 4. fringe) returns a solution. where fi < fi+1 Time?? O(bm).. Sections 1–2 16 Chapter 4. Sections 1–2 18 Chapter 4. but a good heuristic can give dramatic improvement 393=140+253 447=118+329 449=75+374 447=118+329 449=75+374 O Space?? O(bm)—keeps all nodes in memory Arad Fagaras Oradea Rimnicu Vilcea N 646=280+366 671=291+380 Z Optimal?? No Sibiu Bucharest Craiova Pitesti Sibiu A I 591=338+253 450=450+0 526=366+160 417=317+100 553=300+253 380 S F V 400 T R L P H M U B 420 D E C G Chapter 4. Sections 1–2 11 Review: Tree search Greedy search Greedy search example Properties of greedy search function Tree-Search( problem. Sections 1–2 21 Chapter 4. Arad Arad Iasi → Neamt → Iasi → Neamt → 366=0+366 Let n be an unexpanded node on a shortest path to an optimal goal G1. fringe) = estimate of cost from n to the closest goal Iasi → Neamt → Iasi → Neamt → loop do Complete in finite space with repeated-state checking if fringe is empty then return failure E. Sections 1–2 20 Chapter 4. Sections 1–2 greedy search Sibiu Bucharest A∗ search 253 0 Chapter 4.g. Sections 1–2 5 Chapter 4.g. A∗ will never select G2 for expansion Chapter 4.

Sections 1–2 33 Chapter 4.000 nodes – complete and optimal = f (n) G A∗(h1) = 39. Sections 1–2 36 Outline ♦ Hill-climbing ♦ Simulated annealing Local search algorithms ♦ Genetic algorithms (briefly) ♦ Local search in continuous spaces (very briefly) Chapter 4. Sections 1–2 27 Chapter 4.g. h(n) = max(ha(n). Properties of A∗ Properties of A∗ Admissible heuristics Relaxed problems Complete?? Complete?? Yes.000. of squares from desired location of each tile) then h1(n) gives the shortest solution Optimal?? 7 2 4 5 1 2 3 If the rules are relaxed so that a tile can move to any adjacent square.. Sections 1–2 29 Chapter 4.135 nodes – also optimally efficient (up to tie-breaks. Sections 3–4 2 . Sections 1–2 35 Properties of A∗ Proof of lemma: Consistency Dominance Summary Complete?? Yes. Sections 1–2 26 Chapter 4. hb Chapter 4. unless there are infinitely many nodes with f ≤ f (G) E. Given any admissible heuristics ha. Complete?? Yes. hb(n)) is also admissible and dominates ha.000.e. unless there are infinitely many nodes with f ≤ f (G) Complete?? Yes.. no. Sections 1–2 28 Chapter 4.941 nodes – incomplete and not always optimal f (n′) = g(n′) + h(n′) n’ A∗(h1) = 539 nodes = g(n) + c(n. of squares from desired location of each tile) Optimal?? Yes—cannot expand fi+1 until fi is finished 7 2 4 5 1 2 3 A∗ expands all nodes with f (n) < C ∗ 5 6 4 5 6 A∗ expands some nodes with f (n) = C ∗ A∗ expands no nodes with f (n) > C ∗ 8 3 1 7 8 Start State Goal State Minimum spanning tree can be computed in O(n2) h1(S) =?? 6 and is a lower bound on the shortest (open) tour h2(S) =?? 4+0+3+3+1+0+2+1 = 14 Chapter 4.n’) Greedy best-first search expands lowest h h(n) d = 14 IDS = 3.. for the 8-puzzle: Well-known example: travelling salesperson problem (TSP) Find the shortest tour visiting all cities exactly once Time?? Time?? Exponential in [relative error in h × length of soln. unless there are infinitely many nodes with f ≤ f (G) E.] h1(n) = number of misplaced tiles h2(n) = total Manhattan distance Space?? Keeps all nodes in memory (i. Sections 1–2 32 Chapter 4. f (n) is nondecreasing along any path.] h1(n) = number of misplaced tiles h2(n) = total Manhattan distance If the rules of the 8-puzzle are relaxed so that a tile can move anywhere.. Sections 3–4 Chapter 4. a. a. Sections 1–2 31 Chapter 4.e. Sections 1–2 25 Chapter 4. Space?? Keeps all nodes in memory (i. for the 8-puzzle: Admissible heuristics can be derived from the exact solution cost of a relaxed version of the problem Time?? Exponential in [relative error in h × length of soln. Sections 3–4 1 Chapter 4. no.473.] h(n) ≤ c(n.. n′) + h(n′) n Good heuristics can dramatically reduce search cost Typical search costs: Space?? If h is consistent. for forward search) A∗(h2) = 1.g. n′) + h(n′) A∗(h2) = 113 nodes A∗ search expands lowest g + h ≥ g(n) + h(n) h(n’) d = 24 IDS ≈ 54. we have c(n. unless there are infinitely many nodes with f ≤ f (G) A heuristic is consistent if If h2(n) ≥ h1(n) for all n (both admissible) Heuristic functions estimate costs of shortest paths then h2 dominates h1 and is better for search Time?? Exponential in [relative error in h × length of soln. 5 6 4 5 6 then h2(n) gives the shortest solution Key point: the optimal solution cost of a relaxed problem 8 3 1 7 8 is no greater than the optimal solution cost of the real problem Start State Goal State h1(S) =?? h2(S) =?? Chapter 4. hb.e. Sections 1–2 34 Properties of A∗ Properties of A∗ Admissible heuristics Relaxed problems contd.641 nodes Admissible heuristics can be derived from exact solution of relaxed problems I.a. Sections 1–2 30 Chapter 4.

Sections 3–4 3 Chapter 4. a problem local variables: current. e. suitable for online as well as offline search Variants of this approach get within 1% of optimal very quickly with thou- sands of cities Chapter 4. e. TSP or. Sections 3–4 5 Chapter 4. Iterative improvement algorithms Example: Travelling Salesperson Problem In many optimization problems. column.. find configuration satisfying constraints. Sections 3–4 6 . find optimal configuration.g. e. keep a single “current” state. n = 1million Chapter 4. a node current ← Make-Node(Initial-State[problem]) loop do neighbor ← a highest-valued successor of current if Value[neighbor] ≤ Value[current] then return State[current] current ← neighbor end h=5 h=2 h=0 Almost always solves n-queens problems almost instantaneously for very large n. perform pairwise exchanges the goal state itself is the solution Then state space = set of “complete” configurations.g. timetable In such cases. or diagonal function Hill-Climbing( problem) returns a state that is a local maximum Move a queen to reduce number of conflicts inputs: problem.. try to improve it Constant space.. Sections 3–4 4 Example: n-queens Hill-climbing (or gradient ascent/descent) Put n queens on an n × n board with no two queens on the same “Like climbing Everest in thick fog with amnesia” row. path is irrelevant. a node neighbor. Start with any complete tour.g. can use iterative improvement algorithms.

. for physical process modelling Widely used in VLSI layout. schedule) returns a solution state shoulder inputs: problem. Sections 3–4 8 Properties of simulated annealing Local beam search At fixed “temperature” T . etc. all k states end up on same local hill E(x∗ ) E(x) E(x∗ )−E(x) because e kT /e kT = e kT ≫ 1 for small T Idea: choose k successors randomly. Sections 3–4 10 . Sections 3–4 7 Chapter 4. Chapter 4. Sections 3–4 9 Chapter 4. a “temperature” controlling prob. a problem schedule. state occupation probability reaches Idea: keep k states instead of 1. a mapping from time to “temperature” local maximum local variables: current. biased towards good ones Is this necessarily an interesting guarantee?? Observe the close analogy to natural selection! Devised by Metropolis et al. choose top k of all their successors Boltzman distribution Not the same as k searches run in parallel! E(x) p(x) = αe kT Searches that find good states recruit other searches to join them T decreased slowly enough =⇒ always reach best state x∗ Problem: quite often. Simulated annealing Useful to consider state space landscape Idea: escape local maxima by allowing some “bad” moves objective function global maximum but gradually decrease their size and frequency function Simulated-Annealing( problem. of downward steps current ← Make-Node(Initial-State[problem]) for t ← 1 to ∞ do current state space T ← schedule[t] state if T = 0 then return current next ← a randomly selected successor of current Random-restart hill climbing overcomes local maxima—trivially complete ∆E ← Value[next] – Value[current] if ∆E > 0 then current ← next Random sideways moves escape from shoulders loop on flat maxima else current ← next only with probability e∆ E/T Chapter 4. airline scheduling. Hill-climbing contd. a node T. a node "flat" local maximum next. 1953.

g. Genetic algorithms Genetic algorithms contd... . y3) = sum of squared distances from each city to nearest airport Constraint Satisfaction Problems Discretization methods turn continuous space into discrete space. real genes encode replication machinery! Chapter 4.g. (x2. . = stochastic local beam search + generate successors from pairs of states GAs require states encoded as strings (GPs use programs) 24748552 24 31% 32752411 32748552 32748152 Crossover helps iff substrings are meaningful components 32752411 23 29% 24748552 24752411 24752411 24415124 20 26% 32752411 32752124 32252124 24415411 24415417 + = 32543213 11 14% 24415124 Fitness Selection Pairs Cross#Over Mutation GAs 6= evolution: e. e. e. x2.g. y2. by x ← x + α∇f (x) Sometimes can solve for ∇f (x) = 0 exactly (e. with one city). . . where Hij = ∂ 2f /∂xi ∂xj Chapter 4.   ∂x1 ∂y1 ∂x2 ∂y2 ∂x3 ∂y3   to increase/reduce f . empirical gradient considers ±δ change in each coordinate Gradient methods compute Chapter 5 ∂f ∂f ∂f ∂f ∂f ∂f    ∇f = . (x3. y2). y2. Sections 3–4 11 Chapter 4. Newton–Raphson (1664.. Sections 3–4 12 Continuous state spaces Suppose we want to site three airports in Romania: – 6-D state space defined by (x1. Sections 3–4 13 Chapter 5 1 . x3.. 1690) iterates x ← x − H−1 f (x)∇f (x) to solve ∇f (x) = 0.g. y3) – objective function f (x1. y2).

green). T Domains Di = {red. SA = blue. blue} Constraints: adjacent regions must have different colors Solutions are assignments satisfying all constraints. e. W A 6= N T (if the language allows this). . blue). (green. red). (green. .} Chapter 5 4 Chapter 5 5 . N SW = green. N T = green. N T . e. T = green} (W A. Northern Territory Northern Territory Western Queensland Australia Western Queensland Australia South Australia South Australia New South Wales New South Wales Victoria Victoria Tasmania Tasmania Variables W A. Q = red. eval. Q. green.g. V . V = red. blue). (red.. . or {W A = red. successor ♦ Problem structure and problem decomposition CSP: ♦ Local search for CSPs state is defined by variables Xi with values from domain Di goal test is a set of constraints specifying allowable combinations of values for subsets of variables Simple example of a formal representation language Allows useful general-purpose algorithms with more power than standard search algorithms Chapter 5 2 Chapter 5 3 Example: Map-Coloring Example: Map-Coloring contd.. N SW . Outline Constraint satisfaction problems (CSPs) ♦ CSP examples Standard search problem: state is a “black box”—any old data structure ♦ Backtracking search for CSPs that supports goal test.g. N T ) ∈ {(red. SA.

etc. 3. SA 6= W A + T WO Higher-order constraints involve 3 or more variables. SA 6= green Binary constraints involve pairs of variables. E.g. Chapter 5 8 Chapter 5 9 ..g.g. 1.g. U. T WO F T U W R O e. start/end times for Hubble Telescope observations V Victoria ♦ linear constraints solvable in poly time by LP methods T General-purpose CSP algorithms use the graph structure to speed up search. incl. 4. e. W.. Tasmania is an independent subproblem! Chapter 5 6 Chapter 5 7 Varieties of constraints Example: Cryptarithmetic Unary constraints involve a single variable. size d ⇒ O(dn) complete assignments Constraint graph: nodes are variables. job scheduling. T. e. 8. red is better than green often representable by a cost for each variable assignment Variables: F T U W R O X1 X2 X3 → constrained optimization problems Domains: {0. 9} Constraints alldiff(F. Constraint graph Varieties of CSPs Binary CSP: each constraint relates at most two variables Discrete variables finite domains. StartJob1 + 5 ≤ StartJob3 WA ♦ linear constraints solvable. 5. cryptarithmetic column constraints X3 X2 X1 Preferences (soft constraints).) NT ♦ e. arcs show constraints ♦ e. Boolean satisfiability (NP-complete) infinite domains (integers... 2. R. F O U R e. O) O + O = R + 10 · X1..g.g. 6...g.g.. variables are start/end days for each job Q ♦ need a constraint language. nonlinear undecidable SA NSW Continuous variables ♦ e. Boolean CSPs. strings. etc..g. e. 7.

which class is offered when and where? ♦ Initial state: the empty assignment. csp) do Depth-first search for CSPs with single-variable assignments if value is consistent with assignment given Constraints[csp] then add {var = value} to assignment is called backtracking search result ← Recursive-Backtracking(assignment. csp) returns soln/failure ⇒ b = d and there are dn leaves if assignment is complete then return assignment var ← Select-Unassigned-Variable(Variables[csp]. assignment.e. assignment. who teaches what class States are defined by the values assigned so far Timetabling problems e. Spreadsheets ⇒ fail if no legal assignments (not fixable!) Transportation scheduling ♦ Goal test: the current assignment is complete Factory scheduling 1) This is the same for all CSPs! Floorplanning 2) Every solution appears at depth n with n variables ⇒ use depth-first search Notice that many real-world problems involve real-valued variables 3) Path is irrelevant. i. csp) Only need to consider assignments to a single variable at each node function Recursive-Backtracking(assignment. function Backtracking-Search(csp) returns solution/failure [W A = red then N T = green] same as [N T = green then W A = red] return Recursive-Backtracking({ }. hence n!dn leaves!!!! Chapter 5 10 Chapter 5 11 Backtracking search Backtracking search Variable assignments are commutative. csp) for each value in Order-Domain-Values(var. Real-world CSPs Standard search formulation (incremental) Assignment problems Let’s start with the straightforward.g.g.. dumb approach. { } Hardware configuration ♦ Successor function: assign a value to an unassigned variable that does not conflict with current assignment. csp) Backtracking search is the basic uninformed algorithm for CSPs if result 6= failure then return result remove {var = value} from assignment Can solve n-queens for n ≈ 25 return failure Chapter 5 12 Chapter 5 13 .. then fix it e. so can also use complete-state formulation 4) b = (n − ℓ)d at depth ℓ..

Backtracking example Backtracking example Chapter 5 14 Chapter 5 15 Backtracking example Backtracking example Chapter 5 16 Chapter 5 17 .

In what order should its values be tried? 3. Which variable should be assigned next? 2. Improving backtracking efficiency Minimum remaining values General-purpose methods can give huge gains in speed: Minimum remaining values (MRV): choose the variable with the fewest legal values 1. Can we detect inevitable failure early? 4. choose the least constraining value: the one that rules out the fewest values in the remaining variables Degree heuristic: choose the variable with the most constraints on remaining variables Allows 1 value for SA Allows 0 values for SA Combining these heuristics makes 1000 queens feasible Chapter 5 20 Chapter 5 21 . Can we take advantage of problem structure? Chapter 5 18 Chapter 5 19 Degree heuristic Least constraining value Tie-breaker among MRV variables Given a variable.

Forward checking Forward checking Idea: Keep track of remaining legal values for unassigned variables Idea: Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values Terminate search when any variable has no legal values WA NT Q NSW V SA T WA NT Q NSW V SA T Chapter 5 22 Chapter 5 23 Forward checking Forward checking Idea: Keep track of remaining legal values for unassigned variables Idea: Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values Terminate search when any variable has no legal values WA NT Q NSW V SA T WA NT Q NSW V SA T Chapter 5 24 Chapter 5 25 .

Simplest form of propagation makes each arc consistent ables. Constraint propagation Arc consistency Forward checking propagates information from assigned to unassigned vari. neighbors of X need to be rechecked Chapter 5 28 Chapter 5 29 . but doesn’t provide early detection for all failures: X → Y is consistent iff for every value x of X there is some allowed y WA NT Q NSW V SA T WA NT Q NSW V SA T N T and SA cannot both be blue! Constraint propagation repeatedly enforces constraints locally Chapter 5 26 Chapter 5 27 Arc consistency Arc consistency Simplest form of propagation makes each arc consistent Simplest form of propagation makes each arc consistent X → Y is consistent iff X → Y is consistent iff for every value x of X there is some allowed y for every value x of X there is some allowed y WA NT Q NSW V SA T WA NT Q NSW V SA T If X loses a value.

. Xj ) returns true iff succeeds removed ← false for each x in Domain[Xi] do If X loses a value.g. Xj ) then for each Xk in Neighbors[Xi] do WA NT Q NSW V SA T add (Xk . Xn} local variables: queue. d = 2. removed ← true Arc consistency detects failure earlier than forward checking return removed Can be run as a preprocessor or after each assignment O(n2d3).y) to satisfy the constraint Xi ↔ Xj then delete x from Domain[Xi ]. initially all the arcs in csp for every value x of X there is some allowed y while queue is not empty do (Xi. can be reduced to O(n2d2) (but detecting all is NP-hard) Chapter 5 30 Chapter 5 31 Problem structure Problem structure contd. neighbors of X need to be rechecked if no value y in Domain[Xj ] allows (x.. c = 20 280 = 4 billion years at 10 million nodes/sec SA NSW 4 · 220 = 0. Arc consistency Arc consistency algorithm Simplest form of propagation makes each arc consistent function AC-3( csp) returns the CSP.4 seconds at 10 million nodes/sec V Victoria T Tasmania and mainland are independent subproblems Identifiable as connected components of constraint graph Chapter 5 32 Chapter 5 33 . Xj ) ← Remove-First(queue) if Remove-Inconsistent-Values(Xi . X2. possibly with reduced domains X → Y is consistent iff inputs: csp. n = 80. a queue of arcs. . Suppose each subproblem has c variables out of n total NT Q Worst-case solution cost is n/c · dc. a binary CSP with variables {X1. . . linear in n WA E. Xi) to queue function Remove-Inconsistent-Values( Xi .

e.. Xj ) O(n d2) time 3. prune its neighbors’ domains Hill-climbing. very fast for small c Chapter 5 36 Chapter 5 37 . assign Xj consistently with P arent(Xj ) Compare to general CSPs. simulated annealing typically work with “complete” states. all variables assigned NT NT Q Q WA To apply to CSPs: WA allow states with unsatisfied constraints SA NSW NSW operators reassign variable values V Victoria V Victoria Variable selection: randomly select any conflicted variable T T Value selection by min-conflicts heuristic: choose value that violates the fewest constraints Cutset conditioning: instantiate (in all ways) a set of variables i. For j from n down to 2. Choose a variable as root. Chapter 5 34 Chapter 5 35 Nearly tree-structured CSPs Iterative algorithms for CSPs Conditioning: instantiate a variable. i. apply RemoveInconsistent(P arent(Xj ). where worst-case time is O(dn) This property also applies to logical and probabilistic reasoning: an important example of the relation between syntactic restrictions and the complexity of reasoning. Tree-structured CSPs Algorithm for tree-structured CSPs 1.. order variables from root to leaves A E such that every node’s parent precedes it in the ordering B D A E B D A B C D E F C F C F Theorem: if the constraint graph has no loops. For j from 1 to n. hillclimb with h(n) = total number of violated constraints such that the remaining constraint graph is a tree Cutset size c ⇒ runtime O(dc · (n − c)d2). the CSP can be solved in 2.e.

. can solve n-queens in almost constant time for arbitrary n with high probability (e.000) Operators: move queen in column The same appears to be true for any randomly-generated CSP Goal test: no attacks except in a narrow range of the ratio Evaluation: h(n) = number of attacks number of constraints R= number of variables CPU time h=5 h=2 h=0 R critical ratio Chapter 5 38 Chapter 5 39 Summary CSPs are a special kind of problem: states defined by values of a fixed set of variables goal test defined by constraints on variable values Constraint Satisfaction Problems Backtracking = depth-first search with one variable assigned per node Variable ordering and value selection heuristics help significantly Forward checking prevents assignments that guarantee later failure Chapter 5 Constraint propagation (e.. Example: 4-Queens Performance of min-conflicts States: 4 queens in 4 columns (44 = 256 states) Given random initial state.g.000.g. arc consistency) does additional work to constrain values and detect inconsistencies The CSP representation allows analysis of problem structure Tree-structured CSPs can be solved in linear time Iterative min-conflicts is usually effective in practice Chapter 5 40 Chapter 5 1 . n = 10.

Outline Constraint satisfaction problems (CSPs) ♦ CSP examples Standard search problem: state is a “black box”—any old data structure ♦ Backtracking search for CSPs that supports goal test. N SW . T Domains Di = {red. . blue). T = green} (W A. N T ) ∈ {(red. Northern Territory Northern Territory Western Queensland Australia Western Queensland Australia South Australia South Australia New South Wales New South Wales Victoria Victoria Tasmania Tasmania Variables W A. N T . e.g. Q.g. green. blue} Constraints: adjacent regions must have different colors Solutions are assignments satisfying all constraints. (red. V . . V = red. green). SA. blue). N T = green. (green. N SW = green. red).} Chapter 5 4 Chapter 5 5 .. Q = red. (green. or {W A = red. SA = blue. successor ♦ Problem structure and problem decomposition CSP: ♦ Local search for CSPs state is defined by variables Xi with values from domain Di goal test is a set of constraints specifying allowable combinations of values for subsets of variables Simple example of a formal representation language Allows useful general-purpose algorithms with more power than standard search algorithms Chapter 5 2 Chapter 5 3 Example: Map-Coloring Example: Map-Coloring contd. eval.. . e. W A 6= N T (if the language allows this).

SA 6= W A + T WO Higher-order constraints involve 3 or more variables. T WO F T U W R O e. 2. red is better than green often representable by a cost for each variable assignment Variables: F T U W R O X1 X2 X3 → constrained optimization problems Domains: {0.g.) NT ♦ e. Boolean satisfiability (NP-complete) infinite domains (integers.g. cryptarithmetic column constraints X3 X2 X1 Preferences (soft constraints). W. F O U R e. nonlinear undecidable SA NSW Continuous variables ♦ e. 8.. 3.g. StartJob1 + 5 ≤ StartJob3 WA ♦ linear constraints solvable. job scheduling. O) O + O = R + 10 · X1. e. R. incl.g. e.g. 9} Constraints alldiff(F. SA 6= green Binary constraints involve pairs of variables.. 4... 7. strings.g. etc.. etc.g.. start/end times for Hubble Telescope observations V Victoria ♦ linear constraints solvable in poly time by LP methods T General-purpose CSP algorithms use the graph structure to speed up search.. Tasmania is an independent subproblem! Chapter 5 6 Chapter 5 7 Varieties of constraints Example: Cryptarithmetic Unary constraints involve a single variable. e. E. Chapter 5 8 Chapter 5 9 . 6. arcs show constraints ♦ e. variables are start/end days for each job Q ♦ need a constraint language. T. size d ⇒ O(dn) complete assignments Constraint graph: nodes are variables. 1.. Constraint graph Varieties of CSPs Binary CSP: each constraint relates at most two variables Discrete variables finite domains.g. Boolean CSPs. 5.. U.g.

assignment. then fix it e. csp) Only need to consider assignments to a single variable at each node function Recursive-Backtracking(assignment. which class is offered when and where? ♦ Initial state: the empty assignment. Spreadsheets ⇒ fail if no legal assignments (not fixable!) Transportation scheduling ♦ Goal test: the current assignment is complete Factory scheduling 1) This is the same for all CSPs! Floorplanning 2) Every solution appears at depth n with n variables ⇒ use depth-first search Notice that many real-world problems involve real-valued variables 3) Path is irrelevant... hence n!dn leaves!!!! Chapter 5 10 Chapter 5 11 Backtracking search Backtracking search Variable assignments are commutative. function Backtracking-Search(csp) returns solution/failure [W A = red then N T = green] same as [N T = green then W A = red] return Recursive-Backtracking({ }. assignment. Real-world CSPs Standard search formulation (incremental) Assignment problems Let’s start with the straightforward.e.. csp) Backtracking search is the basic uninformed algorithm for CSPs if result 6= failure then return result remove {var = value} from assignment Can solve n-queens for n ≈ 25 return failure Chapter 5 12 Chapter 5 13 . csp) returns soln/failure ⇒ b = d and there are dn leaves if assignment is complete then return assignment var ← Select-Unassigned-Variable(Variables[csp]. i. { } Hardware configuration ♦ Successor function: assign a value to an unassigned variable that does not conflict with current assignment.g. who teaches what class States are defined by the values assigned so far Timetabling problems e. csp) do Depth-first search for CSPs with single-variable assignments if value is consistent with assignment given Constraints[csp] then add {var = value} to assignment is called backtracking search result ← Recursive-Backtracking(assignment.g. so can also use complete-state formulation 4) b = (n − ℓ)d at depth ℓ. dumb approach. csp) for each value in Order-Domain-Values(var.

Backtracking example Backtracking example Chapter 5 14 Chapter 5 15 Backtracking example Backtracking example Chapter 5 16 Chapter 5 17 .

choose the least constraining value: the one that rules out the fewest values in the remaining variables Degree heuristic: choose the variable with the most constraints on remaining variables Allows 1 value for SA Allows 0 values for SA Combining these heuristics makes 1000 queens feasible Chapter 5 20 Chapter 5 21 . Can we detect inevitable failure early? 4. Improving backtracking efficiency Minimum remaining values General-purpose methods can give huge gains in speed: Minimum remaining values (MRV): choose the variable with the fewest legal values 1. In what order should its values be tried? 3. Can we take advantage of problem structure? Chapter 5 18 Chapter 5 19 Degree heuristic Least constraining value Tie-breaker among MRV variables Given a variable. Which variable should be assigned next? 2.

Forward checking Forward checking Idea: Keep track of remaining legal values for unassigned variables Idea: Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values Terminate search when any variable has no legal values WA NT Q NSW V SA T WA NT Q NSW V SA T Chapter 5 22 Chapter 5 23 Forward checking Forward checking Idea: Keep track of remaining legal values for unassigned variables Idea: Keep track of remaining legal values for unassigned variables Terminate search when any variable has no legal values Terminate search when any variable has no legal values WA NT Q NSW V SA T WA NT Q NSW V SA T Chapter 5 24 Chapter 5 25 .

Simplest form of propagation makes each arc consistent ables. Constraint propagation Arc consistency Forward checking propagates information from assigned to unassigned vari. but doesn’t provide early detection for all failures: X → Y is consistent iff for every value x of X there is some allowed y WA NT Q NSW V SA T WA NT Q NSW V SA T N T and SA cannot both be blue! Constraint propagation repeatedly enforces constraints locally Chapter 5 26 Chapter 5 27 Arc consistency Arc consistency Simplest form of propagation makes each arc consistent Simplest form of propagation makes each arc consistent X → Y is consistent iff X → Y is consistent iff for every value x of X there is some allowed y for every value x of X there is some allowed y WA NT Q NSW V SA T WA NT Q NSW V SA T If X loses a value. neighbors of X need to be rechecked Chapter 5 28 Chapter 5 29 .

Xj ) then for each Xk in Neighbors[Xi] do WA NT Q NSW V SA T add (Xk . Xi) to queue function Remove-Inconsistent-Values( Xi . Xn} local variables: queue. removed ← true Arc consistency detects failure earlier than forward checking return removed Can be run as a preprocessor or after each assignment O(n2d3). . . d = 2. Xj ) ← Remove-First(queue) if Remove-Inconsistent-Values(Xi . . can be reduced to O(n2d2) (but detecting all is NP-hard) Chapter 5 30 Chapter 5 31 Problem structure Problem structure contd. . Arc consistency Arc consistency algorithm Simplest form of propagation makes each arc consistent function AC-3( csp) returns the CSP. X2.y) to satisfy the constraint Xi ↔ Xj then delete x from Domain[Xi ]. Suppose each subproblem has c variables out of n total NT Q Worst-case solution cost is n/c · dc.g. n = 80. Xj ) returns true iff succeeds removed ← false for each x in Domain[Xi] do If X loses a value. neighbors of X need to be rechecked if no value y in Domain[Xj ] allows (x.4 seconds at 10 million nodes/sec V Victoria T Tasmania and mainland are independent subproblems Identifiable as connected components of constraint graph Chapter 5 32 Chapter 5 33 . a queue of arcs.. linear in n WA E. initially all the arcs in csp for every value x of X there is some allowed y while queue is not empty do (Xi. possibly with reduced domains X → Y is consistent iff inputs: csp. c = 20 280 = 4 billion years at 10 million nodes/sec SA NSW 4 · 220 = 0. a binary CSP with variables {X1.

the CSP can be solved in 2.e. Choose a variable as root. prune its neighbors’ domains Hill-climbing. all variables assigned NT NT Q Q WA To apply to CSPs: WA allow states with unsatisfied constraints SA NSW NSW operators reassign variable values V Victoria V Victoria Variable selection: randomly select any conflicted variable T T Value selection by min-conflicts heuristic: choose value that violates the fewest constraints Cutset conditioning: instantiate (in all ways) a set of variables i. very fast for small c Chapter 5 36 Chapter 5 37 . Tree-structured CSPs Algorithm for tree-structured CSPs 1. For j from 1 to n. hillclimb with h(n) = total number of violated constraints such that the remaining constraint graph is a tree Cutset size c ⇒ runtime O(dc · (n − c)d2).. order variables from root to leaves A E such that every node’s parent precedes it in the ordering B D A E B D A B C D E F C F C F Theorem: if the constraint graph has no loops. where worst-case time is O(dn) This property also applies to logical and probabilistic reasoning: an important example of the relation between syntactic restrictions and the complexity of reasoning. Xj ) O(n d2) time 3. For j from n down to 2. Chapter 5 34 Chapter 5 35 Nearly tree-structured CSPs Iterative algorithms for CSPs Conditioning: instantiate a variable. simulated annealing typically work with “complete” states. assign Xj consistently with P arent(Xj ) Compare to general CSPs.e. apply RemoveInconsistent(P arent(Xj ).. i.

Example: 4-Queens Performance of min-conflicts States: 4 queens in 4 columns (44 = 256 states) Given random initial state.. can solve n-queens in almost constant time for arbitrary n with high probability (e.g..000. n = 10.000) Operators: move queen in column The same appears to be true for any randomly-generated CSP Goal test: no attacks except in a narrow range of the ratio Evaluation: h(n) = number of attacks number of constraints R= number of variables CPU time h=5 h=2 h=0 R critical ratio Chapter 5 38 Chapter 5 39 Summary CSPs are a special kind of problem: states defined by values of a fixed set of variables goal test defined by constraints on variable values Game playing Backtracking = depth-first search with one variable assigned per node Variable ordering and value selection heuristics help significantly Forward checking prevents assignments that guarantee later failure Chapter 6 Constraint propagation (e.g. arc consistency) does additional work to constrain values and detect inconsistencies The CSP representation allows analysis of problem structure Tree-structured CSPs can be solved in linear time Iterative min-conflicts is usually effective in practice Chapter 6 1 Chapter 5 40 .

.. X O X X O X X O X .. bridge... . approximate evaluation (Zuse. 1948.. Outline Games vs. deterministic.. 1846) ♦ Games of chance • Algorithm for perfect play (Zermelo. MAX (X) O X O X X O X O . 1944) ♦ Games of imperfect information • Finite horizon. MIN (O) X X . Wiener. must approximate – α–β pruning Plan of attack: ♦ Resource limits and approximate evaluation • Computer considers possible lines of play (Babbage. . search problems ♦ Games “Unpredictable” opponent ⇒ solution is a strategy specifying a move for every possible opponent reply ♦ Perfect play – minimax decisions Time limits ⇒ unlikely to find goal. 1950) • First chess program (Turing.. checkers.. turns) MAX (X) deterministic chance perfect information chess. 1951) • Machine learning to improve evaluation accuracy (Samuel. 1945. Shannon. scrabble blind tictactoe nuclear war X O X O X . 1956) Chapter 6 2 Chapter 6 3 Types of games Game tree (2-player... TERMINAL O X O O X X O X X O X O O Utility !1 0 +1 Chapter 6 4 Chapter 6 5 . . poker.. 1952–57) • Pruning to allow deeper search (McCarthy. Von Neumann.. othello monopoly MIN (O) X X X X X X imperfect information battleships. backgammon X X X go. 1912..

2-ply game: function Max-Value(state) returns a utility value MAX 3 if Terminal-Test(state) then return Utility(state) v ← −∞ A1 A2 A3 for a. Minimax Minimax algorithm Perfect play for deterministic. state)) E. s in Successors(state) do v ← Min(v. Min-Value(s)) return v MIN 3 2 2 function Min-Value(state) returns a utility value A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33 if Terminal-Test(state) then return Utility(state) v←∞ 3 12 8 2 4 6 14 5 2 for a. NB a finite strategy can exist even in an infinite tree! Optimal?? Chapter 6 8 Chapter 6 9 .g.. Max-Value(s)) return v Chapter 6 6 Chapter 6 7 Properties of minimax Properties of minimax Complete?? Complete?? Only if tree is finite (chess has specific rules for this). s in Successors(state) do v ← Max(v. perfect-information games function Minimax-Decision(state) returns an action Idea: choose move to position with highest minimax value inputs: state. current state in game = best achievable payoff against best play return the a in Actions(state) maximizing Min-Value(Result(a.

Otherwise?? Optimal?? Yes. against an optimal opponent. if tree is finite (chess has specific rules for this) Complete?? Yes. Properties of minimax Properties of minimax Complete?? Yes. m ≈ 100 for “reasonable” games ⇒ exact solution completely infeasible 3 12 8 But do we need to explore every path? Chapter 6 12 Chapter 6 13 . b ≈ 35. Otherwise?? Time complexity?? Time complexity?? O(bm) Space complexity?? Chapter 6 10 Chapter 6 11 Properties of minimax α–β pruning example Complete?? Yes. if tree is finite (chess has specific rules for this) MAX 3 Optimal?? Yes. Otherwise?? Time complexity?? O(bm) MIN 3 Space complexity?? O(bm) (depth-first exploration) For chess. against an optimal opponent. if tree is finite (chess has specific rules for this) Optimal?? Yes. against an optimal opponent.

α–β pruning example α–β pruning example MAX 3 MAX 3 MIN 3 2 MIN 3 2 14 X X X X 3 12 8 2 3 12 8 2 14 Chapter 6 14 Chapter 6 15 α–β pruning example α–β pruning example MAX 3 MAX 3 3 MIN 3 2 14 5 MIN 3 2 14 5 2 X X X X 3 12 8 2 14 5 3 12 8 2 14 5 2 Chapter 6 16 Chapter 6 17 .

v) α is the best value (to max) found so far off the current path return v If V is worse than α. the value of the best alternative for min along the path to state .. . 3550 is still impossible! ⇒ 106 nodes per move ≈ 358/2 ⇒ α–β reaches depth 8 ⇒ pretty good chess program Chapter 6 20 Chapter 6 21 . evaluation function that estimates desirability of position A simple example of the value of reasoning about which computations are relevant (a form of metareasoning) Suppose we have 100 seconds. β. β) returns a utility value MIN inputs: state.. s in Successors(state) do MIN v ← Max(v.” time complexity = O(b m/2 ) e. α.. the value of the best alternative for max along the path to state .g. depth limit (perhaps add quiescence search) ⇒ doubles solvable depth • Use Eval instead of Utility i. β)) V if v ≥ β then return v α ← Max(α. explore 104 nodes/second Unfortunately. max will avoid it ⇒ prune that branch function Min-Value(state. Min-Value(s. β) returns a utility value same as Max-Value but with roles of α. Why is it called α–β ? The α–β algorithm function Alpha-Beta-Decision(state) returns an action return the a in Actions(state) maximizing Min-Value(Result(a.. state)) MAX function Max-Value(state. β reversed Define β similarly for min Chapter 6 18 Chapter 6 19 Properties of α–β Resource limits Pruning does not affect final result Standard approach: Good move ordering improves effectiveness of pruning • Use Cutoff-Test instead of Terminal-Test With “perfect ordering.. if Terminal-Test(state) then return Utility(state) MAX v ← −∞ for a. α. current state in game α.e. α.

Deep Blue searches 200 million positions per second. Evaluation functions Digression: Exact values don’t matter MAX MIN 1 2 1 20 1 2 2 4 1 20 20 400 Black!to!move! White!to!move! Behaviour is preserved under any monotonic transformation of Eval White!slightly!better Black!winning Only the order matters: For chess. . 25 24 23 22 21 20 19 18 17 16 15 14 13 Chapter 6 24 Chapter 6 25 . b > 300. uses very sophisticated evaluation. who are too good. who are too bad.g. Chess: Deep Blue defeated human world champion Gary Kasparov in a six- game match in 1997. a total of 443.247 positions. w1 = 9 with f1(s) = (number of white queens) – (number of black queens). so most programs use pattern knowledge bases to suggest plausible moves. and undisclosed methods for extending some lines of search up to 40 ply.748. Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board. Chapter 6 22 Chapter 6 23 Deterministic games in practice Nondeterministic games: backgammon Checkers: Chinook ended 40-year-reign of human world champion Marion 0 1 2 3 4 5 6 7 8 9 10 11 12 Tinsley in 1994. . etc. + wnfn(s) e.. Othello: human champions refuse to compete against computers. In go. typically linear weighted sum of features payoff in deterministic games acts as an ordinal utility function Eval(s) = w1f1(s) + w2f2(s) + .401. Go: human champions refuse to compete against computers.

000 with 1-1 roll) depth 4 = 20 × (21 × 20)3 ≈ 1.1 1..9 . chance introduced by dice..9 .2 × 109 DICE 2.5 0.5 0..1 α–β pruning is much less effective MIN 2 3 1 4 20 30 1 400 TDGammon uses depth-2 search + very good Eval ≈ world-champion level 2 2 3 3 1 1 4 4 20 20 30 30 1 1 400 400 Behaviour is preserved only by positive linear transformation of Eval Hence Eval should be proportional to the expected payoff Chapter 6 28 Chapter 6 29 .5 return average of ExpectiMinimax-Value of Successors(state) .9 As depth increases.9 .3 21 40.1 .1 . card-shuffling Expectiminimax gives perfect play Simplified example with coin-flipping: Just like Minimax. except we must also handle chance nodes: MAX . if state is a Max node then return the highest ExpectiMinimax-Value of Successors(state) if state is a Min node then CHANCE 3 "1 return the lowest ExpectiMinimax-Value of Successors(state) if state is a chance node then 0. MIN 2 4 0 "2 2 4 7 4 6 0 5 "2 Chapter 6 26 Chapter 6 27 Nondeterministic games in practice Digression: Exact values DO matter Dice rolls increase b: 21 possible rolls with 2 dice MAX Backgammon ≈ 20 legal moves (can be 6.5 0.1 . Nondeterministic games in general Algorithm for nondeterministic games In nondeterministic games. probability of reaching a given node shrinks ⇒ value of lookahead is diminished .9 ..

Games of imperfect information Example E. then choose the action with highest expected value over all deals∗ Special case: if an action is optimal for all deals.∗ GIB. it’s optimal..g.5 MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7 MIN 4 2 9 3 4 2 9 3 9 4 2 3 2 4 3 6 7 6 4 3 "0.5 Chapter 6 32 Chapter 6 33 . where opponent’s initial cards are unknown Four-card bridge/whist/hearts hand. current best bridge program. approximates this idea by 1) generating 100 deals consistent with bidding information 2) picking the action that wins most tricks on average Chapter 6 30 Chapter 6 31 Example Example Four-card bridge/whist/hearts hand. Max to play first Four-card bridge/whist/hearts hand. card games. Max to play first MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7 6 6 7 MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7 6 6 7 MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3 0 MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3 0 9 2 9 2 8 6 MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7 6 6 7 MAX 6 6 8 7 6 6 7 6 6 7 6 6 7 6 7 0 MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3 0 MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3 9 2 9 2 6 6 7 4 3 "0. Max to play first 8 6 Typically we can calculate a probability for each possible deal 6 6 8 7 6 6 7 6 6 7 6 6 7 6 7 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3 0 Seems just like having one big dice roll at the beginning of the game∗ 9 2 Idea: compute the minimax value of each action in each deal.

take the right fork and you’ll be run over by a bus. guess incorrectly and you’ll be run over by a bus. Leads to rational behaviors such as take the right fork and you’ll find a mound of jewels. Chapter 6 34 Chapter 6 35 Commonsense example Proper analysis Road A leads to a small heap of gold pieces * Intuition that the value of an action is the average of its values Road B leads to a fork: in all actual states is WRONG take the left fork and you’ll find a mound of jewels. take the right fork and you’ll be run over by a bus. Road A leads to a small heap of gold pieces Road B leads to a fork: take the left fork and you’ll be run over by a bus. take the left fork and you’ll find a mound of jewels. take the right fork and you’ll be run over by a bus. ♦ Acting to obtain information Road A leads to a small heap of gold pieces ♦ Signalling to one’s partner Road B leads to a fork: ♦ Acting randomly to minimize information disclosure guess correctly and you’ll find a mound of jewels. take the right fork and you’ll find a mound of jewels. With partial observability. Chapter 6 36 Chapter 6 37 . value of an action depends on the information state or belief state the agent is in Road A leads to a small heap of gold pieces Road B leads to a fork: Can generate and search a tree of information states take the left fork and you’ll be run over by a bus. Commonsense example Commonsense example Road A leads to a small heap of gold pieces Road A leads to a small heap of gold pieces Road B leads to a fork: Road B leads to a fork: take the left fork and you’ll find a mound of jewels.

e. what they know. satisfiability Tell it what it needs to know ♦ Inference rules and theorem proving Then it can Ask itself what to do—answers should follow from the KB – forward chaining – backward chaining Agents can be viewed at the knowledge level – resolution i.e. not real state Games are to AI as grand prix racing is to automobile design Chapter 6 38 Chapter 7 1 Outline Knowledge bases ♦ Knowledge-based agents Inference engine domainindependent algorithms ♦ Wumpus world Knowledge base domainspecific content ♦ Logic in general—models and entailment Knowledge base = set of sentences in a formal language ♦ Propositional (Boolean) logic Declarative approach to building an agent (or other system): ♦ Equivalence.. data structures in KB and algorithms that manipulate them Chapter 7 2 Chapter 7 3 .. validity. regardless of how implemented Or at the implementation level i. Summary Games are fun to work on! (and dangerous) They illustrate several important points about AI ♦ perfection is unattainable ⇒ must approximate Logical agents ♦ good idea to think about what to think about ♦ uncertainty constrains the assignment of values to states Chapter 7 ♦ optimal decisions depend on information state.

t)) Squares adjacent to pit are breezy 3 Stench Gold PIT t←t + 1 Glitter iff gold is in the same square Stench Breeze 2 return action Shooting kills wumpus if you are facing it Shooting uses up the only arrow 1 Breeze Breeze PIT The agent must be able to: Grabbing picks up gold if in same square START Represent states. Make-Action-Query(t)) Squares adjacent to wumpus are smelly Breeze Breeze Tell(KB. Releasing drops the gold in same square 1 2 3 4 Incorporate new percepts Actuators Left turn. actions. a knowledge base gold +1000. Release. A simple knowledge-based agent Wumpus World PEAS description Performance measure function KB-Agent( percept) returns an action static: KB. a counter. Glitter. Smell Chapter 7 4 Chapter 7 5 Wumpus world characterization Wumpus world characterization Observable?? Observable?? No—only local perception Deterministic?? Chapter 7 6 Chapter 7 7 . Make-Action-Sentence(action. -10 for using the arrow Breeze Tell(KB. etc. Right turn. initially 0. Make-Percept-Sentence( percept. Shoot Deduce hidden properties of the world Deduce appropriate actions Sensors Breeze. death -1000 t. Update internal representations of the world Forward. Grab. indicating time -1 per step. t)) Environment 4 Stench PIT action ← Ask(KB.

Wumpus world characterization Wumpus world characterization Observable?? No—only local perception Observable?? No—only local perception Deterministic?? Yes—outcomes exactly specified Deterministic?? Yes—outcomes exactly specified Episodic?? Episodic?? No—sequential at the level of actions Static?? Chapter 7 8 Chapter 7 9 Wumpus world characterization Wumpus world characterization Observable?? No—only local perception Observable?? No—only local perception Deterministic?? Yes—outcomes exactly specified Deterministic?? Yes—outcomes exactly specified Episodic?? No—sequential at the level of actions Episodic?? No—sequential at the level of actions Static?? Yes—Wumpus and Pits do not move Static?? Yes—Wumpus and Pits do not move Discrete?? Discrete?? Yes Single-agent?? Chapter 7 10 Chapter 7 11 .

Wumpus world characterization Exploring a wumpus world
Observable?? No—only local perception
Deterministic?? Yes—outcomes exactly specified
Episodic?? No—sequential at the level of actions
Static?? Yes—Wumpus and Pits do not move
Discrete?? Yes
Single-agent?? Yes—Wumpus is essentially a natural feature OK

OK OK
A

Chapter 7 12 Chapter 7 13

Exploring a wumpus world Exploring a wumpus world

P?

B OK B OK P?
A A

OK OK OK OK
A A

Chapter 7 14 Chapter 7 15

Exploring a wumpus world Exploring a wumpus world

P? P?

P
B OK P? B OK P?
OK
A A

OK S OK OK S OK
A A A A
W
Chapter 7 16 Chapter 7 17

Exploring a wumpus world Exploring a wumpus world

P? P? OK

P P
B OK P? B OK P? OK
OK OK
A A A A

OK S OK OK S OK
A A
W A A
W
Chapter 7 18 Chapter 7 19

Exploring a wumpus world Other tight spots

P?

Breeze in (1,2) and (2,1)
B OK P? ⇒ no safe actions
P?
P? OK A

P A
OK B
A
OK
P?
Assuming pits uniformly distributed,
(2,2) has pit w/ prob 0.86, vs. 0.31
B OK P? BGS OK
OK
A A A Smell in (1,1)
⇒ cannot move
OK S OK Can use a strategy of coercion:
shoot straight ahead
A A
W S
A
wumpus was there ⇒ dead ⇒ safe
wumpus wasn’t there ⇒ safe

Chapter 7 20 Chapter 7 21

Logic in general Entailment
Logics are formal languages for representing information Entailment means that one thing follows from another:
such that conclusions can be drawn
KB |= α
Syntax defines the sentences in the language
Knowledge base KB entails sentence α
Semantics define the “meaning” of sentences; if and only if
i.e., define truth of a sentence in a world α is true in all worlds where KB is true

E.g., the language of arithmetic E.g., the KB containing “the Giants won” and “the Reds won”
entails “Either the Giants won or the Reds won”
x + 2 ≥ y is a sentence; x2 + y > is not a sentence
E.g., x + y = 4 entails 4 = x + y
x + 2 ≥ y is true iff the number x + 2 is no less than the number y
Entailment is a relationship between sentences (i.e., syntax)
x + 2 ≥ y is true in a world where x = 7, y = 1 that is based on semantics
x + 2 ≥ y is false in a world where x = 0, y = 6
Note: brains process syntax (of some sort)

Chapter 7 22 Chapter 7 23

breeze in [2.1] Then KB |= α if and only if M (KB) ⊆ M (α) x x x x ? ? E.1].g. moving right. which are formally structured worlds with respect to which truth can be evaluated We say m is a model of a sentence α if α is true in m M (α) is the set of all models of α Situation after detecting nothing in [1. KB = Giants won and Reds won x x x x x B α = Giants won M( ) x x x x x x x x x x x x x x x x Consider possible models for ?s assuming only pits A A ? x x x x xx x xx x x x x x x 3 Boolean choices ⇒ 8 possible models M(KB) x x x x x x x Chapter 7 24 Chapter 7 25 Wumpus models Wumpus models 2 PIT 2 PIT 2 2 Breeze Breeze 1 1 Breeze Breeze 1 PIT 1 PIT 1 2 3 1 2 3 1 2 3 1 2 3 KB 2 PIT 2 PIT 2 PIT 2 PIT 2 2 Breeze Breeze Breeze 1 PIT Breeze 1 PIT 1 1 Breeze Breeze 1 1 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 2 PIT PIT 2 PIT PIT 2 PIT 2 PIT Breeze Breeze 1 1 Breeze Breeze 1 PIT 1 PIT 2 PIT PIT 2 PIT PIT 1 2 3 1 2 3 1 2 3 1 2 3 Breeze Breeze 1 PIT 1 PIT 1 2 3 1 2 3 KB = wumpus-world rules + observations Chapter 7 26 Chapter 7 27 . Models Entailment in the wumpus world Logicians typically think in terms of models.

inference = finding it 1 2 3 KB Soundness: i is sound if 2 PIT 2 2 PIT whenever KB ⊢i α. That is. it is also true that KB |= α Breeze Breeze 1 PIT 1 Completeness: i is complete if Breeze 1 1 2 3 1 2 3 1 2 3 whenever KB |= α. proved by model checking Chapter 7 28 Chapter 7 29 Wumpus models Inference KB ⊢i α = sentence α can be derived from KB by procedure i 2 PIT 2 Breeze Consequences of KB are a haystack. Wumpus models Wumpus models 2 PIT 2 PIT 2 2 Breeze Breeze 1 1 Breeze Breeze 1 PIT 1 PIT 1 2 3 1 2 3 1 2 3 1 2 3 KB KB 1 2 PIT 2 PIT 2 PIT 2 PIT 2 2 Breeze Breeze Breeze 1 PIT Breeze 1 PIT 1 1 Breeze Breeze 1 1 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 2 PIT PIT 2 PIT PIT 2 PIT 2 PIT Breeze Breeze 1 1 Breeze Breeze 1 PIT 1 PIT 2 PIT PIT 2 PIT PIT 1 2 3 1 2 3 1 2 3 1 2 3 Breeze Breeze 1 PIT 1 PIT 1 2 3 1 2 3 KB = wumpus-world rules + observations KB = wumpus-world rules + observations α1 = “[1.2] is safe”. the procedure will answer any question whose answer follows from KB = wumpus-world rules + observations what is known by the KB. and for which there exists a sound and 2 PIT PIT 1 2 3 1 2 3 Breeze 1 1 2 PIT 3 complete inference procedure. α is a needle. it is also true that KB ⊢i α 2 PIT PIT 2 PIT Breeze 1 Breeze Preview: we will define a logic (first-order logic) which is expressive enough 1 PIT to say almost anything of interest. KB 6|= α2 Chapter 7 30 Chapter 7 31 .2] is safe”. α2 = “[2. KB |= α1. 1 1 Breeze PIT 1 2 3 2 Entailment = needle in haystack.

j]. 8 possible models.1 true false false false true false false true true false true true true true ¬B1.. ¬S is a sentence (negation) (With these symbols.1) = true ∧ (f alse ∨ true) = true ∧ true = true Chapter 7 32 Chapter 7 33 Truth tables for connectives Wumpus world sentences P Q ¬P P ∧Q P ∨Q P ⇒Q P ⇔Q Let Pi.) If S1 and S2 are sentences.2 P3.g. ¬P1.j be true if there is a breeze in [i.2 P2.j be true if there is a pit in [i.1 B2.1 true true f alse If S is a sentence. P1. S1 ∧ S2 is a sentence (conjunction) Rules for evaluating truth with respect to a model m: If S1 and S2 are sentences. is false iff S1 is true and S2 is false S1 ⇔ S 2 is true iff S1 ⇒ S2 is true and S2 ⇒ S1 is true Simple recursive process evaluates an arbitrary sentence. S1 ⇒ S2 is a sentence (implication) S1 ∧ S 2 is true iff S1 is true and S2 is true If S1 and S2 are sentences. S1 ∨ S2 is a sentence (disjunction) ¬S is true iff S is false If S1 and S2 are sentences. S1 ⇔ S2 is a sentence (biconditional) S1 ∨ S 2 is true iff S1 is true or S2 is true S1 ⇒ S 2 is true iff S1 is false or S2 is true i. P2 etc are sentences E. j].2 ∨ P3. false false true false false true true Let Bi. e.. false true true false true true false ¬P1. can be enumerated automatically. Propositional logic: Syntax Propositional logic: Semantics Propositional logic is the simplest logic—illustrates basic ideas Each model specifies true/false for each proposition symbol The proposition symbols P1.2 ∧ (P2.e.g.1 “Pits cause breezes in adjacent squares” Chapter 7 34 Chapter 7 35 .

.. false. j]...1 .2 P3.... [ ]) ((α ∨ β) ∨ γ) ≡ (α ∨ (β ∨ γ)) associativity of ∨ ¬(¬α) ≡ α double-negation elimination function TT-Check-All(KB. rest. model) then return PL-True?(α. α.. .1 false true false false false false false true true false true true false false true false false false false true true true true true true true B2..... . . . problem is co-NP-complete Chapter 7 38 Chapter 7 39 ... B1.1) true true true true true true true false true true false true false B2. j]. a sentence in propositional logic (α ∨ β) ≡ (β ∨ α) commutativity of ∨ symbols ← a list of the proposition symbols in KB and α ((α ∧ β) ∧ γ) ≡ (α ∧ (β ∧ γ)) associativity of ∧ return TT-Check-All(KB..2 P2. ¬B1.j be true if there is a pit in [i.1 P2. model) returns true or false (α ⇒ β) ≡ (¬β ⇒ ¬α) contraposition if Empty?(symbols) then if PL-True?(KB... model) (α ⇒ β) ≡ (¬α ∨ β) implication elimination else return true (α ⇔ β) ≡ ((α ⇒ β) ∧ (β ⇒ α)) biconditional elimination else do ¬(α ∧ β) ≡ (¬α ∨ ¬β) De Morgan P ← First(symbols).2 ∨ P2.. . false false false false false false false true true true true false false false false false false false false true true true false true false false ¬P1. .1 ⇔ (P1. .. ... .1 P1.1 ∨ P2. . true... symbols. the query. . . rest...... . . . α.. the knowledge base. . rest ← Rest(symbols) ¬(α ∨ β) ≡ (¬α ∧ ¬β) De Morgan return TT-Check-All(KB. . check that α is too Chapter 7 36 Chapter 7 37 Inference by enumeration Logical equivalence Depth-first enumeration of all models is sound and complete Two sentences are logically equivalent iff true in same models: α ≡ β if and only if α |= β and β |= α function TT-Entails?(KB..1) “A square is breezy if and only if there is an adjacent pit” Enumerate rows (different assignments to symbols).1 R1 R2 R3 R4 R5 KB Let Bi.... Extend(P .. α. model)) (α ∨ (β ∧ γ)) ≡ ((α ∨ β) ∧ (α ∨ γ)) distributivity of ∨ over ∧ O(2n) for n symbols. .. symbols.. a sentence in propositional logic (α ∧ β) ≡ (β ∧ α) commutativity of ∧ α.... . Extend(P ..1 false true false false false true false true true true true true true “Pits cause breezes in adjacent squares” false true false false false true true true true true true true true false true false false true false false true false false true true false . .. α. . Wumpus world sentences Truth tables for inference Let Pi.1 B2. B1.2 ∨ P3. ....... if KB is true in row.1 P1.. α) returns true or false inputs: KB.1 ⇔ (P1.. .. model)) and (α ∧ (β ∨ γ)) ≡ ((α ∧ β) ∨ (α ∧ γ)) distributivity of ∧ over ∨ TT-Check-All(KB...j be true if there is a breeze in [i..

Davis–Putnam–Logemann–Loveland KB |= α if and only if (KB ∧ ¬α) is unsatisfiable heuristic search in model space (sound but incomplete) i. (A ∧ (A ⇒ B)) ⇒ B Validity is connected to inference via the Deduction Theorem: Application of inference rules KB |= α if and only if (KB ⇒ α) is valid – Legitimate (sound) generation of new sentences from old – Proof = a sequence of inference rule applications A sentence is satisfiable if it is true in some model Can use inference rules as operators in a standard search alg. e. . e. . KB = conjunction of Horn clauses add its conclusion to the KB.. α1 ∧ · · · ∧ αn ⇒ β A∧B ⇒ L L β A Can be used with forward chaining or backward chaining.e.g. prove α by reductio ad absurdum e. Validity and satisfiability Proof methods A sentence is valid if it is true in all models. or P ⇒ Q ♦ (conjunction of symbols) ⇒ symbol E. αn. Proof methods divide into (roughly) two kinds: e. min-conflicts-like hill-climbing algorithms Chapter 7 40 Chapter 7 41 Forward and backward chaining Forward chaining Horn Form (restricted) Idea: fire any rule whose premises are satisfied in the KB.g. T rue.. B A B These algorithms are very natural and run in linear time Chapter 7 42 Chapter 7 43 . A ∨ ¬A. C ∧ (B ⇒ A) ∧ (C ∧ D ⇒ B) L∧M ⇒ P P B∧L ⇒ M Modus Ponens (for Horn Form): complete for Horn KBs A∧P ⇒ L M α1. . A ∨ B.g.. .g.g. A ⇒ A. A ∧ ¬A truth table enumeration (always exponential in n) Satisfiability is connected to inference via the following: improved backtracking.... until query is found Horn clause = Q ♦ proposition symbol. C – Typically require translation of sentences into a normal form A sentence is unsatisfiable if it is true in no models Model checking e..g.

Forward chaining algorithm Forward chaining example function PL-FC-Entails?(KB. indexed by symbol. initially the symbols known in KB while agenda is not empty do 2 p ← Pop(agenda) M unless inferred[p] do inferred[p] ← true 2 for each Horn clause c in whose premise p appears do L decrement count[c] if count[c] = 0 then do if Head[c] = q then return true 2 2 Push(Head[c]. a list of symbols. a set of propositional Horn clauses q. q) returns true or false Q inputs: KB. a table. a proposition symbol 1 local variables: count. initially the number of premises inferred. indexed by clause. each entry initially false P agenda. agenda) return false A B Chapter 7 44 Chapter 7 45 Forward chaining example Forward chaining example Q Q 1 1 P P 2 2 M M 2 1 L L 1 1 1 0 A B A B Chapter 7 46 Chapter 7 47 . the query. a table. the knowledge base.

Forward chaining example Forward chaining example Q Q 1 1 P P 1 0 M M 0 0 L L 1 0 1 0 A B A B Chapter 7 48 Chapter 7 49 Forward chaining example Forward chaining example Q Q 0 0 P P 0 0 M M 0 0 L L 0 0 0 0 A B A B Chapter 7 50 Chapter 7 51 .

∧ ak is true in m and b is false in m Therefore the algorithm has not reached a fixed point! 0 L 4. Every clause in the original KB is true in m 0 Proof: Suppose a clause a1 ∧ . ∧ ak ⇒ b is false in m M Then a1 ∧ . Consider the final state as a model m. . . . or M 2) has already failed L A B Chapter 7 54 Chapter 7 55 . check if q is known already. FC reaches a fixed point where no new atomic sentences are derived 0 2. Forward chaining example Proof of completeness Q FC derives every atomic sentence that is entailed by KB 1. assigning true/false to symbols P 3. or prove by BC all premises of some rule concluding q P Avoid loops: check if new subgoal is already on the goal stack Avoid repeated work: check if new subgoal 1) has already been proved true. If KB |= q. including m 0 0 General idea: construct any model of KB by sound inference. q is true in every model of KB. . check α A B Chapter 7 52 Chapter 7 53 Backward chaining Backward chaining example Idea: work backwards from the query q: Q to prove q by BC. Hence m is a model of KB 5.

Backward chaining example Backward chaining example Q Q P P M M L L A B A B Chapter 7 56 Chapter 7 57 Backward chaining example Backward chaining example Q Q P P M M L L A B A B Chapter 7 58 Chapter 7 59 .

Backward chaining example Backward chaining example Q Q P P M M L L A B A B Chapter 7 60 Chapter 7 61 Backward chaining example Backward chaining example Q Q P P M M L L A B A B Chapter 7 62 Chapter 7 63 .

2. Resolution inference rule (for CNF): complete for propositional logic e..g. appropriate for problem-solving.. Backward chaining example Backward chaining example Q Q P P M M L L A B A B Chapter 7 64 Chapter 7 65 Forward vs. Conjunctive Normal Form (CNF—universal) e. backward chaining Resolution FC is data-driven. m1 ∨ · · · ∨ mn Complexity of BC can be much less than linear in size of KB ℓ1 ∨ · · · ∨ ℓi−1 ∨ ℓi+1 ∨ · · · ∨ ℓk ∨ m1 ∨ · · · ∨ mj−1 ∨ mj+1 ∨ · · · ∨ mn where ℓi and mj are complementary literals.g. ¬P2.g. object recognition. P? P1. (A ∨ ¬B) ∧ (B ∨ ¬C ∨ ¬D) BC is goal-driven.3 ∨ P2.2 P B OK P? P1. automatic.3 A A OK OK S OK Resolution is sound and complete for propositional logic A A W Chapter 7 66 Chapter 7 67 . E. unconscious processing.. routine decisions conjunction of disjunctions | {z of literals} clauses May do lots of work that is irrelevant to the goal E.g. Where are my keys? How do I get into a PhD program? ℓ1 ∨ · · · ∨ ℓ k . cf..

2 P2.2 ∨ P2.2 ∨ P2.1 P2.1 P1..1) ∧ (¬P1.1) ∧ ((¬P1. Eliminate ⇔.1 ⇒ (P1.2 B1. the query.2 2. Move ¬ inwards using de Morgan’s rules and double-negation: resolvents ← PL-Resolve(Ci.1 B1. the knowledge base.1 – inference: deriving sentences from other sentences – soundess: derivations produce only entailed sentences – completeness: derivations can produce all entailed sentences Wumpus world requires the ability to represent partial and negated informa- tion. clauses ← the set of clauses in the CNF representation of KB ∧ ¬α new ← { } (¬B1.2 Basic concepts of logic: – syntax: formal structure of sentences – semantics: truth of sentences wrt models B1. Apply distributivity law (∨ over ∧) and flatten: clauses ← clauses ∪ new (¬B1.1 ∨ P1. Conversion to CNF Resolution algorithm B1.1 ∨ P1.1 P P P1.1) α. replacing α ⇒ β with ¬α ∨ β.1) ∧ (¬(P1.1 α = ¬P1.1 B1.1 P1.2 ∨ P2.2 ∨ P2.2 B1.2 2. Forward.1)) ∧ ((P1.1) Chapter 7 68 Chapter 7 69 Resolution example Summary KB = (B1. i.2 ∨ B1.1 B1.1 B1.1 ∨ P1.1 P2.2 Logical agents apply inference to a knowledge base to derive new information and make decisions P2. Eliminate ⇒.2 B1.2 ∨ P2.2 ∨ P2. complete for Horn clauses Resolution is complete for propositional logic Propositional logic lacks expressive power Chapter 7 70 Chapter 7 71 .1 ⇔ (P1.1) ∨ B1. replacing α ⇔ β with (α ⇒ β) ∧ (β ⇒ α).1)) ∧ ¬B1. a sentence in propositional logic 2.2 – entailment: necessary truth of one sentence given another 1. function PL-Resolution(KB. Cj ) if resolvents contains the empty clause then return true (¬B1. etc.1) loop do for each Ci.1 1. α) returns true or false inputs: KB. show KB ∧ ¬α unsatisfiable 1.e.1 P1.1) ∧ (¬P2.1 P P P2.1) ⇒ B1.2 ∧ ¬P2. Cj in clauses do 3.2 ∨ P2.1 P1.1 ⇔ (P1.1 P1.2 ∨ P2. backward chaining are linear-time.1) Proof by contradiction. reason by cases.1) new ← new ∪ resolvents if new ⊆ clauses then return false 4. a sentence in propositional logic (B1.1 ∨ B1.1) ∨ B1.

2 • Relations: red. prime. Meaning in propositional logic is context-independent comes between.2 is derived from meaning of B1. theories. Propositional logic is compositional: baseball games.1 ∧ P1.. first-order logic (like natural language) assumes the world contains Propositional logic allows partial/disjunctive/negated information (unlike most data structures and databases) • Objects: people.. . . (unlike natural language) E. where meaning depends on context) • Functions: father of. occurred after. . . inside. centuries .g. houses. bigger than. multistoried . end of Propositional logic has very limited expressive power . brother of. part of. wars. third inning of. meaning of B1. colors. numbers. best friend. .1 and of P1. . has color. bogus.. (unlike natural language. owns. one more than. round.. . cannot say “pits cause breezes in adjacent squares” except by writing one sentence for each square Chapter 8 3 Chapter 8 4 . Outline ♦ Why FOL? ♦ Syntax and semantics of FOL First-order logic ♦ Fun with sentences ♦ Wumpus world in FOL Chapter 8 Chapter 8 1 Chapter 8 2 Pros and cons of propositional logic First-order logic Propositional logic is declarative: pieces of syntax correspond to facts Whereas propositional logic assumes world contains facts. Ronald McDonald.

. 2) ∨ ≤(1. Propositional logic facts true/false/unknown Functions Sqrt. b. y. Richard) ⇒ Sibling(Richard. . . U CB.. . Commitment Commitment Predicates Brother. . Logics in general Syntax of FOL: Basic elements Language Ontological Epistemological Constants KingJohn. . objects. . . .g. relations. . RichardT heLionheart) > (Length(Lef tLegOf (Richard)). First-order logic facts. . S 1 ∧ S2 . 2) >(1. Sibling(KingJohn. S1 ∨ S2 . . termn) Complex sentences are made from atomic sentences using connectives or term1 = term2 ¬S. Length(Lef tLegOf (KingJohn))) Chapter 8 7 Chapter 8 8 . Lef tLegOf. . . . 2) ∧ ¬>(1. >.g. S1 ⇒ S2 . Temporal logic facts. . termn) E. times true/false/unknown Connectives ∧ ∨ ¬ ⇒ ⇔ Probability theory facts degree of belief Equality = Fuzzy logic facts + degree of truth known interval value Quantifiers ∀∃ Chapter 8 5 Chapter 8 6 Atomic sentences Complex sentences Atomic sentence = predicate(term1. a. Brother(KingJohn. relations true/false/unknown Variables x. KingJohn) or constant or variable >(1. objects. . . . 2) E. S1 ⇔ S2 Term = f unction(term1. 2. .

Brother(Richard. . . Truth in first-order logic Models for FOL: Example Sentences are true with respect to a model and an interpretation crown Model contains ≥ 1 objects (domain elements) and relations among them Interpretation specifies referents for on head constant symbols → objects person brother predicate symbols → relations person function symbols → functional relations brother king An atomic sentence predicate(term1. . termn) is true iff the objects referred to by term1. . . termn are in the relation referred to by predicate R $ J left leg left leg Chapter 8 9 Chapter 8 10 Truth example Models for FOL: Lots! Consider the interpretation in which Entailment in propositional logic can be computed by enumerating models Richard → Richard the Lionheart We can enumerate the FOL models for a given KB vocabulary: John → the evil King John Brother → the brotherhood relation For each number of domain elements n from 1 to ∞ For each k-ary predicate Pk in the vocabulary Under this interpretation. . Computing entailment by enumerating FOL models is not easy! Chapter 8 11 Chapter 8 12 . . . . John) is true For each possible k-ary relation on n objects just in case Richard the Lionheart and the evil King John For each constant symbol C in the vocabulary are in the brotherhood relation in the model For each choice of referent for C from n objects . .

Stanf ord) ∧ Smart(Richard)) ∨ (At(Stanf ord.. Berkeley) ⇒ Smart(x) ∀ x At(x. Stanf ord) ∧ Smart(KingJohn)) ∨ (At(Richard. Berkeley) ∧ Smart(x) ∀ x P is true in a model m iff P is true with x being means “Everyone is at Berkeley and everyone is smart” each possible object in the model Roughly speaking. Chapter 8 15 Chapter 8 16 . Berkeley) ⇒ Smart(Richard)) ∧ (At(Berkeley.. Universal quantification A common mistake to avoid ∀ hvariablesi hsentencei Typically. Stanf ord) ∧ Smart(x) ∃ x At(x.. equivalent to the disjunction of instantiations of P (At(KingJohn. Berkeley) ⇒ Smart(Berkeley)) ∧ . Stanf ord) ∧ Smart(Stanf ord)) ∨ . equivalent to the conjunction of instantiations of P (At(KingJohn. ⇒ is the main connective with ∀ Everyone at Berkeley is smart: Common mistake: using ∧ as the main connective with ∀: ∀ x At(x. Berkeley) ⇒ Smart(KingJohn)) ∧ (At(Richard. Chapter 8 13 Chapter 8 14 Existential quantification Another common mistake to avoid ∃ hvariablesi hsentencei Typically.. ∧ is the main connective with ∃ Someone at Stanford is smart: Common mistake: using ⇒ as the main connective with ∃: ∃ x At(x. Stanf ord) ⇒ Smart(x) ∃ x P is true in a model m iff P is true with x being is true if there is anyone who is not at Stanford! some possible object in the model Roughly speaking.

y) ⇒ Sibling(x. y Brother(x. y) ⇒ Sibling(x. Broccoli) Chapter 8 17 Chapter 8 18 Fun with sentences Fun with sentences Brothers are siblings Brothers are siblings ∀ x. IceCream) ∃ x Likes(x. y) “Everyone in the world is loved by at least one person” Quantifier duality: each can be expressed using the other ∀ x Likes(x. x). “Sibling” is symmetric “Sibling” is symmetric ∀ x. y) “There is a person who loves everyone in the world” ∀ y ∃ x Loves(x. Broccoli) ¬∀ x ¬Likes(x. ∀ x. One’s mother is one’s female parent Chapter 8 19 Chapter 8 20 . IceCream) ¬∃ x ¬Likes(x. Properties of quantifiers Fun with sentences ∀ x ∀ y is the same as ∀ y ∀ x (why??) Brothers are siblings ∃ x ∃ y is the same as ∃ y ∃ x (why??) ∃ x ∀ y is not the same as ∀ y ∃ x ∃ x ∀ y Loves(x. y Brother(x. y). y Sibling(x. y). y) ⇔ Sibling(y.

y Brother(x. {a/Shoot} ← substitution (binding list) Given a sentence S and a substitution σ. y)). N one]. e. f ¬(m = f ) ∧ P arent(m.. Fun with sentences Fun with sentences Brothers are siblings Brothers are siblings ∀ x.e. does KB entail any particular actions at t = 5? ∀ x.. ∀ x. Sσ denotes the result of plugging σ into S. y). y) ∧ P arent(f.. Bill) Ask(KB. y) Chapter 8 21 Chapter 8 22 Equality Interacting with FOL KBs term1 = term2 is true under a given interpretation Suppose a wumpus-world agent is using an FOL KB if and only if term1 and term2 refer to the same object and perceives a smell and a breeze (but no glitter) at t = 5: E. y M other(x. x). 1 = 2 and ∀ x ×(Sqrt(x). y Brother(x. y) ⇒ Sibling(x. 5)) E.. y) ⇔ (F emale(x) ∧ P arent(x. y M other(x. P ercept([Smell. ps P arent(p.g. ∀ x. y) ⇔ Sibling(y. Breeze. x) ∧ Sibling(ps. 5)) 2 = 2 is valid Ask(KB. ∀ x. y) ⇒ Sibling(x. x). x) ∧ P arent(f. Sqrt(x)) = x are satisfiable T ell(KB. “Sibling” is symmetric “Sibling” is symmetric ∀ x. definition of (full) Sibling in terms of P arent: I. S) returns some/all σ such that KB |= Sσ Chapter 8 23 Chapter 8 24 . y) ⇔ [¬(x = y) ∧ ∃ m. y) ⇔ ∃ p. x) ∧ P arent(m. y) ⇔ (F emale(x) ∧ P arent(x. A first cousin is a child of a parent’s sibling A first cousin is a child of a parent’s sibling ∀ x. ∃ a Action(a.g. S = Smarter(x. y) ⇔ Sibling(y. y)] Answer: Y es. y Sibling(x. y/Bill} Sσ = Smarter(Hillary.g. y Sibling(x. One’s mother is one’s female parent One’s mother is one’s female parent ∀ x. y). p) ∧ P arent(ps. y Sibling(x. y)). y F irstCousin(x. y) σ = {x/Hillary.

t) ⇒ AtGold(t) ∀ x. . y)] Chapter 8 25 Chapter 8 26 Keeping track of change Describing actions I Facts hold in situations. x. x. N ow) denotes a situation Frame problem: find an elegant way to handle non-change Situations are connected by the Result function (a) representation—avoid frame axioms Result(a.g. Glitter]. g. . s)) E. g]. t) ∧ Smelt(t) ⇒ Smelly(x) ∀ s. N ow) rather than just Holding(Gold) ∀ s AtGold(s) ⇒ Holding(Gold.. s) is the situation that results from doing a in s (b) inference—avoid repeated “copy-overs” to keep track of state PIT Qualification problem: true descriptions of real actions require endless caveats— Gold PIT what if gold is slippery or nailed down or . .. b. s)) Situation calculus is one way to represent change in FOL: “Frame” axiom—describe non-changes due to action Adds a situation argument to each non-eternal predicate ∀ s HaveArrow(s) ⇒ HaveArrow(Result(Grab. . S1 PIT Forward S0 Chapter 8 27 Chapter 8 28 ..g. t) Squares are breezy near a pit: Reflex with internal state: do we have the gold already? Diagnostic rule—infer cause from effect ∀ t AtGold(t) ∧ ¬Holding(Gold. the causal rule doesn’t say whether squares far away from pits can be breezy Definition for the Breezy predicate: ∀ y Breezy(y) ⇔ [∃ x P it(x) ∧ Adjacent(x. b. t P ercept([Smell. Result(Grab. t) ∧ Breeze(t) ⇒ Breezy(x) Reflex: ∀ t AtGold(t) ⇒ Action(Grab. t P ercept([s. b. t) ∀ y Breezy(y) ⇒ ∃ x P it(x) ∧ Adjacent(x. Holding(Gold. t At(Agent. t) cannot be observed Causal rule—infer effect from cause ⇒ keeping track of change is essential ∀ x. y) Holding(Gold. y P it(x) ∧ Adjacent(x. rather than eternally “Effect” axiom—describe changes due to action E. y) ⇒ Breezy(y) Neither of these is complete—e. PIT PIT Ramification problem: real actions have many secondary consequences— Gold PIT what about the dust on the gold. wear and tear on gloves. t) ⇒ Smelt(t) ∀ x. . N ow in Holding(Gold. Knowledge base for the wumpus world Deducing hidden properties “Perception” Properties of locations: ∀ b. t) ⇒ Action(Grab.g. t At(Agent.

s)) Planning systems are special-purpose reasoners designed to do this type of inference more efficiently than a general-purpose reasoner Chapter 8 31 Chapter 8 32 . go forward and then grab the gold ∀ a. S0))) Increased expressive power: sufficient to define wumpus world has the solution {p/[F orward. 2]. s) = P lanResult(p. [1. . s Holding(Gold.. 1]. ∃ s Holding(Gold. Result(a. Describing actions II Making plans Successor-state axioms solve the representational frame problem Initial condition in KB: At(Agent. predicates. ∃ p Holding(Gold. [1. s P lanResult([a|p]. P lanResult(p. p.. s)) ∨ P true already and no action made P false] i. .e. s) ∧ a 6= Release)] Chapter 8 29 Chapter 8 30 Making plans: A better way Summary Represent plans as action sequences [a1. S0))} For holding the gold: i.e. s) = s – can formulate planning as inference on a situation calculus KB ∀ a. Result(a. S0) P true afterwards ⇔ [an action made P true Query: Ask(KB. . functions. Grab]} Situation calculus: Definition of P lanResult in terms of Result: – conventions for describing actions and change in FOL ∀ s P lanResult([ ]. s)) ⇔ This assumes that the agent is interested in plans starting at S0 and that S0 [(a = Grab ∧ AtGold(s)) is the only situation described in the KB ∨ (Holding(Gold. quantifiers Then the query Ask(KB. a2. . equality. Result(F orward. in what situation will I be holding the gold? Answer: {s/Result(Grab. an] First-order logic: – objects and relations are semantic primitives P lanResult(p. s) is the result of executing p in s – syntax: constants. S0) Each axiom is “about” a predicate (not an action per se): At(Gold.

g. Aristotle “syllogisms” (inference rules). quantifiers ∀v α 1565 Cardano probability theory (propositional logic + uncertainty) Subst({v/g}. ∀ x King(x) ∧ Greedy(x) ⇒ Evil(x) yields 1930 Gödel ∃ complete algorithm for FOL 1930 Herbrand complete algorithm for FOL (reduce to propositional) King(John) ∧ Greedy(John) ⇒ Evil(John) 1931 Gödel ¬∃ complete algorithm for arithmetic King(Richard) ∧ Greedy(Richard) ⇒ Evil(Richard) 1960 Davis/Putnam “practical” algorithm for propositional logic King(F ather(John)) ∧ Greedy(F ather(John)) ⇒ Evil(F ather(John)) 1965 Robinson “practical” algorithm for FOL—resolution . α) 1847 Boole propositional logic (again) 1879 Frege first-order logic for any variable v and ground term g 1922 Wittgenstein proof by truth tables E. Chapter 9 3 Chapter 9 4 ..c. Stoics propositional logic.c. Outline ♦ Reducing first-order inference to propositional inference ♦ Unification Inference in first-order logic ♦ Generalized Modus Ponens ♦ Forward and backward chaining ♦ Logic programming Chapter 9 ♦ Resolution Chapter 9 1 Chapter 9 2 A brief history of reasoning Universal instantiation (UI) 450b. inference (maybe) Every instantiation of a universally quantified sentence is entailed by it: 322b..

∃ x Crown(x) ∧ OnHead(x. John) provided C1 is a new constant symbol. Theorem: Turing (1936). we have e. John) see if α is entailed by this KB The new KB is propositionalized: proposition symbols are Problem: works if α is entailed... Subst({v/k}. α) the new KB is not equivalent to the old. King(Richard) etc. return result Brother(Richard. Evil(John). apply resolution. Greedy(John). loops if α is not entailed King(John). Existential instantiation (EI) Existential instantiation contd. variable v. there are infinitely many ground terms. entailment in FOL is semidecidable Chapter 9 7 Chapter 9 8 . Suppose the KB contains just the following: Claim: a ground sentence∗ is entailed by new KB iff entailed by original KB ∀ x King(x) ∧ Greedy(x) ⇒ Evil(x) Claim: every FOL KB can be propositionalized so as to preserve entailment King(John) Greedy(John) Idea: propositionalize KB and query. King(Richard) ∧ Greedy(Richard) ⇒ Evil(Richard) it is entailed by a finite subset of the propositional KB King(John) Idea: For n = 0 to ∞ do Greedy(John) create a propositional KB by instantiating with depth-n terms Brother(Richard. Church (1936).g. John) yields Crown(C1) ∧ OnHead(C1. For any sentence α. and constant symbol k UI can be applied several times to add new sentences. F ather(F ather(F ather(John))) King(John) ∧ Greedy(John) ⇒ Evil(John) Theorem: Herbrand (1930). but is satisfiable iff the old KB was satisfiable E. John) Problem: with function symbols. called a Skolem constant Another example: from ∃ x d(xy )/dy = xy we obtain d(ey )/dy = ey provided e is a new constant symbol Chapter 9 5 Chapter 9 6 Reduction to propositional inference Reduction contd.g. If a sentence α is entailed by an FOL KB. Instantiating the universal sentence in all possible ways. that does not appear elsewhere in the knowledge base: the new KB is logically equivalent to the old ∃v α EI can be applied once to replace the existential sentence.

We can get the inference immediately if we can find a substitution θ E. OJ) With p k-ary predicates and n constants. β) = θ if αθ = βθ Brother(Richard. x) Knows(John. M other(y)) Knows(John. M other(y)) With function symbols. y/John} works Unify(α. x) Knows(y. OJ) Knows(John. OJ) Chapter 9 11 Chapter 9 12 . Jane) {x/Jane} Knows(John. x) Knows(x. y/John} works King(John) ∀ y Greedy(y) Unify(α.. but propositionalization produces lots of p q θ facts such as Greedy(Richard) that are irrelevant Knows(John. β) = θ if αθ = βθ Unify(α. Problems with propositionalization Unification Propositionalization seems to generate lots of irrelevant sentences. x) Knows(y. Jane) Knows(John. Jane) {x/Jane} Knows(John. y/John} works θ = {x/John. x) Knows(John. x) Knows(y. x) Knows(y. OJ) Chapter 9 9 Chapter 9 10 Unification Unification We can get the inference immediately if we can find a substitution θ We can get the inference immediately if we can find a substitution θ such that King(x) and Greedy(x) match King(John) and Greedy(y) such that King(x) and Greedy(x) match King(John) and Greedy(y) θ = {x/John. x) Knows(x. OJ) {x/OJ. x) Knows(y. β) = θ if αθ = βθ p q θ p q θ Knows(John. x) Knows(John. x) Knows(y. there are p · nk instantiations Knows(John.g. John) it seems obvious that Evil(John). OJ) Knows(John. from such that King(x) and Greedy(x) match King(John) and Greedy(y) ∀ x King(x) ∧ Greedy(x) ⇒ Evil(x) θ = {x/John. y/John} Knows(John. it gets nuch much worse! Knows(John. x) Knows(x. M other(y)) Knows(John.

Unification Unification
We can get the inference immediately if we can find a substitution θ We can get the inference immediately if we can find a substitution θ
such that King(x) and Greedy(x) match King(John) and Greedy(y) such that King(x) and Greedy(x) match King(John) and Greedy(y)
θ = {x/John, y/John} works θ = {x/John, y/John} works
Unify(α, β) = θ if αθ = βθ Unify(α, β) = θ if αθ = βθ

p q θ p q θ
Knows(John, x) Knows(John, Jane) {x/Jane} Knows(John, x) Knows(John, Jane) {x/Jane}
Knows(John, x) Knows(y, OJ) {x/OJ, y/John} Knows(John, x) Knows(y, OJ) {x/OJ, y/John}
Knows(John, x) Knows(y, M other(y)) {y/John, x/M other(John)} Knows(John, x) Knows(y, M other(y)) {y/John, x/M other(John)}
Knows(John, x) Knows(x, OJ) Knows(John, x) Knows(x, OJ) f ail
Standardizing apart eliminates overlap of variables, e.g., Knows(z17, OJ)

Chapter 9 13 Chapter 9 14

Generalized Modus Ponens (GMP) Soundness of GMP
Need to show that
p1′, p2′, . . . , pn′, (p1 ∧ p2 ∧ . . . ∧ pn ⇒ q) p1′, . . . , pn′, (p1 ∧ . . . ∧ pn ⇒ q) |= qθ
where pi′θ = piθ for all i

provided that pi′θ = piθ for all i
p1′ is King(John) p1 is King(x) Lemma: For any definite clause p, we have p |= pθ by UI
p2′ is Greedy(y) p2 is Greedy(x)
θ is {x/John, y/John} q is Evil(x) 1. (p1 ∧ . . . ∧ pn ⇒ q) |= (p1 ∧ . . . ∧ pn ⇒ q)θ = (p1θ ∧ . . . ∧ pnθ ⇒ qθ)
qθ is Evil(John) 2. p1′, . . . , pn′ |= p1′ ∧ . . . ∧ pn′ |= p1′θ ∧ . . . ∧ pn′θ
GMP used with KB of definite clauses (exactly one positive literal) 3. From 1 and 2, qθ follows by ordinary Modus Ponens
All variables assumed universally quantified

Chapter 9 15 Chapter 9 16

Example knowledge base Example knowledge base contd.
The law says that it is a crime for an American to sell weapons to hostile . . . it is a crime for an American to sell weapons to hostile nations:
nations. The country Nono, an enemy of America, has some missiles, and
all of its missiles were sold to it by Colonel West, who is American.
Prove that Col. West is a criminal

Chapter 9 17 Chapter 9 18

Example knowledge base contd. Example knowledge base contd.
. . . it is a crime for an American to sell weapons to hostile nations: . . . it is a crime for an American to sell weapons to hostile nations:
American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x) American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x)
Nono . . . has some missiles Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x):
Owns(N ono, M1) and M issile(M1)
. . . all of its missiles were sold to it by Colonel West

Chapter 9 19 Chapter 9 20

Example knowledge base contd. Example knowledge base contd.
. . . it is a crime for an American to sell weapons to hostile nations: . . . it is a crime for an American to sell weapons to hostile nations:
American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x) American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x)
Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x): Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x):
Owns(N ono, M1) and M issile(M1) Owns(N ono, M1) and M issile(M1)
. . . all of its missiles were sold to it by Colonel West . . . all of its missiles were sold to it by Colonel West
∀ x M issile(x) ∧ Owns(N ono, x) ⇒ Sells(W est, x, N ono) ∀ x M issile(x) ∧ Owns(N ono, x) ⇒ Sells(W est, x, N ono)
Missiles are weapons: Missiles are weapons:
M issile(x) ⇒ W eapon(x)
An enemy of America counts as “hostile”:

Chapter 9 21 Chapter 9 22

Example knowledge base contd. Forward chaining algorithm
. . . it is a crime for an American to sell weapons to hostile nations:
function FOL-FC-Ask(KB, α) returns a substitution or false
American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x)
repeat until new is empty
Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x):
new ← { }
Owns(N ono, M1) and M issile(M1) for each sentence r in KB do
. . . all of its missiles were sold to it by Colonel West ( p 1 ∧ . . . ∧ p n ⇒ q) ← Standardize-Apart(r)
∀ x M issile(x) ∧ Owns(N ono, x) ⇒ Sells(W est, x, N ono) for each θ such that (p 1 ∧ . . . ∧ p n )θ = (p ′1 ∧ . . . ∧ p ′n )θ
Missiles are weapons: for some p ′1, . . . , p ′n in KB
M issile(x) ⇒ W eapon(x) q ′ ← Subst(θ, q )
An enemy of America counts as “hostile”: if q ′ is not a renaming of a sentence already in KB or new then do
add q ′ to new
Enemy(x, America) ⇒ Hostile(x)
φ ← Unify(q ′, α)
West, who is American . . . if φ is not fail then return φ
American(W est) add new to KB
The country Nono, an enemy of America . . . return false
Enemy(N ono, America)

Chapter 9 23 Chapter 9 24

America) American(West) Missile(M1) Owns(Nono.g.M1. Forward chaining proof Forward chaining proof Weapon(M1) Sells(West.America) Chapter 9 25 Chapter 9 26 Forward chaining proof Properties of forward chaining Criminal(West) Sound and complete for first-order definite clauses (proof similar to propositional proof) Datalog = first-order definite clauses + no functions (e. crime KB) FC terminates for Datalog in poly iterations: at most p · nk literals Weapon(M1) Sells(West.Nono) Hostile(Nono) American(West) Missile(M1) Owns(Nono.America) Chapter 9 27 Chapter 9 28 .M1) Enemy(Nono.M1.M1) Enemy(Nono.M1) Enemy(Nono.Nono) Hostile(Nono) May not terminate in general if α is not entailed This is unavoidable: entailment with definite clauses is semidecidable American(West) Missile(M1) Owns(Nono..

∧ p n ⇒ q) and θ′ ← Unify(q. a knowledge base goals. a list of conjuncts forming a query (θ already applied) θ. . initially the empty substitution { } local variables: answers. query M issile(x) retrieves M issile(M1) V Victoria Diff(Red. . sa) ∧ if a premise wasn’t added on iteration k − 1 Diff(nt. nsw) ∧ Diff(q. sa) ⇒ Colorable() Database indexing allows O(1) retrieval of known facts e. . θ)) ∪ answers return answers Chapter 9 31 Chapter 9 32 . q)Diff(nt. . nt) ∧ Diff(wa. . First(goals)) for each sentence r in KB where Standardize-Apart(r) = ( p 1 ∧ . Green) Forward chaining is widely used in deductive databases Colorable() is inferred iff the CSP has a solution CSPs include 3SAT as a special case.g. Green) Diff(Green. sa) ∧ ⇒ match each rule whose premise contains a newly added literal NT Diff(q. p n |Rest(goals)] answers ← FOL-BC-Ask(KB. Efficiency of forward chaining Hard matching example Simple observation: no need to match a rule on iteration k Diff(wa. the current substitution. v) ∧ Diff(nsw. . Red) Diff(Blue. θ) returns a set of substitutions Criminal(West) inputs: KB. hence matching is NP-hard Chapter 9 29 Chapter 9 30 Backward chaining algorithm Backward chaining example function FOL-BC-Ask(KB. q ′) succeeds new goals ← [ p 1. Blue) Matching conjunctive premises against known facts is NP-hard T Diff(Blue. new goals. Red) Diff(Green. Compose(θ ′ . a set of substitutions. Blue) Diff(Red. sa) ∧ SA NSW Diff(v. initially empty if goals is empty then return {θ} q ′ ← Subst(θ.. sa) ∧ Q Matching itself can be expensive WA Diff(nsw. goals.

z) Sells(x.z) Hostile(z) {} Chapter 9 33 Chapter 9 34 Backward chaining example Backward chaining example Criminal(West) {x/West} Criminal(West) {x/West. Backward chaining example Backward chaining example Criminal(West) {x/West} Criminal(West) {x/West} American(x) Weapon(y) Sells(x.z) Hostile(z) American(West) Weapon(y) Sells(x.y.z) Sells(x. y/M1} American(West) Weapon(y) Sells(West.z) Hostile(Nono) Hostile(z) American(West) Weapon(y) Sells(West.y.y.M1.z) Hostile(Nono) Hostile(z) {} {} Missile(y) Missile(y) { y/M1 } Chapter 9 35 Chapter 9 36 .M1.y.

Ask queries Apply program to data 7.z) Hostile(Nono) {} { z/Nono } {} { z/Nono } Missile(y) Missile(M1) Owns(Nono. z/Nono} American(West) Weapon(y) Sells(West.M1) Enemy(Nono.M1) Missile(y) Missile(M1) Owns(Nono. z/Nono} Criminal(West) {x/West. Assemble information Assemble information Inefficient due to repeated subgoals (both success and failure) 3. Encode problem instance as facts Encode problem instance as data 6.z) Hostile(z) American(West) Weapon(y) Sells(West. Backward chaining example Backward chaining example Criminal(West) {x/West. y/M1.M1. Tea break Figure out solution ⇒ fix using caching of previous results (extra space!) 4. y/M1.M1. U S) than x := x + 2 ! Chapter 9 39 Chapter 9 40 . Find false facts Debug procedural errors Should be easier to debug Capital(N ewY ork. Encode information in KB Program solution Widely used (without improvements!) for logic programming 5. Identify problem Identify problem 2.America) { y/M1 } { y/M1 } {} {} {} Chapter 9 37 Chapter 9 38 Properties of backward chaining Logic programming Depth-first recursive proof search: space is linear in size of proof Sound bite: computation as inference on logical KBs Incomplete due to infinite loops Logic programming Ordinary programming ⇒ fix by checking current goal against every goal on stack 1.

y)] ∨ [∃ y Loves(y. x)] U nhappy(Ken) ∀ x [∃ y Animal(y) ∧ ¬Loves(x. x)] Rich(Ken) ∀ x [∃ y ¬¬Animal(y) ∧ ¬Loves(x. Built-in predicates for arithmetic etc. y))] ∨ [∃ y Loves(y.Z). Eliminate biconditionals and implications where Unify(ℓi. dfs(X) :.2]) ? e.S). ∀ x [¬∀ y ¬Animal(y) ∨ Loves(x.2] B=[] Chapter 9 41 Chapter 9 42 Resolution: brief summary Conversion to CNF Full first-order version: Everyone who loves all animals is loved by someone: ℓ1 ∨ · · · ∨ ℓ k ..successor(X. No need to loop over S: successor succeeds for each criminal(X) :.literal1. x)] with θ = {x/Ken} Apply resolution steps to CN F (KB ∧ ¬α). hostile(Z).Y). e.2] alive(joe) succeeds if dead(joe) fails A=[1] B=[2] A=[1. given alive(X) :. Appending two lists to produce a third: Efficient unification by open coding Efficient retrieval of matching clauses by direct linking append([]. . Japan (basis of 5th Generation project) Compilation techniques ⇒ approaching a billion LIPS dfs(X) :.Y.Y. ¬mj ) = θ. literaln.american(X). .[1. answers: A=[] B=[1.goal(X). .B. X is Y*Z+3 Closed-world assumption (“negation as failure”) query: append(A. sells(X.[X|Z]) :.g.. 2. y)] ∨ [∃ y Loves(y. Prolog systems Prolog examples Basis: backward chaining with Horn clauses + bells & whistles Depth-first search from a start state X: Widely used in Europe. complete for FOL Chapter 9 43 Chapter 9 44 . y)] ⇒ [∃ y Loves(y. weapon(Y). ¬∃ x.. Depth-first. x)] (ℓ1 ∨ · · · ∨ ℓi−1 ∨ ℓi+1 ∨ · · · ∨ ℓk ∨ m1 ∨ · · · ∨ mj−1 ∨ mj+1 ∨ · · · ∨ mn )θ 1.not dead(X). p ≡ ∃ x ¬p. left-to-right backward chaining append([X|L].g.Y. Program = set of clauses = head :. x)] For example.Z).append(L. Move ¬ inwards: ¬∀ x. y)] ∨ [∃ y Loves(y. m1 ∨ · · · ∨ mn ∀ x [∀ y Animal(y) ⇒ Loves(x.Y.dfs(S). p ≡ ∀ x ¬p: ¬Rich(x) ∨ U nhappy(x) ∀ x [∃ y ¬(¬Animal(y) ∨ Loves(x.

Skolemize: a more general form of existential instantiation. F (x))] ∨ Loves(G(x).M1) Hostile(Nono) > L L 6. Drop universal quantifiers: Missile(M1) Missile(M1) Owns(Nono. Conversion to CNF contd.x) Sells(West.America) Hostile(x) Hostile(Nono) > L L [Animal(F (x)) ∨ Loves(G(x). x)] American(West) American(West) Weapon(y) Sells(West.M1.z) Hostile(z) > > L L L ∀ x [Animal(F (x)) ∧ ¬Loves(x.x. F (x)) ∨ Loves(G(x).z) Hostile(z) > > > L L L L 5. F (x))] ∨ Loves(G(x). Distribute ∧ over ∨: Enemy(x.z) Hostile(z) > > > L L L L 4.y. x)] ∧ [¬Loves(x. Standardize variables: each quantifier should use a different one American(x) Weapon(y) Sells(x.z) Hostile(z) > > > L L L L Each existential variable is replaced by a Skolem function of the enclosing universally quantified variables: Missile(M1) Missile(y) Sells(West.America) Enemy(Nono.y. x) Missile(x) Owns(Nono. Resolution proof: definite clauses 3.y.z) Hostile(z) Criminal(x) Criminal(West) > > > > L L L L L ∀ x [∃ y Animal(y) ∧ ¬Loves(x.M1) Hostile(Nono) > > L L L [Animal(F (x)) ∧ ¬Loves(x.Nono) Sells(West.America) Chapter 9 45 Chapter 9 46 Outline ♦ Uncertainty ♦ Probability Uncertainty ♦ Syntax and Semantics ♦ Inference ♦ Independence and Bayes’ Rule Chapter 13 Chapter 13 1 Chapter 13 2 .M1) Owns(Nono. x)] Enemy(Nono. Missile(x) Weapon(x) Weapon(y) Sells(West. y)] ∨ [∃ z Loves(z. x) Owns(Nono.y.

not truth.) Chapter 13 5 Chapter 13 6 . airport cuisine.3 AtAirportOnT ime 4) immense complexity of modelling and predicting traffic Sprinkler 7→0. . . qualifications. . Sprinkler causes Rain?? 1) risks falsehood: “A25 will get me there on time” or 2) leads to conclusions that are too weak for decision making: Probability “A25 will get me there on time if there’s no accident on the bridge Given the available evidence.g.) (Fuzzy logic handles degree of truth NOT uncertainty e. etc. . 5 a. initial conditions.7 Rain Hence a purely logical approach either Issues: Problems with combination. P (A25|no reported accidents.. . etc.g. . . P (A25 gets me there on time| .99 W etGrass W etGrass 7→0.) = 0. other drivers’ plans.m. P (A25|no reported accidents) = 0.g. ..) A25 7→0.. Cardamo (1565) theory of gambling (A1440 might reasonably be said to get me there on time but I’d have to stay overnight in the airport .) = 0. etc. W etGrass is true to degree 0. Probabilities of propositions change with new evidence: Utility theory is used to represent and infer preferences e.) = 0. etc. ignorance: lack of relevant facts. and it doesn’t rain and my tires remain intact etc etc.2) Chapter 13 3 Chapter 13 4 Probability Making decisions under uncertainty Probabilistic assertions summarize effects of Suppose I believe the following: laziness: failure to enumerate exceptions.) = 0..15 Decision theory = utility theory + probability theory (Analogous to logical entailment status KB |= α. Uncertainty Methods for handling uncertainty Let action At = leave for airport t minutes before flight Default or nonmonotonic logic: Will At get me there on time? Assume my car does not have a flat tire Assume A25 works unless contradicted by evidence Problems: Issues: What assumptions are reasonable? How to handle contradiction? 1) partial observability (road state.06 Which action to choose? These are not claims of a “probabilistic tendency” in the current situation (but might be learned from past experience of similar situations) Depends on my preferences for missing flight vs.) 2) noisy sensors (KCBS traffic reports) Rules with fudge factors: 3) uncertainty in action outcomes (flat tire. etc.” A25 will get me there on time with probability 0.70 Subjective or Bayesian probability: P (A120 gets me there on time| . e.) = 0.9999 e. .04 P (A90 gets me there on time| .g.04 Mahaviracarya (9th C.).95 Probabilities relate propositions to one’s own state of knowledge P (A1440 gets me there on time| . .

P (a ∨ b) = P (a) + P (b) − P (a ∧ b) event a = set of sample points where A(ω) = true True event ¬a = set of sample points where A(ω) = f alse A A B B > event a ∧ b = points where A(ω) = true and B(ω) = true Often in AI applications. i.v.. the e. P (Odd = true) = P (1) + P (3) + P (5) = 1/6 + 1/6 + 1/6 = 1/2 e. reals or Booleans ω ∈ Ω is a sample point/possible world/atomic event e. the sample space is the Cartesian product of the ranges of the variables With Boolean variables. P (X = xi) = Σ{ω:X(ω) = xi}P (ω) 0 ≤ P (ω) ≤ 1 Σω P (ω) = 1 e.g..g. or a ∧ ¬b. B = f alse. 6 possible rolls of a die. the sample points are defined by the values of a set of random variables... A probability space or probability model is a sample space P induces a probability distribution for any r.g. (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b) ⇒ P (a ∨ b) = P (¬a ∧ b) + P (a ∧ ¬b) + P (a ∧ b) Chapter 13 9 Chapter 13 10 .g. P (1) = P (2) = P (3) = P (4) = P (5) = P (6) = 1/6.t. e.. X: with an assignment P (ω) for every ω ∈ Ω s..g. e.g. Odd(1) = true..e.. de Finetti (1931): an agent who bets according to probabilities that violate Proposition = disjunction of atomic events in which it is true these axioms can be forced to bet so as to lose money regardless of outcome.. Probability basics Random variables Begin with a set Ω—the sample space A random variable is a function from sample points to some range.g. P (die roll < 4) = P (1) + P (2) + P (3) = 1/6 + 1/6 + 1/6 = 1/2 Chapter 13 7 Chapter 13 8 Propositions Why use probability? Think of a proposition as the event (set of sample points) The definitions imply that certain logically related events must have related where the proposition is true probabilities Given Boolean random variables A and B: E.g. sample point = propositional logic model e.. A = true.g. An event A is any subset of Ω P (A) = Σ{ω∈A}P (ω) E.

.72 Cavity = true is a proposition.g.. integrates to 1.016 0.1.5 ≤ X ≤ 20.125 really means lim P (20.g.e.144 0..02 0. cloudy..e.576 0.g.s gives the probability of every atomic event on those r. snowi P(W eather) = h0.02 Cavity = f alse 0. i. sums to 1) W eather = rain is a proposition Values must be exhaustive and mutually exclusive Joint probability distribution for a set of r. W eather is one of hsunny.08 0.5) = 0.72.. also allow.1i (normalized. 26](x) = uniform density between 18 and 26 0. 0.5 + dx)/dx = 0. T emp < 22.s (i. Arbitrary Boolean combinations of basic propositions W eather = sunny rain cloudy snow Cavity = true 0.g.v.08. also written cavity correspond to belief prior to arrival of any (new) evidence Discrete random variables (finite or infinite) Probability distribution gives values for all possible assignments: e. Syntax for propositions Prior probability Propositional or Boolean random variables Prior or unconditional probabilities of propositions e.6.g. P (X = 20.125 dx→0 Chapter 13 13 Chapter 13 14 ..v.125 0 18 dx 26 Here P is a density. rain. e.064 0. T emp = 21.. 0.0. every sample point) Continuous random variables (bounded or unbounded) P(W eather.08 Every question about a domain can be answered by the joint distribution because every event is a sum of sample points Chapter 13 11 Chapter 13 12 Probability for continuous variables Gaussian density Express distribution as a parameterized function of value: P (x) = 2 √ 1 e−(x−µ) /2σ 2 2πσ P (X = x) = U [18. 0. P (Cavity = true) = 0.1 and P (W eather = sunny) = 0. Cavity (do I have a cavity?) e. Cavity) = a 4 × 2 matrix of values: e.

Xn−1) P (cavity|toothache.. but is not always useful Chain rule is derived by successive application of product rule: P(X1. allowing simplification. Xn) = P(X1. .576 L L For any proposition φ. ..g. .108 . . e. is crucial n = Πi = 1P(Xi|X1.e.108 + 0. . .g. . e. . sum the atomic events where it is true: P (φ) = Σω:ω|=φP (ω) P (φ) = Σω:ω|=φP (ω) P (toothache) = 0. .016 + 0. .016 .064 .108 . cavity) = 1 (View as a 4 × 2 set of equations. then we have P(W eather. . . .g. ..012 + 0. . Xn−1) New evidence may be irrelevant.012 .008 cavity .8 = . ..144 . P (cavity|toothache) = 0. Xn−2) P(Xn1 |X1..012 .064 . sanctioned by domain knowledge. 49ersW in) = P (cavity|toothache) = 0.) Note: the less specific belief remains valid after more evidence arrives. . . sum the atomic events where it is true: For any proposition φ. . given that toothache is all I know P (a|b) = if P (b) 6= 0 P (b) NOT “if toothache then 80% chance of cavity” Product rule gives an alternative formulation: (Notation for conditional distributions: P (a ∧ b) = P (a|b)P (b) = P (b|a)P (a) P(Cavity|T oothache) = 2-element vector of 2-element vectors) A general version holds for whole distributions. . = P(X1.576 cavity . Cavity) = P(W eather|Cavity)P(Cavity) P (cavity|toothache.144 . . Xn−2) P(Xn|X1.016 .2 Chapter 13 17 Chapter 13 18 .008 cavity . . cavity is also given. .. Conditional probability Conditional probability Conditional or posterior probabilities Definition of conditional probability: e.g.072 .. . This kind of inference.8 P (a ∧ b) i. not matrix mult. . Xn−1) P(Xn|X1. . If we know more.064 = 0. .072 . e. . Xi−1) Chapter 13 15 Chapter 13 16 Inference by enumeration Inference by enumeration Start with the joint distribution: Start with the joint distribution: toothache toothache toothache toothache L L catch catch catch catch catch catch catch catch L L L L cavity .

576 L L For any proposition φ.108+0.072 .4i 1) Worst-case time complexity O(dn) where d is the largest arity 2) Space complexity O(dn) to store the joint distribution General idea: compute distribution on query variable 3) How to find the numbers for O(dn) entries??? by fixing evidence variables and summing over hidden variables Chapter 13 21 Chapter 13 22 .108 .064 = 0.016 + 0.064 .12. H = h) Denominator can be viewed as a normalization constant α The terms in the summation are joint entries because Y.012 + 0.016 .064 . contd. 0.144 .072+0.064 . ¬catch)] = α [h0. toothache.4 0. we want catch catch catch catch the posterior joint distribution of the query variables Y L L given specific values e for the evidence variables E cavity .016 . 0.08i = h0.012 . Inference by enumeration Inference by enumeration Start with the joint distribution: Start with the joint distribution: toothache toothache toothache toothache L L catch catch catch catch catch catch catch catch L L L L cavity . toothache toothache L Let X be all the variables. Typically.008 cavity .576 L Then the required summation of joint entries is done by summing out the hidden variables: P(Y|E = e) = αP(Y.008 Let the hidden variables be H = X − Y − E cavity .108 . E = e) = αΣhP(Y.6.016 + 0.008+0.064i] Obvious problems: = α h0.144 . E = e. toothache) exhaust the set of random variables = α [P(Cavity.064 Chapter 13 19 Chapter 13 20 Normalization Inference by enumeration.144 . and H together P(Cavity|toothache) = α P(Cavity.012+0.072 .576 cavity .016i + h0.28 P (toothache) 0.016+0.064 = = 0.108.012 . toothache.016 . E.072 .008 cavity . sum the atomic events where it is true: Can also compute conditional probabilities: P (φ) = Σω:ω|=φP (ω) P (¬cavity ∧ toothache) P (¬cavity|toothache) = P (cavity∨toothache) = 0.012 .108 . catch) + P(Cavity. 0. 0.108 + 0.012.

the use of conditional independence reduces the size of the representation of the joint distribution from exponential in n to linear in n. What to do? Chapter 13 23 Chapter 13 24 Conditional independence contd. Independence Conditional independence A and B are independent iff P(T oothache. Catch. the probability that the probe catches in it doesn’t depend Cavity Cavity decomposes into Toothache Catch on whether I have a toothache: Toothache Catch (1) P (catch|toothache. let M be meningitis. for n independent biased coins. Cavity) ⇒ Bayes’ rule P (a|b) = = P(T oothache|Catch. Catch. Cavity)P(Catch|Cavity)P(Cavity) P (b) = P(T oothache|Cavity)P(Catch|Cavity)P(Cavity) or in distribution form I. Cavity. Catch. Cavity. none of which are independent.8 × 0. Cavity)P(Catch. B) = P(A)P(B) If I have a cavity.0008 P (s) 0.0001 P (m|s) = = = 0. Catch|Cavity) = P(T oothache|Cavity)P(Catch|Cavity) Dentistry is a large field with hundreds of variables. W eather) Catch is conditionally independent of T oothache given Cavity: = P(T oothache. 2 + 2 + 1 = 5 independent numbers (equations 1 and 2 remove 2) P(X|Y )P(Y ) P(Y |X) = = αP(X|Y )P(Y ) P(X) In most cases.g. cavity) = P (catch|cavity) Weather Weather The same independence holds if I haven’t got a cavity: (2) P (catch|toothache. Useful for assessing diagnostic probability from causal probability: P (Ef f ect|Cause)P (Cause) Conditional independence is our most basic and robust P (Cause|Ef f ect) = form of knowledge about uncertain environments. Bayes’ Rule Write out full joint distribution using chain rule: Product rule P (a ∧ b) = P (a|b)P (b) = P (b|a)P (a) P(T oothache.. Cavity) = P(T oothache|Cavity) P(T oothache. 2n → n Equivalent statements: Absolute independence powerful but rare P(T oothache|Catch.1 Note: posterior probability of meningitis still very small! Chapter 13 25 Chapter 13 26 .e. Cavity)P(W eather) P(Catch|T oothache. ¬cavity) = P (catch|¬cavity) P(T oothache. S be stiff neck: P (s|m)P (m) 0. Catch) has 23 − 1 = 7 independent entries P(A|B) = P(A) or P(B|A) = P(B) or P(A.. Cavity) P (b|a)P (a) = P(T oothache|Catch. P (Ef f ect) E. Cavity) = P(Catch|Cavity) 32 entries reduced to 12.

. P4. B2. b) for n pits. P4. .2 × 0. Grows exponentially with number of squares! Chapter 13 29 Chapter 13 30 .2 per square: For inference by enumeration.3 4. . unknown. B1. 0 otherwise Define U nknown = Pij s other than P1.1 ∧ b1. B2.1. .2 3.3 2. P4.1. B1.2 ∧ b2.j = 1.2 4.1 2.1 4. Ef f ect1 .2 = α P(toothache|Cavity)P(catch|Cavity)P(Cavity) B OK This is an example of a naive Bayes model: 1.4) = Π i.1.1 ∧ ¬p1.1 in the probability model Total number of parameters is linear in n Chapter 13 27 Chapter 13 28 Specifying the probability model Observations and query The full joint distribution is P(P1. known.3|known. we have 4. b) First term: 1 if pits are adjacent to breezes.4 P(Cavity|toothache ∧ catch) 1.) Query is P(P1. .2 ∧ ¬p2.3|known.1 Apply product rule: P(B1.4 2.4) known = ¬p1.1.4 3. . probability 0.4 4. . B1.1P(Pi. B1. j] contains a pit Bij = true iff [i.2.3 = α P(toothache ∧ catch|Cavity)P(Cavity) 1.3. .4)P(P1.1.4. . P4.2 2. .1 3. .3 3.4 n 16−n P(P1.1 | P1. .1.3 and Known Second term: pits are placed randomly.1. Ef f ectn ) = P(Cause)ΠiP(Ef f ecti|Cause) OK OK Cavity Cause Pij = true iff [i. B2.1 B P(Cause. .2. . . .8 P(P1. b) = αΣunknownP(P1. Bayes’ Rule and conditional independence Wumpus World 1.1) We know the following facts: b = ¬b1.j ) = 0. . . .1 (Do it this way to get P (Ef f ect|Cause).2. j] is breezy Toothache Catch Effect 1 Effect n Include only B1. .

3.8(0.3.04 + 0.3|known.3.3.2 B 2.3 1.2 1. b) = α h0.16 0.4 2. P1.2 x 0. Known.4 3.1 2.2 B 2. P1. f ringe)P(P1.2 Joint probability distribution specifies probability of every atomic event OK OK OK OK OK 1.3.3.3.2 x 0. other) 1. other) f ringe other 1.3. f ringe)P (f ringe) P (other) f ringe other X Define U nknown = F ringe ∪ Other = α′ P(P1.1 3.16)i ≈ h0.1 FRINGE 3.3 1.4 4.2 B 2.3.04 0.86. known.1 3.2 x 0. known. Using conditional independence Using conditional independence contd. f ringe. Known.2 = 0.1 3.1 1. U nknown) = P(b|P1.3)P (known)P (f ringe)P (other) f ringe other X X = α P (known)P(P1.3) P(b|known. 0.8 = 0.04 + 0.2(0. f ringe.3.31. Summary 1. we must find a way to reduce the joint size Independence and conditional independence provide the tools ′ P(P1.3 Probability is a rigorous formalism for uncertain knowledge 1.16).2 1.16 For nontrivial domains. f ringe.69i P(P2.8 = 0.3) P(b|known.1 3.1 1.4 unknown X = α P(b|P1.3. known. known. b) ≈ h0. unknown)P(P1. F ringe) Manipulate query into a form where we can use this! Chapter 13 31 Chapter 13 32 Using conditional independence contd.1 1. P1.3.1 1.3 1.16 0. f ringe) P(P1.1 4.2 x 0. other) QUERY OTHER f ringe other X X = α P(b|known. Basic insight: observations are conditionally independent of other hidden squares given neighbouring hidden squares X P(P1.16 + 0. other)P(P1. 0.2 2.1 2.2 3. b) = α P(P1. P1.1 2.3 4.3 1.3 2. f ringe)P (f ringe) f ringe P(b|P1.3 3. b) 1.2 = 0.3 = α P(b|known.1 2.3. unknown) unknown X X 1.2 = 0.14i Chapter 13 33 Chapter 13 34 .1 2.2 1.3|known. known.1 3. f ringe.2 4.2 f ringe other X X = α P(b|known.1 2.1 X X KNOWN = α P(b|known.04 0.8 x 0. f ringe) P(P1.2 B 2. known. P1.2 1.1 OK B OK OK B OK OK B OK OK B OK OK B OK Queries can be answered by summing over atomic events 0.3. 0. P1.2|known. unknown.2 B 2.

Outline
♦ Learning agents
♦ Inductive learning
Learning from Observations ♦ Decision tree learning
♦ Measuring learning performance

Chapter 18, Sections 1–3

Chapter 18, Sections 1–3 1 Chapter 18, Sections 1–3 2

Learning Learning agents
Performance standard
Learning is essential for unknown environments,
i.e., when designer lacks omniscience
Critic Sensors
Learning is useful as a system construction method,
i.e., expose the agent to reality rather than trying to write it down
feedback
Learning modifies the agent’s decision mechanisms to improve performance

Environment
changes
Learning Performance
element element
knowledge
learning
goals
experiments
Problem
generator

Agent Effectors

Chapter 18, Sections 1–3 3 Chapter 18, Sections 1–3 4

Learning element Inductive learning (a.k.a. Science)
Design of learning element is dictated by Simplest form: learn a function from examples (tabula rasa)
♦ what type of performance element is used
♦ which functional component is to be learned f is the target function
♦ how that functional compoent is represented O O X
♦ what kind of feedback is available An example is a pair x, f (x), e.g., X , +1
Example scenarios: X
Performance element Component Representation Feedback
Problem: find a(n) hypothesis h
Alpha−beta search Eval. fn. Weighted linear function Win/loss such that h ≈ f
Logical agent Transition model Successor−state axioms Outcome
given a training set of examples

Utility−based agent Transition model Dynamic Bayes net Outcome (This is a highly simplified model of real learning:
Simple reflex agent Percept−action fn Neural net Correct action
– Ignores prior knowledge
– Assumes a deterministic, observable “environment”
Supervised learning: correct answers for each instance – Assumes examples are given
Reinforcement learning: occasional rewards – Assumes that the agent wants to learn f —why?)

Chapter 18, Sections 1–3 5 Chapter 18, Sections 1–3 6

Inductive learning method Inductive learning method
Construct/adjust h to agree with f on training set Construct/adjust h to agree with f on training set
(h is consistent if it agrees with f on all examples) (h is consistent if it agrees with f on all examples)
E.g., curve fitting: E.g., curve fitting:
f(x) f(x)

x x

Chapter 18, Sections 1–3 7 Chapter 18, Sections 1–3 8

Inductive learning method Inductive learning method
Construct/adjust h to agree with f on training set Construct/adjust h to agree with f on training set
(h is consistent if it agrees with f on all examples) (h is consistent if it agrees with f on all examples)
E.g., curve fitting: E.g., curve fitting:
f(x) f(x)

x x

Chapter 18, Sections 1–3 9 Chapter 18, Sections 1–3 10

Inductive learning method Inductive learning method
Construct/adjust h to agree with f on training set Construct/adjust h to agree with f on training set
(h is consistent if it agrees with f on all examples) (h is consistent if it agrees with f on all examples)
E.g., curve fitting: E.g., curve fitting:
f(x) f(x)

x x

Ockham’s razor: maximize a combination of consistency and simplicity
Chapter 18, Sections 1–3 11 Chapter 18, Sections 1–3 12

How many distinct decision trees with n Boolean attributes?? E. there is a consistent decision tree for any training set w/ one path to leaf for each example (unless f nondeterministic in x) but it probably won’t generalize to new examples Prefer to find more compact decision trees Chapter 18. for Boolean functions. Sections 1–3 13 Chapter 18. truth table row → path to leaf: A A B A xor B F T F F F B B F T T F T F T T F T T T F F T T F Trivially.. situations where I will/won’t wait for a table: E. Sections 1–3 14 Expressiveness Hypothesis spaces Decision trees can express any function of the input attributes. Sections 1–3 15 Chapter 18.g. continuous. Attribute-based representations Decision trees Examples described by attribute values (Boolean. Sections 1–3 16 .g.) One possible representation for hypotheses E. etc.. here is the “true” tree for deciding whether to wait: Example Attributes Target Patrons? Alt Bar F ri Hun P at P rice Rain Res T ype Est WillWait X1 T F F T Some $$$ F T French 0–10 T None Some Full X2 T F F T Full $ F F Thai 30–60 F F T WaitEstimate? X3 F T F F Some $ F F Burger 0–10 T X4 T F T T Full $ F F Thai 10–30 T >60 30−60 10−30 0−10 X5 T F T F Full $$$ F T French >60 F F Alternate? Hungry? T X6 F T F T Some $$ T T Italian 0–10 T No Yes No Yes X7 F T F F None $ T F Burger 0–10 F X8 F F F T Some $$ T T Thai 0–10 T Reservation? Fri/Sat? T Alternate? X9 F T T F Full $ T F Burger >60 F No Yes No Yes No Yes X10 T T T T Full $$$ F T Italian 10–30 F Bar? T F T T Raining? X11 F F F F None $ F F Thai 0–10 F No Yes No Yes X12 T T T T Full $ F F Burger 30–60 T F T F T Classification of examples is positive (T) or negative (F) Chapter 18. discrete.g..

709.g.616 trees Chapter 18. Sections 1–3 18 Hypothesis spaces Hypothesis spaces How many distinct decision trees with n Boolean attributes?? How many distinct decision trees with n Boolean attributes?? = number of Boolean functions = number of Boolean functions n n = number of distinct truth tables with 2n rows = 22 = number of distinct truth tables with 2n rows = 22 E.. Sections 1–3 19 Chapter 18.551. Sections 1–3 17 Chapter 18. Hypothesis spaces Hypothesis spaces How many distinct decision trees with n Boolean attributes?? How many distinct decision trees with n Boolean attributes?? = number of Boolean functions = number of Boolean functions = number of distinct truth tables with 2n rows Chapter 18.446. with 6 Boolean attributes. Sections 1–3 20 .073. there are 18.744.

. Sections 1–3 21 Chapter 18.744.709. Hungry ∧ ¬Rain)?? How many purely conjunctive hypotheses (e. Mode(examples)) add a branch to tree with label vi and subtree subtree return tree Chapter 18.616 trees E.446.073.616 trees How many purely conjunctive hypotheses (e. Sections 1–3 22 Decision tree learning Choosing an attribute Aim: find a small tree consistent with the training examples Idea: a good attribute splits the examples into subsets that are (ideally) “all positive” or “all negative” Idea: (recursively) choose “most significant” attribute as root of (sub)tree function DTL(examples. with 6 Boolean attributes.744.446.g. with 6 Boolean attributes.g. Hungry ∧ ¬Rain)?? Each attribute can be in (positive). there are 18. Sections 1–3 23 Chapter 18..551. examples) tree ← a new decision tree with root test best for each value vi of best do P atrons? is a better choice—gives information about the classification examplesi ← {elements of examples with best = vi } subtree ← DTL(examplesi. there are 18.. attributes − best.073.709.g. Sections 1–3 24 .. attributes. Hypothesis spaces Hypothesis spaces How many distinct decision trees with n Boolean attributes?? How many distinct decision trees with n Boolean attributes?? = number of Boolean functions = number of Boolean functions n n = number of distinct truth tables with 2n rows = 22 = number of distinct truth tables with 2n rows = 22 E. default) returns a decision tree if examples is empty then return default Patrons? Type? else if all examples have the same classification then return the classification else if attributes is empty then return Mode(examples) None Some Full French Italian Thai Burger else best ← Choose-Attribute(attributes. or out ⇒ 3n distinct conjunctive hypotheses More expressive hypothesis space – increases chance that target function can be expressed – increases number of hypotheses consistent w/ training set ⇒ may get worse predictions Chapter 18.551.g. in (negative).

Sections 1–3 25 Chapter 18.5i needs less information to complete the classification Information in an answer when prior is hP1. . . 0.6 0. Sections 1–3 27 Chapter 18.4 0 10 20 30 40 50 60 70 80 90 100 tified by small amount of data Training set size Chapter 18.8 T F Fri/Sat? T No Yes 0. . Sections 1–3 28 . n/(p+n)i) bits needed to classify a new example The more clueless I am about the answer initially. for T ype this is (still) 1 bit ⇒ choose the attribute that minimizes the remaining information needed Chapter 18.. for 12 restaurant examples. Pni) = Σ n − Pi log2 Pi ⇒ H(hpi/(pi +ni). .g.9 French Italian Thai Burger 0. this is 0. ni/(pi +ni)i) bits needed to classify a new example i=1 ⇒ expected number of bits per example over all branches is (also called entropy of the prior) pi + ni Σi H(hpi/(pi + ni). . the more information is E. Information answers questions Suppose we have p positive and n negative examples at the root ⇒ H(hp/(p+n).5 Substantially simpler than “true” tree—a more complex hypothesis isn’t jus.459 bits.5. Performance measurement Decision tree learned from the 12 examples: How do we know that h ≈ f ? (Hume’s Problem of Induction) 1) Use theorems of computational/statistical learning theory Patrons? None Some Full 2) Try h on a new test set of examples (use same distribution over example space as training set) F T Hungry? Yes No Learning curve = % correct on test set as a function of training set size Type? F 1 % correct on test set 0. . 0. Information Information contd.7 F T 0. p = n = 6 so we need 1 bit contained in the answer An attribute splits the examples E into subsets Ei. ni/(pi + ni)i) p+n For P atrons?. . each of which (we hope) Scale: 1 bit = answer to Boolean question with prior h0. Pni is Let Ei have pi positive and ni negative examples H(hP1. . Sections 1–3 26 Example contd.

b. the better is the winning probability with a fixed switching strategy. c. Contestant chose A) 0 1 0 F T T (1 − P[p])P[q] Host shows (car door is C. the aim is to find a simple hypothesis 1 realizable that is approximately consistent with training examples redundant Decision tree learning using information gain nonrealizable Learning performance = prediction accuracy measured on test set # of examples Chapter 18. say C. available – redundant expressiveness (e. this: 2 Formal Enumeration Suppose you’re on a game show. Contestant chose B) 1 0 0 F F F (1 − P[p])(1 − P[q]) Switching probabilities for the Contestant: Chose Shown Switches Thus. You pick a door. C can be the car door with probabilities a. Contestant also chooses it) 0 ta 1 − ta q : After the show. we usually enter the uncertain waters of probabilistic models and planning-learning. and the host. in its bare essentials. Contestant chose C) 1 0 0 T F T P[p](1 − P[q]) Host shows (car door is C. Table 1: Generalization and Enumeration 1 2 .. and its representation % correct For supervised learning. y. non-realizable non-realizability can be due to missing attributes Learning agent = performance element + learning element or restricted hypothesis class (e. Moreover. Host shows (car door is B. Figure 1: Selvin’s Solution 1 Introduction The Monty Hall Problem is. the lower the chance of choosing A C sac the car door in the first choice. c respectively and the B C sbc Contestant chooses each of them with probabilities x. doubling the chance of winning the car. Then we The solution is that it is advantageous to always switch. then P[win] = 13 2 18 > 3 . 12 ] and [x. y. c] = [ 61 . showing a goat. Contestant chose A) 0 0 1 T T F P[p]P[q] Host shows (car door is B. lazy designers – realizable (can express target function) vs. 2018 Abstract After studying the modules on Search and Logical Inference in an Artificial Intelligence course. (for some people somewhat counterintuitively). thresholded linear function) Learning method depends on type of performance element. and switching choice. Host shows (car door is A. P[win] = 1 − (ax + by + cz) which C A sca is 1 − a+b+c 3 = 1 − 31 = 32 if the Contestant chooses the doors uniformly. Sections 1–3 30 Introduction to Probability and Reasoning Ramprasad S Joshi April 8.g. opens another door. z are in reverse order of the door priors a. loads of irrelevant attributes) feedback. first choice. type of component to be improved. Let us approach Description Door A Door B Door C the same via propositional logic and formal modelling. say Selvin(1975b) has shown that there are 9 cases (see Figure 1) of composition of original configura- A.g. Summary Learning curve depends on Learning needed for unknown environments. z respectively. Host shows (car door is A. We demonstrate how the language of probability theory is a convenient shorthand to model myriad possibilities of NP-Hard computation of boolean logic. as shown in the first known formal solution to this problem(Selvin. which has tion. 31 . He then says to you. but otherwise it is higher than C B scb 2 3 when the Contestant’s choice probabilities x. Later. The propositions in the KB are: Can be car door a b c=1−a−b Contestant first chooses x y z =1−x−y p : The car door is chosen. a goat. without sacrificing a lot of guarantees coming from rigour. Contestant also chooses it) tc 1 − tc 0 the probability of any door being the car door is equal: P[p] = 13 . “Do you want to pick door B?” Is it to your advantage to switch Selvin again(1975a) gave a proof by the more traditional Bayesian argument. and you’re given the choice of three doors: Behind one door are keys of a new car that you can win. goats. z] = [ 21 . b. Performance measurement contd.. Sections 1–3 29 Chapter 18. This note is supposed to be a bridge from the certain world of hard computing (and haphazard heuristic soft computing when the hard computing becomes NP-Hard) to embracing uncertainty with rigour. see Figure 1). Contestant chose B) 0 0 1 Host shows (car door is A. behind the others. This is a game-show-related paradox which defied the arguments of no less than a genius like Paul Erdös. who knows what’s behind the doors. if the each of A. Contestant chose C) 0 1 0 p q win P Host shows (car door is B. 61 ]. B A sba For example. 1975b. “always switching” strategy) P[win] = 1 − P[p] which is 23 when P[p] = 31 A B sab (uniformity). b. door is switched. Let us formalize Selvin’s your choice? original(1975b) approach in our bridge-between-logic-Probability way. We use the Monty Hall Problem to demonstrate this. with P[q] = 1 (fixed. 31 . Here we also parametrize the probabilities thus (generalizing and dropping the simple fairness assumptions) in Table 1. y. Thus if [a. Contestant also chooses it) tb 0 1 − tb And then we take a probability distribution P on these propositions under natural assumptions such as Host shows (car door is C. B. in response to a barrage of objections.

Complete search..1 Formal Definition We know that two queens cannot be on the same row or column without attacking each other. (1975b). 3 P[win|¬switching] = ax + by + cz = 3×3 = 13 . J. b. when optimal. in complete. P[win|switching] = ay + az + bx + bz + cx + cy = = .1 (N-Q UEENS). First note that by a full joint distribution. Thus taking the reward door priors to be hai i and Proof. queens are attacking any other queen. In fact. It will be interesting to see the behaviour of the resulting probabilities for different values (without 29(1):67–71. to reduce the search space.. in this case.. even this remaining conflict is resolved and we tial complete assignment. The American Statistician. That means. Section 10. . in which assignment of rows to queens happens in a one-queen-at-a- time fashion. In this particular case. Ramprasad S Joshi Therefore. it turns out that 2 P[win|switching] = 2P[win|¬switching] = 2(a + b + c)/3 = .C} ! X X P[win|switching] = 1 − a i xi  aα xα yilα  X   = P[car(X) ∧ chosen(X) ∧ shown(Y )¬switched] i l∈[1. the winning probability is independent of the doors’ priors. Cambridge University Press. it is implicit that the ith queen remains on the ith column. We learnt the hard way.B.n] − {l. et al. then References 6 2 Hardy. This short note is about the assertion in AIMA(Russell and Norvig.express in FOL the probabilities thus: 3 Maximizing the Winning Probability The Car Door P[car(A)] = a. Now we can state the main result: 3. and maximized when a < b < c and x < y < z – by the Rearrangement Inequality(Hardy Switching P[switched|chosen(A) ∧ shown(B)] = sa . when is this minimization happening? Intuitively. Definition 1. door in the first attempt! For a. regardless The Chosen Door P[chosen(A)] = x. p.n]−{i}. in that heuristic preferences. x = y = z. let us summarize what we are studying: The Problem The N-Q UEENS problem is a csp. 3×3 3 Selvin. the Contestant must minimize the chance of choosing the car P[shown(A)|car(B) ∧ chosen(B)] = tb . the latest queens assigned will lose that assignment in favour of potential improvement. G.n]. the Contestant guesses of them to be hxi i. x2 . 1952. Regardless of the values of a. if in the same situation the Contestant never chooses to switch k < n − 1 non-reward doors.n] such that Abstract ∀i 6= j ∈ [1. P[shown(B)|car(C) ∧ chosen(C)] = 1 − tc . then actually the first choice must be made against the guess so that the Host reveals more information P[switched|chosen(C) ∧ shown(A)] = sc .Y 6=Z. except any of the corner queens moving to the other corner in its corner. Theorem 368).. 2. and with the fixed strategy of always switching. 1 2 .1 Generalization Claim 1. and correspondingly the herent solution localization properties of the search space. This strategy is not always visual example. Solutions There are two widely recognized effective solution strategies for csps in general: 1. the run time of min. if such can be made. the winning probability X after (always) switching is P[win] = P[car(X) ∧ chosen(X) ∧ ¬switched] + P[car(X) ∧ ¬chosen(X) ∧ switched]   X∈{A. Note that ax + by + cz is the probability of the Contestant choosing the car door in the first chance. when the Contestant chooses the strategy of always switching. The constraints constitute at least n(n−1) 2 comparisons of absolute differences of row-columns. Given the probabilities in Table 1 as a = b = c. i.l} ) X. we need to carefully implement the heuristic with full rigour.. s∗∗ = 1. 3 Given uniformity in the behaviour of the Contestant. unless any sequence of moves of any length assessed in hindsight is considered a strategy. 1 Introduction 2 Solution Strategies The N-Q UEENS problem is a classic pedagogical constraint satisfaction problem or csp for short. ax+by +cz is minimized when a < b < c and x > y > z. |xi − xj |6= |i − j|. the constraints are only on the diagonals. a + b + c = 1. P[chosen(C)] = z. only a permutation on the rows. maximize the chance of winning finally.. unless and until it is spotted. uniformity and fixed strategic choice) of the parameters. which ensures the efficacy: that feasible assignments will be visited exhaustively. P[chosen(B)] = y. On the monty hall problem (in letters to the editor). we need more + bytb (1 − sba ) + by(1 − tb )(1 − sbc ) rigourous analysis. P[shown(C)|car(A) ∧ chosen(A)] = 1 − ta . Inequalities. s∗∗ = 0 (never switching). it is easy to see that if the first four (in a For the latter. 2017 all the constraints. Thus. P[switched|chosen(B) ∧ shown(C) = 1 − sb .X6=Y . c all distinct. s∗∗ = 1 (always switching). and also to concommittant questions of how to design. if we move any single queen anywhere else in its row. S.2. pays off even more! 3 4 the given position. P[shown(C)|car(B) ∧ chosen(B)] = 1 − tb . b. of the Host’s choice when indeed the Contestant chooses the car door the first time. When Many Queens Fight Over Territory How to make M IN -C ONFLICTS work 1. Thus. The The observation above on the example instance and its solution brings us to the question of what problem is to place n queens on an n × n chessboard such that no queen attacks another. In this formal definition. In chess constitutes a strategy. P[switched|chosen(A) ∧ shown(C) = 1 − sa . the Contestant’s guesses about a. S. xn of [1. Beginning with some ini- latter four queens are moved one row each up. The second strategy. Given n. and the Host reveals probability of winning is 23 . the main lesson here is about conditional independence: just consider the case of x = y = z = 31 and a fixed strategy of always switching. the solution is not found easily. Whenever a “dead-end” happens in the search. But. ta = tb = tc = 21 .Y. the domain is 1 to n when n is the number of queens (also determining that n × n is the board size). such as choosing most constrained variables and least constraining values to be tried first.221) that “on the n-queens problem. the number of conflicts grow only. However. t∗ = 2 . t∗ = 12 . i}. This is Depth-first tree search. 1 1 Instead now substitute a = b = c = x = y = z = 3 .. only the corner queens are attacking each other. P[car(C)] = c. with fairness and uniformity assumptions. B ACK T RACKING S EARCH falls in this category. the Contestant can choose the first door according to the reversal of priors and then switch according to + cztc (1 − sca ) + cz(1 − tc )(1 − scb ) the priors – then both the factors are maximized. c (the priors of doors being car doors) or the host behaviour fairness (t∗ ). But caution! we have the advantage of omniscience for this easily observable heuristic choice of variable and of change in its assignment. The American Statistician. As shown in Figure 1. if we do not consider the permutation part. Then ment.. 2010.n]−{i. + aysbc + azscb + bxsac + bzsca + cxsab + cysba (1) Substitute a = b = c = x = y = z = 13 . the If there are n > 3 doors (with ai probabilities of being the reward door).X6=Z k + P[car(X) ∧ chosen(X) ∧ shown(Z) ∧ ¬switched] where α is the multi-index choosing all groups of size k from [1. P[car(B)] = b. Even then. . the winning probability is 1 − (ax + by + cz).α∈([1. x + y + z = 1. then the probability of winning is 31 . Formally.C}. b. 1 reward. There is a finite set of variable that are to be assigned values from (finite) known domains such that any of the values do not violate any of some fixed constraints. thus again making sure that each queen is on a unique row. when P[shown(A)|car(C) ∧ chosen(C)] = tc . In this example. such a strategy will try to improve it by some randomized have a solution. .Z∈{A. if you don’t count the initial placement of queens.B. are efficient. of local search or iterative improvement. doors not revealed to and not chosen by the Contestant is the complement of the probability that the Contestant chose the reward door in the first choice. verify and assess (by the usual conventional rules) a queen can attack along the row. then 29(3):131–134. but no other the particular question at hand. P[switched|chosen(C) ∧ shown(B)] = 1 − sc . and the switching probabilities hyilj i. and always switching is 100% better! Deliberately making a bad choice the first time. 2 edition. we fix that each queen is on a different column. . find a permutation x1 . but the right factor is minimized by the same.e. c are all “wrong”. (1952). such that then we need to choose different rows for them that place them on different diagonals. G. P[switched|chosen(B) ∧ shown(A)] = sb . to confirm the correct guess about the car door. We will take that general question up for discussion later when we have accounted for it is placed on. to satisfy April 2. We choose conflicts is roughly independent of problem size”. then the probability that the reward is behind one of the n − k − 1 (s∗∗ = 0). the column and each diagonal strategies. otherwise. and Pólya. there is + P[car(X) ∧ chosen(Y ) ∧ switched] + P[car(X) ∧ chosen(Z) ∧ switched] a multiobjective optimization problem: the left factor is maximized by the same inverse ordering of the =axta (1 − sab ) + ax(1 − ta )(1 − sac ) Contestant’s choice against the door priors. obviate the need for Figure 1: 8-Queens Instance Example exhaustive search. (1975a). Here (deterministic) heuristics. noting that the two factors are in fact governed by two different choices. to The Shown Door P[shown(B)|car(A) ∧ chosen(A)] = ta . fairness. Littlewood. seeks to exploit some in- left-to-right column sequencing) queens are moved one row each down. A problem in probability (in letters to the editor). the four-down-four-up solution is not found by a strategy here. That means. that to verify this state. Selvin. if the Contestant can guess the car door priors better.

the updation can depend on both the current (algorithmic) state and the curent assignment. Safe(Q)) 1. the number of steps allowed before giving up 1: Q ← RandomPermutation([1.nconflicts(var.csp) 5: Q ← UpdateState(State. We tested the same 52 1 79 0 0 with the following conditions: 85 1 117 0 0 1.current = current 4: if RandomChoice then for var in csp. p. though 26 1 249 0 0 again doing Python of this might have been equally useful or rewarding.221) describes the M IN -C ONFLICTS as follows (see figure 2): Algorithm 1 Local Search for NQueens function M IN -C ONFLICTS(csp.edu). If there is a tie. However. with again the choice of the tiebreaker being left open for choice and reinterpretation. current) 7: Q. q)) val = min_conflicts_value(csp.conflicted_vars(current) 9: end while if not conflicted: return current var = random. csp. with randomization 5 6 . 99 1 1000 0 2 lambda val: csp. on the n-queens problem. Table 2: Without BADCHOICE.domains[var]. 2010. choose at random. RandomChoice. ”Return the value that will give var the least number of conflicts.221): Amazingly. Do the same without that flag during compilation. just a surrendipitous choice. Table 1: With BADCHOICE 3. This remarkable observation was the stimulus leading to a great deal of research in the 1990s on local searc h and the distinction between easy and hard problems. current.v. the number of queens. max_steps=1000000): 1: Q ← RandomPermutation([1. var. and test run for various n repeatedly. line 4 is where the difference between the text.. This allows such a wide variety of heuristics that it can represent plain hill- climbing.berkeley. The assign- and the various implementations given in the accompanying material on the text-book’s website ment to value is made by a call to min conflicts value. and then go to the im. as we will see in the next section. distributed throughout the state space. number of steps passed or to go. Q) set var = value in current 6: end while return f ailure 7: return (Q. which is the subject matter of this article.v. max steps) returns a solution or failure inputs: csp. or deterministic state. We found that indeed. we reviewed our implementation. the run time of The line “value ← the value v for var that minimizes C ONFLICTS(var.VARIABLES 4: q ← ChooseRandom([1. The latter is what can be a source of inconsistent results. p. We implemented this in C (see Appendix A). S IMULATED A NNEALING and singleton population mutation-only G ENETIC A LGORITHM can similarly be retrofitted. Min-conflicts also works well for hard problems. current.1. It of 50 steps (after the initial assignment). what went wrong and why? 3. q)) # Now repeatedly choose a random conflicted variable and change it for i in range(max_steps): 8: end if conflicted = csp.csp)” in the text-book’s min-conflicts is roughly independent of problem size. 2010. var.n]) current ← an initial complete assignment for csp 2: State ← InitializeState(Q) for i = 1 to max steps do 3: while ¬(Safe(Q) ∨ Failed(State)) do if current is a solution for csp then return current var ← a randomly chosen conflicted variable from csp. val. the number of queens. val. we had to go back to the basics: looking at the algorithm’s description First. 3 4 #______________________________________________________________________________ Algorithm 2 MinConflicts Hill-Climbing for NQueens # Min-conflicts hillclimbing search for CSPs Require: n. max steps . in Table 2. there is this Python implementation from the text-book’s web-site (see figure 3). 86 1 111 0 0 68 1 140 0 0 2.. 88 1 1000 0 2 If there is a tie. choose at lem. n-queens is easy for local search because solutions are densely more than one values that minimize. even raise) the question of how to break the tie when there are we take up in Chapter 7. or its implementation “argmin random tie(csp. Let us see how.q ← ChooseFirst(FindMinimumConflictingRows(Q. etc. when we implemented M IN -C ONFLICTS initially we did not get such a nice perfor- mance. And in that function.n]) current = {}.assign(var. 2.2. val.vars: 5: Q. We got something between linear and quadratic time in the number of queens n. which does not answer (or.1 The Solution and its Validation The performance with random choice is below.n]) value ← the value v for var that minimizes C ONFLICTS(var. if you don’t count the initial placement of queens. the comment.2 The Resolution To answer this question. etc.q ← ChooseRandom(FindMinimumConflictingRows(Q. csp.. a boolean choice. 80 1 85 0 0 66 1 124 0 0 Validation See the performance in Table 1 with the bad choice without randomization.cs. current) 6: else csp. making us wonder.. this unfortunate gloss over this important question is the root of all trouble. 60 1 155 0 0 69 1 96 0 0 Do the experiment and you will know yourselves. Roughly speaking.1 Description and Algorithm The local search strategy is very simple to define in its full generality: In Line 5 in Algorithm The same page(Russell and Norvig. It solves even the million-queens problem in an average pseudo-code version gives an impression that there is (always) a unique value that minimizes etc. (aima. Even the B ACK T RACKING S EARCH can be retrofitted to this. for that matter. current))” is a clear giveaway: the question of the missing plementation issues. current): 56 1 1000 0 2 """Return the value that will give var the least number of conflicts. Compile the program with the flag “-D BADCHOICE”. and finally got to the root of the prob.n]) """Solve a CSP by stochastic hillclimbing on the number of conflicts.1 Local Search 3.choice(conflicted) n Step Size Iterations Restarts Unresolved Conflicts val = min_conflicts_value(csp. random”. we implemented Algorithm 2. n Step Size Iterations Restarts Unresolved Conflicts book description version and the Python version given above is made explicit by the variable 25 1 100 0 0 RandomChoice. tiebreaker is both raised and answered here. in which. Now we know what was missing in the text-book itself.""" return argmin_random_tie(csp. by removal of randomization. 3 M IN -C ONFLICTS in Action Figure 2: The Text-Book Description AIMA says(Russell and Norvig. var.1 Local Search with M IN -C ONFLICTS One such updater is M IN -C ONFLICTS.domains[var]. The state can include random number generator state.nconflicts(var. current)) 62 1 1000 0 1 73 1 1000 0 2 #______________________________________________________________________________ 31 1 1000 0 1 Figure 3: Python implementation of M IN -C ONFLICTS. current) 96 1 1000 0 2 csp.2. lambda val: Next. a constraint satisfaction problem Require: n.""" 2: while ¬Safe(Q) do # Generate a complete assignement for all vars (probably with conflicts) 3: q ← ChooseRandom([1. It changes the current assignment by moving only one (randomly chosen) queen. Therefore. hill-climbing with different tiebreaker criteria to escape various local traps in search. def min_conflicts(csp.assign(var. val. current) 22 1 1000 0 1 return None 13 1 1000 0 1 33 1 1000 0 2 def min_conflicts_value(csp. first we describe the description-specification of the algorithm. to a row in its column that gives it the minimum number of conflicts.

h> #endif #include<stdlib.h> val = min. j++) { int main(int argc. int i. Prentice Hall. maxit = 1000. i++) c[i] = 0.h> c = &(tempc[u[qj]][0]). totc = 0. u[n]. tempit. c[i]++.&restarts. } else if(temp == min) { qj++. j++) { s[qi] = j. min. printf("%d\t%d\t%d\t%d\t%d\n". } val = Eval(n.s. if(temp < min) { min = temp. return 0. temp = Eval(n. totc++. j. if(argc > 2) seed = atoi(argv[2]). char*argv[]) { if(s[i] == s[j]) { int n = 10. } if(step > n) step -= n. Eval(n. } for(i = 0. n. for(i = 0. Artificial Intelligence: A Modern Approach. if(argc > 3) maxit = atoi(argv[3]). for(i = 0. for(i=0.c)). step = 1. return 0.&(tempc[j][0])). } } int HillClimbNQueens(int n. for(j = 0. u[qj] = j. c[i]++.c=&(tempc[0][0])). i < n-1.s. while(MaxIterations-. int c[n]. 3 edition. restarts). int *s.maxit. u[qj] = j. c[j]++. #include<stdio. int *s. maxit-tempit. tempc[n][n].> 0) { if(val==0) return MaxIterations.s. 2010. int *c. j < n. int MaxIterations. j < n. temp. int s[n]. while(c[qi]==0). } int tempit=maxit-HillClimbNQueens(n. step. qj. do qi = rand()%n. #include<time. int *c) { } int i. #else 7 8 References Stuart Russell and Peter Norvig. i++) s[i] = rand()%n. int *restarts. i < n. i++) s[i] = rand()%n. totc++. } if(argc > 4) step = atoi(argv[4]). qj = 0. s[qi] = u[qj]. i.step). printf("%d\t%d\t%d\t%d\t0\n".A Appendix: C Implementation of Algorithm 2 if(qj>0) qj = rand()%(qj+1).s. restarts. int step) { else { int val. *restarts=0. else if(abs(s[i]-s[j])==abs(i-j)) { srand(seed). c[j]++. n. int Eval(int n. qj = -1. if(argc > 1) n = atoi(argv[1]). } } if(qj>=0) { #ifdef BADCHOICE s[qi] = u[0]. i++) { } for(j = i+1. 9 . i < n. min = val. seed. } int restarts. else seed=time(0). step. j. i<n. } if(tempit<maxit) { return totc. qi.