¡Te damos la bienvenida a Scribd!

Computationally Intensive and Noisy Tasks: Co-Evolutionarylearning and Temporal Difference Learning On Backgammon

Cargado por

0% encontró este documento útil (0 votos)

21 vistas18 páginas

This document summarizes an experiment comparing coevolutionary learning and temporal difference learning methods for optimizing a Backgammon agent. The experiment found that: 1) A larger population size is needed for the noisy Backgammon task to maintain genetic diversity. 2) Increasing the number of games per individual, while keeping population size small, reduced behavioral diversity and hurt performance. 3) Coevolutionary learning was able to match the results of temporal difference learning given enough computational resources to support a large population size and number of games.

Descripción original:

Título original

AAI

Derechos de autor

Formatos disponibles

PPT, PDF, TXT o lea en línea desde Scribd

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Denunciar este documento

Copyright:

Attribution Non-Commercial (BY-NC)

Formatos disponibles

Descargue como PPT, PDF, TXT o lea en línea desde Scribd

Marcar por contenido inapropiado

0% encontró este documento útil (0 votos)

21 vistas18 páginas

Computationally Intensive and Noisy Tasks: Co-Evolutionarylearning and Temporal Difference Learning On Backgammon

Cargado por

Akbal Juárez Martínez

Copyright:

Attribution Non-Commercial (BY-NC)

Formatos disponibles

Descargue como PPT, PDF, TXT o lea en línea desde Scribd

Marcar por contenido inapropiado

Saltar a página

Está en la página 1de 18

Buscar dentro del documento

Computationally Intensive and Noisy Tasks: CO-EvolutionaryLearning and Temporal Difference Learning on Backgammon

Motivation Experiment Setup

A Benchmark and a Representation

Fitness Function and Other Parameters Measuring Genetic Diversity

Results

How Big a Population, How Many Games

Corollary: Small Population, More Games is Worse

Conclusions

Experimental Setup

Pubeval (Temporal Difference learning) The co-evolutionary system here also

uses Pubevals simple linear representation.

On a more sophisticatedneural network architecture, thismethod createdthe worlds best Backgammon computer, TDGammon

Experimental

Pubeval is two linear functions,

main part of a game of Backgammon the final racing stage, when pieces do not have to pass the opponents pieces. This racing stage is less interestingthan the main part, because there is an algorithm to exactly solve the end game.

Co-evolution here only optimizes the first function and The final racing part of the game uses Pubevals racing weights.

Measuring Genetic Diversity

A popular measure of genetic diversity

is the Shannon index. Givenn different groups, each of which has fraction fi of the total number of individuals, the Shannon index H

The question facing this paper is, for a given way to repre- sent a solution, how can CO-evolutionary learning obtain the highest ability from the least CPU time?

Results ( How Big a Population, How Many Games)

Plausible answer is that on this noisy

task,more samples

Do those extra games make any

differenceat all?

Sampling more games would more accurately discem the differences in ability among the members of the population.
that the extra precision in those evaluations does indeed have an effect,but a negative one: more games reduce the behavioral diversity, which ipso facto require more evaluations to discern those smaller differences between players.

Corollary: Small Population, More Games is Worse Since we're interested in achieving a
representation's peak ability, from the least CPU time, it is reasonable to ask what happens when the population size is barely large enough, instead of generously large. Smaller populations use less CPU time. But a smaller population has more trouble maintaining diversity, and Figure 3 and 4 show that

Co-Evolution,What Is ItGoodFor?

The answer depends partly on the your

learning task and computational resources

If your task is such that a small improvement doesn't count, and parallel hardware is unavailable, then temporal difference learning is attractive. If the task requires the best possible competitive advantage, and coming second means losing, then using much more CPU time for the best possible results may be worth it. if parallel hardware is available, co- evolution becomes attractive.

Conclusion
Use a generously large population: more noise requires more population. If you skimp on population size, more evaluations can be worse for learning, not better, because of its tendency to reduce diversity. Use just enough evaluations, so that more does not improve learning. This depends on the task, and your implementation. Here, each individual needs to take part in about 1600 games.

Computationally intensive noisy tasks

are tractable to coevolutionary learning on inexpensive parallel hardware, and given enough computational power, can create a solution comparableto Temporal Difference learning.

También podría gustarte

Artificial Intelligence Interview Questions
De Everand
Artificial Intelligence Interview Questions
Tech Interviews
Calificación: 5 de 5 estrellas
5/5 (2)
Analysis and Design of Algorithms: A Beginner’s Hope
De Everand
Analysis and Design of Algorithms: A Beginner’s Hope
Shefali Singhal
Aún no hay calificaciones
Simulating Human GM
Documento8 páginas
Simulating Human GM
Roberto Munter
Aún no hay calificaciones
Case Study
Documento46 páginas
Case Study
Anshul Thakkar
Aún no hay calificaciones
Digital Game Based Learning of Stack: Name-Trishit Gupta REG NO-20BIT0374
Documento7 páginas
Digital Game Based Learning of Stack: Name-Trishit Gupta REG NO-20BIT0374
TRISHIT DEVENDER GUPTA 20BIT0374
Aún no hay calificaciones
Arthur Samuel - Some Studies in Machine Learning Using The Game of Checkers
Documento21 páginas
Arthur Samuel - Some Studies in Machine Learning Using The Game of Checkers
Mayara Carneiro
Aún no hay calificaciones
Learning To Play Draughts Using Temporal Difference Learning With Neural Networks and Databases
Documento8 páginas
Learning To Play Draughts Using Temporal Difference Learning With Neural Networks and Databases
Shay Kar
Aún no hay calificaciones
01 Speed Read Tensorflow Playground
Documento6 páginas
01 Speed Read Tensorflow Playground
moumita dey
Aún no hay calificaciones
Artificial Neural Networks - Lect - 4
Documento17 páginas
Artificial Neural Networks - Lect - 4
ma5395822
Aún no hay calificaciones
1967 Somes Studies in Machine Learning
Documento17 páginas
1967 Somes Studies in Machine Learning
Juan Camilo España
Aún no hay calificaciones
An Introduction of Ensemble Learning
Documento40 páginas
An Introduction of Ensemble Learning
Friday Jones
100% (1)
Genetic Algorithm: Surma Mukhopadhyay
Documento26 páginas
Genetic Algorithm: Surma Mukhopadhyay
Ashwini Sawant
Aún no hay calificaciones
Neuro - Evolutionary Model For Playing Games
Documento6 páginas
Neuro - Evolutionary Model For Playing Games
sid rai
Aún no hay calificaciones
DeepLearning L1 Intro
Documento92 páginas
DeepLearning L1 Intro
lafdali
Aún no hay calificaciones
Deep Reinforcement Learning in Mario: Final Project Report of CS747: Foundations of Intelligent Learning Agents
Documento6 páginas
Deep Reinforcement Learning in Mario: Final Project Report of CS747: Foundations of Intelligent Learning Agents
Toonz Network
Aún no hay calificaciones
A Final Year Project Presentation On: "Movie Recommendation System
Documento21 páginas
A Final Year Project Presentation On: "Movie Recommendation System
Dipen Shrestha
33% (3)
Chapter+3+ +GA+ (Population+Representation+ +Fitness+Function)
Documento44 páginas
Chapter+3+ +GA+ (Population+Representation+ +Fitness+Function)
Ahmed Essam
Aún no hay calificaciones
Deep Learning (All in One)
Documento23 páginas
Deep Learning (All in One)
B Basit
Aún no hay calificaciones
Stable Archetypes For The Turing Machine: You, Them and Me
Documento7 páginas
Stable Archetypes For The Turing Machine: You, Them and Me
mdp anon
Aún no hay calificaciones
1.1 Background
Documento52 páginas
1.1 Background
Shahab ali
Aún no hay calificaciones
Efficient Self-Play Learning of The Game Abalone
Documento18 páginas
Efficient Self-Play Learning of The Game Abalone
Eldane Vieira
Aún no hay calificaciones
Ee126 Project 1
Documento5 páginas
Ee126 Project 1
api-286637373
Aún no hay calificaciones
Recurrent Neural Networks: Anahita Zarei, PH.D
Documento37 páginas
Recurrent Neural Networks: Anahita Zarei, PH.D
Nick
Aún no hay calificaciones
A Recipe For Training Neural Networks
Documento15 páginas
A Recipe For Training Neural Networks
Choukha Ram (cRc)
Aún no hay calificaciones
Deepmind Research Papers
Documento4 páginas
Deepmind Research Papers
afedsxmai
100% (1)
EDAP01
Documento4 páginas
EDAP01
Axel Rosenqvist
Aún no hay calificaciones
Reliable, Decentralized Methodologies For Byzantine Fault Tolerance
Documento7 páginas
Reliable, Decentralized Methodologies For Byzantine Fault Tolerance
themacanerd
Aún no hay calificaciones
Bayesian, Linear-Time Communication
Documento7 páginas
Bayesian, Linear-Time Communication
Gath
Aún no hay calificaciones
ETNCC Paper 2
Documento5 páginas
ETNCC Paper 2
Chirag Thaker
Aún no hay calificaciones
Cadi A Player
Documento6 páginas
Cadi A Player
Nurjamin
Aún no hay calificaciones
Evolutionary Neural Networks For Product Design Tasks
Documento11 páginas
Evolutionary Neural Networks For Product Design Tasks
jlolaza
Aún no hay calificaciones
Tic Tac Toe Game
Documento46 páginas
Tic Tac Toe Game
Amit Patel
73% (15)
Project Presentation Viva Question and Answers
Documento4 páginas
Project Presentation Viva Question and Answers
Pankaj Kumar Gond
Aún no hay calificaciones
DL Class3
Documento28 páginas
DL Class3
Rishi Chaary
Aún no hay calificaciones
Engl317 Proj4 Whitepaper
Documento18 páginas
Engl317 Proj4 Whitepaper
api-356375495
Aún no hay calificaciones
Solve, The Mckinsey Game
Documento47 páginas
Solve, The Mckinsey Game
sj
Aún no hay calificaciones
2012 Nikolaos Nikolaou MSC
Documento102 páginas
2012 Nikolaos Nikolaou MSC
uyjco0
Aún no hay calificaciones
Module 5
Documento72 páginas
Module 5
prattyush1234
Aún no hay calificaciones
6CS6.2 Unit 5 Learning
Documento41 páginas
6CS6.2 Unit 5 Learning
Aayush Agarwal
Aún no hay calificaciones
Deployment of The Producer-Consumer Problem: Bill Smith
Documento7 páginas
Deployment of The Producer-Consumer Problem: Bill Smith
dagospam
Aún no hay calificaciones
Unity 5 Game Optimization - Sample Chapter
Documento43 páginas
Unity 5 Game Optimization - Sample Chapter
Packt Publishing
Aún no hay calificaciones
Scrum Simulation With LEGO Bricks v2.0
Documento17 páginas
Scrum Simulation With LEGO Bricks v2.0
Francisco Javier Méndez Vázquez
100% (1)
Week 03
Documento28 páginas
Week 03
Osii C
Aún no hay calificaciones
Game Tree
Documento25 páginas
Game Tree
Areej Ehsan
Aún no hay calificaciones
Enseble LEarning
Documento57 páginas
Enseble LEarning
YASH GAIKWAD
100% (1)
AI in Sport
Documento5 páginas
AI in Sport
Svastits
Aún no hay calificaciones
Performance of Java Application - Part 1
Documento9 páginas
Performance of Java Application - Part 1
manjiri510
Aún no hay calificaciones
Generative Adversarial Networks
Documento10 páginas
Generative Adversarial Networks
hputluri
Aún no hay calificaciones
HOME
Documento51 páginas
HOME
endale
Aún no hay calificaciones
"Fuzzy", Psychoacoustic Configurations For Web
Documento6 páginas
"Fuzzy", Psychoacoustic Configurations For Web
thrw3411
Aún no hay calificaciones
Case Study 2022 Topics
Documento28 páginas
Case Study 2022 Topics
Priyanka Chaudhary
Aún no hay calificaciones
Fork Join
Documento24 páginas
Fork Join
emailtorakesh
Aún no hay calificaciones
What Is Learning?
Documento59 páginas
What Is Learning?
UrsTruly Anirudh
Aún no hay calificaciones
Monte Carlo Tree Search in Monopoly
Documento5 páginas
Monte Carlo Tree Search in Monopoly
Kapil Kanagal
Aún no hay calificaciones
Practical Matlab Deep Learning A Projects Based Approach 2Nd Edition Michael Paluszek All Chapter
Documento67 páginas
Practical Matlab Deep Learning A Projects Based Approach 2Nd Edition Michael Paluszek All Chapter
paul.wells767
100% (7)
Entropy
Documento126 páginas
Entropy
Nicknaim
Aún no hay calificaciones
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning For Task-Oriented Dialogue Systems
Documento8 páginas
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning For Task-Oriented Dialogue Systems
Mohammad
Aún no hay calificaciones
Neural Networks Project Report: 1. Background
Documento5 páginas
Neural Networks Project Report: 1. Background
sunil sanju
Aún no hay calificaciones
6 CNN
Documento50 páginas
6 CNN
SWAMYA RANJAN DAS
Aún no hay calificaciones
Session 15-2 Future NLP & Deep Learning
Documento81 páginas
Session 15-2 Future NLP & Deep Learning
rearcow
Aún no hay calificaciones
Coursera EF7ZHT9EAX5Z
Documento1 página
Coursera EF7ZHT9EAX5Z
Beyar. Sh
Aún no hay calificaciones
Society - Law and Ethics (SLE-2) (15 Theory)
Documento5 páginas
Society - Law and Ethics (SLE-2) (15 Theory)
vasudayma12601
Aún no hay calificaciones
Concepts of Multimedia Processing and Transmission
Documento63 páginas
Concepts of Multimedia Processing and Transmission
velmanir
Aún no hay calificaciones
A Benchmark of Dual Constellations GNSS Solutions For Vehicle Localization in Container Terminals
Documento7 páginas
A Benchmark of Dual Constellations GNSS Solutions For Vehicle Localization in Container Terminals
jfrascon
Aún no hay calificaciones
ONDULEUR APC Easy UPS 3S 1-10 KVA
Documento7 páginas
ONDULEUR APC Easy UPS 3S 1-10 KVA
ouattara yaya katia
Aún no hay calificaciones
Batch Management
Documento11 páginas
Batch Management
Madhusmita Pradhan
Aún no hay calificaciones
Microsoft Visio
Documento7 páginas
Microsoft Visio
Anandu S Nair
Aún no hay calificaciones
Highlights: LL4 - LL5
Documento2 páginas
Highlights: LL4 - LL5
Devendrasinh Jadeja
Aún no hay calificaciones
Ec - 501 - Microprocessor and Its Application
Documento41 páginas
Ec - 501 - Microprocessor and Its Application
Dr Nikita Shivhare
Aún no hay calificaciones
Retrostar (Quick Play)
Documento42 páginas
Retrostar (Quick Play)
Stephen Sanders
Aún no hay calificaciones
Extreme Privacy: Personal Data Removal Workbook & Credit Freeze Guide
Documento36 páginas
Extreme Privacy: Personal Data Removal Workbook & Credit Freeze Guide
blacksun_moon
100% (2)
Staad Excelents
Documento159 páginas
Staad Excelents
Ce Win
100% (4)
Soc Book
Documento19 páginas
Soc Book
murali
Aún no hay calificaciones
Circle Office: Divisional Office:: Southern Lahore Circle Gulberg Division
Documento1 página
Circle Office: Divisional Office:: Southern Lahore Circle Gulberg Division
William Kent
Aún no hay calificaciones
MAC-LAB Transfer Operator's Manual
Documento37 páginas
MAC-LAB Transfer Operator's Manual
Abdelhakszn Szn
Aún no hay calificaciones
Math Rs Main Question
Documento4 páginas
Math Rs Main Question
Youtube Premium
Aún no hay calificaciones
MTA Architecture
Documento79 páginas
MTA Architecture
Willy Van Steendam
Aún no hay calificaciones
Atmel SAM-BA Tool
Documento7 páginas
Atmel SAM-BA Tool
eanet2013
0% (1)
Proven Perimeter Protection: Single-Platform Simplicity
Documento6 páginas
Proven Perimeter Protection: Single-Platform Simplicity
tecksan
Aún no hay calificaciones
Presentacion Trazador CNC
Documento15 páginas
Presentacion Trazador CNC
Nery Alexander Caal
Aún no hay calificaciones
BCS301 Mathematics Model Question Paper 1
Documento5 páginas
BCS301 Mathematics Model Question Paper 1
Rana Manal
Aún no hay calificaciones
10MW Solar Plant Document Drgs
Documento217 páginas
10MW Solar Plant Document Drgs
Hanuma Reddy
75% (4)
B452 Service Manual 04 - B450 - MM
Documento124 páginas
B452 Service Manual 04 - B450 - MM
genmassa
Aún no hay calificaciones
Backend - Assignment - Internship
Documento2 páginas
Backend - Assignment - Internship
Rahul Nainawat
Aún no hay calificaciones
S1200CPU LAxesGrpCtrl DOC v11 en
Documento62 páginas
S1200CPU LAxesGrpCtrl DOC v11 en
David Jimenez
Aún no hay calificaciones
TD-W8961N (EU) V3 Quick Installation Guide
Documento2 páginas
TD-W8961N (EU) V3 Quick Installation Guide
Paweł Rybiałek
Aún no hay calificaciones
Internship JD Mapro
Documento2 páginas
Internship JD Mapro
Jimmy Jones
Aún no hay calificaciones
The Drying of Granular Fertilizers
Documento16 páginas
The Drying of Granular Fertilizers
nataliamonteiro
Aún no hay calificaciones
Using SystemVerilog Assertions in RTL Code
Documento6 páginas
Using SystemVerilog Assertions in RTL Code
Hardik Trivedi
Aún no hay calificaciones
Image Enhancement Restoration
Documento74 páginas
Image Enhancement Restoration
Darsh Singh
0% (1)