Está en la página 1de 62

A

Li%le Graph Theory for the Busy Developer


Dr. Jim Webber Chief Scien@st, Neo Technology @jimwebber

Roadmap
Imprisoned data Graph models Graph theory
Local proper@es, global behaviours Predic@ve techniques

Graph matching
Predic@ve, real-@me analy@cs for fun and prot

Fin

h%p://www.ickr.com/photos/crazyneighborlady/355232758/

h%p://gallery.nen.gov.uk/image82582-.html

h%p://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale

Aggregate-Oriented Data
h%p://mar@nfowler.com/bliki/AggregateOrientedDatabase.html

There is a significant downside - the whole approach works really well when data access is aligned with the aggregates, but what if you want to look at the data in a different way? Order entry naturally stores orders as aggregates, but analyzing product sales cuts across the aggregate structure. The advantage of not using an aggregate structure in the database is that it allows you to slice and dice your data different ways for different audiences. This is why aggregate-oriented stores talk so much about map-reduce.

complexity = f(size, connectedness, uniformity)

h%p://www.bbc.co.uk/london/travel/downloads/tube_map.html

Property graphs
Property graph model:
Nodes with proper@es Named, directed rela@onships with proper@es Rela@onships have exactly one start and end node
Which may be the same node

Property Graph Model

Property Graph Model


ITH W _ S L TRAVE LOVES ITH W _ S L TRAVE

Property Graph Model


ITH W _ S L TRAVE LOVES ITH W _ S L TRAVE
first name: Rose late name: Tyler name: the Doctor age: 907 species: Time Lord

vehicle: tardis model: Type 40

Labeled Property Graph Model


(Neo4j 2.0)
Label:timelord

Label:human first name: Rose late name: Tyler

ITH W _ S L TRAVE LOVES ITH W _ S L TRAVE

name: the Doctor age: 907 species: Time Lord

vehicle: tardis model: Type 40

Property graphs are very whiteboard-friendly

h%p://blogs.adobe.com/digitalmarke@ng/analy@cs/predic@ve-analy@cs/predic@ve-analy@cs-and-the-digital-marketer/

Meet Leonhard Euler


Swiss mathema@cian Inventor of Graph Theory (1736)

h%p://en.wikipedia.org/wiki/File:Leonhard_Euler_2.jpg

h%p://en.wikipedia.org/wiki/Seven_Bridges_of_Knigsberg

Triadic Closure
name: Kyle

name: Stan

name: Kenny

Triadic Closure
name: Kyle

name: Stan

name: Kenny

name: Kyle

name: Stan

FRIEND

name: Kenny

Structural Balance
name: Cartman

name: Craig

name: Tweek

Structural Balance
name: Cartman

name: Craig

name: Tweek

name: Cartman

name: Craig

FRIEND

name: Tweek

Structural Balance
name: Cartman

name: Craig

name: Tweek

name: Cartman

name: Craig

ENEMY

name: Tweek

Structural Balance
name: Kyle

name: Stan

name: Kenny

name: Kyle

name: Stan

FRIEND

name: Kenny

Structural Balance is a key predic@ve technique


And its domain-agnos@c

Allies and Enemies


UK Austria

France

Germany

Russia

Italy

Allies and Enemies


UK Austria

France

Germany

Russia

Italy

Allies and Enemies


UK Austria

France

Germany

Russia

Italy

Allies and Enemies


UK Austria

France

Germany

Russia

Italy

Allies and Enemies


UK Austria

France

Germany

Russia

Italy

Allies and Enemies


UK Austria

France

Germany

Russia

Italy

Predic@ng WWI
[Easley and Kleinberg]

Strong Triadic Closure


It if a node has strong rela3onships to two neighbours, then these neighbours must have at least a weak rela3onship between them. [Wikipedia]

Triadic Closure
(weak rela@onship)
name: Kenny

name: Stan

name: Cartman

Triadic Closure
(weak rela@onship)
name: Kenny

name: Stan

name: Cartman

name: Kenny

name: Stan

FRIEND 50%

name: Cartman

Weak rela@onships
Rela@onships can have strength as well as intent
Think: weigh@ng on a rela@onship in a property graph

Weak links play another super-important structural role in graph theory


They bridge neighbourhoods

Local Bridges
name: Cartman

FRIEND

name: Kenny

ENEMY
name: Kyle

FRI

END

FRIEND FRIEND
name: Stan

FRIEND 50%
END FRI FRIEND

name: Sally

FRIEND
name: Bebe

name: Wendy

Local Bridge Property


If a node A in a network sa3ses the Strong Triadic Closure Property and is involved in at least two strong rela3onships, then any local bridge it is involved in must be a weak rela3onship. [Easley and Kleinberg]

University Karate Club

Graph Par@@oning
(NP) Hard problem
Recursively remove the spanning links between dense regions Or recursively merge nodes into ever larger subgraph nodes Choose your algorithm carefully some are be%er than others for a given domain

Can use to (almost exactly) predict the break up of the karate club!

University Karate Clubs


(predicted by Graph Theory)

University Karate Clubs


(what actually happened!)

Cypher
Declara@ve graph pa%ern matching language
SQL for graphs Columnar results

Supports graph matching commands and queries


Find me stu like this Aggrega@on, ordering and limit, etc.

Category: baby

Category: alcoholic drinks

MEMBER_OF
Category: consumer electronics Category: nappies

MEMBER_OF
Category: beer

MEMBER_OF
Category: console

MEMBER_OF
SKU: 49d102bc Product: Baby Dry Nights SKU: 2555f258 Product: Peewee Pilsner

SKU: 49d102bc Product: XBox 360

BOUGHT

Firstname: Mickey Surname: Smith DoB: 19781006

BOUGHT

SKU: 5e175641 Product: Badgers Nadgers Ale

MEMBER_OF

Category: nappies

Category: beer

Category: game console

BOUGHT

Firstname: * Surname: * DoB: 1996 > x > 1972

Category: nappies

Category: beer

Category: game console

!BOUGHT

Firstname: * Surname: * DoB: 1996 > x > 1972

(nappies)

(beer)

(console) () ()

()<-[b:BOUGHT]- (daddy)

Fla%en the graph


(daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies) (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer) (daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)

Wrap in a Cypher MATCH clause


MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies), (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer), (daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console)

Cypher WHERE clause


MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies), (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer), (daddy)-[b:BOUGHT]->()-[:MEMBER_OF]->(console) WHERE b is null

Full Cypher query


START beer=node:categories(category=beer), nappies=de:categories(category=nappies), xbox=node:products(product=xbox 360) MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer), (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies), (daddy)-[b?:BOUGHT]->(xbox) WHERE b is null RETURN distinct daddy

Results
==> +---------------------------------------------+ ==> | daddy | ==> +---------------------------------------------+ ==> | Node[15]{name:"Rory Williams",dob:19880121} | ==> +---------------------------------------------+ ==> 1 row ==> 6 ms ==> neo4j-sh (0)$

What are graphs good for?


Recommenda@ons Business intelligence Social compu@ng Geospa@al MDM Data centre management Web of things Genealogy Time series data Product catalogue Web analy@cs Scien@c compu@ng (especially bioinforma@cs) Indexing your slow RDBMS And much more!

Graph h Databases
Ian Robinson, Jim Webber & Emil Eifrem

ts gy en lo i m no pl h m ec Co eo T N of

Free OReilly eBook!


Visit: h%p://GraphDatabases.com

Thanks for listening


Neo4j: h"p://neo4j.org Me: @jimwebber

También podría gustarte