Taller de Construcción con Open Data

javier arturo rodríguez javier.rodriguez@ascii164.com scribd.com/javierrgz @codehead

Jez Page
Thursday, September 15, 2011 1

Un vistazo rápido

Thursday, September 15, 2011

2

Workflow
• Obtener • Limpiar • Visualizar • Refinar

Thursday, September 15, 2011

3

¿Dónde obtener datos abiertos?

Thursday, September 15, 2011

4

¿Dónde obtener datos abiertos?
• Instituciones Públicas • Iniciativa Privadas • Hágalo Usted Mismo

Thursday, September 15, 2011

4

Instituciones Públicas
• http://aporta.es • http://opendata.euskadi.net • http://dadesobertes.gencat.cat • http://w20.bcn.cat/opendata • http://data.gov • http://data.gov.uk

Thursday, September 15, 2011

5

Iniciativa Privada
• Google Public Data Explorer http://www.google.com/publicdata/ • Amazon Web Services Public Data Sets http://aws.amazon.com/datasets • Socrata http://opendata.socrata.com/

Thursday, September 15, 2011

6

Hágalo Usted Mismo
https://github.com/codehead/OpenParlament

Thursday, September 15, 2011

7

Hágalo Usted Mismo
https://github.com/codehead/OpenParlament
#!/usr/bin/perl use strict; use WWW::Mechanize; use Data::Dumper; use JSON::Any; use XML::Atom::SimpleFeed; use File::Slurp; use List::Util; use DateTime; use POSIX qw/strftime/; use Text::vCard::Addressbook; use pQuery; my $DIR = "data"; my $mech = WWW::Mechanize->new(); # The parlament doesn't like lwp-www. Thus, we lie. $mech->agent_alias('Windows IE 6'); sub now { strftime('%Y-%m-%dT%H:%M:%SZ',gmtime()) }; my $now = now(); my $json = JSON::Any->new(); if(!-d $DIR) { mkdir($DIR)||die("Error creating data dir $DIR: $!"); } $mech->get(q{http://www.parlament.cat/web/composicio/ple-parlament/diputats-fotos?p_pant=CO}); if(!$mech->response->is_success()) { die("Error getting parlament data"); } my $html = $mech->content(); my %dip = ($html=~m,<a .*?href="/web/composicio/diputats-fitxa\?p_codi=(\d+)".*?>(.*?)</a>,igs); write_file("$DIR/diputats.json",$json->encode(\%dip)); my $feed = XML::Atom::SimpleFeed->new( title => 'Parlament de Catalunya', link => {rel=>'via',href=>q{http://www.parlament.cat/web/composicio/ple-parlament/diputats-fotos?p_pant=CO}}, updated => $now, author => 'opendatabcn.org', id => "tag:cat.parlament.diputat.list", ); foreach my $id (sort {$a<=>$b} keys %dip) { $feed->add_entry( title => $dip{$id}, link => {rel=>'via',href=>qq{http://www.parlament.cat/web/composicio/diputats-fitxa?p_codi=$id}}, link => {rel=>'related',href=>qq{diputats/$id.vcard},type=>'text/x-vcard',title=>'vCard'}, link => {rel=>'related',href=>qq{diputats/$id.atom},tyle=>'application/atom+xml',title=>'Atom'},

Thursday, September 15, 2011

7

Hágalo Usted Mismo
• Google Spreadsheets https://docs.google.com/ • ScraperWiki https://scraperwiki.com/ • NeedleBase http://needlebase.com

Thursday, September 15, 2011

8

1. Extraer datos de la Wikipedia

Thursday, September 15, 2011

9

Thursday, September 15, 2011

10

Thursday, September 15, 2011

11

Thursday, September 15, 2011

11

Thursday, September 15, 2011

11

Thursday, September 15, 2011

12

Thursday, September 15, 2011

12

Thursday, September 15, 2011

13

Thursday, September 15, 2011

13

Thursday, September 15, 2011

14

Thursday, September 15, 2011

14

2. Visualizar con FusionTables

Thursday, September 15, 2011

15

Thursday, September 15, 2011

16

Thursday, September 15, 2011

17

Thursday, September 15, 2011

18

Thursday, September 15, 2011

19

Thursday, September 15, 2011

20

@!?

Thursday, September 15, 2011

20

Thursday, September 15, 2011

21

Thursday, September 15, 2011

22

Thursday, September 15, 2011

23

Thursday, September 15, 2011

24

Sign up to vote on this title
UsefulNot useful