Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Big Data Template
Big Data Template
TEMPLATE OUTLINE
1 OVERALL PROJECT DESCRIPTION .............................................................................................................................. 2
2 BIG DATA CHARACTERISTICS .................................................................................................................................... 4
3 BIG DATA SCIENCE ................................................................................................................................................... 5
4 GENERAL SECURITY AND PRIVACY ........................................................................................................................... 7
5 CLASSIFY USE CASES WITH TAGS .............................................................................................................................. 9
6 OVERALL BIG DATA ISSUES .................................................................................................................................... 11
7 WORKFLOW PROCESSES ........................................................................................................................................ 12
8 DETAILED SECURITY AND PRIVACY ......................................................................................................................... 16
General Instructions:
Brief instructions are provided with each question requesting an answer in a text field. For the questions offering
check boxes, please check any that apply to the use case. .
No fields are required to be filled in. Please fill in the fields that you are comfortable answering. The fields that are
particularly important to the work of the NBD-PWG are marked with *.
Submit Form
BIG DATA USE CASE TEMPLATE 2 SECTION: Overall Project Description
Top
En la actualidad esta claro que la mayoría de las personas sufre de algún trastorno mental,
debido al ritmo al que viven la vida, ya sea por preocupaciones tanto en sus vidas familiares
como en el trabajo, es notorio el como al pasar los años han aumentado en gran cantidad los
suicidios en el mundo debido a distintos trastornos mentales ya sea estrés, depresión,
anorexia y bulimia, daños causados por el uso de drogas, preocupación, nervios, adicciones,
entre otros.
Es por esto que el trabajo realizado toma sentido en cuanto a mi vida cotidiana, basándome
en los distintos datos que se han encontrado solo en mi país y a través de distintos métodos
1.3 USE Cde
de análisis ASE CONTACTS
datos *
he decidido basarme en los trastornos mentales y en tipos de muerte
Add names, phone number, and email of key people
como ahorcamiento, estrangulamiento associated disparo
o sofocación, with this use
de case.
armaPlease designate who is
de fuego,
authorized to edit this use case.
envenenamiento, entre otras causas, para saber a que edad puedo sufrir un trastorno mental
yName
por ende sufrir algún tipo de suicidio.Email
Phone PI / Author Edit rights? Primary
Author
Author
1.5 APPLICATION *
Summarize the use case applications.
Conocer la posibilidad de que alguien muera debido a un trastorno mental sufrido por su ritmo
de vida, para tratar de evitarlo en lo mayormente posible.
Lograr evitar muertes en mi país, porque ninguna persona está exenta de sufrir algún
trastorno mental, así que la información, como los datos que se pueden obtener por la
variedad de suicidios que han ocurrido, nos pueden ayudar a encontrar un patrón y evitar
sufrir algún trastorno.
2 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Overall Project Description
Top
3 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Big Data Characteristics
Top
2.3 VOLUME
Size 5 GB
4 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Big Data Science
Top
2.4 VELOCITY
Enter if real time or streaming data is important. Be quantitative: this number qualified by 3 fields below: units,
time period, proviso. Refers to the rate of flow at which the data is created, stored, analyzed, and visualized. For
example, big velocity means that a large quantity of data is being processed in a short amount of time.
Unit of measure Registros de muertes a causa de trastornos mentales por día
Proviso https://www.inegi.org.mx/temas/salud/
https://www.inegi.org.mx/temas/salud/default.html#Tabulados
https://www.inegi.org.mx/app/tabulados/pxweb/inicio.html?
Unit of Measure: Units of Velocity size given above. What is measured such as "New Tweets gathered per second", etc.?
Time Period: Time described and interval such as September 2015; items per minute
Proviso: The criterion (e.g., data gathered by a particular organization) used to get Velocity measure with units in time period
in three fields above
2.5 VARIETY
Variety refers to data from multiple repositories, domains, or types. Please indicate if the data is from multiple
datasets, mashups, etc.
Repositorios encontrados en la Web sobre noticias y datos de muertes por trastornos mentales, registros que se
hacen sobre el tema, blogs sobre causas de los trastornos mentales.
- .xlsx
- .csv
- json
2.6 VARIABILITY
Variability refers to changes in rate and nature of data gathered by use case. It captures a broader range of
changes than Velocity which is just change in size. Please describe the use case data variability.
Pasan de muertes por día a muertes por mes, o por año, rangos de edad, tipos de trastornos, causas del suicidio.
5 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Big Data Science
Top
3.2 VISUALIZATION
Describe the way the data is viewed by an analyst making decisions based on the data. Typically visualization is
the final stage of a technical data analysis pipeline and follows the data analytics stage.
Esta representado en la gráfica con referencia de Gráfica 1 en el documento Proyecto Final.docx, para poder ver
los tipos de suicidios y en que frecuencia ocurren de un total de datos de muertes en 10 años.
Esto fue hecho con el programa Statgraphics.
3.4 METADATA
Please comment on quality and richness of metadata.
Porcentaje de depresión hombres
Porcentaje de depresión mujeres
Año
Entidad federativa
Trastorno mental sufrido
Rangos de edad
Causa del suicidio
Sexo
Hombres con nervios o preocupación
Mujeres con nervios o preocupación
3.5 CURATION AND GOVERNANCE
Note that we have a separate section for security and privacy. Comment on process to ensure good data quality
and who is responsible.
Es necesario verificar las bases de datos, esto se puede lograr con una herramienta Gestora de Bases de Datos
y con ella hacer DATA CLEANING.
Para obtener los datasets se solicita reconocimiento de la persona que descarga el dataset, y para abrirlos se
piden las credenciales de Google, Facebook u otras.
Los datos solo deben permitir el acceso al autor, y al profesor que revisará el trabajo.
El autor del trabajo sera el encargado de dar veracidad a los datos y evitar la divulgación del trabajo.
6 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: General Security and Privacy
Top
4.4 IS THERE AN EXPLICIT DATA GOVERNANCE PLAN OR FRAMEWORK FOR THE EFFORT?
Data governance refers to the overall management of the availability, usability, integrity, and security of the data
employed in an enterprise.
Explicit data governance plan
✔ No data governance plan, but could use one
Data governance does not appear to be necessary
Other:
7 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: General Security and Privacy
Top
4.5 DO YOU FORESEE ANY POTENTIAL RISKS FROM PUBLIC OR PRIVATE OPEN DATA PROJECTS?
Transparency and data sharing initiatives can release into public use datasets that can be used to undermine
privacy (and, indirectly, security.)
Risks are known.
✔ Currently no known risks, but it is conceivable.
Not sure
Unlikely that this will ever be an issue (e.g., no PII, human-agent related data or subsystems.)
Other:
4.7 UNDER WHAT CONDITIONS DO YOU GIVE PEOPLE ACCESS TO YOUR DATA?
En caso de que me la soliciten con la finalidad de hacer un análisis similar y para entrega de
mi proyecto en la escuela.
4.8 UNDER WHAT CONDITIONS DO YOU GIVE PEOPLE ACCESS TO YOUR SOFTWARE?
Para fin de entrega del proyecto de la escuela en caso de que sea solicitado, por cualquier
asunto legal en el que este involucrado o para ayudar en un análisis o investigación de un
tema similar.
8 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Classify Use Cases with Tags
Top
9 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Classify Use Cases with Tags
Top
10 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Overall Big Data Issues
Top
11 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Workflow Processes
Top
7 WORKFLOW PROCESSES
Please answer this question if the use case contains multiple steps where Big Data characteristics, recorded in
this template, vary across steps. If possible flesh out workflow in the separate set of questions. Only use this
section if your use case has multiple stages where Big Data issues differ significantly between stages.
1. Se comenzó con la pregunta planteada: ¿A que edad puedo sufrir algún trastorno mental
basándome en estadísticas de mi país con el ritmo de vida que llevo hasta el momento?
2. Al conocer lo que se desea saber, se busco que lenguaje seria más apropiado, en este caso python, para
poder representar gráficamente la información, a través de un algoritmo..
3. Se comenzó la codificación leyendo directamente un archivo tipo csv que contiene la información que se
desea analizar.
4. Con estos datos se demuestra a través de distintas gráficas los resultados obtenidos.
12 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Workflow Processes
Top
Data Source(s)
Nature of Data
Software Used
Data Analytics
Infrastructure
Percentage of Use
Case Effort
Other Comments
Data Source(s)
Nature of Data
Software Used
Data Analytics
Infrastructure
Percentage of Use
Case Effort
Other Comments
13 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Workflow Processes
Top
Data Source(s)
Nature of Data
Software Used
Data Analytics
Infrastructure
Percentage of Use
Case Effort
Other Comments
Data Source(s)
Nature of Data
Software Used
Data Analytics
Infrastructure
Percentage of Use
Case Effort
Other Comments
14 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Workflow Processes
Top
Data Source(s)
Nature of Data
Software Used
Data Analytics
Infrastructure
Percentage of Use
Case Effort
Other Comments
15 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
8.1 ROLES
Roles may be associated with multiple functions within a big data ecosystem.
Los roles de los que actué en mi proyecto fueron de analista e investigador de datos y como desarrollador del
algoritmo para obtener los resultados.
16 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
8.1.3 Sponsors
Include disclosure requirements mandated by sponsors, funders, etc.
Obtenidos de la web dominio: https://www.inegi.org.mx/temas/salud/
8.1.6 Curation
List and describe roles associated with data quality and curation, independent of any specific Big Data
component. Example: Role responsible for identifying US government data as FOUO or Controlled Unclassified
Information, etc.
La curación de los datos se realizo por parte mía, eliminando información que fuera ruido, como datos nulos,
datos que en su contexto fueran cualitativas, modificación y creación de nuevos datasets.
8.1.7 Classified Data, Code or Protocols (Read only, question answered in Section 4.1)
Intellectual property protections
Military classifications, e.g., FOUO, or Controlled Classified
✔ Not applicable
Creative commons/ open source
Other:
17 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
8.2.1 Does the System Maintain PII? * (Read only, question answered in Section 4.2)
Yes, PII is part of this Big Data system
No, and none can be inferred from 3rd party sources
✔ No, but it is possible that individuals could be identified via third party databases
Other:
18 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
19 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
No aplica.
20 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
21 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
22 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
8.5.4 Is there an explicit data governance plan or framework for the effort?
Data governance refers to the overall management of the availability, usability, integrity, and security of the data
employed in an enterprise. (Read only, question answered in Section 4.4)
Explicit data governance plan
✔ No data governance plan, but could use one
Data governance does not appear to be necessary
Other:
8.5.6 Do you foresee any potential risks from public or private open data projects?
Transparency and data sharing initiatives can release into public use datasets that can be used to undermine
privacy (and, indirectly, security.) (Read only, question answered in Section 4.5)
Risks are known.
✔ Currently no known risks, but it is conceivable.
Not sure
Unlikely that this will ever be an issue (e.g., no PII, human-agent related data or subsystems.)
Other:
23 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
8.6.2 If a metadata management system is present, what measures are in place to verify
and protect its integrity?
No aplica
24 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
8.8.1 Current audit needs * (Read only, question answered in Section 4.6)
We have third party registrar or other audits, such as for ISO 9001
We have internal enterprise audit requirements
Audit is only for system health or other management requirements
✔ No audit, not needed or does not apply
Other:
25 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
8.12.2 Paywall
A paywall is in use at some stage in the workflow
✔ Not applicable
26 June 6, 2017
BIG DATA USE CASE TEMPLATE 2 SECTION: Detailed Security and Privacy
Top
27 June 6, 2017