Documentos de Académico
Documentos de Profesional
Documentos de Cultura
<class
'pandas.core.frame.DataFrame'>
RangeIndex:
1941
entries,
0
to
1940
Data
columns
(total
34
columns):
#
Column
Non‑Null
Count
Dtype
‑‑‑
‑‑‑‑‑‑
‑‑‑‑‑‑‑‑‑‑‑‑‑‑
‑‑‑‑‑
0
V1
1941
non‑null
int64
1
V2
1941
non‑null
int64
2
V3
1941
non‑null
int64
3
V4
1941
non‑null
int64
4
V5
1941
non‑null
int64
5
V6
1941
non‑null
int64
6
V7
1941
non‑null
int64
7
V8
1941
non‑null
int64
8
V9
1941
non‑null
int64
9
V10
1941
non‑null
int64
10
V11
1941
non‑null
int64
11
V12
1941
non‑null
int64
12
V13
1941
non‑null
int64
13
V14
1941
non‑null
int64
14
V15
1941
non‑null
float64
15
V16
1941
non‑null
float64
16
V17
1941
non‑null
float64
17
V18
1941
non‑null
float64
18
V19
1941
non‑null
float64
19
V20
1941
non‑null
float64
20
V21
1941
non‑null
float64
21
V22
1941
non‑null
float64
22
V23
1941
non‑null
float64
23
V24
1941
non‑null
float64
24
V25
1941
non‑null
float64
25
V26
1941
non‑null
float64
26
V27
1941
non‑null
float64
27
V28
1941
non‑null
int64
28
V29
1941
non‑null
int64
29
V30
1941
non‑null
int64
30
V31
1941
non‑null
int64
31
V32
1941
non‑null
int64
32
V33
1941
non‑null
int64
33
Class
1941
non‑null
int64
dtypes:
float64(13),
int64(21)
memory
usage:
515.7
KB
In [50]: df_fall_placa_acero.head()
In [53]: n=len(df_fall_placa_acero)
In [54]: sturges=int(round(1+
m.log2(n),0));print(sturges)
12
1. Discretizacion por intervalos de igual amplitud de la variable "V9" (Grafico de Barras) con los bins
de la tecnica sturge
In [55]: amplitud=
KBinsDiscretizer(n_bins=sturges,
encode="ordinal",
strategy="uniform")
In [56]: df_fall_placa_acero["V9"]
=
amplitud.fit_transform(df_fall_placa_acero[["V9"]])
In [61]: df_fall_placa_acero.groupby("V9").V9.count()
V9
Out[61]:
0.0
16
1.0
117
2.0
274
3.0
127
4.0
283
5.0
496
6.0
435
7.0
140
8.0
14
9.0
19
10.0
12
11.0
8
Name:
V9,
dtype:
int64
In [60]: df_fall_placa_acero.groupby("V9").V9.count()/len(df_fall_placa_acero)*100
V9
Out[60]:
0.0
0.824317
1.0
6.027821
2.0
14.116435
3.0
6.543019
4.0
14.580113
5.0
25.553838
6.0
22.411128
7.0
7.212777
8.0
0.721278
9.0
0.978877
10.0
0.618238
11.0
0.412159
Name:
V9,
dtype:
float64
In [62]: df_fall_placa_acero.head()
In [65]: df_fall_placa_acero["V19"]=cuartil.fit_transform(df_fall_placa_acero[["V19"]])
In [66]: df_fall_placa_acero.groupby("V19").V19.count()
V19
Out[66]:
0.0
485
1.0
479
2.0
470
3.0
507
Name:
V19,
dtype:
int64
In [67]: df_fall_placa_acero.head(n=3)
In [70]: df_fall_placa_acero["V15"]=kmeans.fit_transform(df_fall_placa_acero[["V15"]])
In [71]: df_fall_placa_acero.groupby("V15").V15.count()
V15
Out[71]:
0.0
867
1.0
413
2.0
326
3.0
335
Name:
V15,
dtype:
int64
In [46]: df_fall_placa_acero.head(n=3)