Overview

Dataset statistics

Number of variables10
Number of observations2887
Missing cells5
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory228.5 KiB
Average record size in memory81.0 B

Variable types

Numeric1
Text3
Categorical5
DateTime1

Dataset

Description광주광역시 시설물안전법 관리시설물 현황 제공합니다.
Author광주광역시
URLhttps://www.data.go.kr/data/15002180/fileData.do

Alerts

연번 is highly overall correlated with 종별 and 1 other fieldsHigh correlation
관리기관 is highly overall correlated with 구분High correlation
시설물종류 is highly overall correlated with 종별 and 1 other fieldsHigh correlation
종별 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
구분 is highly overall correlated with 연번 and 3 other fieldsHigh correlation
시설물종류 is highly imbalanced (61.6%)Imbalance
종별 is highly imbalanced (60.0%)Imbalance
구분 is highly imbalanced (53.6%)Imbalance
등급 is highly imbalanced (54.4%)Imbalance

Reproduction

Analysis started2023-12-12 23:11:06.502701
Analysis finished2023-12-12 23:11:08.419300
Duration1.92 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION 

Distinct2886
Distinct (%)100.0%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1443.5
Minimum1
Maximum2886
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.5 KiB
2023-12-13T08:11:08.502092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile145.25
Q1722.25
median1443.5
Q32164.75
95-th percentile2741.75
Maximum2886
Range2885
Interquartile range (IQR)1442.5

Descriptive statistics

Standard deviation833.26076
Coefficient of variation (CV)0.57725027
Kurtosis-1.2
Mean1443.5
Median Absolute Deviation (MAD)721.5
Skewness0
Sum4165941
Variance694323.5
MonotonicityStrictly increasing
2023-12-13T08:11:08.657023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1919 1
 
< 0.1%
1921 1
 
< 0.1%
1922 1
 
< 0.1%
1923 1
 
< 0.1%
1924 1
 
< 0.1%
1925 1
 
< 0.1%
1926 1
 
< 0.1%
1927 1
 
< 0.1%
1928 1
 
< 0.1%
Other values (2876) 2876
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2886 1
< 0.1%
2885 1
< 0.1%
2884 1
< 0.1%
2883 1
< 0.1%
2882 1
< 0.1%
2881 1
< 0.1%
2880 1
< 0.1%
2879 1
< 0.1%
2878 1
< 0.1%
2877 1
< 0.1%
Distinct2878
Distinct (%)99.7%
Missing1
Missing (%)< 0.1%
Memory size22.7 KiB
2023-12-13T08:11:08.944237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length26
Mean length12.750173
Min length2

Characters and Unicode

Total characters36797
Distinct characters467
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2871 ?
Unique (%)99.5%

Sample

1st row(주)위니아대우매뉴팩처링 세탁기, 냉장고 동 건축물
2nd row1공장
3rd row2공장
4th rowA동
5th rowM-PLUS 광주
ValueCountFrequency (%)
101동 140
 
2.5%
102동 134
 
2.4%
103동 119
 
2.1%
광주 98
 
1.7%
104동 91
 
1.6%
아파트 90
 
1.6%
105동 81
 
1.4%
힐스테이트 76
 
1.3%
106동 68
 
1.2%
107동 51
 
0.9%
Other values (2180) 4734
83.3%
2023-12-13T08:11:09.415655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2805
 
7.6%
2544
 
6.9%
1 2342
 
6.4%
0 1875
 
5.1%
1366
 
3.7%
1209
 
3.3%
1200
 
3.3%
2 987
 
2.7%
3 662
 
1.8%
636
 
1.7%
Other values (457) 21171
57.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25624
69.6%
Decimal Number 7262
 
19.7%
Space Separator 2805
 
7.6%
Uppercase Letter 388
 
1.1%
Lowercase Letter 206
 
0.6%
Dash Punctuation 132
 
0.4%
Open Punctuation 122
 
0.3%
Close Punctuation 122
 
0.3%
Other Punctuation 113
 
0.3%
Math Symbol 21
 
0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2544
 
9.9%
1366
 
5.3%
1209
 
4.7%
1200
 
4.7%
636
 
2.5%
574
 
2.2%
504
 
2.0%
404
 
1.6%
385
 
1.5%
385
 
1.5%
Other values (399) 16417
64.1%
Uppercase Letter
ValueCountFrequency (%)
S 59
15.2%
A 37
9.5%
I 33
 
8.5%
R 30
 
7.7%
C 30
 
7.7%
G 27
 
7.0%
P 24
 
6.2%
E 22
 
5.7%
T 21
 
5.4%
K 21
 
5.4%
Other values (13) 84
21.6%
Lowercase Letter
ValueCountFrequency (%)
m 38
18.4%
a 36
17.5%
p 35
17.0%
e 23
11.2%
t 23
11.2%
h 17
8.3%
s 17
8.3%
d 8
 
3.9%
o 4
 
1.9%
i 4
 
1.9%
Decimal Number
ValueCountFrequency (%)
1 2342
32.3%
0 1875
25.8%
2 987
13.6%
3 662
 
9.1%
5 355
 
4.9%
4 314
 
4.3%
6 266
 
3.7%
7 187
 
2.6%
8 153
 
2.1%
9 121
 
1.7%
Other Punctuation
ValueCountFrequency (%)
; 36
31.9%
& 25
22.1%
# 22
19.5%
. 15
13.3%
, 14
 
12.4%
/ 1
 
0.9%
Math Symbol
ValueCountFrequency (%)
+ 12
57.1%
~ 9
42.9%
Space Separator
ValueCountFrequency (%)
2805
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 132
100.0%
Open Punctuation
ValueCountFrequency (%)
( 122
100.0%
Close Punctuation
ValueCountFrequency (%)
) 122
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25625
69.6%
Common 10578
28.7%
Latin 594
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2544
 
9.9%
1366
 
5.3%
1209
 
4.7%
1200
 
4.7%
636
 
2.5%
574
 
2.2%
504
 
2.0%
404
 
1.6%
385
 
1.5%
385
 
1.5%
Other values (400) 16418
64.1%
Latin
ValueCountFrequency (%)
S 59
 
9.9%
m 38
 
6.4%
A 37
 
6.2%
a 36
 
6.1%
p 35
 
5.9%
I 33
 
5.6%
R 30
 
5.1%
C 30
 
5.1%
G 27
 
4.5%
P 24
 
4.0%
Other values (24) 245
41.2%
Common
ValueCountFrequency (%)
2805
26.5%
1 2342
22.1%
0 1875
17.7%
2 987
 
9.3%
3 662
 
6.3%
5 355
 
3.4%
4 314
 
3.0%
6 266
 
2.5%
7 187
 
1.8%
8 153
 
1.4%
Other values (13) 632
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25624
69.6%
ASCII 11172
30.4%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2805
25.1%
1 2342
21.0%
0 1875
16.8%
2 987
 
8.8%
3 662
 
5.9%
5 355
 
3.2%
4 314
 
2.8%
6 266
 
2.4%
7 187
 
1.7%
8 153
 
1.4%
Other values (47) 1226
11.0%
Hangul
ValueCountFrequency (%)
2544
 
9.9%
1366
 
5.3%
1209
 
4.7%
1200
 
4.7%
636
 
2.5%
574
 
2.2%
504
 
2.0%
404
 
1.6%
385
 
1.5%
385
 
1.5%
Other values (399) 16417
64.1%
None
ValueCountFrequency (%)
1
100.0%
Distinct667
Distinct (%)23.1%
Missing1
Missing (%)< 0.1%
Memory size22.7 KiB
2023-12-13T08:11:09.629299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length22
Mean length11.485793
Min length3

Characters and Unicode

Total characters33148
Distinct characters400
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique297 ?
Unique (%)10.3%

Sample

1st row(주)위니아대우매뉴팩처링
2nd row금호타이어 광주공장
3rd row금호타이어 광주공장
4th row앰코테크놀로지코리아
5th rowM-PLUS 광주
ValueCountFrequency (%)
관리사무소 511
 
11.9%
종합건설본부 186
 
4.3%
건설과 103
 
2.4%
광산구청 71
 
1.7%
남구청 45
 
1.1%
광주광역시도시공사 41
 
1.0%
광주광역시 37
 
0.9%
동구청 37
 
0.9%
아파트 36
 
0.8%
도로팀 35
 
0.8%
Other values (770) 3183
74.3%
2023-12-13T08:11:10.010171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1399
 
4.2%
1347
 
4.1%
1336
 
4.0%
1216
 
3.7%
1186
 
3.6%
1184
 
3.6%
1168
 
3.5%
1162
 
3.5%
1075
 
3.2%
887
 
2.7%
Other values (390) 21188
63.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29932
90.3%
Space Separator 1399
 
4.2%
Decimal Number 1094
 
3.3%
Uppercase Letter 286
 
0.9%
Close Punctuation 109
 
0.3%
Open Punctuation 107
 
0.3%
Lowercase Letter 97
 
0.3%
Other Punctuation 73
 
0.2%
Dash Punctuation 48
 
0.1%
Other Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1347
 
4.5%
1336
 
4.5%
1216
 
4.1%
1186
 
4.0%
1184
 
4.0%
1168
 
3.9%
1162
 
3.9%
1075
 
3.6%
887
 
3.0%
778
 
2.6%
Other values (337) 18593
62.1%
Uppercase Letter
ValueCountFrequency (%)
S 105
36.7%
L 28
 
9.8%
A 23
 
8.0%
C 19
 
6.6%
K 16
 
5.6%
I 13
 
4.5%
G 12
 
4.2%
E 12
 
4.2%
H 9
 
3.1%
W 9
 
3.1%
Other values (11) 40
 
14.0%
Lowercase Letter
ValueCountFrequency (%)
m 23
23.7%
p 22
22.7%
a 22
22.7%
s 16
16.5%
e 5
 
5.2%
k 2
 
2.1%
d 2
 
2.1%
b 1
 
1.0%
c 1
 
1.0%
t 1
 
1.0%
Other values (2) 2
 
2.1%
Decimal Number
ValueCountFrequency (%)
2 368
33.6%
1 315
28.8%
3 185
16.9%
4 58
 
5.3%
5 55
 
5.0%
6 43
 
3.9%
0 23
 
2.1%
9 18
 
1.6%
8 17
 
1.6%
7 12
 
1.1%
Other Punctuation
ValueCountFrequency (%)
; 34
46.6%
& 23
31.5%
# 11
 
15.1%
. 4
 
5.5%
, 1
 
1.4%
Space Separator
ValueCountFrequency (%)
1399
100.0%
Close Punctuation
ValueCountFrequency (%)
) 109
100.0%
Open Punctuation
ValueCountFrequency (%)
( 107
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29932
90.3%
Common 2830
 
8.5%
Latin 383
 
1.2%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1347
 
4.5%
1336
 
4.5%
1216
 
4.1%
1186
 
4.0%
1184
 
4.0%
1168
 
3.9%
1162
 
3.9%
1075
 
3.6%
887
 
3.0%
778
 
2.6%
Other values (337) 18593
62.1%
Latin
ValueCountFrequency (%)
S 105
27.4%
L 28
 
7.3%
A 23
 
6.0%
m 23
 
6.0%
p 22
 
5.7%
a 22
 
5.7%
C 19
 
5.0%
s 16
 
4.2%
K 16
 
4.2%
I 13
 
3.4%
Other values (23) 96
25.1%
Common
ValueCountFrequency (%)
1399
49.4%
2 368
 
13.0%
1 315
 
11.1%
3 185
 
6.5%
) 109
 
3.9%
( 107
 
3.8%
4 58
 
2.0%
5 55
 
1.9%
- 48
 
1.7%
6 43
 
1.5%
Other values (9) 143
 
5.1%
Han
ValueCountFrequency (%)
3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29929
90.3%
ASCII 3213
 
9.7%
None 3
 
< 0.1%
CJK 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1399
43.5%
2 368
 
11.5%
1 315
 
9.8%
3 185
 
5.8%
) 109
 
3.4%
( 107
 
3.3%
S 105
 
3.3%
4 58
 
1.8%
5 55
 
1.7%
- 48
 
1.5%
Other values (42) 464
 
14.4%
Hangul
ValueCountFrequency (%)
1347
 
4.5%
1336
 
4.5%
1216
 
4.1%
1186
 
4.0%
1184
 
4.0%
1168
 
3.9%
1162
 
3.9%
1075
 
3.6%
887
 
3.0%
778
 
2.6%
Other values (336) 18590
62.1%
None
ValueCountFrequency (%)
3
100.0%
CJK
ValueCountFrequency (%)
3
100.0%

관리기관
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
광주광역시 북구청
708 
광주광역시 광산구청
648 
광주광역시 서구청
544 
광주광역시청
403 
광주광역시 남구청
360 
Other values (2)
224 

Length

Max length10
Median length9
Mean length8.8039487
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row광주광역시 광산구청
2nd row광주광역시 광산구청
3rd row광주광역시 광산구청
4th row광주광역시 북구청
5th row광주광역시 서구청

Common Values

ValueCountFrequency (%)
광주광역시 북구청 708
24.5%
광주광역시 광산구청 648
22.4%
광주광역시 서구청 544
18.8%
광주광역시청 403
14.0%
광주광역시 남구청 360
12.5%
광주광역시 동구청 223
 
7.7%
<NA> 1
 
< 0.1%

Length

2023-12-13T08:11:10.161662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:11:10.283869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광주광역시 2483
46.2%
북구청 708
 
13.2%
광산구청 648
 
12.1%
서구청 544
 
10.1%
광주광역시청 403
 
7.5%
남구청 360
 
6.7%
동구청 223
 
4.2%
na 1
 
< 0.1%

시설물종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct35
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
공동주택
2048 
도로교량
261 
대형건축물
 
85
의료시설
 
70
다중이용건축물
 
58
Other values (30)
365 

Length

Max length9
Median length4
Mean length4.157603
Min length2

Unique

Unique6 ?
Unique (%)0.2%

Sample

1st row대형건축물
2nd row대형건축물
3rd row대형건축물
4th row대형건축물
5th row대형건축물

Common Values

ValueCountFrequency (%)
공동주택 2048
70.9%
도로교량 261
 
9.0%
대형건축물 85
 
2.9%
의료시설 70
 
2.4%
다중이용건축물 58
 
2.0%
육교 54
 
1.9%
수문 및 통문 36
 
1.2%
판매시설 29
 
1.0%
건축물옹벽 26
 
0.9%
도로터널 26
 
0.9%
Other values (25) 194
 
6.7%

Length

2023-12-13T08:11:10.452454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공동주택 2048
68.1%
도로교량 261
 
8.7%
대형건축물 85
 
2.8%
의료시설 70
 
2.3%
61
 
2.0%
다중이용건축물 58
 
1.9%
육교 54
 
1.8%
수문 36
 
1.2%
통문 36
 
1.2%
판매시설 29
 
1.0%
Other values (28) 271
 
9.0%

종별
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
2종
2416 
3종
295 
1종
 
175
<NA>
 
1

Length

Max length4
Median length2
Mean length2.0006928
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1종
2nd row1종
3rd row1종
4th row1종
5th row1종

Common Values

ValueCountFrequency (%)
2종 2416
83.7%
3종 295
 
10.2%
1종 175
 
6.1%
<NA> 1
 
< 0.1%

Length

2023-12-13T08:11:10.613477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:11:10.745968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2종 2416
83.7%
3종 295
 
10.2%
1종 175
 
6.1%
na 1
 
< 0.1%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
민간
2296 
공공
590 
<NA>
 
1

Length

Max length4
Median length2
Mean length2.0006928
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row민간
2nd row민간
3rd row민간
4th row민간
5th row민간

Common Values

ValueCountFrequency (%)
민간 2296
79.5%
공공 590
 
20.4%
<NA> 1
 
< 0.1%

Length

2023-12-13T08:11:10.890012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:11:11.007517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
민간 2296
79.5%
공공 590
 
20.4%
na 1
 
< 0.1%

주소
Text

Distinct1227
Distinct (%)42.5%
Missing1
Missing (%)< 0.1%
Memory size22.7 KiB
2023-12-13T08:11:11.312506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/