Overview

Dataset statistics

Number of variables6
Number of observations1140
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory58.0 KiB
Average record size in memory52.1 B

Variable types

Categorical2
Text1
Numeric3

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 인구 천명당 사설학원 수(개), 사설학원수(개), 주민등록인구수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15110186

Alerts

인구 천명당 사설학원 수(개) is highly overall correlated with 사설학원수(개) and 1 other fieldsHigh correlation
사설학원수(개) is highly overall correlated with 인구 천명당 사설학원 수(개) and 1 other fieldsHigh correlation
주민등록인구수(명) is highly overall correlated with 인구 천명당 사설학원 수(개) and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-11 00:23:51.351937
Analysis finished2023-12-11 00:23:52.860782
Duration1.51 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Categorical

Distinct5
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2017
228 
2018
228 
2019
228 
2020
228 
2021
228 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 228
20.0%
2018 228
20.0%
2019 228
20.0%
2020 228
20.0%
2021 228
20.0%

Length

2023-12-11T09:23:52.931200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:23:53.303516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 228
20.0%
2018 228
20.0%
2019 228
20.0%
2020 228
20.0%
2021 228
20.0%

시도명
Categorical

Distinct16
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
경기도
155 
서울특별시
125 
경상북도
115 
전라남도
110 
강원도
90 
Other values (11)
545 

Length

Max length7
Median length5
Mean length4.1359649
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 155
13.6%
서울특별시 125
11.0%
경상북도 115
10.1%
전라남도 110
9.6%
강원도 90
7.9%
경상남도 90
7.9%
부산광역시 80
7.0%
충청남도 75
6.6%
전라북도 70
 
6.1%
충청북도 55
 
4.8%
Other values (6) 175
15.4%

Length

2023-12-11T09:23:53.425502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 155
13.6%
서울특별시 125
11.0%
경상북도 115
10.1%
전라남도 110
9.6%
강원도 90
7.9%
경상남도 90
7.9%
부산광역시 80
7.0%
충청남도 75
6.6%
전라북도 70
 
6.1%
충청북도 55
 
4.8%
Other values (6) 175
15.4%
Distinct206
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2023-12-11T09:23:53.781549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9307018
Min length2

Characters and Unicode

Total characters3341
Distinct characters132
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
동구 30
 
2.6%
중구 30
 
2.6%
서구 25
 
2.2%
남구 22
 
1.9%
북구 20
 
1.8%
고성군 10
 
0.9%
강서구 10
 
0.9%
완주군 5
 
0.4%
무주군 5
 
0.4%
진안군 5
 
0.4%
Other values (196) 978
85.8%
2023-12-11T09:23:54.274997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3341
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3341
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3341
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
425
 
12.7%
390
 
11.7%
370
 
11.1%
110
 
3.3%
100
 
3.0%
90
 
2.7%
90
 
2.7%
85
 
2.5%
80
 
2.4%
65
 
1.9%
Other values (122) 1536
46.0%

인구 천명당 사설학원 수(개)
Real number (ℝ)

HIGH CORRELATION 

Distinct41
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2729825
Minimum0.1
Maximum4.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2023-12-11T09:23:54.418140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.5
Q10.8
median1.2
Q31.6
95-th percentile2.205
Maximum4.5
Range4.4
Interquartile range (IQR)0.8

Descriptive statistics

Standard deviation0.59876069
Coefficient of variation (CV)0.47036052
Kurtosis3.6419817
Mean1.2729825
Median Absolute Deviation (MAD)0.4
Skewness1.2843176
Sum1451.2
Variance0.35851436
MonotonicityNot monotonic
2023-12-11T09:23:54.553772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
0.7 88
 
7.7%
0.9 87
 
7.6%
1.3 85
 
7.5%
1.0 82
 
7.2%
0.8 81
 
7.1%
1.4 78
 
6.8%
1.2 77
 
6.8%
1.1 66
 
5.8%
1.7 61
 
5.4%
1.5 60
 
5.3%
Other values (31) 375
32.9%
ValueCountFrequency (%)
0.1 3
 
0.3%
0.2 12
 
1.1%
0.3 17
 
1.5%
0.4 20
 
1.8%
0.5 20
 
1.8%
0.6 46
4.0%
0.7 88
7.7%
0.8 81
7.1%
0.9 87
7.6%
1.0 82
7.2%
ValueCountFrequency (%)
4.5 1
 
0.1%
4.4 1
 
0.1%
4.2 2
0.2%
4.0 2
0.2%
3.9 1
 
0.1%
3.8 1
 
0.1%
3.7 1
 
0.1%
3.6 1
 
0.1%
3.3 2
0.2%
3.2 3
0.3%

사설학원수(개)
Real number (ℝ)

HIGH CORRELATION 

Distinct597
Distinct (%)52.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean352.7693
Minimum1
Maximum2383
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2023-12-11T09:23:54.674919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile14
Q151
median188
Q3472.75
95-th percentile1143.15
Maximum2383
Range2382
Interquartile range (IQR)421.75

Descriptive statistics

Standard deviation432.76674
Coefficient of variation (CV)1.2267699
Kurtosis5.0841117
Mean352.7693
Median Absolute Deviation (MAD)162
Skewness2.1214637
Sum402157
Variance187287.05
MonotonicityNot monotonic
2023-12-11T09:23:54.803947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
51 13
 
1.1%
13 12
 
1.1%
20 12
 
1.1%
16 12
 
1.1%
19 12
 
1.1%
28 10
 
0.9%
50 10
 
0.9%
14 10
 
0.9%
26 10
 
0.9%
18 9
 
0.8%
Other values (587) 1030
90.4%
ValueCountFrequency (%)
1 2
 
0.2%
2 3
 
0.3%
4 5
0.4%
6 6
0.5%
7 3
 
0.3%
8 6
0.5%
10 5
0.4%
11 2
 
0.2%
12 8
0.7%
13 12
1.1%
ValueCountFrequency (%)
2383 1
0.1%
2361 1
0.1%
2279 1
0.1%
2263 1
0.1%
2246 1
0.1%
2091 1
0.1%
2057 1
0.1%
2052 1
0.1%
2050 1
0.1%
2042 1
0.1%

주민등록인구수(명)
Real number (ℝ)

HIGH CORRELATION 

Distinct1139
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean225666.46
Minimum8867
Maximum1202628
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2023-12-11T09:23:54.935952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8867
5-th percentile26852.25
Q152752.5
median148246
Q3341557
95-th percentile654420.05
Maximum1202628
Range1193761
Interquartile range (IQR)288804.5

Descriptive statistics

Standard deviation221758.75
Coefficient of variation (CV)0.98268368
Kurtosis3.218356
Mean225666.46
Median Absolute Deviation (MAD)108090.5
Skewness1.6655025
Sum2.5725977 × 108
Variance4.9176943 × 1010
MonotonicityNot monotonic
2023-12-11T09:23:55.069136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122499 2
 
0.2%
154770 1
 
0.1%
1186078 1
 
0.1%
537307 1
 
0.1%
298599 1
 
0.1%
818383 1
 
0.1%
550027 1
 
0.1%
461710 1
 
0.1%
940064 1
 
0.1%
222538 1
 
0.1%
Other values (1129) 1129
99.0%
ValueCountFrequency (%)
8867 1
0.1%
9077 1
0.1%
9617 1
0.1%
9832 1
0.1%
9975 1
0.1%
16320 1
0.1%
16692 1
0.1%
16993 1
0.1%
17356 1
0.1%
17479 1
0.1%
ValueCountFrequency (%)
1202628 1
0.1%
1201166 1
0.1%
1194465 1
0.1%
1186078 1
0.1%
1183714 1
0.1%
1079353 1
0.1%
1079216 1
0.1%
1077508 1
0.1%
1074176 1
0.1%
1066351 1
0.1%

Interactions

2023-12-11T09:23:52.339483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:51.681272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:52.043951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:52.438476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:51.796548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:52.142143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:23:52.553209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/