Overview

Dataset statistics

Number of variables7
Number of observations500
Missing cells153
Missing cells (%)4.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.4 KiB
Average record size in memory58.3 B

Variable types

Numeric1
Categorical3
Text3

Dataset

Description샘플 데이터
AuthorKB국민은행
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=1

Alerts

is highly imbalanced (83.4%)Imbalance
동읍면 has 8 (1.6%) missing valuesMissing
has 145 (29.0%) missing valuesMissing
법정동코드 has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:51:16.514600
Analysis finished2023-12-10 14:51:17.365609
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

법정동코드
Real number (ℝ)

UNIQUE 

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3360896 × 109
Minimum1.1170109 × 109
Maximum5.0130253 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:51:17.448359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1170109 × 109
5-th percentile2.7110157 × 109
Q14.223025 × 109
median4.4827817 × 109
Q34.717039 × 109
95-th percentile4.8850321 × 109
Maximum5.0130253 × 109
Range3.8960144 × 109
Interquartile range (IQR)4.94014 × 108

Descriptive statistics

Standard deviation6.799384 × 108
Coefficient of variation (CV)0.15680912
Kurtosis8.1440326
Mean4.3360896 × 109
Median Absolute Deviation (MAD)2.3725085 × 108
Skewness-2.6750845
Sum2.1680448 × 1012
Variance4.6231622 × 1017
MonotonicityNot monotonic
2023-12-10T23:51:17.599713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4717025021 1
 
0.2%
4420010500 1
 
0.2%
4423036029 1
 
0.2%
4372037023 1
 
0.2%
4575035531 1
 
0.2%
4136025326 1
 
0.2%
4223036026 1
 
0.2%
4514012000 1
 
0.2%
3611034047 1
 
0.2%
4672039027 1
 
0.2%
Other values (490) 490
98.0%
ValueCountFrequency (%)
1117010900 1
0.2%
1117013100 1
0.2%
1123010900 1
0.2%
1141010500 1
0.2%
1141011700 1
0.2%
1144000000 1
0.2%
1147010300 1
0.2%
1156012200 1
0.2%
1159010600 1
0.2%
2614010100 1
0.2%
ValueCountFrequency (%)
5013025321 1
0.2%
5013012000 1
0.2%
5013000000 1
0.2%
5011025326 1
0.2%
5011012300 1
0.2%
4972025026 1
0.2%
4972025023 1
0.2%
4971025900 1
0.2%
4971025624 1
0.2%
4913011700 1
0.2%

시도
Categorical

Distinct18
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
경상북도
82 
전라남도
67 
경상남도
62 
경기도
61 
충청남도
52 
Other values (13)
176 

Length

Max length7
Median length4
Mean length3.938
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라남도
2nd row강원도
3rd row전라북도
4th row강원도
5th row경상남도

Common Values

ValueCountFrequency (%)
경상북도 82
16.4%
전라남도 67
13.4%
경상남도 62
12.4%
경기도 61
12.2%
충청남도 52
10.4%
강원도 40
8.0%
전라북도 37
7.4%
충청북도 35
7.0%
서울특별시 12
 
2.4%
부산광역시 9
 
1.8%
Other values (8) 43
8.6%

Length

2023-12-10T23:51:17.758304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경상북도 82
16.4%
전라남도 67
13.4%
경상남도 62
12.4%
경기도 61
12.2%
충청남도 52
10.4%
강원도 40
8.0%
전라북도 37
7.4%
충청북도 35
7.0%
서울특별시 12
 
2.4%
부산광역시 9
 
1.8%
Other values (8) 43
8.6%
Distinct160
Distinct (%)32.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-10T23:51:18.089338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.006
Min length2

Characters and Unicode

Total characters1503
Distinct characters116
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)7.0%

Sample

1st row북제주군
2nd row청주시
3rd row제주시
4th row안산시
5th row중구
ValueCountFrequency (%)
청주시 12
 
2.4%
중구 10
 
2.0%
포항시 10
 
2.0%
창원시 9
 
1.8%
영천시 9
 
1.8%
상주시 8
 
1.6%
제천시 8
 
1.6%
화성시 7
 
1.4%
경주시 7
 
1.4%
예산군 7
 
1.4%
Other values (150) 413
82.6%
2023-12-10T23:51:18.533619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
235
 
15.6%
234
 
15.6%
78
 
5.2%
61
 
4.1%
49
 
3.3%
48
 
3.2%
46
 
3.1%
36
 
2.4%
32
 
2.1%
28
 
1.9%
Other values (106) 656
43.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1503
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
235
 
15.6%
234
 
15.6%
78
 
5.2%
61
 
4.1%
49
 
3.3%
48
 
3.2%
46
 
3.1%
36
 
2.4%
32
 
2.1%
28
 
1.9%
Other values (106) 656
43.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1503
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
235
 
15.6%
234
 
15.6%
78
 
5.2%
61
 
4.1%
49
 
3.3%
48
 
3.2%
46
 
3.1%
36
 
2.4%
32
 
2.1%
28
 
1.9%
Other values (106) 656
43.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1503
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
235
 
15.6%
234
 
15.6%
78
 
5.2%
61
 
4.1%
49
 
3.3%
48
 
3.2%
46
 
3.1%
36
 
2.4%
32
 
2.1%
28
 
1.9%
Other values (106) 656
43.6%


Categorical

IMBALANCE 

Distinct19
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
<NA>
461 
남구
 
5
상당구
 
4
북구
 
3
흥덕구
 
3
Other values (14)
 
24

Length

Max length5
Median length4
Mean length3.92
Min length2

Unique

Unique8 ?
Unique (%)1.6%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 461
92.2%
남구 5
 
1.0%
상당구 4
 
0.8%
북구 3
 
0.6%
흥덕구 3
 
0.6%
서원구 3
 
0.6%
처인구 3
 
0.6%
동남구 3
 
0.6%
마산합포구 3
 
0.6%
청원구 2
 
0.4%
Other values (9) 10
 
2.0%

Length

2023-12-10T23:51:18.671554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 461
92.2%
남구 5
 
1.0%
상당구 4
 
0.8%
북구 3
 
0.6%
흥덕구 3
 
0.6%
서원구 3
 
0.6%
처인구 3
 
0.6%
동남구 3
 
0.6%
마산합포구 3
 
0.6%
서북구 2
 
0.4%
Other values (9) 10
 
2.0%

동읍면
Text

MISSING 

Distinct415
Distinct (%)84.3%
Missing8
Missing (%)1.6%
Memory size4.0 KiB
2023-12-10T23:51:18.962150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.0223577
Min length2

Characters and Unicode

Total characters1487
Distinct characters209
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique356 ?
Unique (%)72.4%

Sample

1st row가례면
2nd row한경면
3rd row용궁면
4th row반남면
5th row덕적면
ValueCountFrequency (%)
성산읍 5
 
1.0%
동면 4
 
0.8%
남면 4
 
0.8%
입장면 4
 
0.8%
모동면 3
 
0.6%
옥천읍 3
 
0.6%
가덕면 3
 
0.6%
현도면 3
 
0.6%
광석면 3
 
0.6%
옥산면 3
 
0.6%
Other values (405) 457
92.9%
2023-12-10T23:51:19.392946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
337
22.7%
112
 
7.5%
76
 
5.1%
34
 
2.3%
33
 
2.2%
27
 
1.8%
24
 
1.6%
22
 
1.5%
21
 
1.4%
20
 
1.3%
Other values (199) 781
52.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1480
99.5%
Decimal Number 7
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
337
22.8%
112
 
7.6%
76
 
5.1%
34
 
2.3%
33
 
2.2%
27
 
1.8%
24
 
1.6%
22
 
1.5%
21
 
1.4%
20
 
1.4%
Other values (195) 774
52.3%
Decimal Number
ValueCountFrequency (%)
2 3
42.9%
1 2
28.6%
7 1
 
14.3%
3 1
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1480
99.5%
Common 7
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
337
22.8%
112
 
7.6%
76
 
5.1%
34
 
2.3%
33
 
2.2%
27
 
1.8%
24
 
1.6%
22
 
1.5%
21
 
1.4%
20
 
1.4%
Other values (195) 774
52.3%
Common
ValueCountFrequency (%)
2 3
42.9%
1 2
28.6%
7 1
 
14.3%
3 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1480
99.5%
ASCII 7
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
337
22.8%
112
 
7.6%
76
 
5.1%
34
 
2.3%
33
 
2.2%
27
 
1.8%
24
 
1.6%
22
 
1.5%
21
 
1.4%
20
 
1.4%
Other values (195) 774
52.3%
ASCII
ValueCountFrequency (%)
2 3
42.9%
1 2
28.6%
7 1
 
14.3%
3 1
 
14.3%


Text

MISSING 

Distinct341
Distinct (%)96.1%
Missing145
Missing (%)29.0%
Memory size4.0 KiB
2023-12-10T23:51:19.684585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/