Overview

Dataset statistics

Number of variables31
Number of observations1135
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory278.3 KiB
Average record size in memory251.1 B

Variable types

Numeric2
Categorical13
Text15
DateTime1

Alerts

데이터기준 has constant value ""Constant
자료출처 has constant value ""Constant
공개여부 has constant value ""Constant
작성일 has constant value ""Constant
갱신주기 has constant value ""Constant
시군명 is highly imbalanced (73.3%)Imbalance
강좌내용 is highly imbalanced (86.2%)Imbalance
교육방법 is highly imbalanced (90.6%)Imbalance
선정방법 is highly imbalanced (64.0%)Imbalance
훈련비지원 is highly imbalanced (80.6%)Imbalance
학점은행제 is highly imbalanced (87.1%)Imbalance
평생학습 is highly imbalanced (69.8%)Imbalance
순번 has unique valuesUnique
강좌정원수 has 133 (11.7%) zerosZeros

Reproduction

Analysis started2024-03-14 00:36:50.399639
Analysis finished2024-03-14 00:36:51.393963
Duration0.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct1135
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean575.19648
Minimum1
Maximum1155
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2024-03-14T09:36:51.447258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile57.7
Q1288.5
median572
Q3861.5
95-th percentile1098.3
Maximum1155
Range1154
Interquartile range (IQR)573

Descriptive statistics

Standard deviation332.20091
Coefficient of variation (CV)0.57754336
Kurtosis-1.1935733
Mean575.19648
Median Absolute Deviation (MAD)287
Skewness0.0061570182
Sum652848
Variance110357.44
MonotonicityStrictly increasing
2024-03-14T09:36:51.565338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
766 1
 
0.1%
772 1
 
0.1%
771 1
 
0.1%
770 1
 
0.1%
769 1
 
0.1%
768 1
 
0.1%
767 1
 
0.1%
765 1
 
0.1%
774 1
 
0.1%
Other values (1125) 1125
99.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1155 1
0.1%
1154 1
0.1%
1153 1
0.1%
1152 1
0.1%
1151 1
0.1%
1150 1
0.1%
1149 1
0.1%
1148 1
0.1%
1147 1
0.1%
1146 1
0.1%

시군명
Categorical

IMBALANCE 

Distinct11
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
익산시
995 
김제시
 
27
부안군
 
23
전주시
 
22
군산시
 
18
Other values (6)
 
50

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전주시
2nd row전주시
3rd row전주시
4th row전주시
5th row전주시

Common Values

ValueCountFrequency (%)
익산시 995
87.7%
김제시 27
 
2.4%
부안군 23
 
2.0%
전주시 22
 
1.9%
군산시 18
 
1.6%
남원시 11
 
1.0%
정읍시 10
 
0.9%
진안군 9
 
0.8%
순창군 9
 
0.8%
장수군 6
 
0.5%

Length

2024-03-14T09:36:51.671126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
익산시 995
87.7%
김제시 27
 
2.4%
부안군 23
 
2.0%
전주시 22
 
1.9%
군산시 18
 
1.6%
남원시 11
 
1.0%
정읍시 10
 
0.9%
진안군 9
 
0.8%
순창군 9
 
0.8%
장수군 6
 
0.5%
Distinct999
Distinct (%)88.0%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2024-03-14T09:36:51.935784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length30
Mean length11.436123
Min length1

Characters and Unicode

Total characters12980
Distinct characters601
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique933 ?
Unique (%)82.2%

Sample

1st row우쿨렐레
2nd row여행스케치
3rd row서예문인화작품반
4th row문인화
5th row사군자입문반
ValueCountFrequency (%)
트니트니 19
 
0.8%
토요 18
 
0.8%
교실 17
 
0.7%
50%할인 15
 
0.6%
목요 14
 
0.6%
퍼포먼스 14
 
0.6%
노래교실 14
 
0.6%
저녁반 14
 
0.6%
요가 13
 
0.5%
주민행복 12
 
0.5%
Other values (1408) 2230
93.7%
2024-03-14T09:36:52.392136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1258
 
9.7%
) 525
 
4.0%
( 382
 
2.9%
2 276
 
2.1%
1 221
 
1.7%
- 186
 
1.4%
181
 
1.4%
179
 
1.4%
168
 
1.3%
167
 
1.3%
Other values (591) 9437
72.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8602
66.3%
Space Separator 1258
 
9.7%
Decimal Number 1201
 
9.3%
Close Punctuation 527
 
4.1%
Open Punctuation 384
 
3.0%
Uppercase Letter 319
 
2.5%
Other Punctuation 279
 
2.1%
Dash Punctuation 186
 
1.4%
Lowercase Letter 112
 
0.9%
Math Symbol 110
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
181
 
2.1%
179
 
2.1%
168
 
2.0%
167
 
1.9%
166
 
1.9%
153
 
1.8%
141
 
1.6%
127
 
1.5%
122
 
1.4%
119
 
1.4%
Other values (525) 7079
82.3%
Uppercase Letter
ValueCountFrequency (%)
W 37
11.6%
E 36
11.3%
A 29
9.1%
N 28
8.8%
H 27
8.5%
B 26
8.2%
S 24
 
7.5%
P 20
 
6.3%
L 16
 
5.0%
Y 14
 
4.4%
Other values (12) 62
19.4%
Lowercase Letter
ValueCountFrequency (%)
o 23
20.5%
h 14
12.5%
c 11
9.8%
t 10
8.9%
l 8
 
7.1%
u 8
 
7.1%
i 7
 
6.2%
p 4
 
3.6%
y 4
 
3.6%
s 3
 
2.7%
Other values (8) 20
17.9%
Decimal Number
ValueCountFrequency (%)
2 276
23.0%
1 221
18.4%
3 165
13.7%
5 126
10.5%
6 101
 
8.4%
4 87
 
7.2%
0 81
 
6.7%
7 73
 
6.1%
8 39
 
3.2%
9 32
 
2.7%
Other Punctuation
ValueCountFrequency (%)
/ 122
43.7%
, 48
 
17.2%
! 39
 
14.0%
% 33
 
11.8%
& 30
 
10.8%
. 3
 
1.1%
2
 
0.7%
· 2
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 525
99.6%
2
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 382
99.5%
2
 
0.5%
Space Separator
ValueCountFrequency (%)
1258
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 186
100.0%
Math Symbol
ValueCountFrequency (%)
~ 110
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8601
66.3%
Common 3947
30.4%
Latin 431
 
3.3%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
181
 
2.1%
179
 
2.1%
168
 
2.0%
167
 
1.9%
166
 
1.9%
153
 
1.8%
141
 
1.6%
127
 
1.5%
122
 
1.4%
119
 
1.4%
Other values (524) 7078
82.3%
Latin
ValueCountFrequency (%)
W 37
 
8.6%
E 36
 
8.4%
A 29
 
6.7%
N 28
 
6.5%
H 27
 
6.3%
B 26
 
6.0%
S 24
 
5.6%
o 23
 
5.3%
P 20
 
4.6%
L 16
 
3.7%
Other values (30) 165
38.3%
Common
ValueCountFrequency (%)
1258
31.9%
) 525
13.3%
( 382
 
9.7%
2 276
 
7.0%
1 221
 
5.6%
- 186
 
4.7%
3 165
 
4.2%
5 126
 
3.2%
/ 122
 
3.1%
~ 110
 
2.8%
Other values (16) 576
14.6%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8601
66.3%
ASCII 4370
33.7%
None 8
 
0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1258
28.8%
) 525
12.0%
( 382
 
8.7%
2 276
 
6.3%
1 221
 
5.1%
- 186
 
4.3%
3 165
 
3.8%
5 126
 
2.9%
/ 122
 
2.8%
~ 110
 
2.5%
Other values (52) 999
22.9%
Hangul
ValueCountFrequency (%)
181
 
2.1%
179
 
2.1%
168
 
2.0%
167
 
1.9%
166
 
1.9%
153
 
1.8%
141
 
1.6%
127
 
1.5%
122
 
1.4%
119
 
1.4%
Other values (524) 7078
82.3%
None
ValueCountFrequency (%)
2
25.0%
· 2
25.0%
2
25.0%
2
25.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct500
Distinct (%)44.1%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2024-03-14T09:36:52.632998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length3
Mean length3.0246696
Min length1

Characters and Unicode

Total characters3433
Distinct characters207
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique300 ?
Unique (%)26.4%

Sample

1st row김현주
2nd row김성욱
3rd row김연
4th row김연
5th row김연
ValueCountFrequency (%)
비공개 95
 
8.2%
38
 
3.3%
김연화 27
 
2.3%
최희자 13
 
1.1%
김세희 12
 
1.0%
정은희 11
 
1.0%
임창현 11
 
1.0%
박은실 10
 
0.9%
센터직원 9
 
0.8%
노성숙 9
 
0.8%
Other values (500) 917
79.6%
2024-03-14T09:36:53.014636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
229
 
6.7%
139
 
4.0%
115
 
3.3%
114
 
3.3%
108
 
3.1%
95
 
2.8%
95
 
2.8%
95
 
2.8%
90
 
2.6%
83
 
2.4%
Other values (197) 2270
66.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3364
98.0%
Dash Punctuation 39
 
1.1%
Space Separator 17
 
0.5%
Uppercase Letter 9
 
0.3%
Other Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
229
 
6.8%
139
 
4.1%
115
 
3.4%
114
 
3.4%
108
 
3.2%
95
 
2.8%
95
 
2.8%
95
 
2.8%
90
 
2.7%
83
 
2.5%
Other values (188) 2201
65.4%
Uppercase Letter
ValueCountFrequency (%)
D 2
22.2%
A 2
22.2%
N 2
22.2%
I 1
11.1%
L 1
11.1%
U 1
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 39
100.0%
Space Separator
ValueCountFrequency (%)
17
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3364
98.0%
Common 60
 
1.7%
Latin 9
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
229
 
6.8%
139
 
4.1%
115
 
3.4%
114
 
3.4%
108
 
3.2%
95
 
2.8%
95
 
2.8%
95
 
2.8%
90
 
2.7%
83
 
2.5%
Other values (188) 2201
65.4%
Latin
ValueCountFrequency (%)
D 2
22.2%
A 2
22.2%
N 2
22.2%
I 1
11.1%
L 1
11.1%
U 1
11.1%
Common
ValueCountFrequency (%)
- 39
65.0%
17
28.3%
, 4
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3364
98.0%
ASCII 69
 
2.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
229
 
6.8%
139
 
4.1%
115
 
3.4%
114
 
3.4%
108
 
3.2%
95
 
2.8%
95
 
2.8%
95
 
2.8%
90
 
2.7%
83
 
2.5%
Other values (188) 2201
65.4%
ASCII
ValueCountFrequency (%)
- 39
56.5%
17
24.6%
, 4
 
5.8%
D 2
 
2.9%
A 2
 
2.9%
N 2
 
2.9%
I 1
 
1.4%
L 1
 
1.4%
U 1
 
1.4%
Distinct132
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2024-03-14T09:36:53.225186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/