Overview

Dataset statistics

Number of variables5
Number of observations5610
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory224.7 KiB
Average record size in memory41.0 B

Variable types

Text2
Numeric1
DateTime2

Dataset

Description충청북도의 태양광발전소 형황 데이터로 발전소명, 설치장소, 설비용량, 허가일자, 사업개시일자 등을 제공합니다.
URLhttps://www.data.go.kr/data/15043363/fileData.do

Alerts

Dataset has 2 (< 0.1%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 00:04:29.413398
Analysis finished2023-12-12 00:04:30.832575
Duration1.42 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct5273
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
2023-12-12T09:04:31.171728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length29
Mean length10.415686
Min length1

Characters and Unicode

Total characters58432
Distinct characters612
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5023 ?
Unique (%)89.5%

Sample

1st row행복발전소
2nd row문백태양광발전소
3rd row하청태양광발전소(하청태양광발전영농조합)
4th rowSTAR태양광발전소
5th row지혜태양광발전소
ValueCountFrequency (%)
태양광발전소 4034
38.7%
발전소 100
 
1.0%
태양광 57
 
0.5%
2호 49
 
0.5%
1호 37
 
0.4%
주식회사 29
 
0.3%
3호 26
 
0.2%
충북태양광사업협동조합 23
 
0.2%
마을회 21
 
0.2%
협동조합 18
 
0.2%
Other values (5148) 6025
57.8%
2023-12-12T09:04:31.716803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5315
 
9.1%
5263
 
9.0%
5255
 
9.0%
5238
 
9.0%
5231
 
9.0%
5207
 
8.9%
4824
 
8.3%
2263
 
3.9%
1 984
 
1.7%
2 945
 
1.6%
Other values (602) 17907
30.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49747
85.1%
Space Separator 4824
 
8.3%
Decimal Number 3091
 
5.3%
Uppercase Letter 293
 
0.5%
Open Punctuation 151
 
0.3%
Close Punctuation 151
 
0.3%
Lowercase Letter 70
 
0.1%
Dash Punctuation 37
 
0.1%
Other Symbol 35
 
0.1%
Other Number 16
 
< 0.1%
Other values (2) 17
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5315
 
10.7%
5263
 
10.6%
5255
 
10.6%
5238
 
10.5%
5231
 
10.5%
5207
 
10.5%
2263
 
4.5%
399
 
0.8%
388
 
0.8%
332
 
0.7%
Other values (531) 14856
29.9%
Uppercase Letter
ValueCountFrequency (%)
S 44
15.0%
J 33
11.3%
H 26
 
8.9%
K 25
 
8.5%
Y 18
 
6.1%
C 16
 
5.5%
B 16
 
5.5%
M 15
 
5.1%
G 15
 
5.1%
D 11
 
3.8%
Other values (14) 74
25.3%
Lowercase Letter
ValueCountFrequency (%)
n 9
12.9%
i 8
11.4%
e 8
11.4%
o 6
 
8.6%
s 5
 
7.1%
u 4
 
5.7%
a 4
 
5.7%
g 4
 
5.7%
l 3
 
4.3%
r 3
 
4.3%
Other values (10) 16
22.9%
Decimal Number
ValueCountFrequency (%)
1 984
31.8%
2 945
30.6%
3 445
14.4%
4 227
 
7.3%
5 158
 
5.1%
6 103
 
3.3%
7 72
 
2.3%
0 71
 
2.3%
8 46
 
1.5%
9 40
 
1.3%
Other Number
ValueCountFrequency (%)
5
31.2%
3
18.8%
3
18.8%
2
 
12.5%
2
 
12.5%
1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
& 7
63.6%
, 3
27.3%
. 1
 
9.1%
Letter Number
ValueCountFrequency (%)
3
50.0%
2
33.3%
1
 
16.7%
Space Separator
ValueCountFrequency (%)
4824
100.0%
Open Punctuation
ValueCountFrequency (%)
( 151
100.0%
Close Punctuation
ValueCountFrequency (%)
) 151
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 37
100.0%
Other Symbol
ValueCountFrequency (%)
35
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49782
85.2%
Common 8281
 
14.2%
Latin 369
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5315
 
10.7%
5263
 
10.6%
5255
 
10.6%
5238
 
10.5%
5231
 
10.5%
5207
 
10.5%
2263
 
4.5%
399
 
0.8%
388
 
0.8%
332
 
0.7%
Other values (532) 14891
29.9%
Latin
ValueCountFrequency (%)
S 44
 
11.9%
J 33
 
8.9%
H 26
 
7.0%
K 25
 
6.8%
Y 18
 
4.9%
C 16
 
4.3%
B 16
 
4.3%
M 15
 
4.1%
G 15
 
4.1%
D 11
 
3.0%
Other values (37) 150
40.7%
Common
ValueCountFrequency (%)
4824
58.3%
1 984
 
11.9%
2 945
 
11.4%
3 445
 
5.4%
4 227
 
2.7%
5 158
 
1.9%
( 151
 
1.8%
) 151
 
1.8%
6 103
 
1.2%
7 72
 
0.9%
Other values (13) 221
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49747
85.1%
ASCII 8628
 
14.8%
None 35
 
0.1%
Enclosed Alphanum 16
 
< 0.1%
Number Forms 6
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5315
 
10.7%
5263
 
10.6%
5255
 
10.6%
5238
 
10.5%
5231
 
10.5%
5207
 
10.5%
2263
 
4.5%
399
 
0.8%
388
 
0.8%
332
 
0.7%
Other values (531) 14856
29.9%
ASCII
ValueCountFrequency (%)
4824
55.9%
1 984
 
11.4%
2 945
 
11.0%
3 445
 
5.2%
4 227
 
2.6%
5 158
 
1.8%
( 151
 
1.8%
) 151
 
1.8%
6 103
 
1.2%
7 72
 
0.8%
Other values (51) 568
 
6.6%
None
ValueCountFrequency (%)
35
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
5
31.2%
3
18.8%
3
18.8%
2
 
12.5%
2
 
12.5%
1
 
6.2%
Number Forms
ValueCountFrequency (%)
3
50.0%
2
33.3%
1
 
16.7%
Distinct4175
Distinct (%)74.4%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
2023-12-12T09:04:32.013589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length122
Median length84
Mean length25.298217
Min length1

Characters and Unicode

Total characters141923
Distinct characters415
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3504 ?
Unique (%)62.5%

Sample

1st row충북 음성군 원남면 보천리486-6
2nd row충청북도 진천군 문백면 구곡리 산9
3rd row충청북도 충주시 소태면 중청리 565-2
4th row진천군 초평면 은암리 290-1
5th row충청북도 진천군 초평면 은암리 433-27
ValueCountFrequency (%)
충청북도 4083
 
13.1%
충북 899
 
2.9%
옥천군 816
 
2.6%
음성군 730
 
2.3%
충주시 725
 
2.3%
보은군 648
 
2.1%
청주시 556
 
1.8%
영동군 464
 
1.5%
괴산군 444
 
1.4%
진천군 425
 
1.4%
Other values (5929) 21412
68.6%
2023-12-12T09:04:32.469676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25935
 
18.3%
5750
 
4.1%
1 5504
 
3.9%
5374
 
3.8%
5216
 
3.7%
4269
 
3.0%
4032
 
2.8%
- 3923
 
2.8%
3899
 
2.7%
3815
 
2.7%
Other values (405) 74206
52.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 79063
55.7%
Decimal Number 27116
 
19.1%
Space Separator 25935
 
18.3%
Dash Punctuation 3923
 
2.8%
Other Punctuation 2600
 
1.8%
Open Punctuation 1631
 
1.1%
Close Punctuation 1630
 
1.1%
Uppercase Letter 20
 
< 0.1%
Math Symbol 4
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5750
 
7.3%
5374
 
6.8%
5216
 
6.6%
4269
 
5.4%
4032
 
5.1%
3899
 
4.9%
3815
 
4.8%
2159
 
2.7%
1945
 
2.5%
1641
 
2.1%
Other values (373) 40963
51.8%
Decimal Number
ValueCountFrequency (%)
1 5504
20.3%
2 3553
13.1%
3 3326
12.3%
4 2696
9.9%
5 2521
9.3%
6 2442
9.0%
7 1937
 
7.1%
9 1771
 
6.5%
8 1755
 
6.5%
0 1611
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
B 6
30.0%
C 3
15.0%
U 2
 
10.0%
D 2
 
10.0%
G 2
 
10.0%
A 2
 
10.0%
K 1
 
5.0%
F 1
 
5.0%
E 1
 
5.0%
Other Punctuation
ValueCountFrequency (%)
, 2558
98.4%
/ 35
 
1.3%
. 5
 
0.2%
: 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 1548
94.9%
[ 83
 
5.1%
Close Punctuation
ValueCountFrequency (%)
) 1547
94.9%
] 83
 
5.1%
Math Symbol
ValueCountFrequency (%)
+ 2
50.0%
~ 2
50.0%
Space Separator
ValueCountFrequency (%)
25935
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3923
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 79054
55.7%
Common 62839
44.3%
Latin 20
 
< 0.1%
Han 10
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5750
 
7.3%
5374
 
6.8%
5216
 
6.6%
4269
 
5.4%
4032
 
5.1%
3899
 
4.9%
3815
 
4.8%
2159
 
2.7%
1945
 
2.5%
1641
 
2.1%
Other values (370) 40954
51.8%
Common
ValueCountFrequency (%)
25935
41.3%
1 5504
 
8.8%
- 3923
 
6.2%
2 3553
 
5.7%
3 3326
 
5.3%
4 2696
 
4.3%
, 2558
 
4.1%
5 2521
 
4.0%
6 2442
 
3.9%
7 1937
 
3.1%
Other values (12) 8444
 
13.4%
Latin
ValueCountFrequency (%)
B 6
30.0%
C 3
15.0%
U 2
 
10.0%
D 2
 
10.0%
G 2
 
10.0%
A 2
 
10.0%
K 1
 
5.0%
F 1
 
5.0%
E 1
 
5.0%
Han
ValueCountFrequency (%)
4
40.0%
4
40.0%
1
 
10.0%
1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 79053
55.7%
ASCII 62859
44.3%
CJK 10
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25935
41.3%
1 5504
 
8.8%
- 3923
 
6.2%
2 3553
 
5.7%
3 3326
 
5.3%
4 2696
 
4.3%
, 2558
 
4.1%
5 2521
 
4.0%
6 2442
 
3.9%
7 1937
 
3.1%
Other values (21) 8464
 
13.5%
Hangul
ValueCountFrequency (%)
5750
 
7.3%
5374
 
6.8%
5216
 
6.6%
4269
 
5.4%
4032
 
5.1%
3899
 
4.9%
3815
 
4.8%
2159
 
2.7%
1945
 
2.5%
1641
 
2.1%
Other values (369) 40953
51.8%
CJK
ValueCountFrequency (%)
4
40.0%
4
40.0%
1
 
10.0%
1
 
10.0%
None
ValueCountFrequency (%)
1
100.0%

설비용량(kW)
Real number (ℝ)

Distinct1392
Distinct (%)24.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.28324
Minimum7
Maximum3000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size49.4 KiB
2023-12-12T09:04:32.601725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile19
Q157.195
median99
Q399.72
95-th percentile498.96
Maximum3000
Range2993
Interquartile range (IQR)42.525

Descriptive statistics

Standard deviation289.63409
Coefficient of variation (CV)1.9272548
Kurtosis38.627206
Mean150.28324
Median Absolute Deviation (MAD)2.52
Skewness5.6798792
Sum843088.95
Variance83887.904
MonotonicityNot monotonic
2023-12-12T09:04:32.723788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99.0 414
 
7.4%
99.9 238
 
4.2%
99.6 205
 
3.7%
99.45 140
 
2.5%
19.8 123
 
2.2%
99.36 122
 
2.2%
99.84 121
 
2.2%
99.96 116
 
2.1%
97.92 115
 
2.0%
99.2 103
 
1.8%
Other values (1382) 3913
69.8%
ValueCountFrequency (%)
7.0 1
 
< 0.1%
8.72 1
 
< 0.1%
9.0 7
0.1%
9.3 3
 
0.1%
9.57 1
 
< 0.1%
9.6 1
 
< 0.1%
9.7 2
 
< 0.1%
9.72 1
 
< 0.1%
9.9 1
 
< 0.1%
9.99 10
0.2%
ValueCountFrequency (%)
3000.0 1
< 0.1%
2997.0 1
< 0.1%
2995.47 1
< 0.1%
2994.84 1
< 0.1%
2990.0 1
< 0.1%
2988.6 1
< 0.1%
2984.85 2
< 0.1%
2965.92 1
< 0.1%
2821.09 1
< 0.1%
2812.32 1
< 0.1%
Distinct1467
Distinct (%)26.1%
Missing0
Missing (%)0.0%
Memory size44.0 KiB
Minimum2006-09-06 00:00:00
Maximum2023-02-08 00:00:00
2023-12-12T09:04:32.873227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:04:33.023655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1568
Distinct (%)28.0%
Missing3
Missing (%)0.1%
Memory size44.0 KiB
Minimum2007-11-23 00:00:00
Maximum2023-04-17 00:00:00
2023-12-12T09:04:33.157902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:04:33.288016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T09:04:30.518209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T09:04:30.666767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:04:30.778577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/