Overview

Dataset statistics

Number of variables12
Number of observations963
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory92.3 KiB
Average record size in memory98.1 B

Variable types

Text3
Categorical5
Numeric1
DateTime3

Dataset

Description충청남도 서산시 태양광발전소정보 데이터입니다. 항목명은 발전소명, 설치방식, 설치장소, 허가용량, 에너지원, 허가일, 면적, 데이터기준일자로 이루어져 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=100&beforeMenuCd=DOM_000000201001001000&publicdatapk=15103282

Alerts

영업구분 is highly imbalanced (95.2%)Imbalance
원동력의종류 is highly imbalanced (98.8%)Imbalance
공급전압(V) is highly imbalanced (82.7%)Imbalance
주파수(Hz) is highly imbalanced (98.8%)Imbalance

Reproduction

Analysis started2024-01-09 19:46:28.247325
Analysis finished2024-01-09 19:46:29.090940
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct905
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size7.7 KiB
2024-01-10T04:46:29.356427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length26
Mean length10.895119
Min length2

Characters and Unicode

Total characters10492
Distinct characters345
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique851 ?
Unique (%)88.4%

Sample

1st row주식회사 에스엘솔라
2nd row학섭 태양광발전소
3rd row용주에너지 태양광발전소
4th row지혜에너지 태양광발전소
5th row대경 태양광발전소
ValueCountFrequency (%)
태양광발전소 854
45.2%
발전소 23
 
1.2%
태양광 18
 
1.0%
서산 6
 
0.3%
주식회사 4
 
0.2%
해미1호 3
 
0.2%
해나라4호 3
 
0.2%
3
 
0.2%
기은 3
 
0.2%
운산1호 3
 
0.2%
Other values (909) 970
51.3%
2024-01-10T04:46:29.791047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
938
 
8.9%
937
 
8.9%
932
 
8.9%
931
 
8.9%
931
 
8.9%
931
 
8.9%
906
 
8.6%
462
 
4.4%
1 205
 
2.0%
155
 
1.5%
Other values (335) 3164
30.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8680
82.7%
Space Separator 932
 
8.9%
Decimal Number 683
 
6.5%
Close Punctuation 60
 
0.6%
Open Punctuation 59
 
0.6%
Uppercase Letter 46
 
0.4%
Lowercase Letter 19
 
0.2%
Other Symbol 6
 
0.1%
Letter Number 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
938
10.8%
937
10.8%
931
10.7%
931
10.7%
931
10.7%
906
 
10.4%
462
 
5.3%
155
 
1.8%
87
 
1.0%
80
 
0.9%
Other values (286) 2322
26.8%
Uppercase Letter
ValueCountFrequency (%)
G 4
 
8.7%
E 4
 
8.7%
C 4
 
8.7%
N 4
 
8.7%
P 4
 
8.7%
S 4
 
8.7%
J 3
 
6.5%
M 3
 
6.5%
R 2
 
4.3%
W 2
 
4.3%
Other values (10) 12
26.1%
Decimal Number
ValueCountFrequency (%)
1 205
30.0%
2 134
19.6%
3 88
12.9%
4 72
 
10.5%
5 51
 
7.5%
6 39
 
5.7%
7 29
 
4.2%
9 22
 
3.2%
0 22
 
3.2%
8 21
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
n 4
21.1%
o 4
21.1%
r 2
10.5%
s 2
10.5%
u 2
10.5%
t 1
 
5.3%
e 1
 
5.3%
y 1
 
5.3%
a 1
 
5.3%
l 1
 
5.3%
Letter Number
ValueCountFrequency (%)
2
66.7%
1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 1
50.0%
& 1
50.0%
Space Separator
ValueCountFrequency (%)
932
100.0%
Close Punctuation
ValueCountFrequency (%)
) 60
100.0%
Open Punctuation
ValueCountFrequency (%)
( 59
100.0%
Other Symbol
ValueCountFrequency (%)
6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8686
82.8%
Common 1738
 
16.6%
Latin 68
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
938
10.8%
937
10.8%
931
 
10.7%
931
 
10.7%
931
 
10.7%
906
 
10.4%
462
 
5.3%
155
 
1.8%
87
 
1.0%
80
 
0.9%
Other values (287) 2328
26.8%
Latin
ValueCountFrequency (%)
G 4
 
5.9%
E 4
 
5.9%
n 4
 
5.9%
C 4
 
5.9%
N 4
 
5.9%
P 4
 
5.9%
o 4
 
5.9%
S 4
 
5.9%
J 3
 
4.4%
M 3
 
4.4%
Other values (22) 30
44.1%
Common
ValueCountFrequency (%)
932
53.6%
1 205
 
11.8%
2 134
 
7.7%
3 88
 
5.1%
4 72
 
4.1%
) 60
 
3.5%
( 59
 
3.4%
5 51
 
2.9%
6 39
 
2.2%
7 29
 
1.7%
Other values (6) 69
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8680
82.7%
ASCII 1803
 
17.2%
None 6
 
0.1%
Number Forms 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
938
10.8%
937
10.8%
931
10.7%
931
10.7%
931
10.7%
906
 
10.4%
462
 
5.3%
155
 
1.8%
87
 
1.0%
80
 
0.9%
Other values (286) 2322
26.8%
ASCII
ValueCountFrequency (%)
932
51.7%
1 205
 
11.4%
2 134
 
7.4%
3 88
 
4.9%
4 72
 
4.0%
) 60
 
3.3%
( 59
 
3.3%
5 51
 
2.8%
6 39
 
2.2%
7 29
 
1.6%
Other values (36) 134
 
7.4%
None
ValueCountFrequency (%)
6
100.0%
Number Forms
ValueCountFrequency (%)
2
66.7%
1
33.3%
Distinct762
Distinct (%)79.1%
Missing0
Missing (%)0.0%
Memory size7.7 KiB
2024-01-10T04:46:30.070069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length85
Median length70
Mean length25.13811
Min length15

Characters and Unicode

Total characters24208
Distinct characters185
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique687 ?
Unique (%)71.3%

Sample

1st row충청남도 서산시 수석산업로 93(수석동)
2nd row충청남도 서산시 고북면 정자2길 93
3rd row충청남도 서산시 성연면 명천리 888, 888-1, 889, 890-2, 893-1
4th row충청남도 서산시 성연면 명천리 896, 898, 900, 901
5th row충청남도 서산시 대산읍 운산리 890-24
ValueCountFrequency (%)
서산시 964
 
18.3%
충청남도 963
 
18.2%
대산읍 160
 
3.0%
부석면 158
 
3.0%
지곡면 118
 
2.2%
고북면 107
 
2.0%
팔봉면 88
 
1.7%
장동 81
 
1.5%
음암면 69
 
1.3%
해미면 65
 
1.2%
Other values (1120) 2509
47.5%
2024-01-10T04:46:30.468344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4322
17.9%
1347
 
5.6%
1 1044
 
4.3%
1006
 
4.2%
- 993
 
4.1%
979
 
4.0%
969
 
4.0%
968
 
4.0%
964
 
4.0%
963
 
4.0%
Other values (175) 10653
44.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12858
53.1%
Decimal Number 5549
22.9%
Space Separator 4322
 
17.9%
Dash Punctuation 993
 
4.1%
Other Punctuation 400
 
1.7%
Open Punctuation 43
 
0.2%
Close Punctuation 43
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1347
 
10.5%
1006
 
7.8%
979
 
7.6%
969
 
7.5%
968
 
7.5%
964
 
7.5%
963
 
7.5%
748
 
5.8%
676
 
5.3%
393
 
3.1%
Other values (158) 3845
29.9%
Decimal Number
ValueCountFrequency (%)
1 1044
18.8%
4 652
11.7%
2 633
11.4%
8 588
10.6%
3 488
8.8%
5 477
8.6%
9 464
8.4%
7 438
7.9%
6 423
7.6%
0 342
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 386
96.5%
* 11
 
2.8%
. 3
 
0.8%
Space Separator
ValueCountFrequency (%)
4322
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 993
100.0%
Open Punctuation
ValueCountFrequency (%)
( 43
100.0%
Close Punctuation
ValueCountFrequency (%)
) 43
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12858
53.1%
Common 11350
46.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1347
 
10.5%
1006
 
7.8%
979
 
7.6%
969
 
7.5%
968
 
7.5%
964
 
7.5%
963
 
7.5%
748
 
5.8%
676
 
5.3%
393
 
3.1%
Other values (158) 3845
29.9%
Common
ValueCountFrequency (%)
4322
38.1%
1 1044
 
9.2%
- 993
 
8.7%
4 652
 
5.7%
2 633
 
5.6%
8 588
 
5.2%
3 488
 
4.3%
5 477
 
4.2%
9 464
 
4.1%
7 438
 
3.9%
Other values (7) 1251
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12858
53.1%
ASCII 11350
46.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4322
38.1%
1 1044
 
9.2%
- 993
 
8.7%
4 652
 
5.7%
2 633
 
5.6%
8 588
 
5.2%
3 488
 
4.3%
5 477
 
4.2%
9 464
 
4.1%
7 438
 
3.9%
Other values (7) 1251
 
11.0%
Hangul
ValueCountFrequency (%)
1347
 
10.5%
1006
 
7.8%
979
 
7.6%
969
 
7.5%
968
 
7.5%
964
 
7.5%
963
 
7.5%
748
 
5.8%
676
 
5.3%
393
 
3.1%
Other values (158) 3845
29.9%

법인여부
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.7 KiB
개인
842 
법인
121 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row법인
2nd row개인
3rd row개인
4th row개인
5th row개인

Common Values

ValueCountFrequency (%)
개인 842
87.4%
법인 121
 
12.6%

Length

2024-01-10T04:46:30.578949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T04:46:30.645285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개인 842
87.4%
법인 121
 
12.6%
Distinct849
Distinct (%)88.2%
Missing0
Missing (%)0.0%
Memory size7.7 KiB
2024-01-10T04:46:30.831325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/