Overview

Dataset statistics

Number of variables8
Number of observations604
Missing cells597
Missing cells (%)12.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory37.9 KiB
Average record size in memory64.2 B

Variable types

Text6
Categorical1
DateTime1

Dataset

Description충청남도 금산군의 제조업체의 관한 사항으로 회사명, 대표명, 회사 주소, 회사 번호, 팩스 번호 등 의 자료를 포함하고 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=395&beforeMenuCd=DOM_000000201001001000&publicdatapk=15034906

Alerts

데이터기준일자 has constant value ""Constant
전화번호 has 128 (21.2%) missing valuesMissing
팩스번호 has 468 (77.5%) missing valuesMissing

Reproduction

Analysis started2024-01-09 22:58:20.794239
Analysis finished2024-01-09 22:58:21.749883
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct593
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
2024-01-10T07:58:21.877936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length7.3874172
Min length2

Characters and Unicode

Total characters4462
Distinct characters374
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique582 ?
Unique (%)96.4%

Sample

1st row(사)충청남도 장애인부모회 금산지회
2nd row(유)다함
3rd row(주)BDC
4th row(주)EG
5th row(주)갈산
ValueCountFrequency (%)
농업회사법인 15
 
2.3%
제2공장 10
 
1.5%
주)휴온스네이처 4
 
0.6%
영농조합법인 4
 
0.6%
금산공장 4
 
0.6%
중부대학교 3
 
0.5%
제1공장 3
 
0.5%
동아산업 2
 
0.3%
주)대한홍삼진흥공사 2
 
0.3%
금산지점 2
 
0.3%
Other values (592) 614
92.6%
2024-01-10T07:58:22.208040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
357
 
8.0%
( 350
 
7.8%
) 350
 
7.8%
137
 
3.1%
131
 
2.9%
102
 
2.3%
86
 
1.9%
71
 
1.6%
64
 
1.4%
62
 
1.4%
Other values (364) 2752
61.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3621
81.2%
Open Punctuation 350
 
7.8%
Close Punctuation 350
 
7.8%
Space Separator 59
 
1.3%
Decimal Number 27
 
0.6%
Other Symbol 26
 
0.6%
Uppercase Letter 26
 
0.6%
Other Punctuation 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
357
 
9.9%
137
 
3.8%
131
 
3.6%
102
 
2.8%
86
 
2.4%
71
 
2.0%
64
 
1.8%
62
 
1.7%
60
 
1.7%
58
 
1.6%
Other values (340) 2493
68.8%
Uppercase Letter
ValueCountFrequency (%)
E 5
19.2%
G 3
11.5%
P 2
 
7.7%
I 2
 
7.7%
H 2
 
7.7%
T 2
 
7.7%
C 2
 
7.7%
B 2
 
7.7%
D 2
 
7.7%
A 1
 
3.8%
Other values (3) 3
11.5%
Decimal Number
ValueCountFrequency (%)
2 16
59.3%
8 4
 
14.8%
1 4
 
14.8%
9 2
 
7.4%
3 1
 
3.7%
Open Punctuation
ValueCountFrequency (%)
( 350
100.0%
Close Punctuation
ValueCountFrequency (%)
) 350
100.0%
Space Separator
ValueCountFrequency (%)
59
100.0%
Other Symbol
ValueCountFrequency (%)
26
100.0%
Other Punctuation
ValueCountFrequency (%)
& 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3647
81.7%
Common 789
 
17.7%
Latin 26
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
357
 
9.8%
137
 
3.8%
131
 
3.6%
102
 
2.8%
86
 
2.4%
71
 
1.9%
64
 
1.8%
62
 
1.7%
60
 
1.6%
58
 
1.6%
Other values (341) 2519
69.1%
Latin
ValueCountFrequency (%)
E 5
19.2%
G 3
11.5%
P 2
 
7.7%
I 2
 
7.7%
H 2
 
7.7%
T 2
 
7.7%
C 2
 
7.7%
B 2
 
7.7%
D 2
 
7.7%
A 1
 
3.8%
Other values (3) 3
11.5%
Common
ValueCountFrequency (%)
( 350
44.4%
) 350
44.4%
59
 
7.5%
2 16
 
2.0%
8 4
 
0.5%
1 4
 
0.5%
9 2
 
0.3%
& 2
 
0.3%
- 1
 
0.1%
3 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3621
81.2%
ASCII 815
 
18.3%
None 26
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
357
 
9.9%
137
 
3.8%
131
 
3.6%
102
 
2.8%
86
 
2.4%
71
 
2.0%
64
 
1.8%
62
 
1.7%
60
 
1.7%
58
 
1.6%
Other values (340) 2493
68.8%
ASCII
ValueCountFrequency (%)
( 350
42.9%
) 350
42.9%
59
 
7.2%
2 16
 
2.0%
E 5
 
0.6%
8 4
 
0.5%
1 4
 
0.5%
G 3
 
0.4%
9 2
 
0.2%
P 2
 
0.2%
Other values (13) 20
 
2.5%
None
ValueCountFrequency (%)
26
100.0%

주소
Text

Distinct568
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
2024-01-10T07:58:22.482902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length47
Mean length26.360927
Min length15

Characters and Unicode

Total characters15922
Distinct characters287
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique540 ?
Unique (%)89.4%

Sample

1st row충청남도 금산군 금성면 금산로 2044, (마수리 204-3)
2nd row충청남도 금산군 금산읍 후곤천길 112, 103호
3rd row충청남도 금산군 추부면 신평공단1로 62
4th row충청남도 금산군 추부면 서대산로 459 (㈜EG)
5th row충청남도 금산군 복수면 복수공단길 37, (용진리 115-18) (풍국타올)
ValueCountFrequency (%)
충청남도 604
 
16.9%
금산군 604
 
16.9%
추부면 246
 
6.9%
복수면 146
 
4.1%
93
 
2.6%
금성면 63
 
1.8%
1필지 49
 
1.4%
추풍로 44
 
1.2%
군북면 42
 
1.2%
다복로 41
 
1.1%
Other values (828) 1645
46.0%
2024-01-10T07:58:22.886470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2977
18.7%
765
 
4.8%
749
 
4.7%
675
 
4.2%
633
 
4.0%
608
 
3.8%
607
 
3.8%
604
 
3.8%
581
 
3.6%
1 524
 
3.3%
Other values (277) 7199
45.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9599
60.3%
Space Separator 2977
 
18.7%
Decimal Number 2425
 
15.2%
Close Punctuation 254
 
1.6%
Open Punctuation 254
 
1.6%
Dash Punctuation 243
 
1.5%
Other Punctuation 121
 
0.8%
Uppercase Letter 28
 
0.2%
Other Symbol 21
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
765
 
8.0%
749
 
7.8%
675
 
7.0%
633
 
6.6%
608
 
6.3%
607
 
6.3%
604
 
6.3%
581
 
6.1%
328
 
3.4%
297
 
3.1%
Other values (247) 3752
39.1%
Uppercase Letter
ValueCountFrequency (%)
E 5
17.9%
C 4
14.3%
H 3
10.7%
R 3
10.7%
M 2
 
7.1%
B 2
 
7.1%
S 2
 
7.1%
T 2
 
7.1%
D 1
 
3.6%
I 1
 
3.6%
Other values (3) 3
10.7%
Decimal Number
ValueCountFrequency (%)
1 524
21.6%
2 361
14.9%
3 267
11.0%
4 227
9.4%
5 224
9.2%
0 203
 
8.4%
6 190
 
7.8%
7 151
 
6.2%
8 144
 
5.9%
9 134
 
5.5%
Other Punctuation
ValueCountFrequency (%)
, 120
99.2%
& 1
 
0.8%
Space Separator
ValueCountFrequency (%)
2977
100.0%
Close Punctuation
ValueCountFrequency (%)
) 254
100.0%
Open Punctuation
ValueCountFrequency (%)
( 254
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 243
100.0%
Other Symbol
ValueCountFrequency (%)
21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9620
60.4%
Common 6274
39.4%
Latin 28
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
765
 
8.0%
749
 
7.8%
675
 
7.0%
633
 
6.6%
608
 
6.3%
607
 
6.3%
604
 
6.3%
581
 
6.0%
328
 
3.4%
297
 
3.1%
Other values (248) 3773
39.2%
Common
ValueCountFrequency (%)
2977
47.4%
1 524
 
8.4%
2 361
 
5.8%
3 267
 
4.3%
) 254
 
4.0%
( 254
 
4.0%
- 243
 
3.9%
4 227
 
3.6%
5 224
 
3.6%
0 203
 
3.2%
Other values (6) 740
 
11.8%
Latin
ValueCountFrequency (%)
E 5
17.9%
C 4
14.3%
H 3
10.7%
R 3
10.7%
M 2
 
7.1%
B 2
 
7.1%
S 2
 
7.1%
T 2
 
7.1%
D 1
 
3.6%
I 1
 
3.6%
Other values (3) 3
10.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9599
60.3%
ASCII 6302
39.6%
None 21
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2977
47.2%
1 524
 
8.3%
2 361
 
5.7%
3 267
 
4.2%
) 254
 
4.0%
( 254
 
4.0%
- 243
 
3.9%
4 227
 
3.6%
5 224
 
3.6%
0 203
 
3.2%
Other values (19) 768
 
12.2%
Hangul
ValueCountFrequency (%)
765
 
8.0%
749
 
7.8%
675
 
7.0%
633
 
6.6%
608
 
6.3%
607
 
6.3%
604
 
6.3%
581
 
6.1%
328
 
3.4%
297
 
3.1%
Other values (247) 3752
39.1%
None
ValueCountFrequency (%)
21
100.0%

전화번호
Text

MISSING 

Distinct440
Distinct (%)92.4%
Missing128
Missing (%)21.2%
Memory size4.8 KiB
2024-01-10T07:58:23.127096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.002101
Min length9

Characters and Unicode

Total characters5713
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique407 ?
Unique (%)85.5%

Sample

1st row041-753-7887
2nd row042-335-9092
3rd row041-751-6262
4th row041-750-7777
5th row070-7602-7895
ValueCountFrequency (%)
041-753-6981 4
 
0.8%
041-754-5421 3
 
0.6%
041-753-7685 2
 
0.4%
041-752-4160 2
 
0.4%
041-752-5583 2
 
0.4%
041-752-9992 2
 
0.4%
041-753-6913 2
 
0.4%
041-752-1750 2
 
0.4%
041-753-3366 2
 
0.4%
041-754-1551 2
 
0.4%
Other values (430) 453
95.2%
2024-01-10T07:58:23.476254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 947
16.6%
0 785
13.7%
1 777
13.6%
4 730
12.8%
5 618
10.8%
7 606
10.6%
3 344
 
6.0%
2 316
 
5.5%
8 222
 
3.9%
6 191
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4766
83.4%
Dash Punctuation 947
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 785
16.5%
1 777
16.3%
4 730
15.3%
5 618
13.0%
7 606
12.7%
3 344
7.2%
2 316
6.6%
8 222
 
4.7%
6 191
 
4.0%
9 177
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
- 947
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5713
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 947
16.6%
0 785
13.7%
1 777
13.6%
4 730
12.8%
5 618
10.8%
7 606
10.6%
3 344
 
6.0%
2 316
 
5.5%
8 222
 
3.9%
6 191
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5713
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 947
16.6%
0 785
13.7%
1 777
13.6%
4 730
12.8%
5 618
10.8%
7 606
10.6%
3 344
 
6.0%
2 316
 
5.5%
8 222
 
3.9%
6 191
 
3.3%

팩스번호
Text

MISSING 

Distinct131
Distinct (%)96.3%
Missing468
Missing (%)77.5%
Memory size4.8 KiB
2024-01-10T07:58:23.699506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/