Overview

Dataset statistics

Number of variables8
Number of observations659
Missing cells600
Missing cells (%)11.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory41.3 KiB
Average record size in memory64.2 B

Variable types

Text6
Categorical1
DateTime1

Dataset

Description충청남도 금산군의 제조업체의 관한 사항으로 회사명, 대표명, 회사 주소, 회사 번호, 팩스 번호 등 의 자료를 포함하고 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=395&beforeMenuCd=DOM_000000201001001000&publicdatapk=15034906

Alerts

데이터기준일자 has constant value ""Constant
전화번호 has 182 (27.6%) missing valuesMissing
팩스번호 has 418 (63.4%) missing valuesMissing

Reproduction

Analysis started2024-01-09 22:58:26.869064
Analysis finished2024-01-09 22:58:27.880700
Duration1.01 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct648
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
2024-01-10T07:58:28.079367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length7.3793627
Min length2

Characters and Unicode

Total characters4863
Distinct characters396
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique637 ?
Unique (%)96.7%

Sample

1st row(주)BDC
2nd row(주)BDC 제2공장
3rd row(주)EG
4th row(주)가나텍
5th row(주)갈산
ValueCountFrequency (%)
농업회사법인 19
 
2.6%
제2공장 13
 
1.8%
영농조합법인 5
 
0.7%
금산공장 4
 
0.6%
제1공장 3
 
0.4%
중부대학교 3
 
0.4%
주)휴온스네이처 3
 
0.4%
2공장 3
 
0.4%
지점 2
 
0.3%
주)믿음의나무 2
 
0.3%
Other values (645) 670
92.2%
2024-01-10T07:58:28.476438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
396
 
8.1%
( 389
 
8.0%
) 389
 
8.0%
150
 
3.1%
138
 
2.8%
98
 
2.0%
92
 
1.9%
82
 
1.7%
68
 
1.4%
68
 
1.4%
Other values (386) 2993
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3908
80.4%
Open Punctuation 389
 
8.0%
Close Punctuation 389
 
8.0%
Space Separator 68
 
1.4%
Uppercase Letter 40
 
0.8%
Decimal Number 36
 
0.7%
Other Symbol 22
 
0.5%
Other Punctuation 4
 
0.1%
Lowercase Letter 4
 
0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
396
 
10.1%
150
 
3.8%
138
 
3.5%
98
 
2.5%
92
 
2.4%
82
 
2.1%
68
 
1.7%
68
 
1.7%
67
 
1.7%
67
 
1.7%
Other values (354) 2682
68.6%
Uppercase Letter
ValueCountFrequency (%)
E 7
17.5%
D 5
12.5%
C 4
10.0%
I 4
10.0%
N 4
10.0%
P 3
7.5%
G 3
7.5%
B 3
7.5%
H 2
 
5.0%
A 1
 
2.5%
Other values (4) 4
10.0%
Decimal Number
ValueCountFrequency (%)
2 22
61.1%
1 8
 
22.2%
7 2
 
5.6%
8 2
 
5.6%
6 1
 
2.8%
3 1
 
2.8%
Lowercase Letter
ValueCountFrequency (%)
c 1
25.0%
t 1
25.0%
h 1
25.0%
e 1
25.0%
Other Punctuation
ValueCountFrequency (%)
& 2
50.0%
: 2
50.0%
Open Punctuation
ValueCountFrequency (%)
( 389
100.0%
Close Punctuation
ValueCountFrequency (%)
) 389
100.0%
Space Separator
ValueCountFrequency (%)
68
100.0%
Other Symbol
ValueCountFrequency (%)
22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3930
80.8%
Common 889
 
18.3%
Latin 44
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
396
 
10.1%
150
 
3.8%
138
 
3.5%
98
 
2.5%
92
 
2.3%
82
 
2.1%
68
 
1.7%
68
 
1.7%
67
 
1.7%
67
 
1.7%
Other values (355) 2704
68.8%
Latin
ValueCountFrequency (%)
E 7
15.9%
D 5
11.4%
C 4
9.1%
I 4
9.1%
N 4
9.1%
P 3
 
6.8%
G 3
 
6.8%
B 3
 
6.8%
H 2
 
4.5%
A 1
 
2.3%
Other values (8) 8
18.2%
Common
ValueCountFrequency (%)
( 389
43.8%
) 389
43.8%
68
 
7.6%
2 22
 
2.5%
1 8
 
0.9%
- 2
 
0.2%
7 2
 
0.2%
8 2
 
0.2%
& 2
 
0.2%
: 2
 
0.2%
Other values (3) 3
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3908
80.4%
ASCII 933
 
19.2%
None 22
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
396
 
10.1%
150
 
3.8%
138
 
3.5%
98
 
2.5%
92
 
2.4%
82
 
2.1%
68
 
1.7%
68
 
1.7%
67
 
1.7%
67
 
1.7%
Other values (354) 2682
68.6%
ASCII
ValueCountFrequency (%)
( 389
41.7%
) 389
41.7%
68
 
7.3%
2 22
 
2.4%
1 8
 
0.9%
E 7
 
0.8%
D 5
 
0.5%
C 4
 
0.4%
I 4
 
0.4%
N 4
 
0.4%
Other values (21) 33
 
3.5%
None
ValueCountFrequency (%)
22
100.0%

주소
Text

Distinct618
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Memory size5.3 KiB
2024-01-10T07:58:28.739532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length51
Median length46
Mean length26.176024
Min length8

Characters and Unicode

Total characters17250
Distinct characters289
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique583 ?
Unique (%)88.5%

Sample

1st row충청남도 금산군 추부면 신평공단1로 62
2nd row충청남도 금산군 추부면 신평공단1로 85
3rd row충청남도 금산군 추부면 서대산로 459 (㈜EG)
4th row충청남도 금산군 복수면 다복로 537-18
5th row충청남도 금산군 복수면 복수공단길 37+(용진리 115-18) (풍국타올)
ValueCountFrequency (%)
금산군 653
 
17.2%
충청남도 652
 
17.2%
추부면 264
 
7.0%
복수면 153
 
4.0%
115
 
3.0%
금성면 84
 
2.2%
1필지 64
 
1.7%
군북면 58
 
1.5%
다복로 49
 
1.3%
추풍로 42
 
1.1%
Other values (907) 1652
43.6%
2024-01-10T07:58:29.136457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3127
 
18.1%
842
 
4.9%
796
 
4.6%
739
 
4.3%
677
 
3.9%
657
 
3.8%
655
 
3.8%
653
 
3.8%
643
 
3.7%
1 562
 
3.3%
Other values (279) 7899
45.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10475
60.7%
Space Separator 3127
 
18.1%
Decimal Number 2627
 
15.2%
Open Punctuation 283
 
1.6%
Close Punctuation 283
 
1.6%
Dash Punctuation 271
 
1.6%
Math Symbol 128
 
0.7%
Uppercase Letter 35
 
0.2%
Other Symbol 20
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
842
 
8.0%
796
 
7.6%
739
 
7.1%
677
 
6.5%
657
 
6.3%
655
 
6.3%
653
 
6.2%
643
 
6.1%
351
 
3.4%
325
 
3.1%
Other values (248) 4137
39.5%
Uppercase Letter
ValueCountFrequency (%)
C 5
14.3%
E 5
14.3%
D 3
8.6%
R 3
8.6%
M 3
8.6%
H 3
8.6%
J 2
 
5.7%
S 2
 
5.7%
A 2
 
5.7%
T 2
 
5.7%
Other values (4) 5
14.3%
Decimal Number
ValueCountFrequency (%)
1 562
21.4%
2 356
13.6%
3 282
10.7%
4 261
9.9%
5 240
9.1%
6 216
 
8.2%
0 195
 
7.4%
7 187
 
7.1%
8 175
 
6.7%
9 153
 
5.8%
Space Separator
ValueCountFrequency (%)
3127
100.0%
Open Punctuation
ValueCountFrequency (%)
( 283
100.0%
Close Punctuation
ValueCountFrequency (%)
) 283
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 271
100.0%
Math Symbol
ValueCountFrequency (%)
+ 128
100.0%
Other Symbol
ValueCountFrequency (%)
20
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10495
60.8%
Common 6720
39.0%
Latin 35
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
842
 
8.0%
796
 
7.6%
739
 
7.0%
677
 
6.5%
657
 
6.3%
655
 
6.2%
653
 
6.2%
643
 
6.1%
351
 
3.3%
325
 
3.1%
Other values (249) 4157
39.6%
Common
ValueCountFrequency (%)
3127
46.5%
1 562
 
8.4%
2 356
 
5.3%
( 283
 
4.2%
) 283
 
4.2%
3 282
 
4.2%
- 271
 
4.0%
4 261
 
3.9%
5 240
 
3.6%
6 216
 
3.2%
Other values (6) 839
 
12.5%
Latin
ValueCountFrequency (%)
C 5
14.3%
E 5
14.3%
D 3
8.6%
R 3
8.6%
M 3
8.6%
H 3
8.6%
J 2
 
5.7%
S 2
 
5.7%
A 2
 
5.7%
T 2
 
5.7%
Other values (4) 5
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10475
60.7%
ASCII 6755
39.2%
None 20
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3127
46.3%
1 562
 
8.3%
2 356
 
5.3%
( 283
 
4.2%
) 283
 
4.2%
3 282
 
4.2%
- 271
 
4.0%
4 261
 
3.9%
5 240
 
3.6%
6 216
 
3.2%
Other values (20) 874
 
12.9%
Hangul
ValueCountFrequency (%)
842
 
8.0%
796
 
7.6%
739
 
7.1%
677
 
6.5%
657
 
6.3%
655
 
6.3%
653
 
6.2%
643
 
6.1%
351
 
3.4%
325
 
3.1%
Other values (248) 4137
39.5%
None
ValueCountFrequency (%)
20
100.0%

전화번호
Text

MISSING 

Distinct448
Distinct (%)93.9%
Missing182
Missing (%)27.6%
Memory size5.3 KiB
2024-01-10T07:58:29.387566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.077568
Min length9

Characters and Unicode

Total characters5761
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique421 ?
Unique (%)88.3%

Sample

1st row041-751-6262
2nd row041-751-6262
3rd row041-0750-7777
4th row041-752-1197
5th row070-7602-7895
ValueCountFrequency (%)
041-753-7141 3
 
0.6%
041-753-6981 3
 
0.6%
041-751-6111 2
 
0.4%
041-752-3243 2
 
0.4%
041-753-4291 2
 
0.4%
041-752-9992 2
 
0.4%
041-752-5583 2
 
0.4%
041-752-9945 2
 
0.4%
041-754-1551 2
 
0.4%
041-752-5304 2
 
0.4%
Other values (438) 455
95.4%
2024-01-10T07:58:29.764194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/