Overview

Dataset statistics

Number of variables8
Number of observations738
Missing cells880
Missing cells (%)14.9%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory46.3 KiB
Average record size in memory64.2 B

Variable types

Text6
Categorical2

Dataset

Description충청남도 금산군의 제조업체의 관한 사항으로 회사명, 대표명, 회사 주소, 회사 번호, 팩스 번호 등 의 자료를 포함하고 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=395&beforeMenuCd=DOM_000000201001001000&publicdatapk=15034906

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.1%) duplicate rowsDuplicates
전화번호 has 156 (21.1%) missing valuesMissing
팩스번호 has 724 (98.1%) missing valuesMissing

Reproduction

Analysis started2024-01-09 22:58:34.139834
Analysis finished2024-01-09 22:58:35.173105
Duration1.03 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct665
Distinct (%)90.1%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
2024-01-10T07:58:35.320235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length7.2601626
Min length2

Characters and Unicode

Total characters5358
Distinct characters378
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique595 ?
Unique (%)80.6%

Sample

1st row(사)충청남도 장애인부모회 금산지회
2nd row(유)다함
3rd row(유)신화상사
4th row(유)자연의길
5th row(주)BDC
ValueCountFrequency (%)
농업회사법인 11
 
1.4%
제2공장 9
 
1.1%
영농조합법인 6
 
0.7%
금산공장 6
 
0.7%
진우산업 4
 
0.5%
주)휴온스네이처 4
 
0.5%
주)에스코알티에스 3
 
0.4%
중부대학교 3
 
0.4%
농업회사법인(주 3
 
0.4%
제1공장 3
 
0.4%
Other values (667) 753
93.5%
2024-01-10T07:58:35.653526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
407
 
7.6%
( 402
 
7.5%
) 402
 
7.5%
178
 
3.3%
161
 
3.0%
109
 
2.0%
98
 
1.8%
97
 
1.8%
80
 
1.5%
80
 
1.5%
Other values (368) 3344
62.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4401
82.1%
Open Punctuation 402
 
7.5%
Close Punctuation 402
 
7.5%
Space Separator 67
 
1.3%
Uppercase Letter 32
 
0.6%
Decimal Number 28
 
0.5%
Other Symbol 20
 
0.4%
Dash Punctuation 3
 
0.1%
Other Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
407
 
9.2%
178
 
4.0%
161
 
3.7%
109
 
2.5%
98
 
2.2%
97
 
2.2%
80
 
1.8%
80
 
1.8%
77
 
1.7%
76
 
1.7%
Other values (345) 3038
69.0%
Uppercase Letter
ValueCountFrequency (%)
E 5
15.6%
I 4
12.5%
D 4
12.5%
B 3
9.4%
C 3
9.4%
G 3
9.4%
P 2
 
6.2%
N 2
 
6.2%
H 2
 
6.2%
T 1
 
3.1%
Other values (3) 3
9.4%
Decimal Number
ValueCountFrequency (%)
2 20
71.4%
1 7
 
25.0%
3 1
 
3.6%
Other Punctuation
ValueCountFrequency (%)
& 2
66.7%
, 1
33.3%
Open Punctuation
ValueCountFrequency (%)
( 402
100.0%
Close Punctuation
ValueCountFrequency (%)
) 402
100.0%
Space Separator
ValueCountFrequency (%)
67
100.0%
Other Symbol
ValueCountFrequency (%)
20
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4421
82.5%
Common 905
 
16.9%
Latin 32
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
407
 
9.2%
178
 
4.0%
161
 
3.6%
109
 
2.5%
98
 
2.2%
97
 
2.2%
80
 
1.8%
80
 
1.8%
77
 
1.7%
76
 
1.7%
Other values (346) 3058
69.2%
Latin
ValueCountFrequency (%)
E 5
15.6%
I 4
12.5%
D 4
12.5%
B 3
9.4%
C 3
9.4%
G 3
9.4%
P 2
 
6.2%
N 2
 
6.2%
H 2
 
6.2%
T 1
 
3.1%
Other values (3) 3
9.4%
Common
ValueCountFrequency (%)
( 402
44.4%
) 402
44.4%
67
 
7.4%
2 20
 
2.2%
1 7
 
0.8%
- 3
 
0.3%
& 2
 
0.2%
, 1
 
0.1%
3 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4401
82.1%
ASCII 937
 
17.5%
None 20
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
407
 
9.2%
178
 
4.0%
161
 
3.7%
109
 
2.5%
98
 
2.2%
97
 
2.2%
80
 
1.8%
80
 
1.8%
77
 
1.7%
76
 
1.7%
Other values (345) 3038
69.0%
ASCII
ValueCountFrequency (%)
( 402
42.9%
) 402
42.9%
67
 
7.2%
2 20
 
2.1%
1 7
 
0.7%
E 5
 
0.5%
I 4
 
0.4%
D 4
 
0.4%
- 3
 
0.3%
B 3
 
0.3%
Other values (12) 20
 
2.1%
None
ValueCountFrequency (%)
20
100.0%

주소
Text

Distinct682
Distinct (%)92.4%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
2024-01-10T07:58:35.947211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length42
Mean length18.132791
Min length9

Characters and Unicode

Total characters13382
Distinct characters288
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique638 ?
Unique (%)86.4%

Sample

1st row금성면 금산로 2044, (마수리 204-3)
2nd row금산읍 후곤천길 112, 102동 103호
3rd row남이면 강변길 49-21, (흑암리 915)
4th row금성면 금성공단로 19-18, (하신리 770)
5th row추부면 신평공단1로 62
ValueCountFrequency (%)
추부면 300
 
10.0%
복수면 162
 
5.4%
금산군 123
 
4.1%
충청남도 122
 
4.1%
금성면 78
 
2.6%
추풍로 58
 
1.9%
다복로 55
 
1.8%
진산면 54
 
1.8%
군북면 47
 
1.6%
금산읍 46
 
1.5%
Other values (849) 1944
65.0%
2024-01-10T07:58:36.407265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2404
 
18.0%
697
 
5.2%
1 540
 
4.0%
2 418
 
3.1%
407
 
3.0%
364
 
2.7%
349
 
2.6%
328
 
2.5%
( 313
 
2.3%
310
 
2.3%
Other values (278) 7252
54.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7096
53.0%
Decimal Number 2779
 
20.8%
Space Separator 2404
 
18.0%
Open Punctuation 313
 
2.3%
Close Punctuation 310
 
2.3%
Dash Punctuation 274
 
2.0%
Other Punctuation 136
 
1.0%
Uppercase Letter 46
 
0.3%
Other Symbol 24
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
697
 
9.8%
407
 
5.7%
364
 
5.1%
349
 
4.9%
328
 
4.6%
310
 
4.4%
280
 
3.9%
244
 
3.4%
234
 
3.3%
231
 
3.3%
Other values (248) 3652
51.5%
Uppercase Letter
ValueCountFrequency (%)
C 8
17.4%
E 7
15.2%
M 5
10.9%
S 4
8.7%
H 4
8.7%
R 3
 
6.5%
J 3
 
6.5%
T 3
 
6.5%
O 2
 
4.3%
I 2
 
4.3%
Other values (3) 5
10.9%
Decimal Number
ValueCountFrequency (%)
1 540
19.4%
2 418
15.0%
5 282
10.1%
3 280
10.1%
4 271
9.8%
6 223
8.0%
0 217
7.8%
7 195
 
7.0%
8 186
 
6.7%
9 167
 
6.0%
Other Punctuation
ValueCountFrequency (%)
, 135
99.3%
& 1
 
0.7%
Space Separator
ValueCountFrequency (%)
2404
100.0%
Open Punctuation
ValueCountFrequency (%)
( 313
100.0%
Close Punctuation
ValueCountFrequency (%)
) 310
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 274
100.0%
Other Symbol
ValueCountFrequency (%)
24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7120
53.2%
Common 6216
46.5%
Latin 46
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
697
 
9.8%
407
 
5.7%
364
 
5.1%
349
 
4.9%
328
 
4.6%
310
 
4.4%
280
 
3.9%
244
 
3.4%
234
 
3.3%
231
 
3.2%
Other values (249) 3676
51.6%
Common
ValueCountFrequency (%)
2404
38.7%
1 540
 
8.7%
2 418
 
6.7%
( 313
 
5.0%
) 310
 
5.0%
5 282
 
4.5%
3 280
 
4.5%
- 274
 
4.4%
4 271
 
4.4%
6 223
 
3.6%
Other values (6) 901
 
14.5%
Latin
ValueCountFrequency (%)
C 8
17.4%
E 7
15.2%
M 5
10.9%
S 4
8.7%
H 4
8.7%
R 3
 
6.5%
J 3
 
6.5%
T 3
 
6.5%
O 2
 
4.3%
I 2
 
4.3%
Other values (3) 5
10.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7096
53.0%
ASCII 6262
46.8%
None 24
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2404
38.4%
1 540
 
8.6%
2 418
 
6.7%
( 313
 
5.0%
) 310
 
5.0%
5 282
 
4.5%
3 280
 
4.5%
- 274
 
4.4%
4 271
 
4.3%
6 223
 
3.6%
Other values (19) 947
 
15.1%
Hangul
ValueCountFrequency (%)
697
 
9.8%
407
 
5.7%
364
 
5.1%
349
 
4.9%
328
 
4.6%
310
 
4.4%
280
 
3.9%
244
 
3.4%
234
 
3.3%
231
 
3.3%
Other values (248) 3652
51.5%
None
ValueCountFrequency (%)
24
100.0%

전화번호
Text

MISSING 

Distinct514
Distinct (%)88.3%
Missing156
Missing (%)21.1%
Memory size5.9 KiB
2024-01-10T07:58:36.640859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.027491
Min length9

Characters and Unicode

Total characters7000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique454 ?
Unique (%)78.0%

Sample

1st row041-753-7887
2nd row042-335-9092
3rd row041-753-7979
4th row041-751-6262
5th row041-751-6262
ValueCountFrequency (%)
041-752-9945 4
 
0.7%
041-754-5421 4
 
0.7%
041-751-6870 3
 
0.5%
041-754-0072 3
 
0.5%
041-753-7327 3
 
0.5%
041-752-3243 3
 
0.5%
041-752-0872 2
 
0.3%
041-753-6102 2
 
0.3%
041-753-7685 2
 
0.3%
041-751-8091 2
 
0.3%
Other values (504) 554
95.2%
2024-01-10T07:58:37.004434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/