Overview

Dataset statistics

Number of variables9
Number of observations778
Missing cells606
Missing cells (%)8.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory55.6 KiB
Average record size in memory73.2 B

Variable types

Numeric1
Text7
DateTime1

Dataset

Description충청남도 금산군의 제조업체의 관한 사항으로 회사명, 대표명, 회사 주소, 회사 번호, 팩스 번호 등 의 자료를 포함하고 있습니다.
Author충청남도 금산군
URLhttps://www.data.go.kr/data/15034906/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
전화번호 has 192 (24.7%) missing valuesMissing
팩스번호 has 412 (53.0%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-14 12:13:30.205976
Analysis finished2024-03-14 12:13:33.040517
Duration2.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct778
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean389.5
Minimum1
Maximum778
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 KiB
2024-03-14T21:13:33.175997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile39.85
Q1195.25
median389.5
Q3583.75
95-th percentile739.15
Maximum778
Range777
Interquartile range (IQR)388.5

Descriptive statistics

Standard deviation224.73355
Coefficient of variation (CV)0.57697958
Kurtosis-1.2
Mean389.5
Median Absolute Deviation (MAD)194.5
Skewness0
Sum303031
Variance50505.167
MonotonicityStrictly increasing
2024-03-14T21:13:33.447142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
536 1
 
0.1%
514 1
 
0.1%
515 1
 
0.1%
516 1
 
0.1%
517 1
 
0.1%
518 1
 
0.1%
519 1
 
0.1%
520 1
 
0.1%
521 1
 
0.1%
Other values (768) 768
98.7%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
778 1
0.1%
777 1
0.1%
776 1
0.1%
775 1
0.1%
774 1
0.1%
773 1
0.1%
772 1
0.1%
771 1
0.1%
770 1
0.1%
769 1
0.1%
Distinct764
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size6.2 KiB
2024-03-14T21:13:34.455532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length17
Mean length7.4640103
Min length2

Characters and Unicode

Total characters5807
Distinct characters419
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique751 ?
Unique (%)96.5%

Sample

1st row(유)다함
2nd row(주)BDC
3rd row(주)BDC 제2공장
4th row(주)가나텍
5th row(주)갈산
ValueCountFrequency (%)
농업회사법인 27
 
3.1%
제2공장 15
 
1.7%
금산공장 7
 
0.8%
주식회사 7
 
0.8%
영농조합법인 5
 
0.6%
금산지점 4
 
0.5%
주)에스코알티에스 3
 
0.3%
중부대학교 3
 
0.3%
주)믿음의나무 3
 
0.3%
주)휴온스푸디언스 3
 
0.3%
Other values (769) 797
91.2%
2024-03-14T21:13:35.705553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
492
 
8.5%
( 484
 
8.3%
) 484
 
8.3%
177
 
3.0%
154
 
2.7%
125
 
2.2%
105
 
1.8%
102
 
1.8%
96
 
1.7%
88
 
1.5%
Other values (409) 3500
60.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4657
80.2%
Open Punctuation 484
 
8.3%
Close Punctuation 484
 
8.3%
Space Separator 96
 
1.7%
Uppercase Letter 40
 
0.7%
Decimal Number 38
 
0.7%
Lowercase Letter 4
 
0.1%
Other Punctuation 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
492
 
10.6%
177
 
3.8%
154
 
3.3%
125
 
2.7%
105
 
2.3%
102
 
2.2%
88
 
1.9%
85
 
1.8%
78
 
1.7%
73
 
1.6%
Other values (376) 3178
68.2%
Uppercase Letter
ValueCountFrequency (%)
E 7
17.5%
I 4
10.0%
C 4
10.0%
N 3
7.5%
D 3
7.5%
B 3
7.5%
G 3
7.5%
P 3
7.5%
H 2
 
5.0%
T 1
 
2.5%
Other values (7) 7
17.5%
Decimal Number
ValueCountFrequency (%)
2 26
68.4%
1 6
 
15.8%
3 2
 
5.3%
8 2
 
5.3%
4 1
 
2.6%
5 1
 
2.6%
Lowercase Letter
ValueCountFrequency (%)
t 1
25.0%
e 1
25.0%
c 1
25.0%
h 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 484
100.0%
Close Punctuation
ValueCountFrequency (%)
) 484
100.0%
Space Separator
ValueCountFrequency (%)
96
100.0%
Other Punctuation
ValueCountFrequency (%)
& 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4658
80.2%
Common 1105
 
19.0%
Latin 44
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
492
 
10.6%
177
 
3.8%
154
 
3.3%
125
 
2.7%
105
 
2.3%
102
 
2.2%
88
 
1.9%
85
 
1.8%
78
 
1.7%
73
 
1.6%
Other values (377) 3179
68.2%
Latin
ValueCountFrequency (%)
E 7
15.9%
I 4
 
9.1%
C 4
 
9.1%
N 3
 
6.8%
D 3
 
6.8%
B 3
 
6.8%
G 3
 
6.8%
P 3
 
6.8%
H 2
 
4.5%
T 1
 
2.3%
Other values (11) 11
25.0%
Common
ValueCountFrequency (%)
( 484
43.8%
) 484
43.8%
96
 
8.7%
2 26
 
2.4%
1 6
 
0.5%
3 2
 
0.2%
8 2
 
0.2%
& 2
 
0.2%
- 1
 
0.1%
4 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4657
80.2%
ASCII 1149
 
19.8%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
492
 
10.6%
177
 
3.8%
154
 
3.3%
125
 
2.7%
105
 
2.3%
102
 
2.2%
88
 
1.9%
85
 
1.8%
78
 
1.7%
73
 
1.6%
Other values (376) 3178
68.2%
ASCII
ValueCountFrequency (%)
( 484
42.1%
) 484
42.1%
96
 
8.4%
2 26
 
2.3%
E 7
 
0.6%
1 6
 
0.5%
I 4
 
0.3%
C 4
 
0.3%
N 3
 
0.3%
D 3
 
0.3%
Other values (22) 32
 
2.8%
None
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct721
Distinct (%)92.7%
Missing0
Missing (%)0.0%
Memory size6.2 KiB
2024-03-14T21:13:36.947719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length47
Mean length25.615681
Min length18

Characters and Unicode

Total characters19929
Distinct characters301
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique670 ?
Unique (%)86.1%

Sample

1st row충청남도 금산군 금산읍 뒷담말4길 24 102~103호
2nd row충청남도 금산군 추부면 신평공단1로 62
3rd row충청남도 금산군 추부면 신평공단1로 85
4th row충청남도 금산군 복수면 다복로 537-18
5th row충청남도 금산군 복수면 복수공단길 37, (용진리 115-18) (풍국타올)
ValueCountFrequency (%)
충청남도 778
 
17.2%
금산군 778
 
17.2%
추부면 316
 
7.0%
복수면 190
 
4.2%
126
 
2.8%
금성면 93
 
2.1%
1필지 67
 
1.5%
추풍로 53
 
1.2%
다복로 52
 
1.1%
금산읍 50
 
1.1%
Other values (943) 2023
44.7%
2024-03-14T21:13:38.659015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3762
18.9%
1012
 
5.1%
978
 
4.9%
859
 
4.3%
801
 
4.0%
783
 
3.9%
782
 
3.9%
778
 
3.9%
741
 
3.7%
1 636
 
3.2%
Other values (291) 8797
44.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12188
61.2%
Space Separator 3762
 
18.9%
Decimal Number 2968
 
14.9%
Dash Punctuation 287
 
1.4%
Open Punctuation 273
 
1.4%
Close Punctuation 273
 
1.4%
Other Punctuation 116
 
0.6%
Uppercase Letter 38
 
0.2%
Other Symbol 23
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1012
 
8.3%
978
 
8.0%
859
 
7.0%
801
 
6.6%
783
 
6.4%
782
 
6.4%
778
 
6.4%
741
 
6.1%
442
 
3.6%
376
 
3.1%
Other values (260) 4636
38.0%
Uppercase Letter
ValueCountFrequency (%)
C 6
15.8%
E 5
13.2%
S 4
10.5%
R 4
10.5%
M 3
7.9%
H 3
7.9%
D 3
7.9%
T 3
7.9%
A 2
 
5.3%
O 2
 
5.3%
Other values (2) 3
7.9%
Decimal Number
ValueCountFrequency (%)
1 636
21.4%
2 406
13.7%
3 297
10.0%
4 291
9.8%
5 285
9.6%
6 260
8.8%
0 230
 
7.7%
7 196
 
6.6%
8 191
 
6.4%
9 176
 
5.9%
Other Punctuation
ValueCountFrequency (%)
, 113
97.4%
/ 2
 
1.7%
& 1
 
0.9%
Space Separator
ValueCountFrequency (%)
3762
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 287
100.0%
Open Punctuation
ValueCountFrequency (%)
( 273
100.0%
Close Punctuation
ValueCountFrequency (%)
) 273
100.0%
Other Symbol
ValueCountFrequency (%)
23
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12211
61.3%
Common 7680
38.5%
Latin 38
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1012
 
8.3%
978
 
8.0%
859
 
7.0%
801
 
6.6%
783
 
6.4%
782
 
6.4%
778
 
6.4%
741
 
6.1%
442
 
3.6%
376
 
3.1%
Other values (261) 4659
38.2%
Common
ValueCountFrequency (%)
3762
49.0%
1 636
 
8.3%
2 406
 
5.3%
3 297
 
3.9%
4 291
 
3.8%
- 287
 
3.7%
5 285
 
3.7%
( 273
 
3.6%
) 273
 
3.6%
6 260
 
3.4%
Other values (8) 910
 
11.8%
Latin
ValueCountFrequency (%)
C 6
15.8%
E 5
13.2%
S 4
10.5%
R 4
10.5%
M 3
7.9%
H 3
7.9%
D 3
7.9%
T 3
7.9%
A 2
 
5.3%
O 2
 
5.3%
Other values (2) 3
7.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12188
61.2%
ASCII 7718
38.7%
None 23
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3762
48.7%
1 636
 
8.2%
2 406
 
5.3%
3 297
 
3.8%
4 291
 
3.8%
- 287
 
3.7%
5 285
 
3.7%
( 273
 
3.5%
) 273
 
3.5%
6 260
 
3.4%
Other values (20) 948
 
12.3%
Hangul
ValueCountFrequency (%)
1012
 
8.3%
978
 
8.0%
859
 
7.0%
801
 
6.6%
783
 
6.4%
782
 
6.4%
778
 
6.4%
741
 
6.1%
442
 
3.6%
376
 
3.1%
Other values (260) 4636
38.0%
None
ValueCountFrequency (%)
23
100.0%

전화번호
Text

MISSING 

Distinct541
Distinct (%)92.3%
Missing192
Missing (%)24.7%
Memory size6.2 KiB
2024-03-14T21:13:39.647268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length12
Mean length12.093857
Min length9

Characters and Unicode

Total characters7087
Distinct characters14
Distinct categories5 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique500 ?
Unique (%)85.3%

Sample

1st row042-335-9092
2nd row041-751-6262
3rd row041-751-6262
4th row041-752-1197
5th row070-7602-7895
ValueCountFrequency (%)
041-753-7141 4
 
0.7%
041-753-4291 3
 
0.5%
041-754-5421 3
 
0.5%
041-753-8803 2
 
0.3%
041-752-1029 2
 
0.3%
041-753-0253 2
 
0.3%
041-752-5304 2
 
0.3%
041-751-6301 2
 
0.3%
041-751-2722 2
 
0.3%
041-752-9945 2
 
0.3%
Other values (532) 564
95.9%
2024-03-14T21:13:41.007944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/