Overview

Dataset statistics

Number of variables7
Number of observations699
Missing cells148
Missing cells (%)3.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.0 KiB
Average record size in memory57.2 B

Variable types

Numeric1
Categorical1
Text4
DateTime1

Dataset

Description당진시 담배소매인 데이터 정보입니다. (연번, 민원구분, 업소명, 업소지번주소,업소도로명주소, 업소주소우편번호,지정일자)
URLhttps://www.data.go.kr/data/15021412/fileData.do

Alerts

연번 is highly overall correlated with 민원구분High correlation
민원구분 is highly overall correlated with 연번High correlation
업소주소우편번호 has 148 (21.2%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:21:40.542310
Analysis finished2023-12-12 01:21:41.664110
Duration1.12 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct699
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean350
Minimum1
Maximum699
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.3 KiB
2023-12-12T10:21:41.758869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile35.9
Q1175.5
median350
Q3524.5
95-th percentile664.1
Maximum699
Range698
Interquartile range (IQR)349

Descriptive statistics

Standard deviation201.92821
Coefficient of variation (CV)0.57693773
Kurtosis-1.2
Mean350
Median Absolute Deviation (MAD)175
Skewness0
Sum244650
Variance40775
MonotonicityStrictly increasing
2023-12-12T10:21:41.939018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
471 1
 
0.1%
463 1
 
0.1%
464 1
 
0.1%
465 1
 
0.1%
466 1
 
0.1%
467 1
 
0.1%
468 1
 
0.1%
469 1
 
0.1%
470 1
 
0.1%
Other values (689) 689
98.6%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
699 1
0.1%
698 1
0.1%
697 1
0.1%
696 1
0.1%
695 1
0.1%
694 1
0.1%
693 1
0.1%
692 1
0.1%
691 1
0.1%
690 1
0.1%

민원구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
제7조의3제2항에따른경우
478 
184 
제7조의3제3항에따른경우
 
37

Length

Max length13
Median length13
Mean length9.8412017
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제7조의3제2항에따른경우
2nd row제7조의3제2항에따른경우
3rd row제7조의3제2항에따른경우
4th row제7조의3제2항에따른경우
5th row제7조의3제2항에따른경우

Common Values

ValueCountFrequency (%)
제7조의3제2항에따른경우 478
68.4%
184
 
26.3%
제7조의3제3항에따른경우 37
 
5.3%

Length

2023-12-12T10:21:42.132747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:21:42.282597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제7조의3제2항에따른경우 478
92.8%
제7조의3제3항에따른경우 37
 
7.2%
Distinct675
Distinct (%)96.6%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-12T10:21:42.596879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length19
Mean length8.0400572
Min length1

Characters and Unicode

Total characters5620
Distinct characters415
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique664 ?
Unique (%)95.0%

Sample

1st rowGS25 당진제철점
2nd row지에스25당진명지점
3rd row씨유 당진호반써밋 1차점
4th row두원 당진 반품샵
5th row지에스25수청아린점
ValueCountFrequency (%)
세븐일레븐 54
 
5.3%
씨유 48
 
4.7%
gs25 36
 
3.5%
이마트24 28
 
2.7%
잡화상 13
 
1.3%
미니스톱 11
 
1.1%
지에스(gs)25 11
 
1.1%
지에스25 10
 
1.0%
슈퍼 8
 
0.8%
cu 8
 
0.8%
Other values (696) 793
77.7%
2023-12-12T10:21:43.118516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
324
 
5.8%
308
 
5.5%
285
 
5.1%
280
 
5.0%
150
 
2.7%
145
 
2.6%
2 127
 
2.3%
123
 
2.2%
90
 
1.6%
5 87
 
1.5%
Other values (405) 3701
65.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4701
83.6%
Space Separator 324
 
5.8%
Decimal Number 254
 
4.5%
Uppercase Letter 207
 
3.7%
Close Punctuation 58
 
1.0%
Open Punctuation 58
 
1.0%
Lowercase Letter 9
 
0.2%
Other Punctuation 8
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
308
 
6.6%
285
 
6.1%
280
 
6.0%
150
 
3.2%
145
 
3.1%
123
 
2.6%
90
 
1.9%
85
 
1.8%
82
 
1.7%
82
 
1.7%
Other values (365) 3071
65.3%
Uppercase Letter
ValueCountFrequency (%)
S 67
32.4%
G 65
31.4%
C 21
 
10.1%
U 14
 
6.8%
O 6
 
2.9%
K 4
 
1.9%
I 4
 
1.9%
A 4
 
1.9%
D 3
 
1.4%
E 3
 
1.4%
Other values (9) 16
 
7.7%
Decimal Number
ValueCountFrequency (%)
2 127
50.0%
5 87
34.3%
4 32
 
12.6%
1 4
 
1.6%
3 1
 
0.4%
6 1
 
0.4%
9 1
 
0.4%
8 1
 
0.4%
Lowercase Letter
ValueCountFrequency (%)
o 3
33.3%
k 1
 
11.1%
g 1
 
11.1%
e 1
 
11.1%
s 1
 
11.1%
l 1
 
11.1%
m 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 6
75.0%
& 2
 
25.0%
Space Separator
ValueCountFrequency (%)
324
100.0%
Close Punctuation
ValueCountFrequency (%)
) 58
100.0%
Open Punctuation
ValueCountFrequency (%)
( 58
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4701
83.6%
Common 703
 
12.5%
Latin 216
 
3.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
308
 
6.6%
285
 
6.1%
280
 
6.0%
150
 
3.2%
145
 
3.1%
123
 
2.6%
90
 
1.9%
85
 
1.8%
82
 
1.7%
82
 
1.7%
Other values (365) 3071
65.3%
Latin
ValueCountFrequency (%)
S 67
31.0%
G 65
30.1%
C 21
 
9.7%
U 14
 
6.5%
O 6
 
2.8%
K 4
 
1.9%
I 4
 
1.9%
A 4
 
1.9%
D 3
 
1.4%
o 3
 
1.4%
Other values (16) 25
 
11.6%
Common
ValueCountFrequency (%)
324
46.1%
2 127
 
18.1%
5 87
 
12.4%
) 58
 
8.3%
( 58
 
8.3%
4 32
 
4.6%
. 6
 
0.9%
1 4
 
0.6%
& 2
 
0.3%
3 1
 
0.1%
Other values (4) 4
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4701
83.6%
ASCII 919
 
16.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
324
35.3%
2 127
 
13.8%
5 87
 
9.5%
S 67
 
7.3%
G 65
 
7.1%
) 58
 
6.3%
( 58
 
6.3%
4 32
 
3.5%
C 21
 
2.3%
U 14
 
1.5%
Other values (30) 66
 
7.2%
Hangul
ValueCountFrequency (%)
308
 
6.6%
285
 
6.1%
280
 
6.0%
150
 
3.2%
145
 
3.1%
123
 
2.6%
90
 
1.9%
85
 
1.8%
82
 
1.7%
82
 
1.7%
Other values (365) 3071
65.3%
Distinct597
Distinct (%)85.4%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-12T10:21:43.493070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length43
Mean length22.236052
Min length1

Characters and Unicode

Total characters15543
Distinct characters300
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique591 ?
Unique (%)84.5%

Sample

1st row충청남도 당진시 송악읍 고대리 167-136
2nd row충청남도 당진시 송악읍 반촌리 46-4
3rd row충청남도 당진시 수청동 1551 호반써밋 시그니처 1차
4th row충청남도 당진시 읍내동 420-11
5th row충청남도 당진시 수청동 1362
ValueCountFrequency (%)
충청남도 606
 
17.5%
당진시 457
 
13.2%
당진군 149
 
4.3%
송악읍 114
 
3.3%
읍내동 97
 
2.8%
석문면 67
 
1.9%
신평면 60
 
1.7%
합덕읍 56
 
1.6%
1호 48
 
1.4%
송산면 47
 
1.4%
Other values (884) 1767
51.0%
2023-12-12T10:21:43.986222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3084
19.8%
687
 
4.4%
682
 
4.4%
1 626
 
4.0%
623
 
4.0%
622
 
4.0%
612
 
3.9%
606
 
3.9%
497
 
3.2%
430
 
2.8%
Other values (290) 7074
45.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9672
62.2%
Space Separator 3084
 
19.8%
Decimal Number 2621
 
16.9%
Dash Punctuation 122
 
0.8%
Other Punctuation 14
 
0.1%
Uppercase Letter 10
 
0.1%
Open Punctuation 8
 
0.1%
Close Punctuation 8
 
0.1%
Math Symbol 3
 
< 0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
687
 
7.1%
682
 
7.1%
623
 
6.4%
622
 
6.4%
612
 
6.3%
606
 
6.3%
497
 
5.1%
430
 
4.4%
426
 
4.4%
403
 
4.2%
Other values (265) 4084
42.2%
Decimal Number
ValueCountFrequency (%)
1 626
23.9%
2 326
12.4%
3 288
11.0%
5 242
 
9.2%
4 242
 
9.2%
6 216
 
8.2%
9 190
 
7.2%
8 173
 
6.6%
0 173
 
6.6%
7 145
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
G 2
20.0%
A 2
20.0%
B 2
20.0%
H 1
10.0%
L 1
10.0%
F 1
10.0%
S 1
10.0%
Other Punctuation
ValueCountFrequency (%)
. 12
85.7%
@ 2
 
14.3%
Space Separator
ValueCountFrequency (%)
3084
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 122
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%
Lowercase Letter
ValueCountFrequency (%)
s 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9672
62.2%
Common 5860
37.7%
Latin 11
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
687
 
7.1%
682
 
7.1%
623
 
6.4%
622
 
6.4%
612
 
6.3%
606
 
6.3%
497
 
5.1%
430
 
4.4%
426
 
4.4%
403
 
4.2%
Other values (265) 4084
42.2%
Common
ValueCountFrequency (%)
3084
52.6%
1 626
 
10.7%
2 326
 
5.6%
3 288
 
4.9%
5 242
 
4.1%
4 242
 
4.1%
6 216
 
3.7%
9 190
 
3.2%
8 173
 
3.0%
0 173
 
3.0%
Other values (7) 300
 
5.1%
Latin
ValueCountFrequency (%)
G 2
18.2%
A 2
18.2%
B 2
18.2%
H 1
9.1%
L 1
9.1%
s 1
9.1%
F 1
9.1%
S 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9672
62.2%
ASCII 5871
37.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3084
52.5%
1 626
 
10.7%
2 326
 
5.6%
3 288
 
4.9%
5 242
 
4.1%
4 242
 
4.1%
6 216
 
3.7%
9 190
 
3.2%
8 173
 
2.9%
0 173
 
2.9%
Other values (15) 311
 
5.3%
Hangul
ValueCountFrequency (%)
687
 
7.1%
682
 
7.1%
623
 
6.4%
622
 
6.4%
612
 
6.3%
606
 
6.3%
497
 
5.1%
430
 
4.4%
426
 
4.4%
403
 
4.2%
Other values (265) 4084
42.2%
Distinct645
Distinct (%)92.3%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-12T10:21:44.337653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length52
Mean length25.266094
Min length1

Characters and Unicode

Total characters17661
Distinct characters354
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique641 ?
Unique (%)91.7%

Sample

1st row충청남도 당진시 송악읍 대섬길 4-5
2nd row충청남도 당진시 송악읍 반촌로 288-1
3rd row충청남도 당진시 수청3로 20. 116동 103. 104호 (수청동. 호반써밋 시그니처 1차)
4th row충청남도 당진시 서해로 5740-6 (읍내동)
5th row충청남도 당진시 수청1길 47. 106호 (수청동)
ValueCountFrequency (%)
충청남도 647
 
17.2%
당진시 644
 
17.1%
송악읍 131
 
3.5%
읍내동 103
 
2.7%
석문면 72
 
1.9%
1층 68
 
1.8%
신평면 63
 
1.7%
합덕읍 55
 
1.5%
송산면 50
 
1.3%
101호 28
 
0.7%
Other values (1008) 1908
50.6%
2023-12-12T10:21:44.902909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/