Overview

Dataset statistics

Number of variables9
Number of observations1369
Missing cells1457
Missing cells (%)11.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory99.1 KiB
Average record size in memory74.1 B

Variable types

Text5
Categorical2
Numeric2

Dataset

Description충청남도 부여군에서 관리하는 축산 및 가금류 농가정보(농장명, 농장규모, 사육두수, 도로명주소, 대표자명, 가축종류 등)
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=396&beforeMenuCd=DOM_000000201001001000&publicdatapk=15047891

Alerts

데이터기준일자 has constant value ""Constant
사육두수 is highly overall correlated with 규모High correlation
규모 is highly overall correlated with 사육두수High correlation
축종 is highly imbalanced (62.5%)Imbalance
소재지(도로명) has 723 (52.8%) missing valuesMissing
연락처 has 734 (53.6%) missing valuesMissing
사육두수 has 35 (2.6%) zerosZeros

Reproduction

Analysis started2024-01-09 21:52:47.382240
Analysis finished2024-01-09 21:52:48.448337
Duration1.07 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1061
Distinct (%)77.5%
Missing0
Missing (%)0.0%
Memory size10.8 KiB
2024-01-10T06:52:48.617907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length4
Mean length4.2615047
Min length2

Characters and Unicode

Total characters5834
Distinct characters346
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique876 ?
Unique (%)64.0%

Sample

1st row신안농장
2nd row백제양돈영농조합법인
3rd row달성양돈
4th row동이농장
5th row부운팜
ValueCountFrequency (%)
우리농장 15
 
1.1%
한우농장 13
 
0.9%
청송농장 10
 
0.7%
라복농장 8
 
0.6%
농장 8
 
0.6%
가중농장 7
 
0.5%
수목농장 7
 
0.5%
부부농장 6
 
0.4%
거전농장 6
 
0.4%
나령농장 6
 
0.4%
Other values (1062) 1310
93.8%
2024-01-10T06:52:48.939143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1184
20.3%
1037
 
17.8%
189
 
3.2%
163
 
2.8%
130
 
2.2%
89
 
1.5%
75
 
1.3%
74
 
1.3%
72
 
1.2%
67
 
1.1%
Other values (336) 2754
47.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5784
99.1%
Space Separator 27
 
0.5%
Decimal Number 12
 
0.2%
Uppercase Letter 4
 
0.1%
Open Punctuation 3
 
0.1%
Close Punctuation 3
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1184
20.5%
1037
 
17.9%
189
 
3.3%
163
 
2.8%
130
 
2.2%
89
 
1.5%
75
 
1.3%
74
 
1.3%
72
 
1.2%
67
 
1.2%
Other values (324) 2704
46.7%
Decimal Number
ValueCountFrequency (%)
2 8
66.7%
3 2
 
16.7%
6 1
 
8.3%
5 1
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
K 1
25.0%
C 1
25.0%
S 1
25.0%
M 1
25.0%
Space Separator
ValueCountFrequency (%)
27
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5784
99.1%
Common 46
 
0.8%
Latin 4
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1184
20.5%
1037
 
17.9%
189
 
3.3%
163
 
2.8%
130
 
2.2%
89
 
1.5%
75
 
1.3%
74
 
1.3%
72
 
1.2%
67
 
1.2%
Other values (324) 2704
46.7%
Common
ValueCountFrequency (%)
27
58.7%
2 8
 
17.4%
( 3
 
6.5%
) 3
 
6.5%
3 2
 
4.3%
6 1
 
2.2%
5 1
 
2.2%
& 1
 
2.2%
Latin
ValueCountFrequency (%)
K 1
25.0%
C 1
25.0%
S 1
25.0%
M 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5784
99.1%
ASCII 50
 
0.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1184
20.5%
1037
 
17.9%
189
 
3.3%
163
 
2.8%
130
 
2.2%
89
 
1.5%
75
 
1.3%
74
 
1.3%
72
 
1.2%
67
 
1.2%
Other values (324) 2704
46.7%
ASCII
ValueCountFrequency (%)
27
54.0%
2 8
 
16.0%
( 3
 
6.0%
) 3
 
6.0%
3 2
 
4.0%
6 1
 
2.0%
5 1
 
2.0%
K 1
 
2.0%
& 1
 
2.0%
C 1
 
2.0%
Other values (2) 2
 
4.0%

축종
Categorical

IMBALANCE 

Distinct10
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size10.8 KiB
한우
1088 
육계
109 
염소
 
46
젖소
 
45
돼지
 
41
Other values (5)
 
40

Length

Max length3
Median length2
Mean length2.0080351
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row돼지
2nd row돼지
3rd row돼지
4th row돼지
5th row돼지

Common Values

ValueCountFrequency (%)
한우 1088
79.5%
육계 109
 
8.0%
염소 46
 
3.4%
젖소 45
 
3.3%
돼지 41
 
3.0%
산란계 11
 
0.8%
사슴 10
 
0.7%
산양 10
 
0.7%
육우 7
 
0.5%
오리 2
 
0.1%

Length

2024-01-10T06:52:49.276130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:52:49.366394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한우 1088
79.5%
육계 109
 
8.0%
염소 46
 
3.4%
젖소 45
 
3.3%
돼지 41
 
3.0%
산란계 11
 
0.8%
사슴 10
 
0.7%
산양 10
 
0.7%
육우 7
 
0.5%
오리 2
 
0.1%

사육두수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct182
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3812.3492
Minimum0
Maximum300000
Zeros35
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size12.2 KiB
2024-01-10T06:52:49.473366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q18
median20
Q351
95-th percentile32600
Maximum300000
Range300000
Interquartile range (IQR)43

Descriptive statistics

Standard deviation16572.168
Coefficient of variation (CV)4.3469701
Kurtosis93.445267
Mean3812.3492
Median Absolute Deviation (MAD)15
Skewness7.6947032
Sum5219106
Variance2.7463675 × 108
MonotonicityNot monotonic
2024-01-10T06:52:49.624784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 71
 
5.2%
30 71
 
5.2%
20 68
 
5.0%
5 56
 
4.1%
7 45
 
3.3%
4 43
 
3.1%
6 42
 
3.1%
8 42
 
3.1%
50 42
 
3.1%
3 40
 
2.9%
Other values (172) 849
62.0%
ValueCountFrequency (%)
0 35
2.6%
1 9
 
0.7%
2 32
2.3%
3 40
2.9%
4 43
3.1%
5 56
4.1%
6 42
3.1%
7 45
3.3%
8 42
3.1%
9 21
 
1.5%
ValueCountFrequency (%)
300000 1
 
0.1%
160000 1
 
0.1%
142000 1
 
0.1%
140000 1
 
0.1%
100000 3
0.2%
99000 1
 
0.1%
90000 1
 
0.1%
88000 1
 
0.1%
80000 2
0.1%
75000 2
0.1%

규모
Real number (ℝ)

HIGH CORRELATION 

Distinct991
Distinct (%)72.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean903.78655
Minimum0
Maximum33810.52
Zeros2
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size12.2 KiB
2024-01-10T06:52:49.774725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile48.2
Q1151.29
median375.95
Q31025
95-th percentile3269.3
Maximum33810.52
Range33810.52
Interquartile range (IQR)873.71

Descriptive statistics

Standard deviation1712.1591
Coefficient of variation (CV)1.8944286
Kurtosis142.91675
Mean903.78655
Median Absolute Deviation (MAD)290.95
Skewness9.3210468
Sum1237283.8
Variance2931488.6
MonotonicityNot monotonic
2024-01-10T06:52:49.878322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
396.0 14
 
1.0%
330.0 13
 
0.9%
250.0 12
 
0.9%
350.0 11
 
0.8%
500.0 11
 
0.8%
50.0 10
 
0.7%
96.0 10
 
0.7%
80.0 9
 
0.7%
150.0 9
 
0.7%
48.0 9
 
0.7%
Other values (981) 1261
92.1%
ValueCountFrequency (%)
0.0 2
0.1%
13.0 1
0.1%
15.0 2
0.1%
18.0 2
0.1%
20.0 1
0.1%
21.2 1
0.1%
22.1 1
0.1%
22.55 1
0.1%
23.0 1
0.1%
24.0 1
0.1%
ValueCountFrequency (%)
33810.52 1
0.1%
26400.0 1
0.1%
14428.23 1
0.1%
14302.06 1
0.1%
11517.44 1
0.1%
10458.77 1
0.1%
9750.0 1
0.1%
7978.1 1
0.1%
7375.0 2
0.1%
7323.6 1
0.1%

소재지(도로명)
Text

MISSING 

Distinct611
Distinct (%)94.6%
Missing723
Missing (%)52.8%
Memory size10.8 KiB
2024-01-10T06:52:50.107445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length27
Mean length23.010836
Min length18

Characters and Unicode

Total characters14865
Distinct characters157
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique582 ?
Unique (%)90.1%

Sample

1st row충청남도 부여군 은산면 회곡저실로 55
2nd row충청남도 부여군 장암면 위덕로 588-23
3rd row충청남도 부여군 초촌면 선사로226번길 16
4th row충청남도 부여군 옥산면 옥산북로 285
5th row충청남도 부여군 홍산면 조현로43번길 85-53
ValueCountFrequency (%)
충청남도 646
20.0%
부여군 646
20.0%
은산면 135
 
4.2%
규암면 56
 
1.7%
부여읍 54
 
1.7%
장암면 48
 
1.5%
초촌면 47
 
1.5%
남면 43
 
1.3%
외산면 41
 
1.3%
세도면 39
 
1.2%
Other values (839) 1475
45.7%
2024-01-10T06:52:50.446389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2584
17.4%
760
 
5.1%
737
 
5.0%
711
 
4.8%
706
 
4.7%
700
 
4.7%
650
 
4.4%
646
 
4.3%
638
 
4.3%
592
 
4.0%
Other values (147) 6141
41.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9094
61.2%
Decimal Number 2821
 
19.0%
Space Separator 2584
 
17.4%
Dash Punctuation 366
 
2.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
760
 
8.4%
737
 
8.1%
711
 
7.8%
706
 
7.8%
700
 
7.7%
650
 
7.1%
646
 
7.1%
638
 
7.0%
592
 
6.5%
305
 
3.4%
Other values (135) 2649
29.1%
Decimal Number
ValueCountFrequency (%)
1 514
18.2%
2 362
12.8%
3 351
12.4%
4 276
9.8%
5 255
9.0%
6 246
8.7%
9 216
7.7%
7 215
7.6%
8 205
 
7.3%
0 181
 
6.4%
Space Separator
ValueCountFrequency (%)
2584
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 366
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9094
61.2%
Common 5771
38.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
760
 
8.4%
737
 
8.1%
711
 
7.8%
706
 
7.8%
700
 
7.7%
650
 
7.1%
646
 
7.1%
638
 
7.0%
592
 
6.5%
305
 
3.4%
Other values (135) 2649
29.1%
Common
ValueCountFrequency (%)
2584
44.8%
1 514
 
8.9%
- 366
 
6.3%
2 362
 
6.3%
3 351
 
6.1%
4 276
 
4.8%
5 255
 
4.4%
6 246
 
4.3%
9 216
 
3.7%
7 215
 
3.7%
Other values (2) 386
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9094
61.2%
ASCII 5771
38.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2584
44.8%
1 514
 
8.9%
- 366
 
6.3%
2 362
 
6.3%
3 351
 
6.1%
4 276
 
4.8%
5 255
 
4.4%
6 246
 
4.3%
9 216
 
3.7%
7 215
 
3.7%
Other values (2) 386
 
6.7%
Hangul
ValueCountFrequency (%)
760
 
8.4%
737
 
8.1%
711
 
7.8%
706
 
7.8%
700
 
7.7%
650
 
7.1%
646
 
7.1%
638
 
7.0%
592
 
6.5%
305
 
3.4%
Other values (135) 2649
29.1%
Distinct1308
Distinct (%)95.5%
Missing0
Missing (%)0.0%
Memory size10.8 KiB
2024-01-10T06:52:50.690573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/