Overview

Dataset statistics

Number of variables12
Number of observations126
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.4 KiB
Average record size in memory101.0 B

Variable types

Numeric4
Categorical3
Text3
DateTime2

Dataset

Description충청남도 토석채취 관련정보를 토석구분, 주된행정처분, 소재지, 지번, 용도, 면적 등으로 나열하여 개방합니다.
URLhttps://www.data.go.kr/data/15032200/fileData.do

Alerts

건별 is highly overall correlated with 사업장별High correlation
사업장별 is highly overall correlated with 건별High correlation
면적(천제곱미터) is highly overall correlated with 수량(천세제곱미터) and 1 other fieldsHigh correlation
수량(천세제곱미터) is highly overall correlated with 면적(천제곱미터) and 1 other fieldsHigh correlation
토석구분 is highly overall correlated with 면적(천제곱미터) and 2 other fieldsHigh correlation
용도 is highly overall correlated with 토석구분High correlation
유형(주된처분) is highly imbalanced (55.5%)Imbalance
건별 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:31:18.507183
Analysis finished2023-12-12 05:31:21.640717
Duration3.13 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건별
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct126
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63.5
Minimum1
Maximum126
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-12T14:31:21.739946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7.25
Q132.25
median63.5
Q394.75
95-th percentile119.75
Maximum126
Range125
Interquartile range (IQR)62.5

Descriptive statistics

Standard deviation36.517119
Coefficient of variation (CV)0.57507274
Kurtosis-1.2
Mean63.5
Median Absolute Deviation (MAD)31.5
Skewness0
Sum8001
Variance1333.5
MonotonicityStrictly increasing
2023-12-12T14:31:21.924303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.8%
81 1
 
0.8%
94 1
 
0.8%
93 1
 
0.8%
92 1
 
0.8%
91 1
 
0.8%
90 1
 
0.8%
89 1
 
0.8%
88 1
 
0.8%
87 1
 
0.8%
Other values (116) 116
92.1%
ValueCountFrequency (%)
1 1
0.8%
2 1
0.8%
3 1
0.8%
4 1
0.8%
5 1
0.8%
6 1
0.8%
7 1
0.8%
8 1
0.8%
9 1
0.8%
10 1
0.8%
ValueCountFrequency (%)
126 1
0.8%
125 1
0.8%
124 1
0.8%
123 1
0.8%
122 1
0.8%
121 1
0.8%
120 1
0.8%
119 1
0.8%
118 1
0.8%
117 1
0.8%

사업장별
Real number (ℝ)

HIGH CORRELATION 

Distinct98
Distinct (%)77.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.349206
Minimum1
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-12T14:31:22.116301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7.25
Q127.25
median44.5
Q371.75
95-th percentile91.75
Maximum98
Range97
Interquartile range (IQR)44.5

Descriptive statistics

Standard deviation26.719975
Coefficient of variation (CV)0.55264558
Kurtosis-1.0456782
Mean48.349206
Median Absolute Deviation (MAD)21
Skewness0.15261541
Sum6092
Variance713.95708
MonotonicityIncreasing
2023-12-12T14:31:22.375228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40 7
 
5.6%
39 5
 
4.0%
50 3
 
2.4%
72 3
 
2.4%
32 3
 
2.4%
25 3
 
2.4%
87 3
 
2.4%
86 2
 
1.6%
52 2
 
1.6%
49 2
 
1.6%
Other values (88) 93
73.8%
ValueCountFrequency (%)
1 1
0.8%
2 1
0.8%
3 1
0.8%
4 1
0.8%
5 1
0.8%
6 1
0.8%
7 1
0.8%
8 1
0.8%
9 1
0.8%
10 1
0.8%
ValueCountFrequency (%)
98 1
0.8%
97 1
0.8%
96 1
0.8%
95 1
0.8%
94 1
0.8%
93 1
0.8%
92 1
0.8%
91 1
0.8%
90 1
0.8%
89 1
0.8%

토석구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
석재
84 
토사
41 
토석
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row토사
2nd row토사
3rd row토사
4th row석재
5th row토사

Common Values

ValueCountFrequency (%)
석재 84
66.7%
토사 41
32.5%
토석 1
 
0.8%

Length

2023-12-12T14:31:22.543308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:31:22.661242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
석재 84
66.7%
토사 41
32.5%
토석 1
 
0.8%

유형(주된처분)
Categorical

IMBALANCE 

Distinct6
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
허가
93 
부수적(산지전용)
25 
채석단지(채석신고)
 
4
부수적(산지일시사용-채광)
 
2
협의
 
1

Length

Max length14
Median length2
Mean length3.8333333
Min length2

Unique

Unique2 ?
Unique (%)1.6%

Sample

1st row부수적(산지전용)
2nd row부수적(산지전용)
3rd row부수적(산지전용)
4th row부수적(산지전용)
5th row부수적(산지전용)

Common Values

ValueCountFrequency (%)
허가 93
73.8%
부수적(산지전용) 25
 
19.8%
채석단지(채석신고) 4
 
3.2%
부수적(산지일시사용-채광) 2
 
1.6%
협의 1
 
0.8%
신고 1
 
0.8%

Length

2023-12-12T14:31:22.786896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:31:22.911151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
허가 93
73.8%
부수적(산지전용 25
 
19.8%
채석단지(채석신고 4
 
3.2%
부수적(산지일시사용-채광 2
 
1.6%
협의 1
 
0.8%
신고 1
 
0.8%
Distinct88
Distinct (%)69.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-12T14:31:23.389547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length11.111111
Min length7

Characters and Unicode

Total characters1400
Distinct characters134
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)55.6%

Sample

1st row천안시 동남구 구성동
2nd row천안시 서북구 성거읍 오목리
3rd row천안시 동남구 목천읍 응원리
4th row천안시 동남구 목천읍 소사리
5th row천안시 서북구 입장면 용정리
ValueCountFrequency (%)
서산시 18
 
4.7%
보령시 16
 
4.2%
당진시 15
 
3.9%
천안시 14
 
3.7%
금산군 13
 
3.4%
아산시 11
 
2.9%
웅천읍 10
 
2.6%
태안군 10
 
2.6%
부여군 10
 
2.6%
평리 9
 
2.4%
Other values (149) 255
66.9%
2023-12-12T14:31:24.023920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
261
18.6%
119
 
8.5%
96
 
6.9%
83
 
5.9%
73
 
5.2%
46
 
3.3%
35
 
2.5%
31
 
2.2%
30
 
2.1%
26
 
1.9%
Other values (124) 600
42.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1139
81.4%
Space Separator 261
 
18.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
119
 
10.4%
96
 
8.4%
83
 
7.3%
73
 
6.4%
46
 
4.0%
35
 
3.1%
31
 
2.7%
30
 
2.6%
26
 
2.3%
21
 
1.8%
Other values (123) 579
50.8%
Space Separator
ValueCountFrequency (%)
261
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1139
81.4%
Common 261
 
18.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
119
 
10.4%
96
 
8.4%
83
 
7.3%
73
 
6.4%
46
 
4.0%
35
 
3.1%
31
 
2.7%
30
 
2.6%
26
 
2.3%
21
 
1.8%
Other values (123) 579
50.8%
Common
ValueCountFrequency (%)
261
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1139
81.4%
ASCII 261
 
18.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
261
100.0%
Hangul
ValueCountFrequency (%)
119
 
10.4%
96
 
8.4%
83
 
7.3%
73
 
6.4%
46
 
4.0%
35
 
3.1%
31
 
2.7%
30
 
2.6%
26
 
2.3%
21
 
1.8%
Other values (123) 579
50.8%

지번
Text

Distinct125
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-12T14:31:24.335641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length10
Mean length8.547619
Min length2

Characters and Unicode

Total characters1077
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)98.4%

Sample

1st row1-22 외 4필지
2nd row산22-1 외 8필지
3rd row166-1외
4th row187-13
5th row산12-1 외 4필지
ValueCountFrequency (%)
76
25.9%
2필 13
 
4.4%
1필 11
 
3.7%
2필지 6
 
2.0%
6필 5
 
1.7%
4필지 5
 
1.7%
1 4
 
1.4%
산6 3
 
1.0%
7필지 3
 
1.0%
10필 3
 
1.0%
Other values (148) 165
56.1%
2023-12-12T14:31:24.822827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
169
15.7%
1 135
12.5%
- 99
9.2%
92
8.5%
78
 
7.2%
2 78
 
7.2%
77
 
7.1%
6 52
 
4.8%
4 50
 
4.6%
3 44
 
4.1%
Other values (8) 203
18.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 519
48.2%
Other Letter 288
26.7%
Space Separator 169
 
15.7%
Dash Punctuation 99
 
9.2%
Other Punctuation 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 135
26.0%
2 78
15.0%
6 52
 
10.0%
4 50
 
9.6%
3 44
 
8.5%
7 42
 
8.1%
5 40
 
7.7%
0 27
 
5.2%
8 26
 
5.0%
9 25
 
4.8%
Other Letter
ValueCountFrequency (%)
92
31.9%
78
27.1%
77
26.7%
35