Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells6
Missing cells (%)< 0.1%
Duplicate rows141
Duplicate rows (%)1.4%
Total size in memory400.4 KiB
Average record size in memory41.0 B

Variable types

Text3
Numeric1

Dataset

Description한국세라믹기술원 세라믹소재정보은행의 카탈로그 물성 정보입니다. 금속/화학/세라믹 통합사이트 주소: http://www.matcenter.org 담당자: 김경훈 수석
Author한국세라믹기술원
URLhttps://www.data.go.kr/data/15072102/fileData.do

Alerts

Dataset has 141 (1.4%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 12:00:51.909508
Analysis finished2023-12-12 12:00:53.077729
Duration1.17 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1145
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T21:00:53.192556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length17
Mean length14.6039
Min length10

Characters and Unicode

Total characters146039
Distinct characters21
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1031 ?
Unique (%)10.3%

Sample

1st row10400000000000
2nd row1000004301
3rd row30series(우정무역)911
4th row105000000000
5th row10400000000000
ValueCountFrequency (%)
10400000000000 3340
33.4%
104000000000000 2525
25.2%
10500000000000 869
 
8.7%
1050000000000 793
 
7.9%
10200000000000 493
 
4.9%
1040000000000 293
 
2.9%
1020000000000 181
 
1.8%
105000000000 97
 
1.0%
106000000000 52
 
0.5%
104000000000 37
 
0.4%
Other values (1135) 1320
 
13.2%
2023-12-12T21:00:53.532482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 111842
76.6%
1 10275
 
7.0%
8769
 
6.0%
4 6659
 
4.6%
5 2264
 
1.6%
2 1162
 
0.8%
3 646
 
0.4%
7 619
 
0.4%
8 575
 
0.4%
6 542
 
0.4%
Other values (11) 2686
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 134978
92.4%
Space Separator 8769
 
6.0%
Lowercase Letter 1146
 
0.8%
Other Letter 764
 
0.5%
Open Punctuation 191
 
0.1%
Close Punctuation 191
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 111842
82.9%
1 10275
 
7.6%
4 6659
 
4.9%
5 2264
 
1.7%
2 1162
 
0.9%
3 646
 
0.5%
7 619
 
0.5%
8 575
 
0.4%
6 542
 
0.4%
9 394
 
0.3%
Lowercase Letter
ValueCountFrequency (%)
e 382
33.3%
s 382
33.3%
r 191
16.7%
i 191
16.7%
Other Letter
ValueCountFrequency (%)
191
25.0%
191
25.0%
191
25.0%
191
25.0%
Space Separator
ValueCountFrequency (%)
8769
100.0%
Open Punctuation
ValueCountFrequency (%)
( 191
100.0%
Close Punctuation
ValueCountFrequency (%)
) 191
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 144129
98.7%
Latin 1146
 
0.8%
Hangul 764
 
0.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 111842
77.6%
1 10275
 
7.1%
8769
 
6.1%
4 6659
 
4.6%
5 2264
 
1.6%
2 1162
 
0.8%
3 646
 
0.4%
7 619
 
0.4%
8 575
 
0.4%
6 542
 
0.4%
Other values (3) 776
 
0.5%
Latin
ValueCountFrequency (%)
e 382
33.3%
s 382
33.3%
r 191
16.7%
i 191
16.7%
Hangul
ValueCountFrequency (%)
191
25.0%
191
25.0%
191
25.0%
191
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 145275
99.5%
Hangul 764
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 111842
77.0%
1 10275
 
7.1%
8769
 
6.0%
4 6659
 
4.6%
5 2264
 
1.6%
2 1162
 
0.8%
3 646
 
0.4%
7 619
 
0.4%
8 575
 
0.4%
6 542
 
0.4%
Other values (7) 1922
 
1.3%
Hangul
ValueCountFrequency (%)
191
25.0%
191
25.0%
191
25.0%
191
25.0%

순번
Real number (ℝ)

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.4636
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T21:00:53.663725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median5
Q311
95-th percentile12
Maximum24
Range23
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.9612428
Coefficient of variation (CV)0.61285394
Kurtosis-1.3321312
Mean6.4636
Median Absolute Deviation (MAD)3
Skewness0.43486207
Sum64636
Variance15.691444
MonotonicityNot monotonic
2023-12-12T21:00:53.792313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
4 2823
28.2%
12 2387
23.9%
2 1725
17.2%
5 881
 
8.8%
11 607
 
6.1%
8 407
 
4.1%
7 307
 
3.1%
1 272
 
2.7%
6 159
 
1.6%
9 152
 
1.5%
Other values (11) 280
 
2.8%
ValueCountFrequency (%)
1 272
 
2.7%
2 1725
17.2%
3 137
 
1.4%
4 2823
28.2%
5 881
 
8.8%
6 159
 
1.6%
7 307
 
3.1%
8 407
 
4.1%
9 152
 
1.5%
10 74
 
0.7%
ValueCountFrequency (%)
24 1
 
< 0.1%
21 1
 
< 0.1%
20 2
 
< 0.1%
18 4
 
< 0.1%
17 6
 
0.1%
16 5
 
0.1%
15 10
 
0.1%
14 9
 
0.1%
13 31
 
0.3%
12 2387
23.9%
Distinct469
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T21:00:54.100837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length59
Median length48
Mean length12.8915
Min length2

Characters and Unicode

Total characters128915
Distinct characters158
Distinct categories14 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique259 ?
Unique (%)2.6%

Sample

1st rowPacking Type
2nd rowInstallation Method
3rd row분자식
4th rowPacking Type
5th rowLead Type
ValueCountFrequency (%)
type 5521
31.0%
packing 4743
26.6%
termination 2154
 
12.1%
dimension(l 483
 
2.7%
or 483
 
2.7%
h)(mm 483
 
2.7%
lead 368
 
2.1%
terminal 323
 
1.8%
dimension 300
 
1.7%
interconnection 148
 
0.8%
Other values (517) 2793
15.7%
2023-12-12T21:00:54.613441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 12473
 
9.7%
i 12408
 
9.6%
12262
 
9.5%
e 11094
 
8.6%
a 8965
 
7.0%
p 6122
 
4.7%
T 6045
 
4.7%
t 5967
 
4.6%
y 5693
 
4.4%
c 5644
 
4.4%
Other values (148) 42242
32.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 96699
75.0%
Uppercase Letter 15441
 
12.0%
Space Separator 12262
 
9.5%
Close Punctuation 1402
 
1.1%
Open Punctuation 1402
 
1.1%
Other Letter 604
 
0.5%
Other Punctuation 491
 
0.4%
Decimal Number 394
 
0.3%
Other Symbol 115
 
0.1%
Math Symbol 43
 
< 0.1%
Other values (4) 62
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
14.1%
72
11.9%
72
11.9%
66
10.9%
63
10.4%
28
 
4.6%
22
 
3.6%
22
 
3.6%
22
 
3.6%
12
 
2.0%
Other values (48) 140
23.2%
Uppercase Letter
ValueCountFrequency (%)
T 6045
39.1%
P 4885
31.6%
L 912
 
5.9%
H 573
 
3.7%
D 547
 
3.5%
C 454
 
2.9%
R 298
 
1.9%
I 264
 
1.7%
M 239
 
1.5%
A 238
 
1.5%
Other values (18) 986
 
6.4%
Lowercase Letter
ValueCountFrequency (%)
n 12473
12.9%
i 12408
12.8%
e 11094
11.5%
a 8965
9.3%
p 6122
 
6.3%
t 5967
 
6.2%
y 5693
 
5.9%
c 5644
 
5.8%
g 5048
 
5.2%
o 4906
 
5.1%
Other values (17) 18379
19.0%
Decimal Number
ValueCountFrequency (%)
2 101
25.6%
0 83
21.1%
1 73
18.5%
3 53
13.5%
5 46
11.7%
4 14
 
3.6%
7 10
 
2.5%
6 6
 
1.5%
8 4
 
1.0%
9 4
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 187
38.1%
% 169
34.4%
/ 117
23.8%
& 6
 
1.2%
# 5
 
1.0%
* 3
 
0.6%
@ 3
 
0.6%
' 1
 
0.2%
Other Symbol
ValueCountFrequency (%)
52
45.2%
17
 
14.8%
12
 
10.4%
° 11
 
9.6%
10
 
8.7%
9
 
7.8%
3
 
2.6%
1
 
0.9%
Math Symbol
ValueCountFrequency (%)
× 17
39.5%
+ 7
16.3%
= 7
16.3%
4
 
9.3%
± 4
 
9.3%
~ 2
 
4.7%
< 1
 
2.3%
> 1
 
2.3%
Other Number
ValueCountFrequency (%)
12
85.7%
³ 1
 
7.1%
1
 
7.1%
Close Punctuation
ValueCountFrequency (%)
) 1370
97.7%
] 32
 
2.3%
Open Punctuation
ValueCountFrequency (%)
( 1370
97.7%
[ 32
 
2.3%
Space Separator
ValueCountFrequency (%)
12262
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 112130
87.0%
Common 16171
 
12.5%
Hangul 604
 
0.5%
Greek 10
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
14.1%
72
11.9%
72
11.9%
66
10.9%
63
10.4%
28
 
4.6%
22
 
3.6%
22
 
3.6%
22
 
3.6%
12
 
2.0%
Other values (48) 140
23.2%
Latin
ValueCountFrequency (%)
n 12473
 
11.1%
i 12408
 
11.1%
e 11094
 
9.9%
a 8965
 
8.0%
p 6122
 
5.5%
T 6045
 
5.4%
t 5967
 
5.3%
y 5693
 
5.1%
c 5644
 
5.0%
g 5048
 
4.5%
Other values (41) 32671
29.1%
Common
ValueCountFrequency (%)
12262
75.8%
) 1370
 
8.5%
( 1370
 
8.5%
. 187
 
1.2%
% 169
 
1.0%
/ 117
 
0.7%
2 101
 
0.6%
0 83
 
0.5%
1 73
 
0.5%
3 53
 
0.3%
Other values (35) 386
 
2.4%
Greek
ValueCountFrequency (%)
μ 6
60.0%
Ω 2
 
20.0%
σ 1
 
10.0%
1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 128147
99.4%
Hangul 604
 
0.5%
None 55
 
< 0.1%
Letterlike Symbols 53
 
< 0.1%
CJK Compat 52
 
< 0.1%
Math Operators 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 12473
 
9.7%
i 12408
 
9.7%
12262
 
9.6%
e 11094
 
8.7%
a 8965
 
7.0%
p 6122
 
4.8%
T 6045
 
4.7%
t 5967
 
4.7%
y 5693
 
4.4%
c 5644
 
4.4%
Other values (72) 41474
32.4%
Hangul
ValueCountFrequency (%)
85
14.1%
72
11.9%
72
11.9%
66
10.9%
63
10.4%
28
 
4.6%
22
 
3.6%
22
 
3.6%
22
 
3.6%
12
 
2.0%
Other values (48) 140
23.2%
Letterlike Symbols
ValueCountFrequency (%)
52
98.1%
1
 
1.9%
CJK Compat
ValueCountFrequency (%)
17
32.7%
12
23.1%
10
19.2%
9
17.3%
3
 
5.8%
1
 
1.9%
None
ValueCountFrequency (%)
× 17
30.9%
12
21.8%
° 11
20.0%
μ 6
 
10.9%
± 4
 
7.3%
Ω 2
 
3.6%
σ 1
 
1.8%
³ 1
 
1.8%
1
 
1.8%
Math Operators
ValueCountFrequency (%)
4
100.0%
Distinct983
Distinct (%)9.8%
Missing6
Missing (%)0.1%
Memory size156.2 KiB
2023-12-12T21:00:55.030323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length108
Median length94
Mean length8.6478887
Min length1

Characters and Unicode

Total characters86427
Distinct characters153
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique740 ?
Unique (%)7.4%

Sample

1st rowReeled
2nd rowCasting. Troweling
3rd rowBaC2O4
4th rowReel taping
5th rowHockey
ValueCountFrequency (%)
reeled 2181
15.6%
reel 1634
 
11.7%
taping 1609
 
11.5%
ag/ni/sn 1089
 
7.8%
cu/ni/sn 1065
 
7.6%
4.50±0.40 483
 
3.5%
bulk 428
 
3.1%
radial 217
 
1.6%
smd 213
 
1.5%
in 186
 
1.3%
Other values (1157) 4856
34.8%
2023-12-12T21:00:55.818543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/