Overview

Dataset statistics

Number of variables4
Number of observations72
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 KiB
Average record size in memory36.8 B

Variable types

Text1
Numeric3

Dataset

Description2021년 12월 31일 기준 여성 과학기술인력 공공연구기관 연구과제예산별 연구과제 수 및 여성 과학기술인력의 연구과제책임자 현황에 대한 정보입니다. 컬럼명 : 구분,과제 수,여성 과제책임자,전체 과제책임자
URLhttps://www.data.go.kr/data/15053984/fileData.do

Alerts

과제 수 is highly overall correlated with 여성 과제책임자 and 1 other fieldsHigh correlation
여성 과제책임자 is highly overall correlated with 과제 수 and 1 other fieldsHigh correlation
전체 과제책임자 is highly overall correlated with 과제 수 and 1 other fieldsHigh correlation
구분 has unique valuesUnique
전체 과제책임자 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:51:45.315765
Analysis finished2023-12-12 05:51:46.484871
Duration1.17 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

UNIQUE 

Distinct72
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size708.0 B
2023-12-12T14:51:46.689595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13.5
Mean length13.5
Min length12

Characters and Unicode

Total characters972
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)100.0%

Sample

1st row2021-10억원 이상
2nd row2021-1억~10억원 미만
3rd row2021-3천만~1억원 미만
4th row2021-3천만원 미만
5th row2020-10억원 이상
ValueCountFrequency (%)
미만 54
37.5%
이상 18
 
12.5%
2010-10억원 1
 
0.7%
2009-3천만원 1
 
0.7%
2009-3천만~1억원 1
 
0.7%
2009-1억~10억원 1
 
0.7%
2009-10억원 1
 
0.7%
2010-3천만원 1
 
0.7%
2008-1억~10억원 1
 
0.7%
2010-3천만~1억원 1
 
0.7%
Other values (64) 64
44.4%
2023-12-12T14:51:47.181430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 140
14.4%
1 120
12.3%
90
9.3%
2 84
8.6%
- 72
7.4%
72
7.4%
72
7.4%
72
7.4%
54
 
5.6%
3 40
 
4.1%
Other values (10) 156
16.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 432
44.4%
Other Letter 360
37.0%
Dash Punctuation 72
 
7.4%
Space Separator 72
 
7.4%
Math Symbol 36
 
3.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 140
32.4%
1 120
27.8%
2 84
19.4%
3 40
 
9.3%
7 8
 
1.9%
9 8
 
1.9%
8 8
 
1.9%
6 8
 
1.9%
5 8
 
1.9%
4 8
 
1.9%
Other Letter
ValueCountFrequency (%)
90
25.0%
72
20.0%
72
20.0%
54
15.0%
36
 
10.0%
18
 
5.0%
18
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 72
100.0%
Space Separator
ValueCountFrequency (%)
72
100.0%
Math Symbol
ValueCountFrequency (%)
~ 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 612
63.0%
Hangul 360
37.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 140
22.9%
1 120
19.6%
2 84
13.7%
- 72
11.8%
72
11.8%
3 40
 
6.5%
~ 36
 
5.9%
7 8
 
1.3%
9 8
 
1.3%
8 8
 
1.3%
Other values (3) 24
 
3.9%
Hangul
ValueCountFrequency (%)
90
25.0%
72
20.0%
72
20.0%
54
15.0%
36
 
10.0%
18
 
5.0%
18
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 612
63.0%
Hangul 360
37.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 140
22.9%
1 120
19.6%
2 84
13.7%
- 72
11.8%
72
11.8%
3 40
 
6.5%
~ 36
 
5.9%
7 8
 
1.3%
9 8
 
1.3%
8 8
 
1.3%
Other values (3) 24
 
3.9%
Hangul
ValueCountFrequency (%)
90
25.0%
72
20.0%
72
20.0%
54
15.0%
36
 
10.0%
18
 
5.0%
18
 
5.0%

과제 수
Real number (ℝ)

HIGH CORRELATION 

Distinct71
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4655
Minimum532
Maximum10420
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size780.0 B
2023-12-12T14:51:47.642226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum532
5-th percentile844.7
Q12881
median4649
Q36630
95-th percentile8845.25
Maximum10420
Range9888
Interquartile range (IQR)3749

Descriptive statistics

Standard deviation2599.7099
Coefficient of variation (CV)0.55847689
Kurtosis-0.72926709
Mean4655
Median Absolute Deviation (MAD)2003
Skewness0.03116156
Sum335160
Variance6758491.7
MonotonicityNot monotonic
2023-12-12T14:51:47.779660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1104 2
 
2.8%
1507 1
 
1.4%
5350 1
 
1.4%
5523 1
 
1.4%
887 1
 
1.4%
5257 1
 
1.4%
6011 1
 
1.4%
6744 1
 
1.4%
982 1
 
1.4%
7700 1
 
1.4%
Other values (61) 61
84.7%
ValueCountFrequency (%)
532 1
1.4%
664 1
1.4%
740 1
1.4%
793 1
1.4%
887 1
1.4%
916 1
1.4%
982 1
1.4%
998 1
1.4%
1060 1
1.4%
1104 2
2.8%
ValueCountFrequency (%)
10420 1
1.4%
10280 1
1.4%
9436 1
1.4%
9024 1
1.4%
8699 1
1.4%
8107 1
1.4%
7800 1
1.4%
7700 1
1.4%
7511 1
1.4%
7448 1
1.4%

여성 과제책임자
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean463.125
Minimum18
Maximum1297
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size780.0 B
2023-12-12T14:51:47.931257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile33.1
Q1110.5
median387
Q3747.75
95-th percentile1145.65
Maximum1297
Range1279
Interquartile range (IQR)637.25

Descriptive statistics

Standard deviation362.00187
Coefficient of variation (CV)0.78165045
Kurtosis-0.54391115
Mean463.125
Median Absolute Deviation (MAD)301
Skewness0.65224378
Sum33345
Variance131045.35
MonotonicityNot monotonic
2023-12-12T14:51:48.090121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
530 2
 
2.8%
20 2
 
2.8%
384 2
 
2.8%
55 2
 
2.8%
297 2
 
2.8%
76 2
 
2.8%
420 1
 
1.4%
379 1
 
1.4%
361 1
 
1.4%
42 1
 
1.4%
Other values (56) 56
77.8%
ValueCountFrequency (%)
18 1
1.4%
20 2
2.8%
32 1
1.4%
34 1
1.4%
41 1
1.4%
42 1
1.4%
53 1
1.4%
55 2
2.8%
69 1
1.4%
76 2
2.8%
ValueCountFrequency (%)
1297 1
1.4%
1283 1
1.4%
1213 1
1.4%
1210 1
1.4%
1093 1
1.4%
1069 1
1.4%
1053 1
1.4%
1042 1
1.4%
988 1
1.4%
966 1
1.4%

전체 과제책임자
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct72
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4860.4028
Minimum533
Maximum10620
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size780.0 B
2023-12-12T14:51:48.223717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum533
5-th percentile849.35
Q12979
median4888
Q36844.5
95-th percentile9296.05
Maximum10620
Range10087
Interquartile range (IQR)3865.5

Descriptive statistics

Standard deviation2697.6003
Coefficient of variation (CV)0.55501579
Kurtosis-0.78464171
Mean4860.4028
Median Absolute Deviation (MAD)1983
Skewness-0.030836662
Sum349949
Variance7277047.3
MonotonicityNot monotonic
2023-12-12T14:51:48.400838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1515 1
 
1.4%
6818 1
 
1.4%
5609 1
 
1.4%
893 1
 
1.4%
5383 1
 
1.4%
6024 1
 
1.4%
6781 1
 
1.4%
986 1
 
1.4%
5865 1
 
1.4%
8019 1
 
1.4%
Other values (62) 62
86.1%
ValueCountFrequency (%)
533 1
1.4%
667 1
1.4%
747 1
1.4%
796 1
1.4%
893 1
1.4%
949 1
1.4%
986 1
1.4%
1009 1
1.4%
1115 1
1.4%
1117 1
1.4%
ValueCountFrequency (%)
10620 1
1.4%
10551 1
1.4%
9483 1
1.4%
9466 1
1.4%
9157 1
1.4%
8417 1
1.4%
8019 1
1.4%
7951 1
1.4%
7945 1
1.4%
7864 1
1.4%

Interactions

2023-12-12T14:51:46.029390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:45.465025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:45.760932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:46.121646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:45.565497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:45.851256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:46.225874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:45.660691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:45.943167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:51:48.505854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분과제 수여성 과제책임자전체 과제책임자
구분1.0001.0001.0001.000
과제 수1.0001.0000.7880.963
여성 과제책임자1.0000.7881.0000.751
전체 과제책임자1.0000.9630.7511.000
2023-12-12T14:51:48.597941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과제 수여성 과제책임자전체 과제책임자
과제 수1.0000.8560.979
여성 과제책임자0.8561.0000.891
전체 과제책임자0.9790.8911.000

Missing values

2023-12-12T14:51:46.335443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:51:46.439538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분과제 수여성 과제책임자전체 과제책임자
02021-10억원 이상15071121515
12021-1억~10억원 미만10420129710620
22021-3천만~1억원 미만742312837554
32021-3천만원 미만46347254679
42020-10억원 이상1310901355
52020-1억~10억원 미만943610699483
62020-3천만~1억원 미만64279666473
72020-3천만원 미만45558034638
82019-10억원 이상1261821182
92019-1억~10억원 미만86998727951
구분과제 수여성 과제책임자전체 과제책임자
622006-3천만~1억원 미만53964425631
632006-3천만원 미만43702774428
642005-10억원 이상66420667
652005-1억~10억원 미만46171784873
662005-3천만~1억원 미만46642974772
672005-3천만원 미만48502814903
682004-10억원 이상53218533
692004-1억~10억원 미만35441063545
702004-3천만~1억원 미만34632173467
712004-3천만원 미만43312974332