Source: Allison, Truett and Cicchetti, Domenic V. (1976), "Sleep in Mammals: Ecological and Constitutional Correlates", _Science_, November 12, vol. 194, pp. 732-734.
Variables below (from left to right) for Mammals Data Set:species of animal
body weight in kg
brain weight in g
slow wave ("nondreaming") sleep (hrs/day)
paradoxical ("dreaming") sleep (hrs/day)
total sleep (hrs/day) (sum of slow wave and paradoxical sleep)
maximum life span (years)
gestation time (days)
predation index (1-5)
1 = minimum (least likely to be preyed upon) 5 = maximum (most likely to be preyed upon)sleep exposure index (1-5)
1 = least exposed (e.g. animal sleeps in a well-protected den) 5 = most exposedoverall danger index (1-5)
(based on the above two indices and other information) 1 = least danger (from other animals) 5 = most danger (from other animals)Note: Missing values denoted by -999.0
The above data set can be freely used for non-commercial purposes and can be freely distributed (permission in writing obtained from Dr. Truett Allison). Submitted by Roger Johnso nrwjohnso@silver.sdsmt.edu
분류 : Cluster Analysis
Source: Allison, Truett and Cicchetti, Domenic V. (1976), "Sleep in Mammals: Ecological and Constitutional Correlates", _Science_, November 12, vol. 194, pp. 732-734.
Variables below (from left to right) for Mammals Data Set:
species of animal
body weight in kg
brain weight in g
slow wave ("nondreaming") sleep (hrs/day)
paradoxical ("dreaming") sleep (hrs/day)
total sleep (hrs/day) (sum of slow wave and paradoxical sleep)
maximum life span (years)
gestation time (days)
predation index (1-5)
1 = minimum (least likely to be preyed upon)
5 = maximum (most likely to be preyed upon)
sleep exposure index (1-5)
1 = least exposed (e.g. animal sleeps in a
well-protected den)
5 = most exposed
overall danger index (1-5)
(based on the above two indices and other information)
1 = least danger (from other animals)
5 = most danger (from other animals)
Note: Missing values denoted by -999.0
The above data set can be freely used for non-commercial purposes and can be freely distributed (permission in writing obtained from Dr. Truett Allison). Submitted by Roger Johnso nrwjohnso@silver.sdsmt.edu
R-code.r
JaeseongYoo — Jun 2, 2014, 10:57 PM
rm(list = ls())
setwd("D:/Dropbox/DataSets/Blog Posts/20140602 Ecological Correlates of Sleep in mammals")
set.seed(1)
data = read.csv("sleep.txt", sep="\t")
names = data[,1]
data = data[,-1]
dimnames(data)[[1]] = names
dimnames(data)[[2]] = c("body_weight", "brain_weight", "slow_wave", "paradoxical", "total_sleep",
"maximum_life_span", "gestation_time", "predation_index", "sleep_exposure_index", "overall_danger_index")
eig = eigen(cor(data))
round(eig$value, 2)
[1] 3.35 2.12 1.71 1.06 0.76 0.60 0.27 0.05 0.04 0.04
n_factor = sum((eig$value > 1)*1)
# Single Linkage
hclust_result = hclust(dist(data, method="euclidean"), method="single")
plot(hclust_result, hang=-1, main="Single Linkage")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
47 11 1 2
# Complete Linkage
hclust_result = hclust(dist(data, method="euclidean"), method="complete")
plot(hclust_result, hang=-1, main="Complete Linkage")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
41 15 1 4
# Average Linkage
hclust_result = hclust(dist(data, method="euclidean"), method="average")
plot(hclust_result, hang=-1, main="Average Linkage")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
47 11 1 2
# Centroid Method
hclust_result = hclust(dist(data, method="euclidean"), method="centroid")
plot(hclust_result, hang=-1, main="Centroid")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
57 1 1 2
# Ward's Method
hclust_result = hclust(dist(data, method="euclidean"), method="ward.D2")
plot(hclust_result, hang=-1, main="Ward")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
41 13 1 6
# K-means
kmeanclust_result =kmeans(data, n_factor, nstart=500)
kmeanclust_result
K-means clustering with 4 clusters of sizes 43, 4, 1, 13
Cluster means:
body_weight brain_weight slow_wave paradoxical total_sleep
1 39.27 106.81 8.877 1.935 10.81
2 15.51 25.48 8.125 3.075 11.20
3 2547.00 4603.00 2.100 1.800 3.90
4 105.64 195.74 -999.000 -845.208 -300.15
maximum_life_span gestation_time predation_index sleep_exposure_index
1 -29.15 114.2 2.93 2.233
2 -238.62 -999.0 1.75 1.250
3 69.00 624.0 3.00 5.000
4 -57.14 159.9 3.00 3.000
overall_danger_index
1 2.628
2 1.500
3 4.000
4 2.769
Clustering vector:
African giant pouched rat Arctic Fox
1 4
Arctic ground squirrel Asian elephant
4 3
Baboon Big brown bat
1 1
Brazilian tapir Cat
1 1
Chimpanzee Chinchilla
1 1
Cow Desert hedgehog
1 2
Donkey Eastern American mole
4 1
Echidna European hedgehog
1 1
Galago Genet
1 2
Giant armadillo Giraffe
2 4
Goat Golden hamster
1 1
Gorilla Gray seal
4 1
Gray wolf Ground squirrel
4 1
Guinea pig Horse
1 1
Jaguar Kangaroo
4 4
Lesser short-tailed shrew Little brown bat
1 1
Man Mole rat
1 1
Mountain beaver Mouse
1 1
Musk shrew N. American opossum
1 1
Nine-banded armadillo Okapi
1 4
Owl monkey Patas monkey
1 1
Phanlanger Pig
1 1
Rabbit Raccoon
1 4
Rat Red fox
1 1
Rhesus monkey Rock hyrax (Hetero. b)
1 1
Rock hyrax (Procavia hab) Roe deer
1 4
Sheep Slow loris
1 4
Star nosed mole Tenrec
2 1
Tree hyrax Tree shrew
1 1
Vervet Water opossum
1 1
Yellow-bellied marmot
4
Within cluster sum of squares by cluster:
[1] 5273523 778489 0 6636877
(between_SS / total_SS = 79.8 %)
Available components:
[1] "cluster" "centers" "totss" "withinss"
[5] "tot.withinss" "betweenss" "size" "iter"
[9] "ifault"
require(cluster)
Loading required package: cluster
clusplot(data, kmeanclust_result$cluster, color=TRUE, shade=TRUE, labels=2, lines=0)
require(fpc)
Loading required package: fpc
Loading required package: MASS
Loading required package: mclust
Package 'mclust' version 4.3
Loading required package: flexmix
Loading required package: lattice
plotcluster(data, kmeanclust_result$cluster)
JaeseongYoo — Jun 2, 2014, 10:57 PM
rm(list = ls())
setwd("D:/Dropbox/DataSets/Blog Posts/20140602 Ecological Correlates of Sleep in mammals")
set.seed(1)
data = read.csv("sleep.txt", sep="\t")
names = data[,1]
data = data[,-1]
dimnames(data)[[1]] = names
dimnames(data)[[2]] = c("body_weight", "brain_weight", "slow_wave", "paradoxical", "total_sleep",
"maximum_life_span", "gestation_time", "predation_index", "sleep_exposure_index", "overall_danger_index")
eig = eigen(cor(data))
round(eig$value, 2)
[1] 3.35 2.12 1.71 1.06 0.76 0.60 0.27 0.05 0.04 0.04
n_factor = sum((eig$value > 1)*1)
# Single Linkage
hclust_result = hclust(dist(data, method="euclidean"), method="single")
plot(hclust_result, hang=-1, main="Single Linkage")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
47 11 1 2
# Complete Linkage
hclust_result = hclust(dist(data, method="euclidean"), method="complete")
plot(hclust_result, hang=-1, main="Complete Linkage")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
41 15 1 4
# Average Linkage
hclust_result = hclust(dist(data, method="euclidean"), method="average")
plot(hclust_result, hang=-1, main="Average Linkage")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
47 11 1 2
# Centroid Method
hclust_result = hclust(dist(data, method="euclidean"), method="centroid")
plot(hclust_result, hang=-1, main="Centroid")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
57 1 1 2
# Ward's Method
hclust_result = hclust(dist(data, method="euclidean"), method="ward.D2")
plot(hclust_result, hang=-1, main="Ward")
rect.hclust(hclust_result, n_factor)
cutree_result = cutree(hclust_result, n_factor)
table(cutree_result)
cutree_result
1 2 3 4
41 13 1 6
# K-means
kmeanclust_result =kmeans(data, n_factor, nstart=500)
kmeanclust_result
K-means clustering with 4 clusters of sizes 43, 4, 1, 13
Cluster means:
body_weight brain_weight slow_wave paradoxical total_sleep
1 39.27 106.81 8.877 1.935 10.81
2 15.51 25.48 8.125 3.075 11.20
3 2547.00 4603.00 2.100 1.800 3.90
4 105.64 195.74 -999.000 -845.208 -300.15
maximum_life_span gestation_time predation_index sleep_exposure_index
1 -29.15 114.2 2.93 2.233
2 -238.62 -999.0 1.75 1.250
3 69.00 624.0 3.00 5.000
4 -57.14 159.9 3.00 3.000
overall_danger_index
1 2.628
2 1.500
3 4.000
4 2.769
Clustering vector:
African giant pouched rat Arctic Fox
1 4
Arctic ground squirrel Asian elephant
4 3
Baboon Big brown bat
1 1
Brazilian tapir Cat
1 1
Chimpanzee Chinchilla
1 1
Cow Desert hedgehog
1 2
Donkey Eastern American mole
4 1
Echidna European hedgehog
1 1
Galago Genet
1 2
Giant armadillo Giraffe
2 4
Goat Golden hamster
1 1
Gorilla Gray seal
4 1
Gray wolf Ground squirrel
4 1
Guinea pig Horse
1 1
Jaguar Kangaroo
4 4
Lesser short-tailed shrew Little brown bat
1 1
Man Mole rat
1 1
Mountain beaver Mouse
1 1
Musk shrew N. American opossum
1 1
Nine-banded armadillo Okapi
1 4
Owl monkey Patas monkey
1 1
Phanlanger Pig
1 1
Rabbit Raccoon
1 4
Rat Red fox
1 1
Rhesus monkey Rock hyrax (Hetero. b)
1 1
Rock hyrax (Procavia hab) Roe deer
1 4
Sheep Slow loris
1 4
Star nosed mole Tenrec
2 1
Tree hyrax Tree shrew
1 1
Vervet Water opossum
1 1
Yellow-bellied marmot
4
Within cluster sum of squares by cluster:
[1] 5273523 778489 0 6636877
(between_SS / total_SS = 79.8 %)
Available components:
[1] "cluster" "centers" "totss" "withinss"
[5] "tot.withinss" "betweenss" "size" "iter"
[9] "ifault"
require(cluster)
Loading required package: cluster
clusplot(data, kmeanclust_result$cluster, color=TRUE, shade=TRUE, labels=2, lines=0)
require(fpc)
Loading required package: fpc
Loading required package: MASS
Loading required package: mclust
Package 'mclust' version 4.3
Loading required package: flexmix
Loading required package: lattice
plotcluster(data, kmeanclust_result$cluster)
'# Download Files Post > Data' 카테고리의 다른 글
Air Pollution and Mortality (0) | 2014.05.29 |
---|---|
James M. Lattin, J. Douglas Carroll, Paul E. Green, "Datasets of Analyzing Multivariate Data", Thomson, 2003 (0) | 2014.03.13 |
아이스탯(ISTAT) raw data (0) | 2013.08.10 |
가계동향조사 - 1990 - 통계청 MDSS (0) | 2012.07.22 |
가계동향조사 - 1995 - 통계청 MDSS (0) | 2012.07.22 |
가계동향조사 - 2000 - 통계청 MDSS (0) | 2012.07.22 |
가계동향조사 - 2005 - 통계청 MDSS (0) | 2012.07.22 |
가계동향조사 - 2010 - 통계청 MDSS (0) | 2012.07.22 |
가계동향조사 - 2011 - 통계청 MDSS (0) | 2012.06.20 |
가계금융조사 - 2010 - 통계청 MDSS (0) | 2012.06.17 |