This is a short lesson on how to create heatmaps. You will create two heatmaps in this session: one with step-by-step instructions and the other on your own in the exercises.
You are provided two datasets:
One of the average high temperatures for cities in the US
RNA-seq Data from the ENCODE Project
While there are several ways that you can create heatmaps, we are going to use the heatmap.2
function in the gplots
package
library(gplots)
We are going to begin by importing our dataset
Temperatures = read.delim("../dataSets/Temperatures.txt")
We can then preview our dataset:
head(Temperatures)
X Boston New.York Miami Los.Angeles Aspen
1 January 36 36 74 68 36
2 February 39 40 75 69 39
3 March 45 48 76 70 46
4 April 56 58 79 73 53
5 May 66 68 83 74 63
6 June 76 77 87 78 73
Anchorage New.Orleans San.Diego San.Francisco Austin
1 22 62 65 57 62
2 26 65 65 60 65
3 34 72 66 62 72
4 44 78 68 63 60
5 56 85 68 64 87
6 63 90 71 67 92
Phoenix Chicago Honolulu Atlanta Washington.D.C. Seattle
1 67 32 81 51 43 47
2 71 38 81 56 47 51
3 77 47 82 63 56 55
4 85 59 83 72 67 59
5 95 70 85 79 75 65
6 104 80 87 85 84 70
St..Louis Minneapolis
1 40 24
2 45 29
3 55 41
4 67 58
5 77 69
6 85 79
Notice that the first column of data, should be our row names instead of the first column. To fix this:
row.names(Temperatures) = Temperatures[,1]
Temperatures = Temperatures[,-1]
Now when we preview the dataset, the data frame has the appropriate row names.
head(Temperatures)
Boston New.York Miami Los.Angeles Aspen Anchorage
January 36 36 74 68 36 22
February 39 40 75 69 39 26
March 45 48 76 70 46 34
April 56 58 79 73 53 44
May 66 68 83 74 63 56
June 76 77 87 78 73 63
New.Orleans San.Diego San.Francisco Austin Phoenix
January 62 65 57 62 67
February 65 65 60 65 71
March 72 66 62 72 77
April 78 68 63 60 85
May 85 68 64 87 95
June 90 71 67 92 104
Chicago Honolulu Atlanta Washington.D.C. Seattle
January 32 81 51 43 47
February 38 81 56 47 51
March 47 82 63 56 55
April 59 83 72 67 59
May 70 85 79 75 65
June 80 87 85 84 70
St..Louis Minneapolis
January 40 24
February 45 29
March 55 41
April 67 58
May 77 69
June 85 79
To plot data in a heatmap you must first transform the data frame into a data matrix. This can be easily done using the data.matrix()
function:
TempMatrix=data.matrix(Temperatures)
We are now ready to make our heatmap:
heatmap.2(TempMatrix)
Now you are going to do the same actions for your ENCODE RNA-seq dataset:
Congratulations! You have learned how to make a heatmap! However, it looks rather awful. This section is going to cover several options which you can change to improve your heatmap.
Clustering
By default the heatmap.2()
function will cluster both the rows and columns. This means that it will group rows and columns that are more similar closer together. Depending on what you are plotting, you may way to remove this option.
To remove column clustering:
heatmap.2(TempMatrix, Colv=NA)
To remove row clustering:
heatmap.2(TempMatrix, Rowv=NA)
To remove all clustering:
heatmap.2(TempMatrix, Colv=NA, Rowv=NA)
Dendrograms
By default the heatmap.2()
function will add dendograms to your rows and columns demonstration how they are clustered. If you disable clustering you will also remove the dendrograms. However, you may want to keep the clustering and just remove the dendograms.
To only have row dendrograms:
heatmap.2(TempMatrix, dendrogram="row")