
Homework Problem \#2 - Arctic Ice

Hand-in format: IPython Notebook or python program. Submit via email.

As a reminder: please make sure your code is clean, documentated, and understandable. Make sure it runs without errors.

提交格式: IPython Notebook或python程序。通过电子邮件提交。



The purpose of this problem is to become familiar with loading, manipulating, and analyzing image-like data, plotting it. We will use a dataset collected by the AMSR-E instrument Aquaopen in new window satellite.

这个问题的目的是熟悉加载、操作和分析类似图像的数据,并绘制它。我们将使用AMSR-E仪器Aquaopen in new window卫星收集的数据集。

The data consists of maps of the concentration of ice in the Arctic collected between 2006 and 2011. The data obtained from the amsr databaseopen in new window and converted into a single HDF5 file format.

这些数据由2006年至2011年收集的北极冰浓度地图组成。从amsr数据库open in new window获得的数据,并转换为单一的HDF5文件格式。

Part 1 - Examining a single map


Begin by examining the HDF5 file - you can use h5ls at the command line, or h5py inside the notebook.

从检查HDF5文件开始-你可以在命令行中使用' h5ls ',或者在笔记本中使用' h5py '。

If you don't remember how to open HDF5 files, and read datasets from HDF5 files, look at our Day 2 lecture.

如果你不记得如何打开 HDF5 文件,并从 HDF5 文件中读取数据集,请参考我们第2天的课程。

There are many datasets, each with a name of the format YYYYMMDD, giving the data. Each dataset is a single map (i.e. 2D array), where the values give the ice concentration (fraction, from 0.0 to 100.0) in that pixel of the map. Careful of NaN values!


Read one of the maps, and plot it with Matplotlib.


Note: to get the correct orientation, you need the origin='lower' argument for imshow(). Include a colorbar. Remove the tick labels (0, 100, and so on, indicating pixel number) since they are not useful.

注意:为了获得正确的方向,你需要' origin='lower' '参数为' imshow() '。包括一个colorbar。删除标记(“0”、“100”等,表示像素数),因为它们没有用。


def read_hdf5(path):
    f = h5py.File(path, 'r')
    keys = f.keys()
    for key in keys:
        # print(key)
        dataset = f[key]
        print(key, dataset.shape, dataset.dtype, sep="\t")

import matplotlib.pyplot as plt
import h5py
def read_hdf5(path):
    year_lst = []
    dataset_lst = []
    f = h5py.File(path, 'r')
    keys = f.keys()
    for key in keys:
        # print(key)
        dataset = f[key]
#         print(key, dataset.shape, dataset.dtype, sep="\t")
    # "g" 表示红色,marksize用来设置'D'菱形的大小
    plt.plot(year_lst, dataset_lst, "g", marker='D', markersize=5, label="year")
    # 绘制坐标轴标签
    # 显示图例
    plt.legend(loc="lower right")
    # 调用 text()在图像上绘制注释文本
    # x1、y1表示文本所处坐标位置,ha参数控制水平对齐方式, va控制垂直对齐方式,str(y1)表示要绘制的文本
#     for x1, y1 in zip(year_lst, dataset_lst):
#         plt.text(x1, y1, str(y1), ha='center', va='bottom', fontsize=10)
    # 保存图片


Part 2 - Ice concentration versus time


We want to make a plot of the ice concentration over time.


First, write a loop to read all the datasets of the HDF5 file (e.g. into a dict).


Then, write an analysis function frac_pixels_above(dict,value) which, for each array in the input dict, computes the fraction of pixels above the input value. Use this to make a plot of the number of pixels with concentration above 50%, versus time.

然后,编写一个分析函数' frac_pixels_above(dict,value) ',该函数对于输入dict中的每个数组,计算输入' value '以上像素的百分比。使用此方法绘制浓度超过50%的像素数量与时间的关系图。

Note: to include "time" on the x-axis of a plot, you may want to write a helper function to convert the dict keys from their YYYYMMDD string format into a 3-tuple of (year, month, day) integer values.


This can then be converted into fractional years (e.g. 1 July 2012 is 2012.5). For simplicity you can assume each month has 30 days.


Try experimenting with matplotlib set_major_formatter to get a good representation of dates in the tick labelsopen in new window.

尝试使用matplotlib ' set_major_formatter '来获得一个很好的在标记标签中日期的表示open in new window

Describe what you see in the plot.


def read_hdf5(path):
    # year_lst = []
    dataset_lst = []
    f = h5py.File(path, 'r')
    keys = f.keys()
    for key in keys:
        # print(key)
        dataset = f[key]
        # print(key, dataset.shape, dataset.dtype, sep="\t")
        # year_lst.append(key)
        dataset_lst.append((key, dataset.shape))
    return dict(dataset_lst)

Part 3 - Physical units


To be more quantitative we will compute the actual surface area of Earth in km2\rm{km}^2 over which the ice concentration is above a given threshold.


However, these maps are projections of a spherical surface, so pixels have different areaopen in new window.

然而,这些地图是球面的投影,所以像素有不同的面积open in new window

Every map uses the same projection, so the pixel areas in each are the same.


The areas (in km2\rm{km}^2) are available in the file named data/p2_icedata_area.hdf5. Inspect, then load, this datafile. Plot it (with colorbar and units).






