Chapter 5. Groups, Links, and Iteration: The “H” in HDF5

So far we’ve seen how to create Dataset objects by giving them a name in the file like myfile["dataset1"] or myfile["dataset2"]. Unless you’re one of those people who stores all their documents on the desktop, you can probably see the flaw in this approach.

Groups are the HDF5 container object, analagous to folders in a filesystem. They can hold datasets and other groups, allowing you to build up a hierarchical structure with objects neatly organized in groups and subgroups.

The Root Group and Subgroups

You may have guessed by now that the File object is itself a group. In this case, it also serves as the root group, named /, our entry point into the file.

The more general group object is h5py.Group, of which h5py.File is a subclass. Other groups are easily created by the method create_group:

>>> f = h5py.File("Groups.hdf5")
>>> subgroup = f.create_group("SubGroup")
>>> subgroup
<HDF5 group "/SubGroup" (0 members)>
>>> subgroup.name
u'/SubGroup'

Of course, groups can be nested also. The create_group method exists on all Group objects, not just File:

>>> subsubgroup = subgroup.create_group("AnotherGroup")
>>> subsubgroup.name
u'/SubGroup/AnotherGroup'

By the way, you don’t have to manually create nested groups one at a time. If you supply a full path, HDF5 will create all the intermediate groups for you:

>>> out = f.create_group('/some/big/path')
>>> out
<HDF5 group "/some/big/path" (0 members)>

The same goes for creating datasets; just ...

Get Python and HDF5 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.