h5py 簡單操作 動態調整 dataset 大小

Kiwi lee
1 min readJun 22, 2018

--

Abstrct

h5py 常用在大數據的資料儲存,相較於 csv 更快速讀寫,並且更符合 python 方式。

How to use

準備資料,這邊準備的是連續圖片

# setting 
frame = np.zeros((1, 60, 80))

生成一個 dataset ,並預設此 dataset 可以彈性成長

# initial
with h5py.File("mytestfile.hdf5", "w") as f:
dset = f.create_dataset('video', data=frame,
maxshape=(None, 60, 80), chunks=True)

讀取這個檔案所有的 dataset,目前只有 u"video"

# get key 
with h5py.File("mytestfile.hdf5", "r") as f:
print(f.keys())

先擴增 dataset 的大小後,再塞入新的 frame

# extend dataset
with h5py.File("mytestfile.hdf5", "a") as hf:
hf['video'].resize((hf['video'].shape[0] + 1), axis=0)
hf['video'][-1:] = frame

讀取 dataset,因為取出的是 dataset 格式,而我想要直接用 numpy array 做事,故加入 [:]

# get data
with h5py.File("mytestfile.hdf5", 'r') as hf:
data = hf[u'video'] # <HDF5 dataset>
data = hf[u'video'][:] # <np.array>

Reference

--

--

Kiwi lee
Kiwi lee

Written by Kiwi lee

Hi, I'm kiwi, Platform Engineer (SRE, DevOps). Python Engineer. Love art, books, longboard. https://kiwilee-blog.netlify.app/