Abstrct
h5py 常用在大數據的資料儲存,相較於 csv 更快速讀寫,並且更符合 python 方式。
How to use
準備資料,這邊準備的是連續圖片
# setting
frame = np.zeros((1, 60, 80))
生成一個 dataset ,並預設此 dataset 可以彈性成長
# initial
with h5py.File("mytestfile.hdf5", "w") as f:
dset = f.create_dataset('video', data=frame,
maxshape=(None, 60, 80), chunks=True)
讀取這個檔案所有的 dataset,目前只有 u"video"
# get key
with h5py.File("mytestfile.hdf5", "r") as f:
print(f.keys())
先擴增 dataset 的大小後,再塞入新的 frame
# extend dataset
with h5py.File("mytestfile.hdf5", "a") as hf:
hf['video'].resize((hf['video'].shape[0] + 1), axis=0)
hf['video'][-1:] = frame
讀取 dataset,因為取出的是 dataset 格式,而我想要直接用 numpy array 做事,故加入 [:]
# get data
with h5py.File("mytestfile.hdf5", 'r') as hf:
data = hf[u'video'] # <HDF5 dataset>
data = hf[u'video'][:] # <np.array>
Reference
- http://docs.h5py.org/en/latest/index.html
- https://stackoverflow.com/questions/25655588/incremental-writes-to-hdf5-with-h5py?rq=1
- https://stackoverflow.com/questions/47072859/how-to-append-data-to-one-specific-dataset-in-a-hdf5-file-with-h5py
- http://jeff-leaf.site/2017/09/29/Python%E5%A4%84%E7%90%86HDF5%E6%96%87%E4%BB%B6/
- https://stackoverflow.com/questions/22998248/how-to-resize-an-hdf5-array-with-h5py