hdf5N2Write

class hdf5N2Write : public kotekan::Stage

Buffered-transpose writer: buffers sequential time frames and writes HDF5 files.

This stage groups frames into windows of fixed-number-of-frames num_file_t based on their absolute time index, buffers a complete (num_file_f * num_file_t) block per output file in memory (via N2FileData), and writes all arrays to disk in large contiguous slabs. Missing frames stay zero-filled in the output; /frames_added records which (f, t) pairs were present and frac_lost remains 1.0 where frames were absent. Files are first written to <base_dir>/.partial/vis_<abs_idx>.h5 and renamed to vis_<abs_idx>_<YYYYMMDDThhmmss>_<nsec>.h5 based on the earliest fpga_start_tick in the file; if that finalized name already exists when a new window is opened, late frames are dropped. Partial files are finalized when full, after late_frame_grace_seconds of inactivity once later windows arrive, or during shutdown.

Buffers

  • in_buf Input visibility buffer

    • Format: VisBuffer

    • Metadata: N2Metadata

Configuration

Metrics

  • kotekan_hdf5N2Write_write_time_seconds Duration to write the last flush

  • kotekan_hdf5N2Write_n_datasets Number of datasets currently open

  • kotekan_hdf5N2Write_open_file_info Gauge=1 for each open file {abs_file_idx, partial_path, file_mode}

  • kotekan_hdf5N2Write_open_file_age_seconds Wall time since file open {abs_file_idx}

  • kotekan_hdf5N2Write_file_completion_fraction Added (f,t) pairs / expected per file {abs_file_idx}

  • kotekan_hdf5N2Write_add_frame_errors_total Counter of add_frame failures {reason}

  • kotekan_hdf5N2Write_last_add_frame_error_seconds Timestamp of last add_frame failure {reason, abs_file_idx, freq_id, t_index}

  • kotekan_hdf5N2Write_finalize_failures_total Counter of finalize failures {reason}

  • kotekan_hdf5N2Write_unfinalized_file Gauge=1 for files left partial/quarantined {abs_file_idx, partial_path}

Example
hdf5_vis_write:
    kotekan_stage: hdf5N2Write
    in_buf: n2_merge_buffer      # Input N2-type buffer
    base_dir: ./vis_data         # Output directory root
    num_file_t: 10               # Frames per file window
    blocksize_f: 4               # Chunk cap (frequency)
    blocksize_p: 8               # Chunk cap (products/elements)
    blocksize_t: 5               # Chunk cap (time)
    compression: deflate         # none|deflate|zstd|lz4 (with bitshuffle)
    compression_level: 4         # Codec level (0=stage default)
    use_bitshuffle: true         # Enable bitshuffle filter
    late_frame_grace_seconds: 30 # Grace before finalizing partials
    max_frames: -1               # Stop after N frames (-1 disables)

Note

N2FileData documents the per-file layout, chunking, and compression details.

Note

User-level documentation lives in docs/sphinx/user/processes/hdf5N2Write.rst.

Param in_buf:

String. N2 buffer supplying frames (buffer_type must be “N2”).

Param base_dir:

String. Output directory (absolute or relative to the process working directory where kotekan was invoked). An acquisition subdirectory acq_YYYYMMDD_HHMMSS_NNNNNNNNN is appended automatically at startup; <base_dir>/<acq>/ and <base_dir>/<acq>/.partial are created.

Param num_file_t:

UInt. Number of time frames per file (t_index = abs_time_idx % num_file_t).

Param blocksize_f:

UInt. Chunk cap for the frequency dimension (default: 16).

Param blocksize_p:

UInt. Chunk cap for product/element dimensions (default: 16).

Param blocksize_t:

UInt. Chunk cap for the time dimension (default: num_file_t).

Param compression:

String. “none” (default) or “deflate” for zlib; also “zstd” or “lz4” when paired with bitshuffle.

Param compression_level:

UInt. Codec level (0 picks 4 for deflate, 9 for bitshuffle).

Param use_bitshuffle:

Bool. Enable bitshuffle with the selected backend codec (default: false).

Param baseband_gain_file:

String. Path to the digital gains HDF5 file. If empty (default), gains are not written. Mutually exclusive with baseband_gain_host_info.

Param baseband_gain_host_info:

String. Absolute config path (e.g. /fpga_controller) to a block that exposes host, port, and an optional gains_endpoint (default /get-current-gain-file). At startup the gains HDF5 file is fetched once over plain HTTP from <host>:<port><gains_endpoint> into <base_dir>/.partial/baseband_gains.h5 and then used as if it had been provided via baseband_gain_file. Mutually exclusive with baseband_gain_file.

Param baseband_gain_update_idx:

Int. Index along the update_time axis to read from the gains file (-1 = latest, default: -1).

Param late_frame_grace_seconds:

UInt. Grace period in seconds for late frames (default: 60).

Param max_frames:

Int. Stop writing after this many frames (-1 = unlimited).

Public Functions

hdf5N2Write(kotekan::Config &config, const std::string &unique_name, kotekan::bufferContainer &buffer_container)
virtual ~hdf5N2Write()
virtual void main_thread() override

Buffered writer for CHORD/CHIME N2 visibilities. Inline Doxygen covers the file layout, chunking/compression, configuration, metrics, and an example YAML snippet.

class N2FileData

Buffer a full file worth of arrays in memory, then flush to disk all at once.

A file contains num_file_t time frames across num_file_f freqs. This class holds all relevant arrays in their target on-disk layout, “transposed” so the time-axis is fastest-varying, to facilitate large contiguous writes, rather than writing as individual frames arrive. Additional bookkeeping regarding the file (HDF5 handle, paths, open time) is also stored here. The file start time (and therefore final file name) is determined by the timestamp of the earliest frame in the file, so it cannot be set until all frames have arrived, since frames of different frequencies may arrive out of temporal order.

File layout

  • Attributes: version, file_mode (CHORD/CHIME), abs_file_idx, num_file_t, num_elements, num_prod, num_ev, num_freq, n2_layout, telescope geometry (origin, orientations, dish maps), EOP tables, num_file_f.

  • Index maps: /index_map/freq (MHz + width per file frequency), /index_map/prod, /index_map/grid_x_idx, /index_map/grid_y_idx, /index_map/feed_pos_disp_m, /index_map/coelev_disp_deg, /index_map/type, /index_map/dish_positions_in_grid_coords.

  • Per-(f, p, t)/(f, t) datasets: /vis, /eval, /evec, /erms, /gain, /radiometer_chi2, /frames_added; vis_weight, flags, frac_lost, and frac_rfi live at the root (CHORD) or under /flags/{vis_weight, flags, frac_lost, frac_rfi} when file_mode == CHIME.

  • Per-time metadata: /fpga_start_tick, /frame_length_fpga_ticks, /time_center_ut1_ns, /bin_ut1_ns, /bin_start_ERA_deg, /bin_end_ERA_deg, /bin_start_ERAL, /bin_end_ERAL, /rfi_frame_excision_enabled, /rfi_frame_excision_num, /rfi_frame_excision_threshold, /rfi_frame_excision_fraction.

  • /config_json grows on flush with snapshots from configTracker.

Chunking and compression

  • Chunk caps: blocksize_f (frequency), blocksize_p (product/element), blocksize_t (time). A value of 0 leaves the dimension size unchanged.

  • Compression: compression=”none” (default) or “deflate” (zlib; default level 4 when compression_level=0).

  • use_bitshuffle=true adds the bitshuffle filter and uses compression as the backend (“none”, “zstd”, “lz4”); compression_level=0 maps to level 9. Required HDF5 plugins must be available at runtime.

Subclassed by TestVisFileData

Public Types

enum FileMode

Values:

enumerator CHORD
enumerator CHIME
enum class AddFrameStatus

Values:

enumerator Success
enumerator OutOfBounds
enumerator Duplicate
enumerator MetadataMismatch

Public Functions

N2FileData(FileMode file_mode_, uint64_t num_file_t_, const N2FrameView &fv, const double open_wall_s_, const uint64_t abs_file_idx_, const size_t blocksize_f_, const size_t blocksize_p_, const size_t blocksize_t_, const std::string compression_, const size_t compression_level_, const bool use_bitshuffle_, const std::string base_dir_, const std::string baseband_gain_file_, const int baseband_gain_update_idx_ = -1)
AddFrameStatus add_frame(const N2FrameView &fv, size_t t_index)

Add a frame of data at the computed time index.

Parameters:
  • fv – Frame view containing data.

  • meta – N2 metadata for the frame.

  • t_index – Time index within this file block (0..num_file_t-1).

Returns:

status describing whether the frame was accepted.

std::optional<std::string> _get_final_filename()

Get the final filename for this file based on earliest fpga tick in the frame. May return std::nullopt if no frames have been added yet, so earliest time is unknown. If earlier frames are later added, this method may return a different filename.

bool flush_to_disk()

Flush buffered data to the associated dataset, always writing the entire time range [0 .. num_file_t-1] regardless of which frames were populated. Returns true if a write occurred, false on error. Although errors are logged, no further action is taken by this class, and an attempt is made to write data regardless.

void close()

Close the associated dataset handle if open.

inline bool full() const

Check if all (f, t) pairs have been added.

inline double completion_fraction() const
inline size_t idx_fpt(size_t f, size_t p, size_t t) const

Get the index for a (f, p, t) triplet.

inline size_t idx_fet(size_t f, size_t e, size_t t) const

Get the index for a (f, e, t) triplet.

inline size_t idx_feit(size_t f, size_t e, size_t i, size_t t) const

Get the index for a (f, e, i, t) quadruplet.

inline size_t idx_fit(size_t f, size_t i, size_t t) const

Get the index for a (f, i, t) triplet.

inline size_t idx_ft(size_t f, size_t t) const

Get the index for a (f, t) pair.

inline size_t get_added_count() const

Get the number of added frames (ie. (f, t) pairs)

inline size_t get_expected_count() const

Get the number of added frames (ie. (f, t) pairs)

Public Members

const size_t num_elements
const size_t num_prod
const size_t num_ev
const size_t num_file_f
const size_t num_file_t
const FileMode file_mode
const size_t blocksize_f
const size_t blocksize_p
const size_t blocksize_t
const std::string compression
const size_t compression_level
const bool use_bitshuffle
const double open_wall_s
const uint64_t abs_file_idx
const std::string base_dir
const std::string baseband_gain_file
const int baseband_gain_update_idx
const std::string partial_filepath
const N2Layout n2_layout
double last_update_wall_s
std::unique_ptr<HighFive::File> h5_file
std::optional<size_t> gains_file_hash = std::nullopt

See tests/boost/test_hdf5N2Write.cpp for end-to-end expectations and file layout checks.