API Documentation

We highly recommend that newcomers walk through the Onda Tour before diving into this reference documentation.

Support For Generic Path-Like Types

Onda.jl attempts to be as agnostic as possible with respect to the storage system that sample data, Arrow files, etc. are read from/written to. As such, any path-like argument accepted by an Onda.jl API function should generically "work" as long as the argument's type supports:

  • Base.read(path)::Vector{UInt8} (return the bytes stored at path)
  • Base.write(path, bytes::Vector{UInt8}) (write bytes to the location specified by path)

For backends which support direct byte range access (e.g. S3), Onda.read_byte_range may be overloaded for the backend's corresponding path type to enable further optimizations:

Onda.read_byte_rangeFunction
read_byte_range(path, byte_offset, byte_count)

Return the equivalent read(path)[(byte_offset + 1):(byte_offset + byte_count)], but try to avoid reading unreturned intermediate bytes. Note that the effectiveness of this method depends on the type of path.

source

onda.annotation

Onda.MergedAnnotationV1Type
@version MergedAnnotationV1 > AnnotationV1 begin
    from::Vector{UUID}
end

A Legolas-generated record type representing an annotation derived from "merging" one or more existing annotations.

This record type extends AnnotationV1 with a single additional required field, from::Vector{UUID}, whose entries are the ids of the annotation's source annotation(s).

See https://github.com/beacon-biosignals/Legolas.jl for details regarding Legolas record types.

source
Onda.merge_overlapping_annotationsFunction
merge_overlapping_annotations([predicate=TimeSpans.overlaps,] annotations)

Given the onda.annotation@1-compliant table annotations, return a Vector{MergedAnnotationV1} where "overlapping" consecutive entries of annotations have been merged using TimeSpans.shortest_timespan_containing.

Two consecutive annotations a and b are determined to be "overlapping" if a.recording == b.recording && predicate(a.span, b.span). Merged annotations' span fields are generated via calling TimeSpans.shortest_timespan_containing on the overlapping set of source annotations.

Note that every annotation in the returned table has a freshly generated id field and a non-empty from field. An output annotation whose from field only a contains a single element corresponds to an individual non-overlapping annotation in the provided annotations.

Note that this function internally works with Tables.columns(annotations) rather than annotations directly, so it may be slower and/or require more memory if !Tables.columnaccess(annotations).

See also TimeSpans.merge_spans for similar functionality on generic time spans (instead of annotations).

source

onda.signal

Onda.SignalV2Type
@version SignalV2 > SamplesInfoV2 begin
    recording::UUID
    file_path::(<:Any)
    file_format::String
    span::TimeSpan
    sensor_label::String
    sensor_type::String
    channels::Vector{String}
    sample_unit::String
end

A Legolas-generated record type representing an onda.signal as described by the Onda Format Specification.

Note that some fields documented as required fields of onda.signal@2 in the Onda Format Specification are captured via this schema version's extension of SamplesInfoV2.

See https://github.com/beacon-biosignals/Legolas.jl for details regarding Legolas record types.

source
Onda.SamplesInfoV2Type
@version SamplesInfoV2 begin
    sensor_type::String
    channels::Vector{String}
    sample_unit::String
    sample_resolution_in_unit::Float64
    sample_offset_in_unit::Float64
    sample_type::String = onda_sample_type_from_julia_type(sample_type)
    sample_rate::Float64
end

A Legolas-generated record type representing the bundle of onda.signal fields that are intrinsic to a signal's sample data, leaving out extrinsic file or recording information. This is useful when the latter information is irrelevant or does not yet exist (e.g. if sample data is being constructed/manipulated in-memory without yet having been serialized).

See https://github.com/beacon-biosignals/Legolas.jl for details regarding Legolas record types.

source
Onda.sample_countMethod
sample_count(x, duration::Period)

Return the number of multichannel samples that fit within duration given x.sample_rate.

source
Onda.sizeof_samplesMethod
sizeof_samples(x, duration::Period)

Returns the expected size (in bytes) of an encoded Samples object corresponding to x and duration:

sample_count(x, duration) * channel_count(x) * sizeof(x.sample_type)
source
Onda.sample_typeMethod
sample_type(x)

Return x.sample_type as an Onda.LPCM_SAMPLE_TYPE_UNION subtype. If x.sample_type is an Onda-specified sample_type string (e.g. "int16"), it will be converted to the corresponding Julia type. If x.sample_type <: Onda.LPCM_SAMPLE_TYPE_UNION, this function simply returns x.sample_type as-is.

source

Samples

Onda.SamplesType
Samples(data::AbstractMatrix, info::SamplesInfoV2, encoded::Bool;
        validate::Bool=Onda.VALIDATE_SAMPLES_DEFAULT[])

Return a Samples instance with the following fields:

  • data::AbstractMatrix: A matrix of sample data. The i th row of the matrix corresponds to the ith channel in info.channels, while the jth column corresponds to the jth multichannel sample.

  • info::SamplesInfoV2: The SamplesInfoV2-compliant value that describes the Samples instance.

  • encoded::Bool: If true, the values in data are LPCM-encoded as prescribed by the Samples instance's info. If false, the values in data have been decoded into the info's canonical units.

If validate is true, Onda.validate_samples is called on the constructed Samples instance before it is returned.

Note that getindex and view are defined on Samples to accept normal integer indices, but also accept channel names or a regex to match channel names for row indices, and TimeSpan values for column indices; see Onda/examples/tour.jl for a comprehensive set of indexing examples.

Note also that "slices" copied from s::Samples via getindex(s, ...) may alias s.info in order to avoid excessive overhead. This means one should generally avoid directly mutating s.info, especially s.info.channels.

See also: load, store, encode, encode!, decode, decode!

source
Base.:==Method
==(a::Samples, b::Samples)

Returns a.encoded == b.encoded && a.info == b.info && a.data == b.data.

source
Onda.channelFunction
channel(x, name)

Return i where x.channels[i] == name.

source
channel(x, i::Integer)

Return x.channels[i].

source
channel(samples::Samples, name)

Return channel(samples.info, name).

This function is useful for indexing rows of samples.data by channel names.

source
channel(samples::Samples, i::Integer)

Return channel(samples.info, i).

source
Onda.channel_countFunction
channel_count(x)

Return length(x.channels).

source
channel_count(samples::Samples)

Return channel_count(samples.info).

source
Onda.sample_countFunction
sample_count(x, duration::Period)

Return the number of multichannel samples that fit within duration given x.sample_rate.

source
sample_count(samples::Samples)

Return the number of multichannel samples in samples (i.e. size(samples.data, 2))

source
Onda.encodeFunction
encode(sample_type::DataType, sample_resolution_in_unit, sample_offset_in_unit,
       sample_data, dither_storage=nothing)

Return a copy of sample_data quantized according to sample_type, sample_resolution_in_unit, and sample_offset_in_unit. sample_type must be a concrete subtype of Onda.LPCM_SAMPLE_TYPE_UNION. Quantization of an individual sample s is performed via:

round(S, (s - sample_offset_in_unit) / sample_resolution_in_unit)

with additional special casing to clip values exceeding the encoding's dynamic range.

If dither_storage isa Nothing, no dithering is applied before quantization.

If dither_storage isa Missing, dither storage is allocated automatically and triangular dithering is applied to the info prior to quantization.

Otherwise, dither_storage must be a container of similar shape and type to sample_data. This container is then used to store the random noise needed for the triangular dithering process, which is applied to the info prior to quantization.

If:

sample_type === eltype(sample_data) &&
sample_resolution_in_unit == 1 &&
sample_offset_in_unit == 0

then this function will simply return sample_data directly without copying/dithering.

source
encode(samples::Samples, dither_storage=nothing)

If samples.encoded is false, return a Samples instance that wraps:

encode(sample_type(samples.info),
       samples.info.sample_resolution_in_unit,
       samples.info.sample_offset_in_unit,
       samples.data, dither_storage)

If samples.encoded is true, this function is the identity.

source
Onda.encode!Function
encode!(result_storage, sample_type::DataType, sample_resolution_in_unit,
        sample_offset_in_unit, sample_data, dither_storage=nothing)
encode!(result_storage, sample_resolution_in_unit, sample_offset_in_unit,
        sample_data, dither_storage=nothing)

Similar to encode(sample_type, sample_resolution_in_unit, sample_offset_in_unit, sample_data, dither_storage), but write encoded values to result_storage rather than allocating new storage.

sample_type defaults to eltype(result_storage) if it is not provided.

If:

sample_type === eltype(sample_data) &&
sample_resolution_in_unit == 1 &&
sample_offset_in_unit == 0

then this function will simply copy sample_data directly into result_storage without dithering.

source
encode!(result_storage, samples::Samples, dither_storage=nothing)

If samples.encoded is false, return a Samples instance that wraps:

encode!(result_storage,
        sample_type(samples.info),
        samples.info.sample_resolution_in_unit,
        samples.info.sample_offset_in_unit,
        samples.data, dither_storage)`.

If samples.encoded is true, return a Samples instance that wraps copyto!(result_storage, samples.data).

source
Onda.decodeFunction
decode(sample_resolution_in_unit, sample_offset_in_unit, sample_data)

Return fma.(sample_resolution_in_unit, sample_data, sample_offset_in_unit).

If:

sample_data isa AbstractArray &&
sample_resolution_in_unit == 1 &&
sample_offset_in_unit == 0

then this function is the identity and will return sample_data directly without copying.

source
decode(samples::Samples, ::Type{T}=Float64)

If samples.encoded is true, return a Samples instance that wraps

decode(convert(T, samples.info.sample_resolution_in_unit),
       convert(T, samples.info.sample_offset_in_unit),
       samples.data)

If samples.encoded is false, this function is the identity.

source
Onda.decode!Function
decode!(result_storage, sample_resolution_in_unit, sample_offset_in_unit, sample_data)

Similar to decode(sample_resolution_in_unit, sample_offset_in_unit, sample_data), but write decoded values to result_storage rather than allocating new storage.

source
decode!(result_storage, samples::Samples)

If samples.encoded is true, return a Samples instance that wraps

decode!(result_storage, samples.info.sample_resolution_in_unit, samples.info.sample_offset_in_unit, samples.data)

If samples.encoded is false, return a Samples instance that wraps copyto!(result_storage, samples.data).

source
Onda.loadFunction
load(signal[, span_relative_to_loaded_samples]; encoded::Bool=false)
load(file_path, file_format::Union{AbstractString,AbstractLPCMFormat},
     info[, span_relative_to_loaded_samples]; encoded::Bool=false)

Return the Samples object described by signal/file_path/file_format/info.

If span_relative_to_loaded_samples is present, return load(...)[:, span_relative_to_loaded_samples], but attempt to avoid reading unreturned intermediate sample data. Note that the effectiveness of this optimized method versus the naive approach depends on the types of file_path (i.e. if there is a fast method defined for Onda.read_byte_range(::typeof(file_path), ...)) and file_format (i.e. does the corresponding format support random or chunked access).

If encoded is true, do not decode the Samples object before returning it.

source
Onda.mmapFunction
Onda.mmap(signal)

Return Onda.mmap(signal.file_path, SamplesInfoV2(signal)), throwing an ArgumentError if signal.file_format != "lpcm".

source
Onda.mmap(mmappable, info)

Return Samples(data, info, true) where data is created via Mmap.mmap(mmappable, ...).

mmappable is assumed to reference memory that is formatted according to the Onda Format's canonical interleaved LPCM representation in accordance with sample_type(info) and channel_count(info). No explicit checks are performed to ensure that this is true.

source
Onda.storeFunction
store(file_path, file_format::Union{AbstractString,AbstractLPCMFormat}, samples::Samples)

Serialize the given samples to file_format and write the output to file_path.

source
store(file_path, file_format::Union{AbstractString,AbstractLPCMFormat},
      samples::Samples, recording::UUID, start::Period,
      sensor_label::AbstractString = samples.info.sensor_type)

Serialize the given samples to file_format and write the output to file_path, returning a SignalV2 instance constructed from the provided arguments.

source
Onda.channelMethod
channel(samples::Samples, name)

Return channel(samples.info, name).

This function is useful for indexing rows of samples.data by channel names.

source
Onda.channelMethod
channel(samples::Samples, i::Integer)

Return channel(samples.info, i).

source
Onda.sample_countMethod
sample_count(samples::Samples)

Return the number of multichannel samples in samples (i.e. size(samples.data, 2))

source

LPCM (De)serialization API

Onda.jl's LPCM (De)serialization API facilitates low-level streaming sample data (de)serialization and provides a storage-agnostic abstraction layer that can be overloaded to support new file/byte formats for (de)serializing LPCM-encodeable sample data.

Onda.LPCMFormatType
LPCMFormat(channel_count::Int, sample_type::Type)
LPCMFormat(info::SamplesInfoV2)

Return a LPCMFormat<:AbstractLPCMFormat instance corresponding to Onda's default interleaved LPCM format assumed for sample data files with the "lpcm" extension.

channel_count corresponds to length(info.channels), while sample_type corresponds to sample_type(info)

Note that bytes (de)serialized to/from this format are little-endian (per the Onda specification).

source
Onda.LPCMZstFormatType
LPCMZstFormat(lpcm::LPCMFormat; level=3)
LPCMZstFormat(info; level=3)

Return a LPCMZstFormat<:AbstractLPCMFormat instance that corresponds to Onda's default interleaved LPCM format compressed by zstd. This format is assumed for sample data files with the "lpcm.zst" extension.

The level keyword argument sets the same compression level parameter as the corresponding flag documented by the zstd command line utility.

See https://facebook.github.io/zstd/ for details about zstd.

source
Onda.deserialize_lpcmFunction
deserialize_lpcm(format::AbstractLPCMFormat, bytes,
                 samples_offset::Integer=0,
                 samples_count::Integer=typemax(Int))
deserialize_lpcm(stream::AbstractLPCMStream,
                 samples_offset::Integer=0,
                 samples_count::Integer=typemax(Int))

Return a channels-by-timesteps AbstractMatrix of interleaved LPCM-encoded sample data by deserializing the provided bytes in the given format, or from the given stream constructed by deserializing_lpcm_stream.

Note that this operation may be performed in a zero-copy manner such that the returned sample matrix directly aliases bytes.

The returned segment is at most sample_offset samples offset from the start of stream/bytes and contains at most sample_count samples. This ensures that overrun behavior is generally similar to the behavior of Base.skip(io, n) and Base.read(io, n).

This function is the inverse of the corresponding serialize_lpcm method, i.e.:

serialize_lpcm(format, deserialize_lpcm(format, bytes)) == bytes
source
Onda.serialize_lpcmFunction
serialize_lpcm(format::AbstractLPCMFormat, samples::AbstractMatrix)
serialize_lpcm(stream::AbstractLPCMStream, samples::AbstractMatrix)

Return the AbstractVector{UInt8} of bytes that results from serializing samples to the given format (or serialize those bytes directly to stream) where samples is a channels-by-timesteps matrix of interleaved LPCM-encoded sample data.

Note that this operation may be performed in a zero-copy manner such that the returned AbstractVector{UInt8} directly aliases samples.

This function is the inverse of the corresponding deserialize_lpcm method, i.e.:

deserialize_lpcm(format, serialize_lpcm(format, samples)) == samples
source
Onda.deserialize_lpcm_callbackFunction
deserialize_lpcm_callback(format::AbstractLPCMFormat, samples_offset, samples_count)

Return (callback, required_byte_offset, required_byte_count) where callback accepts the byte block specified by required_byte_offset and required_byte_count and returns the samples specified by samples_offset and samples_count.

As a fallback, this function returns (callback, missing, missing), where callback requires all available bytes. AbstractLPCMFormat subtypes that support partial/block-based deserialization (e.g. the basic LPCMFormat) can overload this function to only request exactly the byte range that is required for the sample range requested by the caller.

This allows callers to handle the byte block retrieval themselves while keeping Onda's LPCM Serialization API agnostic to the caller's storage layer of choice.

source
Onda.deserializing_lpcm_streamFunction
deserializing_lpcm_stream(format::AbstractLPCMFormat, io)

Return a stream::AbstractLPCMStream that wraps io to enable direct LPCM deserialization from io via deserialize_lpcm.

Note that stream must be finalized after usage via finalize_lpcm_stream. Until stream is finalized, io should be considered to be part of the internal state of stream and should not be directly interacted with by other processes.

source
Onda.serializing_lpcm_streamFunction
serializing_lpcm_stream(format::AbstractLPCMFormat, io)

Return a stream::AbstractLPCMStream that wraps io to enable direct LPCM serialization to io via serialize_lpcm.

Note that stream must be finalized after usage via finalize_lpcm_stream. Until stream is finalized, io should be considered to be part of the internal state of stream and should not be directly interacted with by other processes.

source
Onda.finalize_lpcm_streamFunction
finalize_lpcm_stream(stream::AbstractLPCMStream)::Bool

Finalize stream, returning true if the underlying I/O object used to construct stream is still open and usable. Otherwise, return false to indicate that underlying I/O object was closed as result of finalization.

source
Onda.register_lpcm_format!Function
Onda.register_lpcm_format!(create_constructor)

Register an AbstractLPCMFormat constructor so that it can automatically be used when format is called. Authors of new AbstractLPCMFormat subtypes should call this function for their subtype.

create_constructor should be a unary function that accepts a single file_format::AbstractString argument, and return either a matching AbstractLPCMFormat constructor or nothing. Any returned AbstractLPCMFormat constructor f should be of the form f(info; kwargs...)::AbstractLPCMFormat where info is a SamplesInfoV2-compliant value.

Note that if Onda.register_lpcm_format! is called in a downstream package, it must be called within the __init__ function of the package's top-level module to ensure that the function is always invoked when the module is loaded (not just during precompilation). For details, see https://docs.julialang.org/en/v1/manual/modules/#Module-initialization-and-precompilation.

source
Onda.file_format_stringFunction
file_format_string(format::AbstractLPCMFormat)

Return the String representation of format to be written to the file_format field of a *.signals file.

source

Utilities

Onda.VALIDATE_SAMPLES_DEFAULTConstant
VALIDATE_SAMPLES_DEFAULT[]

Defaults to true.

When set to true, Samples objects will be validated upon construction for compliance with the Onda specification.

Users may interactively set this reference to false in order to disable this extra layer validation, which can be useful when working with malformed Onda datasets.

See also: Onda.validate_samples

source
Onda.upgradeFunction
Onda.upgrade(from::SignalV1, ::SignalV2SchemaVersion)

Return a SignalV2 instance that represents from in the SignalV2SchemaVersion format.

The fields of the output will match from's fields, except:

  • The kind field will be removed.
  • The sensor_label=from.kind field will be added.
  • The sensor_type=from.kind field will be added.
source

Developer Installation

To install Onda for development, run:

julia -e 'using Pkg; Pkg.develop("Onda")'

This will install Onda to the default package development directory, ~/.julia/dev/Onda.