API Documentation
Import EDF to Onda
OndaEDF.jl prefers "self-service" import over "automagic", and provides functionality to extract Onda.Samples
and EDFAnnotationV1
s (which extend Onda.AnnotationV1
s) from an EDF.File
. These can be written to disk (with Onda.store
/ Legolas.write
or manipulated in memory as desired.
Import signal data as Samples
OndaEDF.edf_to_onda_samples
— Functionedf_to_onda_samples(edf::EDF.File, plan_table; validate=true, dither_storage=missing)
Convert Signals found in an EDF File to Onda.Samples
according to the plan specified in plan_table
(e.g., as generated by plan_edf_to_onda_samples
), returning an iterable of the generated Onda.Samples
and the plan as actually executed.
The input plan is transformed by using merge_samples_info
to combine rows with the same :onda_signal_index
into a common Onda.SamplesInfo
. Then OndaEDF.onda_samples_from_edf_signals
is used to combine the EDF signals data into a single Onda.Samples
per group.
Any errors that occur are shown as String
s (with backtrace) and inserted into the :error
column for the corresponding rows from the plan.
Samples are returned in the order of :onda_signal_index
. Signals that could not be matched or otherwise caused an error during execution are not returned.
If validate=true
(the default), the plan is validated against the FilePlanV2
schema, and the signal headers in the EDF.File
.
If dither_storage=missing
(the default), dither storage is allocated automatically as specified in the docstring for Onda.encode
. dither_storage=nothing
disables dithering.
Returned samples are integer-encoded. If these samples are being serialized out (e.g. via Onda.store!
) this is not an issue, but if the samples are being immediately analyzed in memory, call Onda.decode
to decode them to recover the time-series voltages.
edf_to_onda_samples(edf::EDF.File; kwargs...)
Read signals from an EDF.File
into a vector of Onda.Samples
. This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples
, and then immediately executes this plan with edf_to_onda_samples
.
The samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type
or :channel
is missing
) and errors (non-nothing
values in :error
).
Collections of EDF.Signal
s are mapped as channels to Onda.Samples
via plan_edf_to_onda_samples
. The caller of this function can control the plan via the labels
and units
keyword arguments, all of which are forwarded to plan_edf_to_onda_samples
.
EDF.Signal
labels that are converted into Onda channel names undergo the following transformations:
- the label is whitespace-stripped, parens-stripped, and lowercased
- trailing generic EDF references (e.g. "ref", "ref2", etc.) are dropped
- any instance of
+
is replaced with_plus_
and/
with_over_
- all component names are converted to their "canonical names" when possible (e.g. "m1" in an EEG-matched channel name will be converted to "a1").
See the OndaEDF README for additional details regarding EDF formatting expectations.
Returned samples are integer-encoded. If these samples are being serialized out (e.g. via Onda.store!
) this is not an issue, but if the samples are being immediately analyzed in memory, call Onda.decode
to decode them to recover the time-series voltages.
OndaEDF.plan_edf_to_onda_samples
— Functionplan_edf_to_onda_samples(header, seconds_per_record; labels=STANDARD_LABELS,
units=STANDARD_UNITS)
plan_edf_to_onda_samples(signal::EDF.Signal, args...; kwargs...)
Formulate a plan for converting an EDF signal into Onda format. This returns a Tables.jl row with all the columns from the signal header, plus additional columns for the Onda.SamplesInfo
for this signal, and the seconds_per_record
that is passed in here.
If no labels match, then the channel
and kind
columns are missing
; the behavior of other SamplesInfo
columns is undefined; they are currently set to missing but that may change in future versions.
Any errors that are thrown in the process will be wrapped as SampleInfoError
s and then printed with backtrace to a String
in the error
column.
Matching EDF label to Onda labels
The labels
keyword argument determines how Onda channel
and signal kind
are extracted from the EDF label.
Labels are specified as an iterable of signal_names => channel_names
pairs. signal_names
should be an iterable of signal names, the first of which is the canonical name used as the Onda kind
. Each element of channel_names
gives the specification for one channel, which can either be a string, or a canonical_name => alternates
pair. Occurences of alternates
will be replaces with canonical_name
in the generated channel label.
Matching is determined solely by the channel names. When matching, the signal names are only used to remove signal names occuring as prefixes (e.g., "[ECG] AVL") before matching channel names. See match_edf_label
for details, and see OndaEDF.STANDARD_LABELS
for the default labels.
As an example, here is (a subset of) the default labels for ECG signals:
["ecg", "ekg"] => ["i" => ["1"], "ii" => ["2"], "iii" => ["3"],
"avl"=> ["ecgl", "ekgl", "ecg", "ekg", "l"],
"avr"=> ["ekgr", "ecgr", "r"], ...]
Matching is done in the order that labels
iterates pairs, and will stop at the first match, with no warning if signals are ambiguous (although this may change in a future version)
plan_edf_to_onda_samples(edf::EDF.File;
labels=STANDARD_LABELS,
units=STANDARD_UNITS,
onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))
Formulate a plan for converting an EDF.File
to Onda Samples. This applies plan_edf_to_onda_samples
to each individual signal contained in the file, storing edf_signal_index
as an additional column.
The resulting rows are then passed to plan_edf_to_onda_samples_groups
and grouped according to onda_signal_groupby
(by default, the :sensor_type
, :sample_unit
, and :sample_rate
columns), and the group index is added as an additional column in onda_signal_index
.
The resulting plan is returned as a table. No signal data is actually read from the EDF file; to execute this plan and generate Onda.Samples
, use edf_to_onda_samples
. The index of the EDF signal (after filtering out signals that are not EDF.Signal
s, e.g. annotation channels) for each row is stored in the :edf_signal_index
column, and the rows are sorted in order of :onda_signal_index
, and then by :edf_signal_index
.
OndaEDF.plan_edf_to_onda_samples_groups
— Functionplan_edf_to_onda_samples_groups(plan_rows; onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))
Group together plan_rows
based on the values of the onda_signal_groupby
columns, creating the :onda_signal_index
column and promoting the Onda encodings for each group using OndaEDF.promote_encodings
.
If the :edf_signal_index
column is not present or otherwise missing, it will be filled in based on the order of the input rows.
The updated rows are returned, sorted first by the columns named in onda_signal_groupby
and second by order of occurrence within the input rows.
Import annotations
OndaEDF.edf_to_onda_annotations
— Functionedf_to_onda_annotations(edf::EDF.File, uuid::UUID)
Extract EDF+ annotations from an EDF.File
for recording with ID uuid
and return them as a vector of Onda.Annotation
s. Each returned annotation has a value
field that contains the string value of the corresponding EDF+ annotation.
If no EDF+ annotations are found in edf
, then an empty Vector{Annotation}
is returned.
OndaEDFSchemas.EDFAnnotationV1
— Type@version EDFAnnotationV1 > AnnotationV1 begin
value::String
end
A Legolas-generated record type that represents a single annotation imported from an EDF Annotation signal. The value
field contains the annotation value as a string.
Import plan table schemas
OndaEDFSchemas.PlanV2
— Type@version PlanV2 begin
# EDF.SignalHeader fields
label::String
transducer_type::String
physical_dimension::String
physical_minimum::Float32
physical_maximum::Float32
digital_minimum::Float32
digital_maximum::Float32
prefilter::String
samples_per_record::Int16
# EDF.FileHeader field
seconds_per_record::Float64
# Onda.SignalV2 fields (channels -> channel), may be missing
recording::Union{UUID,Missing} = passmissing(UUID)
sensor_type::Union{Missing,AbstractString}
sensor_label::Union{Missing,AbstractString}
channel::Union{Missing,AbstractString}
sample_unit::Union{Missing,AbstractString}
sample_resolution_in_unit::Union{Missing,Float64}
sample_offset_in_unit::Union{Missing,Float64}
sample_type::Union{Missing,AbstractString}
sample_rate::Union{Missing,Float64}
# errors, use `nothing` to indicate no error
error::Union{Nothing,String}
end
A Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of
- fields from
EDF.SignalHeader
(all mandatory) - the
seconds_per_record
field fromEDF.FileHeader
(mandatory) - fields from
Onda.SignalV2
(optional, may bemissing
to indicate failed conversion), except forfile_path
error
, which isnothing
for a conversion that is or is expected to be successful, and aString
describing the source of the error (with backtrace) in the case of a caught error.
OndaEDFSchemas.FilePlanV2
— Type@version FilePlanV2 > PlanV2 begin
edf_signal_index::Int
onda_signal_index::Int
end
A Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV2
and additional file-level context:
edf_signal_index
gives the index of thesignals
in the sourceEDF.File
corresponding to this rowonda_signal_index
gives the index of the outputOnda.Samples
.
Note that while the EDF index does correspond to the actual index in edf.signals
, some Onda indices may be skipped in the output, so onda_signal_index
is only to indicate order and grouping.
OndaEDF.write_plan
— Functionwrite_plan(io_or_path, plan_table; validate=true, kwargs...)
Write a plan table to io_or_path
using Legolas.write
, using the ondaedf.file-plan@1
schema.
Full-service import
For a more "full-service" experience, OndaEDF.jl also provides functionality to extract Onda.Samples
and EDFAnnotationV1
s and then write them to disk:
OndaEDF.store_edf_as_onda
— Functionstore_edf_as_onda(edf::EDF.File, onda_dir, recording_uuid::UUID=uuid4();
custom_extractors=STANDARD_EXTRACTORS, import_annotations::Bool=true,
postprocess_samples=identity,
signals_prefix="edf", annotations_prefix=signals_prefix)
Convert an EDF.File to Onda.Samples
and Onda.Annotation
s, store the samples in $path/samples/
, and write the Onda signals and annotations tables to $path/$(signals_prefix).onda.signals.arrow
and $path/$(annotations_prefix).onda.annotations.arrow
. The default prefix is "edf", and if a prefix is provided for signals but not annotations both will use the signals prefix. The prefixes cannot reference (sub)directories.
Returns (; recording_uuid, signals, annotations, signals_path, annotations_path, plan)
.
This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples
, and then immediately executes this plan with edf_to_onda_samples
.
The samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type
or :channel
is missing
) and errors (non-nothing
values in :error
).
Groups of EDF.Signal
s are mapped as channels to Onda.Samples
via plan_edf_to_onda_samples
. The caller of this function can control the plan via the labels
and units
keyword arguments, all of which are forwarded to plan_edf_to_onda_samples
.
EDF.Signal
labels that are converted into Onda channel names undergo the following transformations:
- the label is whitespace-stripped, parens-stripped, and lowercased
- trailing generic EDF references (e.g. "ref", "ref2", etc.) are dropped
- any instance of
+
is replaced with_plus_
and/
with_over_
- all component names are converted to their "canonical names" when possible (e.g. "3" in an ECG-matched channel name will be converted to "iii").
If more control (e.g. preprocessing signal labels) is required, callers should use plan_edf_to_onda_samples
and edf_to_onda_samples
directly, and Onda.store
the resulting samples manually.
See the OndaEDF README for additional details regarding EDF formatting expectations.
Internal import utilities
OndaEDF.match_edf_label
— FunctionOndaEDF.match_edf_label(label, signal_names, channel_name, canonical_names)
Return a normalized label matched from an EDF label
. The purpose of this function is to remove signal names from the label, and to canonicalize the channel name(s) that remain. So something like "[eCG] avl-REF" will be transformed to "avl" (given signal_names=["ecg"]
, and channel_name="avl"
)
This returns nothing
if channel_name
does not match after normalization.
Canonicalization
- ensures the given label is whitespace-stripped, lowercase, and parens-free
- strips trailing generic EDF references (e.g. "ref", "ref2", etc.)
- replaces all references with the appropriate name as specified by
canonical_names
- replaces
+
with_plus_
and/
with_over_
- returns the initial reference name (w/o prefix sign, if present) and the entire label; the initial reference name should match the canonical channel name, otherwise the channel extraction will be rejected.
Examples
match_edf_label("[ekG] avl-REF", ["ecg", "ekg"], "avl", []) == "avl"
match_edf_label("ECG 2", ["ecg", "ekg"], "ii", ["ii" => ["2", "two", "ecg2"]]) == "ii"
See the tests for more examples
This is an internal function and is not meant to be called directly.
OndaEDF.merge_samples_info
— FunctionOndaEDF.merge_samples_info(plan_rows)
Create a single, merged SamplesInfo
from plan rows, such as generated by plan_edf_to_onda_samples
. Encodings are promoted with promote_encodings
.
The input rows must have the same values for :sensor_type
, :sample_unit
, and :sample_rate
; otherwise an ArgumentError
is thrown.
If any of these values is missing
, or any row's :channel
value is missing
, this returns missing
to indicate it is not possible to determine a shared SamplesInfo
.
The original EDF labels are included in the output in the :edf_channels
column.
This is an internal function and is not meant to be called direclty.
OndaEDF.onda_samples_from_edf_signals
— FunctionOndaEDF.onda_samples_from_edf_signals(target::Onda.SamplesInfo, edf_signals,
edf_seconds_per_record; dither_storage=missing)
Generate an Onda.Samples
struct from an iterable of EDF.Signal
s, based on the Onda.SamplesInfo
in target
. This checks for matching sample rates in the source signals. If the encoding of target
is the same as the encoding in a signal, its encoded (usually Int16
) data is copied directly into the Samples
data matrix; otherwise it is re-encoded.
If dither_storage=missing
(the default), dither storage is allocated automatically as specified in the docstring for Onda.encode
. dither_storage=nothing
disables dithering. See Onda.encode
's docstring for more details.
This function is not meant to be called directly, but through edf_to_onda_samples
Returned samples are integer-encoded. If these samples are being serialized out (e.g. via Onda.store!
) this is not an issue, but if the samples are being immediately analyzed in memory, call Onda.decode
to decode them to recover the time-series voltages.
OndaEDF.promote_encodings
— Functionpromote_encodings(encodings; pick_offset=(_ -> 0.0), pick_resolution=minimum)
Return a common encoding for input encodings
, as a NamedTuple
with fields sample_type
, sample_offset_in_unit
, sample_resolution_in_unit
, and sample_rate
. If input encodings' sample_rate
s are not all equal, an error is thrown. If sample rates/offests are not equal, then pick_offset
and pick_resolution
are used to combine them into a common offset/resolution.
This is an internal function and is not meant to be called direclty.
Export EDF from Onda
OndaEDF.onda_to_edf
— Functiononda_to_edf(samples::AbstractVector{<:Samples}, annotations=[]; kwargs...)
Return an EDF.File
containing signal data converted from a collection of Onda Samples
and (optionally) annotations from an annotations
table.
Following the Onda v0.5 format, annotations
can be any Tables.jl-compatible table (DataFrame, Arrow.Table, NamedTuple of vectors, vector of NamedTuples) which follows the annotation schema.
Each EDF.Signal
in the returned EDF.File
corresponds to a channel of an input Onda.Samples
.
The ordering of EDF.Signal
s in the output will match the order of the input collection of Samples
(and within each channel grouping, the order of the samples' channels).
EDF signals are encoded as Int16, while Onda allows a range of different sample types, some of which provide considerably more resolution than Int16. During export, re-encoding may be necessary if the encoded Onda samples cannot be represented directly as Int16 values. In this case, new encoding (resolution and offset) will be chosen based on the minimum and maximum values actually present in each signal in the input Onda Samples. Thus, it may not always be possible to losslessly round trip Onda-formatted datasets to EDF and back.
Deprecations
To support deserializing plan tables generated with old versions of OndaEDF + Onda, the following schemas are provided. These are deprecated and will be removed in a future release.
OndaEDFSchemas.PlanV1
— Type@version PlanV1 begin
# EDF.SignalHeader fields
label::String
transducer_type::String
physical_dimension::String
physical_minimum::Float32
physical_maximum::Float32
digital_minimum::Float32
digital_maximum::Float32
prefilter::String
samples_per_record::Int16
# EDF.FileHeader field
seconds_per_record::Float64
# Onda.SignalV1 fields (channels -> channel), may be missing
recording::Union{UUID,Missing} = passmissing(UUID)
kind::Union{Missing,AbstractString}
channel::Union{Missing,AbstractString}
sample_unit::Union{Missing,AbstractString}
sample_resolution_in_unit::Union{Missing,Float64}
sample_offset_in_unit::Union{Missing,Float64}
sample_type::Union{Missing,AbstractString}
sample_rate::Union{Missing,Float64}
# errors, use `nothing` to indicate no error
error::Union{Nothing,String}
end
A Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of
- fields from
EDF.SignalHeader
(all mandatory) - the
seconds_per_record
field fromEDF.FileHeader
(mandatory) - fields from
Onda.SignalV1
(optional, may bemissing
to indicate failed conversion), except forfile_path
error
, which isnothing
for a conversion that is or is expected to be successful, and aString
describing the source of the error (with backtrace) in the case of a caught error.
OndaEDFSchemas.FilePlanV1
— Type@version FilePlanV1 > PlanV1 begin
edf_signal_index::Int
onda_signal_index::Int
end
A Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV1
and additional file-level context:
edf_signal_index
gives the index of thesignals
in the sourceEDF.File
corresponding to this rowonda_signal_index
gives the index of the outputOnda.Samples
.
Note that while the EDF index does correspond to the actual index in edf.signals
, some Onda indices may be skipped in the output, so onda_signal_index
is only to indicate order and grouping.