API Documentation

The AbstractClassifier Interface

Lighthouse.AbstractClassifierType
AbstractClassifier

An abstract type whose subtypes C<:AbstractClassifier must implement:

Subtypes may additionally overload default implementations for:

The AbstractClassifier interface is built upon the expectation that any multiclass label will be represented in one of two standardized forms:

  • "soft label": a probability distribution vector where the ith element is the probability assigned to the ith class in classes(classifier).
  • "hard label": the interger index of a corresponding class in classes(classifier).

Internally, Lighthouse converts hard labels to soft labels via onehot and soft labels to hard labels via onecold.

See also: learn!

source
Lighthouse.classesFunction
Lighthouse.classes(classifier::AbstractClassifier)

Return a Vector or Tuple of class values for classifier.

This method must be implemented for each AbstractClassifier subtype.

source
Lighthouse.train!Function
Lighthouse.train!(classifier::AbstractClassifier, batches, logger)

Train classifier on the iterable batches for a single epoch. This function is called once per epoch by learn!.

This method must be implemented for each AbstractClassifier subtype. Implementers should ensure that the training loss is properly logged to logger by calling Lighthouse.log_value!(logger, "train/loss_per_batch", batch_loss) for each batch in batches.

source
Lighthouse.loss_and_predictionFunction
Lighthouse.loss_and_prediction(classifier::AbstractClassifier,
                               input_batch::AbstractArray,
                               args...)

Return (loss, soft_label_batch) given input_batch and any additional args provided by the caller; loss is a scalar, which soft_label_batch is a matrix with length(classes(classifier)) rows and size(input_batch).

Specifically, the ith column of soft_label_batch is classifier's soft label prediction for the ith sample in input_batch.

This method must be implemented for each AbstractClassifier subtype.

source
Lighthouse.onehotFunction
Lighthouse.onehot(classifier::AbstractClassifier, hard_label)

Return the one-hot encoded probability distribution vector corresponding to the given hard_label. hard_label must be an integer index in the range 1:length(classes(classifier)).

source
Lighthouse.onecoldFunction
Lighthouse.onecold(classifier::AbstractClassifier, soft_label)

Return the hard label (integer index in the range 1:length(classes(classifier))) corresponding to the given soft_label (one-hot encoded probability distribution vector).

By default, this function returns argmax(soft_label).

source
Lighthouse.is_early_stopping_exceptionFunction
Lighthouse.is_early_stopping_exception(classifier::AbstractClassifier, exception)

Return true if exception should be considered an "early-stopping exception" (e.g. Flux.Optimise.StopException), rather than rethrown from learn!.

This function returns false by default, but can be overloaded by subtypes of AbstractClassifier that employ exceptions as early-stopping mechanisms.

source

The learn! Interface

Lighthouse.learn!Function
learn!(model::AbstractClassifier, logger,
       get_train_batches, get_test_batches, votes,
       elected=majority.(eachrow(votes), (1:length(classes(model)),));
       epoch_limit=100, post_epoch_callback=(_ -> nothing),
       optimal_threshold_class::Union{Nothing,Integer}=nothing,
       test_set_logger_prefix="test_set")

Return model after optimizing its parameters across multiple epochs of training and test, logging Lighthouse's standardized suite of classifier performance metrics to logger throughout the optimization process.

The following phases are executed at each epoch (note: in the below lists of logged values, $resource takes the values of the field names of Lighthouse.ResourceInfo):

  1. Train model by calling train!(model, get_train_batches(), logger). The following quantities are logged to logger during this phase:

    • train/loss_per_batch
    • any additional quantities logged by the relevant model/framework-specific implementation of train!.
  2. Compute model's predictions on test set provided by get_test_batches() (see below for details). The following quantities are logged to logger during this phase:

    • <test_set_logger_prefix>_prediction/loss_per_batch
    • <test_set_logger_prefix>_prediction/mean_loss_per_epoch
    • <test_set_logger_prefix>_prediction/$resource_per_batch
  3. Compute a battery of metrics to evaluate model's performance on the test set based on the test set prediction phase. The following quantities are logged to logger during this phase:

    • <test_set_logger_prefix>_evaluation/metrics_per_epoch
    • <test_set_logger_prefix>_evaluation/$resource_per_epoch
  4. Call post_epoch_callback(current_epoch).

Where...

  • get_train_batches is a zero-argument function that returns an iterable of training set batches. Internally, learn! uses this function when it calls train!(model, get_train_batches(), logger).

  • get_test_batches is a zero-argument function that returns an iterable of test set batches used during the current epoch's test phase. Each element of the iterable takes the form (batch, votes_locations). Internally, batch is passed to loss_and_prediction as loss_and_prediction(model, batch...), and votes_locations[i] is expected to yield the row index of votes that corresponds to the ith sample in batch.

  • votes is a matrix of hard labels whose columns correspond to voters and whose rows correspond to the samples in the test set that have been voted on. If votes[sample, voter] is not a valid hard label for model, then voter will simply be considered to have not assigned a hard label to sample.

  • elected is a vector of hard labels where the ith element is the hard label elected as "ground truth" out of votes[i, :].

  • optimal_threshold_class is the class index (1 or 2) for which to calculate an optimal threshold for converting predicted_soft_labels to predicted_hard_labels. This is only a valid parameter when length(classes) == 2. If optimal_threshold_class is present, test set evaluation will be based on predicted hard labels calculated with this threshold; if optimal_threshold_class is nothing, predicted hard labels will be calculated via onecold(classifier, soft_label).

source
Lighthouse.evaluate!Function
evaluate!(predicted_hard_labels::AbstractVector,
          predicted_soft_labels::AbstractMatrix,
          elected_hard_labels::AbstractVector,
          classes, logger;
          logger_prefix, logger_suffix,
          votes::Union{Nothing,AbstractMatrix}=nothing,
          thresholds=0.0:0.01:1.0,
          optimal_threshold_class::Union{Nothing,Integer}=nothing)

Return nothing after computing and logging a battery of classifier performance metrics that each compare predicted_soft_labels and/or predicted_hard_labels agaist elected_hard_labels.

The following quantities are logged to logger: - <logger_prefix>/metrics<logger_suffix> - <logger_prefix>/$resource<logger_suffix>

Where...

  • predicted_soft_labels is a matrix of soft labels whose columns correspond to classes and whose rows correspond to samples in the evaluation set.

  • predicted_hard_labels is a vector of hard labels where the ith element is the hard label predicted by the model for sample i in the evaulation set.

  • elected_hard_labels is a vector of hard labels where the ith element is the hard label elected as "ground truth" for sample i in the evaulation set.

  • thresholds are the range of thresholds used by metrics (e.g. PR curves) that are calculated on the predicted_soft_labels for a range of thresholds.

  • votes is a matrix of hard labels whose columns correspond to voters and whose rows correspond to the samples in the test set that have been voted on. If votes[sample, voter] is not a valid hard label for model, then voter will simply be considered to have not assigned a hard label to sample.

  • optimal_threshold_class is the class index (1 or 2) for which to calculate an optimal threshold for converting the predicted_soft_labels to predicted_hard_labels. If present, the input predicted_hard_labels will be ignored and new predicted_hard_labels will be recalculated from the new threshold. This is only a valid parameter when length(classes) == 2

source
Lighthouse.predict!Function
predict!(model::AbstractClassifier,
         predicted_soft_labels::AbstractMatrix,
         batches, logger::LearnLogger;
         logger_prefix::AbstractString)

Return mean_loss of all batches after using model to predict their soft labels and storing those results in predicted_soft_labels.

The following quantities are logged to logger:

  • <logger_prefix>/loss_per_batch
  • <logger_prefix>/mean_loss_per_epoch
  • <logger_prefix>/$resource_per_batch

Where...

  • model is a model that outputs soft labels when called on a batch of batches, model(batch).

  • predicted_soft_labels is a matrix whose columns correspond to classes and whose rows correspond to samples in batches, and which is filled in with soft-label predictions.

  • batches is an iterable of batches, where each element of the iterable takes the form (batch, votes_locations). Internally, batch is passed to loss_and_prediction as loss_and_prediction(model, batch...).

source

The logging interface

The following "primitives" must be defined for a logger to be used with Lighthouse:

Lighthouse.log_line_series!Function
log_line_series!(logger, field::AbstractString, curves, labels=1:length(curves))

Logs a series plot to logger under field, where...

  • curves is an iterable of the form Tuple{Vector{Real},Vector{Real}}, where each tuple contains (x-values, y-values), as in the Lighthouse.EvaluationV1 field per_class_roc_curves
  • labels is the class label for each curve, which defaults to the numeric index of each curve.
source
Lighthouse.log_plot!Function
log_plot!(logger, field::AbstractString, plot, plot_data)

Log a plot to logger under field field.

  • plot: the plot itself
  • plot_data: an unstructured dictionary of values used in creating plot.

See also log_line_series!.

source

in addition to Base.flush(logger) (which can be a no-op by defining Base.flush(::MyLoggingType) = nothing).

These primitives can be used in implementations of train!, evaluate!, and predict!, as well as in the following composite logging functions, which by default call the above primitives. Loggers may provide custom implementations of these.

Lighthouse.log_event!Function
log_event!(logger, value::AbstractString)

Logs a string event given by value to logger. Defaults to calling log_value! with a field named event.

source
Lighthouse.log_values!Function
log_values!(logger, values)

Logs an iterable of (field, value) pairs to logger. Falls back to calling log_value! in a loop. Loggers may specialize this method for improved performance.

source
Lighthouse.log_array!Function
log_array!(logger::Any, field::AbstractString, value)

Log an array value to field.

Defaults to log_value!(logger, mean(value)).

source
Lighthouse.log_arrays!Function
log_arrays!(logger, values)

Logs an iterable of (field, array) pairs to logger. Falls back to calling log_array! in a loop.

Loggers may specialize this method for improved performance.

source

LearnLoggers

LearnLoggers are a Tensorboard-backed logger which comply with the above logging interface. They also support additional callback functionality with upon:

Lighthouse.LearnLoggerType
LearnLogger

A struct that wraps a TensorBoardLogger.TBLogger in order to enforce the following:

  • all values logged to Tensorboard should be accessible to the post_epoch_callback argument to learn!
  • all values that are cached during learn! should be logged to Tensorboard

To access values logged to a LearnLogger instance, inspect the instance's logged field.

source
Lighthouse.uponFunction
upon(logger::LearnLogger, field::AbstractString; condition, initial)
upon(logged::Dict{String,Any}, field::AbstractString; condition, initial)

Return a closure that can be called to check the most recent state of logger.logged[field] and trigger a caller-provided function when condition(recent_state, previously_chosen_state) is true.

For example:

upon_loss_decrease = upon(logger, "test_set_prediction/mean_loss_per_epoch";
                          condition=<, initial=Inf)

save_upon_loss_decrease = _ -> begin
    upon_loss_decrease(new_lowest_loss -> save_my_model(model, new_lowest_loss),
                       consecutive_failures -> consecutive_failures > 10 && Flux.stop())
end

learn!(model, logger, get_train_batches, get_test_batches, votes;
       post_epoch_callback=save_upon_loss_decrease)

Specifically, the form of the returned closure is f(on_true, on_false) where on_true(state) is called if condition(state, previously_chosen_state) is true. Otherwise, on_false(consecutive_falses) is called where consecutive_falses is the number of condition calls that have returned false since the last condition call returned true.

Note that the returned closure is a no-op if logger.logged[field] has not been updated since the most recent call.

source
Lighthouse.forward_logsFunction
forwarding_task = forward_logs(channel, logger::LearnLogger)

Forwards logs with values supported by TensorBoardLogger to logger::LearnLogger:

  • string events of type AbstractString
  • scalars of type Union{Real,Complex}
  • plots that TensorBoardLogger can convert to raster images

returns the forwarding_task:::Task that does the forwarding. To cleanly stop forwarding, close(channel) and wait(forwarding_task).

outbox is a Channel or RemoteChannel of Pair{String, Any} field names starting with "plot" forward to TensorBoardLogger.log_image

source
Base.flushMethod
Base.flush(logger::LearnLogger)

Persist possibly transient logger state.

source

Performance Metrics

Lighthouse.confusion_matrixFunction
confusion_matrix(class_count::Integer, hard_label_pairs = ())

Given the iterable hard_label_pairs whose kth element takes the form (first_classifiers_label_for_sample_k, second_classifiers_label_for_sample_k), return the corresponding confusion matrix where matrix[i, j] is the number of samples that the first classifier labeled i and the second classifier labeled j.

Note that the returned confusion matrix can be updated in-place with new labels via Lighthouse.increment_at!(matrix, more_hard_label_pairs).

source
Lighthouse.accuracyFunction
accuracy(confusion::AbstractMatrix)

Returns the percentage of matching classifications out of total classifications, or NaN if all(iszero, confusion).

Note that accuracy(confusion) is equivalent to overall percent agreement between confusion's row classifier and column classifier.

source
Lighthouse.binary_statisticsFunction
binary_statistics(confusion::AbstractMatrix, class_index)

Treating the rows of confusion as corresponding to predicted classifications and the columns as corresponding to true classifications, return a NamedTuple with the following fields for the given class_index:

  • predicted_positives
  • predicted_negatives
  • actual_positives
  • actual_negatives
  • true_positives
  • true_negatives
  • false_positives
  • false_negatives
  • true_positive_rate
  • true_negative_rate
  • false_positive_rate
  • false_negative_rate
  • precision
  • f1
source
Lighthouse.cohens_kappaFunction
cohens_kappa(class_count, hard_label_pairs)

Return (κ, p₀) where κ is Cohen's kappa and p₀ percent agreement given class_count and hard_label_pairs (these arguments take the same form as their equivalents in confusion_matrix).

source
Lighthouse.calibration_curveFunction
calibration_curve(probabilities, bitmask; bin_count=10)

Given probabilities (the predicted probabilities of the positive class) and bitmask (a vector of Bools indicating whether or not the element actually belonged to the positive class), return (bins, fractions, totals, mean_squared_error) where:

  • bins a vector with bin_count Pairs specifying the calibration curve's probability bins
  • fractions: a vector where fractions[i] is the number of values in probabilities that falls within bin[i] over the total number of values within bin[i], or NaN if the total number of values in bin[i] is zero.
  • totals: a vector where totals[i] the total number of values within bin[i].
  • mean_squared_error: The mean squared error of fractions vs. an ideal calibration curve.

This method is similar to the corresponding scikit-learn method:

https://scikit-learn.org/stable/modules/generated/sklearn.calibration.calibration_curve.html

source
Lighthouse.EvaluationV1Type
@version EvaluationV1 begin
    class_labels::Union{Missing,Vector{String}}
    confusion_matrix::Union{Missing,Array{Int64,1},Array{Int64,2}} = vec_to_mat(confusion_matrix)
    discrimination_calibration_curve::Union{Missing,GenericCurve}
    discrimination_calibration_score::Union{Missing,Float64}
    multiclass_IRA_kappas::Union{Missing,Float64}
    multiclass_kappa::Union{Missing,Float64}
    optimal_threshold::Union{Missing,Float64}
    optimal_threshold_class::Union{Missing,Int64}
    per_class_IRA_kappas::Union{Missing,Vector{Float64}}
    per_class_kappas::Union{Missing,Vector{Float64}}
    stratified_kappas::Union{Missing,
                             Vector{@NamedTuple{per_class::Vector{Float64},
                                                multiclass::Float64,
                                                n::Int64}}}
    per_class_pr_curves::Union{Missing,Vector{GenericCurve}}
    per_class_reliability_calibration_curves::Union{Missing,Vector{GenericCurve}}
    per_class_reliability_calibration_scores::Union{Missing,Vector{Float64}}
    per_class_roc_aucs::Union{Missing,Vector{Float64}}
    per_class_roc_curves::Union{Missing,Vector{GenericCurve}}
    per_expert_discrimination_calibration_curves::Union{Missing,Vector{GenericCurve}}
    per_expert_discrimination_calibration_scores::Union{Missing,Vector{Float64}}
    spearman_correlation::Union{Missing,
                                @NamedTuple{ρ::Float64,  # Note: is rho not 'p' 😢
                                            n::Int64,
                                            ci_lower::Float64,
                                            ci_upper::Float64}}
    thresholds::Union{Missing,Vector{Float64}}
end

A Legolas record representing the output metrics computed by evaluation_metrics_record and evaluation_metrics.

See Legolas.jl for details regarding Legolas record types.

source
Lighthouse.ObservationV1Type
@version ObservationV1 begin
    predicted_hard_label::Int64
    predicted_soft_labels::Vector{Float32}
    elected_hard_label::Int64
    votes::Union{Missing,Vector{Int64}}
end

A Legolas record representing the per-observation input values required to compute evaluation_metrics_record.

source
Lighthouse._evaluation_dictFunction
_evaluation_row_dict(row::EvaluationV1) -> Dict{String,Any}

Convert EvaluationV1 into ::Dict{String, Any} results, as are output by [evaluation_metrics](@ref) (and predated use of EvaluationV1 in Lighthouse <v0.14.0).

source
Lighthouse.evaluation_metrics_recordFunction
evaluation_metrics_record(observation_table, classes, thresholds=0.0:0.01:1.0;
                          strata::Union{Nothing,AbstractVector{Set{T}} where T}=nothing,
                          optimal_threshold_class::Union{Missing,Nothing,Integer}=missing)
evaluation_metrics_record(predicted_hard_labels::AbstractVector,
                          predicted_soft_labels::AbstractMatrix,
                          elected_hard_labels::AbstractVector,
                          classes,
                          thresholds=0.0:0.01:1.0;
                          votes::Union{Nothing,Missing,AbstractMatrix}=nothing,
                          strata::Union{Nothing,AbstractVector{Set{T}} where T}=nothing,
                          optimal_threshold_class::Union{Missing,Nothing,Integer}=missing)

Returns EvaluationV1 containing a battery of classifier performance metrics that each compare predicted_soft_labels and/or predicted_hard_labels agaist elected_hard_labels.

Where...

  • predicted_soft_labels is a matrix of soft labels whose columns correspond to classes and whose rows correspond to samples in the evaluation set.

  • predicted_hard_labels is a vector of hard labels where the ith element is the hard label predicted by the model for sample i in the evaulation set.

  • elected_hard_labels is a vector of hard labels where the ith element is the hard label elected as "ground truth" for sample i in the evaulation set.

  • thresholds are the range of thresholds used by metrics (e.g. PR curves) that are calculated on the predicted_soft_labels for a range of thresholds.

  • votes is a matrix of hard labels whose columns correspond to voters and whose rows correspond to the samples in the test set that have been voted on. If votes[sample, voter] is not a valid hard label for model, then voter will simply be considered to have not assigned a hard label to sample.

  • strata is a vector of sets of (arbitrarily typed) groups/strata for each sample in the evaluation set, or nothing. If not nothing, per-class and multiclass kappas will also be calculated per group/stratum.

  • optimal_threshold_class is the class index (1 or 2) for which to calculate an optimal threshold for converting the predicted_soft_labels to predicted_hard_labels. If present, the input predicted_hard_labels will be ignored and new predicted_hard_labels will be recalculated from the new threshold. This is only a valid parameter when length(classes) == 2

Alternatively, an observation_table that consists of rows of type ObservationV1 can be passed in in place of predicted_soft_labels,predicted_hard_labels,elected_hard_labels, and votes. Supply a function to the keyword argument binarize which takes as input (soft_label, threshold) and outputs a Bool indicating whether or not the class of interest.

See also evaluation_metrics_plot.

source
Lighthouse.ClassV1Type
@version ClassV1 begin
    class_index::Union{Int64,Symbol} = check_valid_class(class_index)
    class_labels::Union{Missing,Vector{String}}
end

A Legolas record representing a single column class_index that holds either an integer or the value :multiclass, and the class names associated to the integer class indices.

source
Lighthouse.TradeoffMetricsV1Type
@version TradeoffMetricsV1 > ClassV1 begin
    roc_curve::Curve = lift(Curve, roc_curve)
    roc_auc::Float64
    pr_curve::Curve = lift(Curve, pr_curve)
    spearman_correlation::Union{Missing,Float64}
    spearman_correlation_ci_upper::Union{Missing,Float64}
    spearman_correlation_ci_lower::Union{Missing,Float64}
    n_samples::Union{Missing,Int}
    reliability_calibration_curve::Union{Missing,Curve} = lift(Curve,
                                                               reliability_calibration_curve)
    reliability_calibration_score::Union{Missing,Float64}
end

A Legolas record representing metrics calculated over predicted soft labels. See also get_tradeoff_metrics and get_tradeoff_metrics_binary_multirater.

source
Lighthouse.get_tradeoff_metricsFunction
get_tradeoff_metrics(predicted_soft_labels, elected_hard_labels, class_index;
                     thresholds, binarize=binarize_by_threshold, class_labels=missing)

Return [TradeoffMetricsV1] calculated for the given class_index, with the following fields guaranteed to be non-missing: roc_curve, roc_auc, prcurve,reliabilitycalibrationcurve,reliabilitycalibrationscore. Supply a function to the keyword argument binarize which takes as input `(softlabel, threshold)and outputs aBoolindicating whether or not the class of interest (class_index`).

source
Lighthouse.get_tradeoff_metrics_binary_multiraterFunction
get_tradeoff_metrics_binary_multirater(predicted_soft_labels, elected_hard_labels, class_index;
                                       thresholds, binarize=binarize_by_threshold, class_labels=missing)

Return [TradeoffMetricsV1] calculated for the given class_index. In addition to metrics calculated by get_tradeoff_metrics, additionally calculates spearman_correlation-based metrics. Supply a function to the keyword argument binarize which takes as input (soft_label, threshold) and outputs a Bool indicating whether or not the class of interest (class_index).

source
Lighthouse.HardenedMetricsV1Type
@version HardenedMetricsV1 > ClassV1 begin
    confusion_matrix::Union{Missing,Array{Int64,1},Array{Int64,2}} = vec_to_mat(confusion_matrix)
    discrimination_calibration_curve::Union{Missing,Curve} = lift(Curve,
                                                                  discrimination_calibration_curve)
    discrimination_calibration_score::Union{Missing,Float64}
    ea_kappa::Union{Missing,Float64}
end

A Legolas record representing metrics calculated over predicted hard labels. See also get_hardened_metrics, get_hardened_metrics_multirater, and get_hardened_metrics_multiclass.

source
Lighthouse.get_hardened_metricsFunction
get_hardened_metrics(predicted_hard_labels, elected_hard_labels, class_index;
                     class_labels=missing)

Return [HardenedMetricsV1] calculated for the given class_index, with the following field guaranteed to be non-missing: expert-algorithm agreement (ea_kappa).

source
Lighthouse.get_hardened_metrics_multiraterFunction
get_hardened_metrics_multirater(predicted_hard_labels, elected_hard_labels, class_index;
                     class_labels=missing)

Return [HardenedMetricsV1] calculated for the given class_index. In addition to metrics calculated by get_hardened_metrics, additionally calculates discrimination_calibration_curve and discrimination_calibration_score.

source
Lighthouse.get_hardened_metrics_multiclassFunction
get_hardened_metrics_multiclass(predicted_hard_labels, elected_hard_labels,
                                class_count; class_labels=missing)

Return [HardenedMetricsV1] calculated over all class_count classes. Calculates expert-algorithm agreement (ea_kappa) over all classes, as well as the multiclass confusion_matrix.

source
Lighthouse.LabelMetricsV1Type
@version LabelMetricsV1 > ClassV1 begin
    ira_kappa::Union{Missing,Float64}
    per_expert_discrimination_calibration_curves::Union{Missing,Vector{Curve}} = lift(v -> Curve.(v),
                                                                                      per_expert_discrimination_calibration_curves)
    per_expert_discrimination_calibration_scores::Union{Missing,Vector{Float64}}
end

A Legolas record representing metrics calculated over labels provided by multiple labelers. See also get_label_metrics_multirater and get_label_metrics_multirater_multiclass.

source
Lighthouse.get_label_metrics_multiraterFunction
get_label_metrics_multirater(votes, class_index; class_labels=missing)

Return [LabelMetricsV1] calculated for the given class_index, with the following field guaranteed to be non-missing: per_expert_discrimination_calibration_curves, per_expert_discrimination_calibration_scores, interrater-agreement (ira_kappa).

source
Lighthouse._evaluation_recordFunction
_evaluation_record(tradeoff_metrics_table, hardened_metrics_table, label_metrics_table;
                   optimal_threshold_class=missing, class_labels, thresholds,
                   optimal_threshold, stratified_kappas=missing)

Helper function to create an EvaluationV1 from tables of constituent Metrics schemas, to support evaluation_metrics_record:

source
Lighthouse._calculate_ea_kappasFunction
_calculate_ea_kappas(predicted_hard_labels, elected_hard_labels, classes)

Return NamedTuple with keys :per_class_kappas, :multiclass_kappa containing the Cohen's Kappa per-class and over all classes, respectively. The value of output key :per_class_kappas is an Array such that item i is the Cohen's kappa calculated for class i.

Where...

  • predicted_hard_labels is a vector of hard labels where the ith element is the hard label predicted by the model for sample i in the evaulation set.

  • elected_hard_labels is a vector of hard labels where the ith element is the hard label elected as "ground truth" for sample i in the evaulation set.

  • class_count is the number of possible classes.

source
Lighthouse._calculate_ira_kappasFunction
_calculate_ira_kappas(votes, classes)

Return NamedTuple with keys :per_class_IRA_kappas, :multiclass_IRA_kappas containing the Cohen's Kappa for inter-rater agreement (IRA) per-class and over all classes, respectively. The value of output key :per_class_IRA_kappas is an Array such that item i is the IRA kappa calculated for class i.

Where...

  • votes is a matrix of hard labels whose columns correspond to voters and whose rows correspond to the samples in the test set that have been voted on. If votes[sample, voter] is not a valid hard label for model, then voter will simply be considered to have not assigned a hard label to sample.

  • classes all possible classes voted on.

Returns (per_class_IRA_kappas=missing, multiclass_IRA_kappas=missing) if votes has only a single voter (i.e., a single column) or if no two voters rated the same sample. Note that vote entries of 0 are taken to mean that the voter did not rate that sample.

source
Lighthouse._calculate_spearman_correlationFunction
_calculate_spearman_correlation(predicted_soft_labels, votes, classes)

Return NamedTuple with keys , :n, :ci_lower, and ci_upper that are the Spearman correlation constant ρ and its 95% confidence interval bounds. Only valid for binary classification problems (i.e., length(classes) == 2)

Where...

  • predicted_soft_labels is a matrix of soft labels whose columns correspond to the two classes and whose rows correspond to the samples in the test set that have been classified. For a given sample, the two class column values must sum to 1 (i.e., softmax has been applied to the classification output).

  • votes is a matrix of hard labels whose columns correspond to voters and whose rows correspond to the samples in the test set that have been voted on. If votes[sample, voter] is not a valid hard label for model, then voter will simply be considered to have not assigned a hard label to sample. May contain a single voter (i.e., a single column).

  • classes are the two classes voted on.

source

Utilities

Lighthouse.majorityFunction
majority([rng::AbstractRNG=Random.GLOBAL_RNG], hard_labels, among::UnitRange)

Return the majority label within among out of hard_labels:

julia> majority([1, 2, 1, 3, 2, 2, 3], 1:3)
2

julia> majority([1, 2, 1, 3, 2, 2, 3, 4], 3:4)
3

In the event of a tie, a winner is randomly selected from the tied labels via rng.

source
Lighthouse.area_under_curveFunction
area_under_curve(x, y)

Calculates the area under the curve specified by the x vector and y vector using the trapezoidal rule. If inputs are empty, return missing.

source
Lighthouse.area_under_curve_unit_squareFunction
area_under_curve_unit_square(x, y)

Calculates the area under the curve specified by the x vector and y vector for a unit square, using the trapezoidal rule. If inputs are empty, return missing.

source
Lighthouse.CurveType
Curve(x, y)

Represents a (plot) curve of x and y points.

When constructing a Curve, missing's are replaced with NaN, and values are converted to Float64. Curve objects c support iteration, x, y = c, and indexing, x = c[1], y = c[2].

source