Upgrading from Legolas v0.4 to v0.5
This guide is incomplete; please add to it if you encounter items which would help other upgraders along their journey.
See here for a comprehensive log of changes from Legolas v0.4 to Legolas v0.5.
Some main changes to be aware of
- In Legolas v0.4, every
Legolas.Rowfield's type was available as a type parameter ofLegolas.Row; for example, the type of a fieldyspecified asy::Realin aLegolas.@rowdeclaration would be surfaced likeLegolas.Row{..., NamedTuple{(...,:y,...),Tuple{...,typeof(y),...}}. In Legolas v0.5, the schema version author controls which fields have their types surfaced as type parameters in Legolas-generated record types via thefield::(<:F)syntax inLegolas.@version.- Additionally, to include type parameters associated to fields in a parent schema, they must be re-declared in the child schema. For example, the package LegolasFlux declares a
ModelV1version with a fieldweights::(<:Union{Missing,Weights}). LegolasFlux includes an example with a schema extensionDigitsRowV1which extendsModelV1. This@versioncall must re-declare the fieldweightsto be parametric in order for theDigitsRowV1struct to also have a type parameter for this field.
- Additionally, to include type parameters associated to fields in a parent schema, they must be re-declared in the child schema. For example, the package LegolasFlux declares a
- In Legolas v0.4,
@row-generatedLegolas.Rowconstructors accepted and propagated any non-schema-declared fields provided by the caller. In Legolas v0.5,@version-generated record type constructors will discard any non-schema-declared fields provided by the caller. When upgrading code that formerly "implicitly extended" a given schema version by propagating non-declared fields, it is advisable to instead explicitly declare a new extension of the schema version to capture the propagated fields as declared fields; or, if it makes more sense for a given use case, one may instead define a new schema version that adds these propagated fields as declared fields directly to the schema (likely declared as::Union{Missing,T}to allow them to be missing). - Before Legolas v0.5, the documented guidance for schema authors surrounding new fields' impact on schema version breakage was misleading, implying that adding a new declared field to an existing schema version is non-breaking if the field's type allowed for
Missingvalues. This is incorrect. For clarity, adding a new declared field to an existing schema version is a breaking change unless the field's type and value are both completely unconstrained in the declaration, i.e. the field's type constraint must be::Anyand may not feature a value-constraining or value-transforming assignment expression.
Deserializing old tables with Legolas v0.5
Generally, tables serialized with earlier versions of Legolas can be de-serialized with Legolas v0.5, making it only a "code-breaking" change, rather than a "data-breaking" change. However, it is strongly suggested to have reference tests with checked in (pre-Legolas v0.5) serialized tables which are deserialized and verified during the tests, in order to be sure.
Additionally, serialized Arrow tables containing nested Legolas-v0.4-defined Legolas.Row values (i.e. a table that contains a row that has a field that is, itself, a Legolas.Row value, or contains such values) require special handling to deserialize under Legolas v0.5, if you wish users to be able to deserialize them with Legolas.read using the Legolas-v0.5-ready version of your package. Note that these tables are still deserializable as plain Arrow tables regardless, so it may not be worthwhile to provide a bespoke deprecation/compatibility pathway in the Legolas-v0.5-ready version package unless your use case merits it (i.e. the impact surface would be high for your package's users).
If you would like to provide such a pathway, though:
Recall that under Legolas v0.4, @row-generated Legolas.Row constructors may accept and propagate arbitrary non-schema-declared fields, whereas Legolas v0.5's @version-generated record types may only contain schema-declared fields. Therefore, one must decide what to do with any non-declared fields present in serialized Legolas.Row values upon deserialization. A common approach is to implement a deprecation/compatibility pathway within the relevant surrounding @version declaration. For example, this LegolasFlux example uses a function compat_config to handle old Legolas.Row values, but does not add any handling for non-declared fields, which will be discarded if present. If one did not want non-declared fields to be discarded, these fields could be handled by throwing an error or warning, or defining a schema version extension that captured them, or defining a new version of the relevant schema to capture them (e.g. adding a field like extras::Union{Missing, NamedTuple}).