Expand description
📚 This module documents ICU4X constructor signatures.
One of the key differences between ICU4X and its parent projects, ICU4C and ICU4J, is in how it deals with locale data.
In ICU4X, data can always be explicitly passed to any function that requires data. This enables ICU4X to achieve the following value propositions:
- Configurable data sources (machine-readable data file, baked into code, JSON, etc).
- Dynamic data loading at runtime (load data on demand).
- Reduced overhead and code size (data is resolved locally at each call site).
- Explicit support for multiple ICU4X instances sharing data.
However, as manual data management can be tedious, ICU4X also has a compiled_data
default Cargo feature that includes data and makes ICU4X work out-of-the box.
Subsequently, there are 4 versions of all Rust ICU4X functions that use data:
*
*_unstable
*_with_any_provider
*_with_buffer_provider
§Which constructor should I use?
§When to use *
If you don’t want to customize data at runtime (i.e. if you don’t care about code size,
updating your data, etc.) you can use the compiled_data
Cargo feature and don’t have to think
about where your data comes from.
These constructors are sometimes const
functions, this way Rust can most effectively optimize
your usage of ICU4X.
§When to use *_unstable
Use this constructor if your data provider implements the DataProvider
trait for all
data structs in current and future ICU4X versions. Examples:
BakedDataProvider
generated for the specific ICU4X minor version- Anything with a blanket
DataProvider
impl
Since the exact set of bounds may change at any time, including in minor SemVer releases, it is the client’s responsibility to guarantee that the requirement is upheld.
§When to use *_with_any_provider
Use this constructor if you need to use a provider that implements AnyProvider
but not
DataProvider
. Examples:
AnyPayloadProvider
ForkByKeyProvider
between two providers implementingAnyProvider
- Providers that cache or override certain keys but not others and therefore
can’t implement
DataProvider
§When to use *_with_buffer_provider
Use this constructor if your data originates as byte buffers that need to be deserialized.
All such providers should implement BufferProvider
. Examples:
BlobDataProvider
FsDataProvider
ForkByKeyProvider
between two providers implementingBufferProvider
Please note that you must enable the serde
Cargo feature on each crate in which you use the
*_with_buffer_provider
constructor.
§Data Versioning Policy
The *_with_any_provider
and *_with_buffer_provider
functions will succeed to compile and
run if given a data provider supporting all of the keys required for the object being
constructed, either the current or any previous version within the same SemVer major release.
For example, if a data file is built to support FooFormatter version 1.1, then FooFormatter
version 1.2 will be able to read the same data file. Likewise, backwards-compatible keys can
always be included by icu_datagen
to support older library versions.
The *_unstable
functions are only guaranteed to work on data built for the exact same minor version
of ICU4X. The advantage of the *_unstable
functions is that they result in the smallest code
size and allow for automatic data slicing when BakedDataProvider
is used. However, the type
bounds of this function may change over time, breaking SemVer guarantees. These functions
should therefore only be used when you have full control over your data lifecycle at compile
time.
§Data Providers Over FFI
Over FFI, there is only one data provider type: ICU4XDataProvider
. Internally, it is an
enum
betweendyn
BufferProvider
and a unit compiled data variant.
To control for code size, there are two Cargo features, compiled_data
and buffer_provider
,
that enable the corresponding items in the enum.
In Rust ICU4X, a similar enum approach was not taken because:
- Feature-gating the enum branches gets complex across crates.
- Without feature gating, users need to carry Serde code even if they’re not using it, violating one of the core value propositions of ICU4X.
- We could reduce the number of constructors from 4 to 2 but not to 1, so the educational benefit is limited.