diesel/connection/statement_cache/mod.rs
1//! Helper types for prepared statement caching
2//!
3//! A primer on prepared statement caching in Diesel
4//! ------------------------------------------------
5//!
6//! Diesel uses prepared statements for virtually all queries. This is most
7//! visible in our lack of any sort of "quoting" API. Values must always be
8//! transmitted as bind parameters, we do not support direct interpolation. The
9//! only method in the public API that doesn't require the use of prepared
10//! statements is [`SimpleConnection::batch_execute`](super::SimpleConnection::batch_execute).
11//!
12//! In order to avoid the cost of re-parsing and planning subsequent queries,
13//! by default Diesel caches the prepared statement whenever possible. This
14//! can be customized by calling
15//! [`Connection::set_cache_size`](super::Connection::set_cache_size).
16//!
17//! Queries will fall into one of three buckets:
18//!
19//! - Unsafe to cache
20//! - Cached by SQL
21//! - Cached by type
22//!
23//! A query is considered unsafe to cache if it represents a potentially
24//! unbounded number of queries. This is communicated to the connection through
25//! [`QueryFragment::is_safe_to_cache_prepared`]. While this is done as a full AST
26//! pass, after monomorphisation and inlining this will usually be optimized to
27//! a constant. Only boxed queries will need to do actual work to answer this
28//! question.
29//!
30//! The majority of AST nodes are safe to cache if their components are safe to
31//! cache. There are at least 4 cases where a query is unsafe to cache:
32//!
33//! - queries containing `IN` with bind parameters
34//! - This requires 1 bind parameter per value, and is therefore unbounded
35//! - `IN` with subselects are cached (assuming the subselect is safe to
36//! cache)
37//! - `IN` statements for postgresql are cached as they use `= ANY($1)` instead
38//! which does not cause an unbound number of binds
39//! - `INSERT` statements with a variable number of rows
40//! - The SQL varies based on the number of rows being inserted.
41//! - `UPDATE` statements
42//! - Technically it's bounded on "number of optional values being passed to
43//! `SET` factorial" but that's still quite high, and not worth caching
44//! for the same reason as single row inserts
45//! - `SqlLiteral` nodes
46//! - We have no way of knowing whether the SQL was generated dynamically or
47//! not, so we must assume that it's unbounded
48//!
49//! For queries which are unsafe to cache, the statement cache will never insert
50//! them. They will be prepared and immediately released after use (or in the
51//! case of PG they will use the unnamed prepared statement).
52//!
53//! For statements which are able to be cached, we then have to determine what
54//! to use as the cache key. The standard method that virtually all ORMs or
55//! database access layers use in the wild is to store the statements in a
56//! hash map, using the SQL as the key.
57//!
58//! However, the majority of queries using Diesel that are safe to cache as
59//! prepared statements will be uniquely identified by their type. For these
60//! queries, we can bypass the query builder entirely. Since our AST is
61//! generally optimized away by the compiler, for these queries the cost of
62//! fetching a prepared statement from the cache is the cost of [`HashMap<u32,
63//! _>::get`], where the key we're fetching by is a compile time constant. For
64//! these types, the AST pass to gather the bind parameters will also be
65//! optimized to accessing each parameter individually.
66//!
67//! Determining if a query can be cached by type is the responsibility of the
68//! [`QueryId`] trait. This trait is quite similar to `Any`, but with a few
69//! differences:
70//!
71//! - No `'static` bound
72//! - Something being a reference never changes the SQL that is generated,
73//! so `&T` has the same query id as `T`.
74//! - `Option<TypeId>` instead of `TypeId`
75//! - We need to be able to constrain on this trait being implemented, but
76//! not all types will actually have a static query id. Hopefully once
77//! specialization is stable we can remove the `QueryId` bound and
78//! specialize on it instead (or provide a blanket impl for all `T`)
79//! - Implementors give a more broad type than `Self`
80//! - This really only affects bind parameters. There are 6 different Rust
81//! types which can be used for a parameter of type `timestamp`. The same
82//! statement can be used regardless of the Rust type, so [`Bound<ST, T>`](crate::expression::bound::Bound)
83//! defines its [`QueryId`] as [`Bound<ST, ()>`](crate::expression::bound::Bound).
84//!
85//! A type returning `Some(id)` or `None` for its query ID is based on whether
86//! the SQL it generates can change without the type changing. At the moment,
87//! the only type which is safe to cache as a prepared statement but does not
88//! have a static query ID is something which has been boxed.
89//!
90//! One potential optimization that we don't perform is storing the queries
91//! which are cached by type ID in a separate map. Since a type ID is a u64,
92//! this would allow us to use a specialized map which knows that there will
93//! never be hashing collisions (also known as a perfect hashing function),
94//! which would mean lookups are always constant time. However, this would save
95//! nanoseconds on an operation that will take microseconds or even
96//! milliseconds.
97
98use std::any::TypeId;
99use std::borrow::Cow;
100use std::collections::hash_map::Entry;
101use std::hash::Hash;
102use std::ops::{Deref, DerefMut};
103
104use strategy::{
105 LookupStatementResult, StatementCacheStrategy, WithCacheStrategy, WithoutCacheStrategy,
106};
107
108use crate::backend::Backend;
109use crate::connection::InstrumentationEvent;
110use crate::query_builder::*;
111use crate::result::QueryResult;
112
113use super::{CacheSize, Instrumentation};
114
115/// Various interfaces and implementations to control connection statement caching.
116#[allow(unreachable_pub)]
117pub mod strategy;
118
119/// A prepared statement cache
120#[allow(missing_debug_implementations, unreachable_pub)]
121#[cfg_attr(
122 docsrs,
123 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
124)]
125pub struct StatementCache<DB: Backend, Statement> {
126 cache: Box<dyn StatementCacheStrategy<DB, Statement>>,
127 // increment every time a query is cached
128 // some backends might use it to create unique prepared statement names
129 cache_counter: u64,
130}
131
132/// A helper type that indicates if a certain query
133/// is cached inside of the prepared statement cache or not
134///
135/// This information can be used by the connection implementation
136/// to signal this fact to the database while actually
137/// preparing the statement
138#[derive(Debug, Clone, Copy)]
139#[cfg_attr(
140 docsrs,
141 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
142)]
143#[allow(unreachable_pub)]
144pub enum PrepareForCache {
145 /// The statement will be cached
146 Yes {
147 /// Counter might be used as unique identifier for prepared statement.
148 #[allow(dead_code)]
149 counter: u64,
150 },
151 /// The statement won't be cached
152 No,
153}
154
155#[allow(clippy::new_without_default, unreachable_pub)]
156impl<DB, Statement> StatementCache<DB, Statement>
157where
158 DB: Backend + 'static,
159 Statement: Send + 'static,
160 DB::TypeMetadata: Send + Clone,
161 DB::QueryBuilder: Default,
162 StatementCacheKey<DB>: Hash + Eq,
163{
164 /// Create a new prepared statement cache using [`CacheSize::Unbounded`] as caching strategy.
165 #[allow(unreachable_pub)]
166 pub fn new() -> Self {
167 StatementCache {
168 cache: Box::new(WithCacheStrategy::default()),
169 cache_counter: 0,
170 }
171 }
172
173 /// Set caching strategy from predefined implementations
174 pub fn set_cache_size(&mut self, size: CacheSize) {
175 if self.cache.cache_size() != size {
176 self.cache = match size {
177 CacheSize::Unbounded => Box::new(WithCacheStrategy::default()),
178 CacheSize::Disabled => Box::new(WithoutCacheStrategy::default()),
179 }
180 }
181 }
182
183 /// Setting custom caching strategy. It is used in tests, to verify caching logic
184 #[allow(dead_code)]
185 pub(crate) fn set_strategy<Strategy>(&mut self, s: Strategy)
186 where
187 Strategy: StatementCacheStrategy<DB, Statement> + 'static,
188 {
189 self.cache = Box::new(s);
190 }
191
192 /// Prepare a query as prepared statement
193 ///
194 /// This functions returns a prepared statement corresponding to the
195 /// query passed as `source` with the bind values passed as `bind_types`.
196 /// If the query is already cached inside this prepared statement cache
197 /// the cached prepared statement will be returned, otherwise `prepare_fn`
198 /// will be called to create a new prepared statement for this query source.
199 /// The first parameter of the callback contains the query string, the second
200 /// parameter indicates if the constructed prepared statement will be cached or not.
201 /// See the [module](self) documentation for details
202 /// about which statements are cached and which are not cached.
203 //
204 // Notes:
205 // This function takes explicitly a connection and a function pointer (and no generic callback)
206 // as argument to ensure that we don't leak generic query types into the prepare function
207 #[allow(unreachable_pub)]
208 #[cfg(any(
209 feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes",
210 feature = "sqlite",
211 feature = "mysql"
212 ))]
213 pub fn cached_statement<'a, T, R, C>(
214 &'a mut self,
215 source: &T,
216 backend: &DB,
217 bind_types: &[DB::TypeMetadata],
218 conn: C,
219 prepare_fn: fn(C, &str, PrepareForCache, &[DB::TypeMetadata]) -> R,
220 instrumentation: &mut dyn Instrumentation,
221 ) -> R::Return<'a>
222 where
223 T: QueryFragment<DB> + QueryId,
224 R: StatementCallbackReturnType<Statement, C> + 'a,
225 {
226 self.cached_statement_non_generic(
227 T::query_id(),
228 source,
229 backend,
230 bind_types,
231 conn,
232 prepare_fn,
233 instrumentation,
234 )
235 }
236
237 /// Prepare a query as prepared statement
238 ///
239 /// This function closely mirrors `Self::cached_statement` but
240 /// eliminates the generic query type in favour of a trait object
241 ///
242 /// This can be easier to use in situations where you already turned
243 /// the query type into a concrete SQL string
244 // Notes:
245 // This function takes explicitly a connection and a function pointer (and no generic callback)
246 // as argument to ensure that we don't leak generic query types into the prepare function
247 #[allow(unreachable_pub)]
248 #[allow(clippy::too_many_arguments)] // we need all of them
249 pub fn cached_statement_non_generic<'a, R, C>(
250 &'a mut self,
251 maybe_type_id: Option<TypeId>,
252 source: &dyn QueryFragmentForCachedStatement<DB>,
253 backend: &DB,
254 bind_types: &[DB::TypeMetadata],
255 conn: C,
256 prepare_fn: fn(C, &str, PrepareForCache, &[DB::TypeMetadata]) -> R,
257 instrumentation: &mut dyn Instrumentation,
258 ) -> R::Return<'a>
259 where
260 R: StatementCallbackReturnType<Statement, C> + 'a,
261 {
262 Self::cached_statement_non_generic_impl(
263 self.cache.as_mut(),
264 maybe_type_id,
265 source,
266 backend,
267 bind_types,
268 conn,
269 |conn, sql, is_cached| {
270 if is_cached {
271 instrumentation.on_connection_event(InstrumentationEvent::CacheQuery { sql });
272 self.cache_counter += 1;
273 prepare_fn(
274 conn,
275 sql,
276 PrepareForCache::Yes {
277 counter: self.cache_counter,
278 },
279 bind_types,
280 )
281 } else {
282 prepare_fn(conn, sql, PrepareForCache::No, bind_types)
283 }
284 },
285 )
286 }
287
288 /// Reduce the amount of monomorphized code by factoring this via dynamic dispatch
289 /// There will be only one instance of `R` for diesel (and a different single instance for diesel-async)
290 /// There will be only a instance per connection type `C` for each connection that
291 /// uses this prepared statement impl, this closely correlates to the types `DB` and `Statement`
292 /// for the overall statement cache impl
293 fn cached_statement_non_generic_impl<'a, R, C>(
294 cache: &'a mut dyn StatementCacheStrategy<DB, Statement>,
295 maybe_type_id: Option<TypeId>,
296 source: &dyn QueryFragmentForCachedStatement<DB>,
297 backend: &DB,
298 bind_types: &[DB::TypeMetadata],
299 conn: C,
300 prepare_fn: impl FnOnce(C, &str, bool) -> R,
301 ) -> R::Return<'a>
302 where
303 R: StatementCallbackReturnType<Statement, C> + 'a,
304 {
305 // this function cannot use the `?` operator
306 // as we want to abstract over returning `QueryResult<MaybeCached>` and
307 // `impl Future<Output = QueryResult<MaybeCached>>` here
308 // to share the prepared statement cache implementation between diesel and
309 // diesel_async
310 //
311 // For this reason we need to match explicitly on each error and call `R::from_error()`
312 // to construct the right error return variant
313 let cache_key =
314 match StatementCacheKey::for_source(maybe_type_id, source, bind_types, backend) {
315 Ok(o) => o,
316 Err(e) => return R::from_error(e),
317 };
318 let is_safe_to_cache_prepared = match source.is_safe_to_cache_prepared(backend) {
319 Ok(o) => o,
320 Err(e) => return R::from_error(e),
321 };
322 // early return if the statement cannot be cached
323 if !is_safe_to_cache_prepared {
324 let sql = match cache_key.sql(source, backend) {
325 Ok(sql) => sql,
326 Err(e) => return R::from_error(e),
327 };
328 return prepare_fn(conn, &sql, false).map_to_no_cache();
329 }
330 let entry = cache.lookup_statement(cache_key);
331 match entry {
332 // The statement is already cached
333 LookupStatementResult::CacheEntry(Entry::Occupied(e)) => {
334 R::map_to_cache(e.into_mut(), conn)
335 }
336 // The statement is not cached but there is capacity to cache it
337 LookupStatementResult::CacheEntry(Entry::Vacant(e)) => {
338 let sql = match e.key().sql(source, backend) {
339 Ok(sql) => sql,
340 Err(e) => return R::from_error(e),
341 };
342 let st = prepare_fn(conn, &sql, true);
343 st.register_cache(|stmt| e.insert(stmt))
344 }
345 // The statement is not cached and there is no capacity to cache it
346 LookupStatementResult::NoCache(cache_key) => {
347 let sql = match cache_key.sql(source, backend) {
348 Ok(sql) => sql,
349 Err(e) => return R::from_error(e),
350 };
351 prepare_fn(conn, &sql, false).map_to_no_cache()
352 }
353 }
354 }
355}
356
357/// Implemented for all `QueryFragment`s, dedicated to dynamic dispatch within the context of
358/// `statement_cache`
359///
360/// We want the generated code to be as small as possible, so for each query passed to
361/// [`StatementCache::cached_statement`] the generated assembly will just call a non generic
362/// version with dynamic dispatch pointing to the VTABLE of this minimal trait
363///
364/// This preserves the opportunity for the compiler to entirely optimize the `construct_sql`
365/// function as a function that simply returns a constant `String`.
366#[allow(unreachable_pub)]
367#[cfg_attr(
368 docsrs,
369 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
370)]
371pub trait QueryFragmentForCachedStatement<DB> {
372 /// Convert the query fragment into a SQL string for the given backend
373 fn construct_sql(&self, backend: &DB) -> QueryResult<String>;
374
375 /// Check whether it's safe to cache the query
376 fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool>;
377}
378
379impl<T, DB> QueryFragmentForCachedStatement<DB> for T
380where
381 DB: Backend,
382 DB::QueryBuilder: Default,
383 T: QueryFragment<DB>,
384{
385 fn construct_sql(&self, backend: &DB) -> QueryResult<String> {
386 let mut query_builder = DB::QueryBuilder::default();
387 self.to_sql(&mut query_builder, backend)?;
388 Ok(query_builder.finish())
389 }
390
391 fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool> {
392 <T as QueryFragment<DB>>::is_safe_to_cache_prepared(self, backend)
393 }
394}
395
396/// Wraps a possibly cached prepared statement
397///
398/// Essentially a customized version of [`Cow`]
399/// that does not depend on [`ToOwned`]
400#[allow(missing_debug_implementations, unreachable_pub)]
401#[cfg_attr(
402 docsrs,
403 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
404)]
405#[non_exhaustive]
406pub enum MaybeCached<'a, T: 'a> {
407 /// Contains a not cached prepared statement
408 CannotCache(T),
409 /// Contains a reference cached prepared statement
410 Cached(&'a mut T),
411}
412
413/// This trait abstracts over the type returned by the prepare statement function
414///
415/// The main use-case for this abstraction is to share the same statement cache implementation
416/// between diesel and diesel-async.
417#[cfg_attr(
418 docsrs,
419 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
420)]
421#[allow(unreachable_pub)]
422pub trait StatementCallbackReturnType<S: 'static, C> {
423 /// The return type of `StatementCache::cached_statement`
424 ///
425 /// Either a `QueryResult<MaybeCached<S>>` or a future of that result type
426 type Return<'a>;
427
428 /// Create the return type from an error
429 fn from_error<'a>(e: diesel::result::Error) -> Self::Return<'a>;
430
431 /// Map the callback return type to the `MaybeCached::CannotCache` variant
432 fn map_to_no_cache<'a>(self) -> Self::Return<'a>
433 where
434 Self: 'a;
435
436 /// Map the cached statement to the `MaybeCached::Cached` variant
437 fn map_to_cache(stmt: &mut S, conn: C) -> Self::Return<'_>;
438
439 /// Insert the created statement into the cache via the provided callback
440 /// and then turn the returned reference into `MaybeCached::Cached`
441 fn register_cache<'a>(
442 self,
443 callback: impl FnOnce(S) -> &'a mut S + Send + 'a,
444 ) -> Self::Return<'a>
445 where
446 Self: 'a;
447}
448
449impl<S, C> StatementCallbackReturnType<S, C> for QueryResult<S>
450where
451 S: 'static,
452{
453 type Return<'a> = QueryResult<MaybeCached<'a, S>>;
454
455 fn from_error<'a>(e: diesel::result::Error) -> Self::Return<'a> {
456 Err(e)
457 }
458
459 fn map_to_no_cache<'a>(self) -> Self::Return<'a> {
460 self.map(MaybeCached::CannotCache)
461 }
462
463 fn map_to_cache(stmt: &mut S, _conn: C) -> Self::Return<'_> {
464 Ok(MaybeCached::Cached(stmt))
465 }
466
467 fn register_cache<'a>(
468 self,
469 callback: impl FnOnce(S) -> &'a mut S + Send + 'a,
470 ) -> Self::Return<'a>
471 where
472 Self: 'a,
473 {
474 Ok(MaybeCached::Cached(callback(self?)))
475 }
476}
477
478impl<T> Deref for MaybeCached<'_, T> {
479 type Target = T;
480
481 fn deref(&self) -> &Self::Target {
482 match *self {
483 MaybeCached::CannotCache(ref x) => x,
484 MaybeCached::Cached(ref x) => x,
485 }
486 }
487}
488
489impl<T> DerefMut for MaybeCached<'_, T> {
490 fn deref_mut(&mut self) -> &mut Self::Target {
491 match *self {
492 MaybeCached::CannotCache(ref mut x) => x,
493 MaybeCached::Cached(ref mut x) => x,
494 }
495 }
496}
497
498/// The lookup key used by [`StatementCache`] internally
499///
500/// This can contain either a at compile time known type id
501/// (representing a statically known query) or a at runtime
502/// calculated query string + parameter types (for queries
503/// that may change depending on their parameters)
504#[allow(missing_debug_implementations, unreachable_pub)]
505#[derive(Hash, PartialEq, Eq)]
506#[cfg_attr(
507 docsrs,
508 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
509)]
510pub enum StatementCacheKey<DB: Backend> {
511 /// Represents a at compile time known query
512 ///
513 /// Calculated via [`QueryId::QueryId`]
514 Type(TypeId),
515 /// Represents a dynamically constructed query
516 ///
517 /// This variant is used if [`QueryId::HAS_STATIC_QUERY_ID`]
518 /// is `false` and [`AstPass::unsafe_to_cache_prepared`] is not
519 /// called for a given query.
520 Sql {
521 /// contains the sql query string
522 sql: String,
523 /// contains the types of any bind parameter passed to the query
524 bind_types: Vec<DB::TypeMetadata>,
525 },
526}
527
528impl<DB> StatementCacheKey<DB>
529where
530 DB: Backend,
531 DB::QueryBuilder: Default,
532 DB::TypeMetadata: Clone,
533{
534 /// Create a new statement cache key for the given query source
535 // Note: Intentionally monomorphic over source.
536 #[allow(unreachable_pub)]
537 pub fn for_source(
538 maybe_type_id: Option<TypeId>,
539 source: &dyn QueryFragmentForCachedStatement<DB>,
540 bind_types: &[DB::TypeMetadata],
541 backend: &DB,
542 ) -> QueryResult<Self> {
543 match maybe_type_id {
544 Some(id) => Ok(StatementCacheKey::Type(id)),
545 None => {
546 let sql = source.construct_sql(backend)?;
547 Ok(StatementCacheKey::Sql {
548 sql,
549 bind_types: bind_types.into(),
550 })
551 }
552 }
553 }
554
555 /// Get the sql for a given query source based
556 ///
557 /// This is an optimization that may skip constructing the query string
558 /// twice if it's already part of the current cache key
559 // Note: Intentionally monomorphic over source.
560 #[allow(unreachable_pub)]
561 pub fn sql(
562 &self,
563 source: &dyn QueryFragmentForCachedStatement<DB>,
564 backend: &DB,
565 ) -> QueryResult<Cow<'_, str>> {
566 match *self {
567 StatementCacheKey::Type(_) => source.construct_sql(backend).map(Cow::Owned),
568 StatementCacheKey::Sql { ref sql, .. } => Ok(Cow::Borrowed(sql)),
569 }
570 }
571}