diesel/connection/statement_cache/mod.rs
1//! Helper types for prepared statement caching
2//!
3//! A primer on prepared statement caching in Diesel
4//! ------------------------------------------------
5//!
6//! Diesel uses prepared statements for virtually all queries. This is most
7//! visible in our lack of any sort of "quoting" API. Values must always be
8//! transmitted as bind parameters, we do not support direct interpolation. The
9//! only method in the public API that doesn't require the use of prepared
10//! statements is [`SimpleConnection::batch_execute`](super::SimpleConnection::batch_execute).
11//!
12//! In order to avoid the cost of re-parsing and planning subsequent queries,
13//! by default Diesel caches the prepared statement whenever possible. This
14//! can be customized by calling
15//! [`Connection::set_cache_size`](super::Connection::set_cache_size).
16//!
17//! Queries will fall into one of three buckets:
18//!
19//! - Unsafe to cache
20//! - Cached by SQL
21//! - Cached by type
22//!
23//! A query is considered unsafe to cache if it represents a potentially
24//! unbounded number of queries. This is communicated to the connection through
25//! [`QueryFragment::is_safe_to_cache_prepared`]. While this is done as a full AST
26//! pass, after monomorphisation and inlining this will usually be optimized to
27//! a constant. Only boxed queries will need to do actual work to answer this
28//! question.
29//!
30//! The majority of AST nodes are safe to cache if their components are safe to
31//! cache. There are at least 4 cases where a query is unsafe to cache:
32//!
33//! - queries containing `IN` with bind parameters
34//! - This requires 1 bind parameter per value, and is therefore unbounded
35//! - `IN` with subselects are cached (assuming the subselect is safe to
36//! cache)
37//! - `IN` statements for postgresql are cached as they use `= ANY($1)` instead
38//! which does not cause an unbound number of binds
39//! - `INSERT` statements with a variable number of rows
40//! - The SQL varies based on the number of rows being inserted.
41//! - `UPDATE` statements
42//! - Technically it's bounded on "number of optional values being passed to
43//! `SET` factorial" but that's still quite high, and not worth caching
44//! for the same reason as single row inserts
45//! - `SqlLiteral` nodes
46//! - We have no way of knowing whether the SQL was generated dynamically or
47//! not, so we must assume that it's unbounded
48//!
49//! For queries which are unsafe to cache, the statement cache will never insert
50//! them. They will be prepared and immediately released after use (or in the
51//! case of PG they will use the unnamed prepared statement).
52//!
53//! For statements which are able to be cached, we then have to determine what
54//! to use as the cache key. The standard method that virtually all ORMs or
55//! database access layers use in the wild is to store the statements in a
56//! hash map, using the SQL as the key.
57//!
58//! However, the majority of queries using Diesel that are safe to cache as
59//! prepared statements will be uniquely identified by their type. For these
60//! queries, we can bypass the query builder entirely. Since our AST is
61//! generally optimized away by the compiler, for these queries the cost of
62//! fetching a prepared statement from the cache is the cost of [`HashMap<u32,
63//! _>::get`], where the key we're fetching by is a compile time constant. For
64//! these types, the AST pass to gather the bind parameters will also be
65//! optimized to accessing each parameter individually.
66//!
67//! Determining if a query can be cached by type is the responsibility of the
68//! [`QueryId`] trait. This trait is quite similar to `Any`, but with a few
69//! differences:
70//!
71//! - No `'static` bound
72//! - Something being a reference never changes the SQL that is generated,
73//! so `&T` has the same query id as `T`.
74//! - `Option<TypeId>` instead of `TypeId`
75//! - We need to be able to constrain on this trait being implemented, but
76//! not all types will actually have a static query id. Hopefully once
77//! specialization is stable we can remove the `QueryId` bound and
78//! specialize on it instead (or provide a blanket impl for all `T`)
79//! - Implementors give a more broad type than `Self`
80//! - This really only affects bind parameters. There are 6 different Rust
81//! types which can be used for a parameter of type `timestamp`. The same
82//! statement can be used regardless of the Rust type, so [`Bound<ST, T>`](crate::expression::bound::Bound)
83//! defines its [`QueryId`] as [`Bound<ST, ()>`](crate::expression::bound::Bound).
84//!
85//! A type returning `Some(id)` or `None` for its query ID is based on whether
86//! the SQL it generates can change without the type changing. At the moment,
87//! the only type which is safe to cache as a prepared statement but does not
88//! have a static query ID is something which has been boxed.
89//!
90//! One potential optimization that we don't perform is storing the queries
91//! which are cached by type ID in a separate map. Since a type ID is a u64,
92//! this would allow us to use a specialized map which knows that there will
93//! never be hashing collisions (also known as a perfect hashing function),
94//! which would mean lookups are always constant time. However, this would save
95//! nanoseconds on an operation that will take microseconds or even
96//! milliseconds.
97
98use std::any::TypeId;
99use std::borrow::Cow;
100use std::collections::hash_map::Entry;
101use std::hash::Hash;
102use std::ops::{Deref, DerefMut};
103
104use strategy::{
105 LookupStatementResult, StatementCacheStrategy, WithCacheStrategy, WithoutCacheStrategy,
106};
107
108use crate::backend::Backend;
109use crate::connection::InstrumentationEvent;
110use crate::query_builder::*;
111use crate::result::QueryResult;
112
113use super::{CacheSize, Instrumentation};
114
115/// Various interfaces and implementations to control connection statement caching.
116#[allow(unreachable_pub)]
117pub mod strategy;
118
119/// A prepared statement cache
120#[allow(missing_debug_implementations, unreachable_pub)]
121#[cfg_attr(
122 docsrs,
123 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
124)]
125pub struct StatementCache<DB: Backend, Statement> {
126 cache: Box<dyn StatementCacheStrategy<DB, Statement>>,
127 // increment every time a query is cached
128 // some backends might use it to create unique prepared statement names
129 cache_counter: u64,
130}
131
132/// A helper type that indicates if a certain query
133/// is cached inside of the prepared statement cache or not
134///
135/// This information can be used by the connection implementation
136/// to signal this fact to the database while actually
137/// preparing the statement
138#[derive(Debug, Clone, Copy)]
139#[cfg_attr(
140 docsrs,
141 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
142)]
143#[allow(unreachable_pub)]
144pub enum PrepareForCache {
145 /// The statement will be cached
146 Yes {
147 /// Counter might be used as unique identifier for prepared statement.
148 #[allow(dead_code)]
149 counter: u64,
150 },
151 /// The statement won't be cached
152 No,
153}
154
155#[allow(clippy::new_without_default, unreachable_pub)]
156impl<DB, Statement> StatementCache<DB, Statement>
157where
158 DB: Backend + 'static,
159 Statement: Send + 'static,
160 DB::TypeMetadata: Send + Clone,
161 DB::QueryBuilder: Default,
162 StatementCacheKey<DB>: Hash + Eq,
163{
164 /// Create a new prepared statement cache using [`CacheSize::Unbounded`] as caching strategy.
165 #[allow(unreachable_pub)]
166 pub fn new() -> Self {
167 StatementCache {
168 cache: Box::new(WithCacheStrategy::default()),
169 cache_counter: 0,
170 }
171 }
172
173 /// Set caching strategy from predefined implementations
174 pub fn set_cache_size(&mut self, size: CacheSize) {
175 if self.cache.cache_size() != size {
176 self.cache = match size {
177 CacheSize::Unbounded => Box::new(WithCacheStrategy::default()),
178 CacheSize::Disabled => Box::new(WithoutCacheStrategy::default()),
179 }
180 }
181 }
182
183 /// Setting custom caching strategy. It is used in tests, to verify caching logic
184 #[allow(dead_code)]
185 pub(crate) fn set_strategy<Strategy>(&mut self, s: Strategy)
186 where
187 Strategy: StatementCacheStrategy<DB, Statement> + 'static,
188 {
189 self.cache = Box::new(s);
190 }
191
192 /// Prepare a query as prepared statement
193 ///
194 /// This functions returns a prepared statement corresponding to the
195 /// query passed as `source` with the bind values passed as `bind_types`.
196 /// If the query is already cached inside this prepared statement cache
197 /// the cached prepared statement will be returned, otherwise `prepare_fn`
198 /// will be called to create a new prepared statement for this query source.
199 /// The first parameter of the callback contains the query string, the second
200 /// parameter indicates if the constructed prepared statement will be cached or not.
201 /// See the [module](self) documentation for details
202 /// about which statements are cached and which are not cached.
203 //
204 // Notes:
205 // This function takes explicitly a connection and a function pointer (and no generic callback)
206 // as argument to ensure that we don't leak generic query types into the prepare function
207 #[allow(unreachable_pub)]
208 pub fn cached_statement<'a, T, R, C>(
209 &'a mut self,
210 source: &T,
211 backend: &DB,
212 bind_types: &[DB::TypeMetadata],
213 conn: C,
214 prepare_fn: fn(C, &str, PrepareForCache, &[DB::TypeMetadata]) -> R,
215 instrumentation: &mut dyn Instrumentation,
216 ) -> R::Return<'a>
217 where
218 T: QueryFragment<DB> + QueryId,
219 R: StatementCallbackReturnType<Statement, C> + 'a,
220 {
221 self.cached_statement_non_generic(
222 T::query_id(),
223 source,
224 backend,
225 bind_types,
226 conn,
227 prepare_fn,
228 instrumentation,
229 )
230 }
231
232 /// Prepare a query as prepared statement
233 ///
234 /// This function closely mirrors `Self::cached_statement` but
235 /// eliminates the generic query type in favour of a trait object
236 ///
237 /// This can be easier to use in situations where you already turned
238 /// the query type into a concrete SQL string
239 // Notes:
240 // This function takes explicitly a connection and a function pointer (and no generic callback)
241 // as argument to ensure that we don't leak generic query types into the prepare function
242 #[allow(unreachable_pub)]
243 #[allow(clippy::too_many_arguments)] // we need all of them
244 pub fn cached_statement_non_generic<'a, R, C>(
245 &'a mut self,
246 maybe_type_id: Option<TypeId>,
247 source: &dyn QueryFragmentForCachedStatement<DB>,
248 backend: &DB,
249 bind_types: &[DB::TypeMetadata],
250 conn: C,
251 prepare_fn: fn(C, &str, PrepareForCache, &[DB::TypeMetadata]) -> R,
252 instrumentation: &mut dyn Instrumentation,
253 ) -> R::Return<'a>
254 where
255 R: StatementCallbackReturnType<Statement, C> + 'a,
256 {
257 Self::cached_statement_non_generic_impl(
258 self.cache.as_mut(),
259 maybe_type_id,
260 source,
261 backend,
262 bind_types,
263 conn,
264 |conn, sql, is_cached| {
265 if is_cached {
266 instrumentation.on_connection_event(InstrumentationEvent::CacheQuery { sql });
267 self.cache_counter += 1;
268 prepare_fn(
269 conn,
270 sql,
271 PrepareForCache::Yes {
272 counter: self.cache_counter,
273 },
274 bind_types,
275 )
276 } else {
277 prepare_fn(conn, sql, PrepareForCache::No, bind_types)
278 }
279 },
280 )
281 }
282
283 /// Reduce the amount of monomorphized code by factoring this via dynamic dispatch
284 /// There will be only one instance of `R` for diesel (and a different single instance for diesel-async)
285 /// There will be only a instance per connection type `C` for each connection that
286 /// uses this prepared statement impl, this closely correlates to the types `DB` and `Statement`
287 /// for the overall statement cache impl
288 fn cached_statement_non_generic_impl<'a, R, C>(
289 cache: &'a mut dyn StatementCacheStrategy<DB, Statement>,
290 maybe_type_id: Option<TypeId>,
291 source: &dyn QueryFragmentForCachedStatement<DB>,
292 backend: &DB,
293 bind_types: &[DB::TypeMetadata],
294 conn: C,
295 prepare_fn: impl FnOnce(C, &str, bool) -> R,
296 ) -> R::Return<'a>
297 where
298 R: StatementCallbackReturnType<Statement, C> + 'a,
299 {
300 // this function cannot use the `?` operator
301 // as we want to abstract over returning `QueryResult<MaybeCached>` and
302 // `impl Future<Output = QueryResult<MaybeCached>>` here
303 // to share the prepared statement cache implementation between diesel and
304 // diesel_async
305 //
306 // For this reason we need to match explicitly on each error and call `R::from_error()`
307 // to construct the right error return variant
308 let cache_key =
309 match StatementCacheKey::for_source(maybe_type_id, source, bind_types, backend) {
310 Ok(o) => o,
311 Err(e) => return R::from_error(e),
312 };
313 let is_safe_to_cache_prepared = match source.is_safe_to_cache_prepared(backend) {
314 Ok(o) => o,
315 Err(e) => return R::from_error(e),
316 };
317 // early return if the statement cannot be cached
318 if !is_safe_to_cache_prepared {
319 let sql = match cache_key.sql(source, backend) {
320 Ok(sql) => sql,
321 Err(e) => return R::from_error(e),
322 };
323 return prepare_fn(conn, &sql, false).map_to_no_cache();
324 }
325 let entry = cache.lookup_statement(cache_key);
326 match entry {
327 // The statement is already cached
328 LookupStatementResult::CacheEntry(Entry::Occupied(e)) => {
329 R::map_to_cache(e.into_mut(), conn)
330 }
331 // The statement is not cached but there is capacity to cache it
332 LookupStatementResult::CacheEntry(Entry::Vacant(e)) => {
333 let sql = match e.key().sql(source, backend) {
334 Ok(sql) => sql,
335 Err(e) => return R::from_error(e),
336 };
337 let st = prepare_fn(conn, &sql, true);
338 st.register_cache(|stmt| e.insert(stmt))
339 }
340 // The statement is not cached and there is no capacity to cache it
341 LookupStatementResult::NoCache(cache_key) => {
342 let sql = match cache_key.sql(source, backend) {
343 Ok(sql) => sql,
344 Err(e) => return R::from_error(e),
345 };
346 prepare_fn(conn, &sql, false).map_to_no_cache()
347 }
348 }
349 }
350}
351
352/// Implemented for all `QueryFragment`s, dedicated to dynamic dispatch within the context of
353/// `statement_cache`
354///
355/// We want the generated code to be as small as possible, so for each query passed to
356/// [`StatementCache::cached_statement`] the generated assembly will just call a non generic
357/// version with dynamic dispatch pointing to the VTABLE of this minimal trait
358///
359/// This preserves the opportunity for the compiler to entirely optimize the `construct_sql`
360/// function as a function that simply returns a constant `String`.
361#[allow(unreachable_pub)]
362#[cfg_attr(
363 docsrs,
364 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
365)]
366pub trait QueryFragmentForCachedStatement<DB> {
367 /// Convert the query fragment into a SQL string for the given backend
368 fn construct_sql(&self, backend: &DB) -> QueryResult<String>;
369
370 /// Check whether it's safe to cache the query
371 fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool>;
372}
373
374impl<T, DB> QueryFragmentForCachedStatement<DB> for T
375where
376 DB: Backend,
377 DB::QueryBuilder: Default,
378 T: QueryFragment<DB>,
379{
380 fn construct_sql(&self, backend: &DB) -> QueryResult<String> {
381 let mut query_builder = DB::QueryBuilder::default();
382 self.to_sql(&mut query_builder, backend)?;
383 Ok(query_builder.finish())
384 }
385
386 fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool> {
387 <T as QueryFragment<DB>>::is_safe_to_cache_prepared(self, backend)
388 }
389}
390
391/// Wraps a possibly cached prepared statement
392///
393/// Essentially a customized version of [`Cow`]
394/// that does not depend on [`ToOwned`]
395#[allow(missing_debug_implementations, unreachable_pub)]
396#[cfg_attr(
397 docsrs,
398 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
399)]
400#[non_exhaustive]
401pub enum MaybeCached<'a, T: 'a> {
402 /// Contains a not cached prepared statement
403 CannotCache(T),
404 /// Contains a reference cached prepared statement
405 Cached(&'a mut T),
406}
407
408/// This trait abstracts over the type returned by the prepare statement function
409///
410/// The main use-case for this abstraction is to share the same statement cache implementation
411/// between diesel and diesel-async.
412#[cfg_attr(
413 docsrs,
414 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
415)]
416#[allow(unreachable_pub)]
417pub trait StatementCallbackReturnType<S: 'static, C> {
418 /// The return type of `StatementCache::cached_statement`
419 ///
420 /// Either a `QueryResult<MaybeCached<S>>` or a future of that result type
421 type Return<'a>;
422
423 /// Create the return type from an error
424 fn from_error<'a>(e: diesel::result::Error) -> Self::Return<'a>;
425
426 /// Map the callback return type to the `MaybeCached::CannotCache` variant
427 fn map_to_no_cache<'a>(self) -> Self::Return<'a>
428 where
429 Self: 'a;
430
431 /// Map the cached statement to the `MaybeCached::Cached` variant
432 fn map_to_cache(stmt: &mut S, conn: C) -> Self::Return<'_>;
433
434 /// Insert the created statement into the cache via the provided callback
435 /// and then turn the returned reference into `MaybeCached::Cached`
436 fn register_cache<'a>(
437 self,
438 callback: impl FnOnce(S) -> &'a mut S + Send + 'a,
439 ) -> Self::Return<'a>
440 where
441 Self: 'a;
442}
443
444impl<S, C> StatementCallbackReturnType<S, C> for QueryResult<S>
445where
446 S: 'static,
447{
448 type Return<'a> = QueryResult<MaybeCached<'a, S>>;
449
450 fn from_error<'a>(e: diesel::result::Error) -> Self::Return<'a> {
451 Err(e)
452 }
453
454 fn map_to_no_cache<'a>(self) -> Self::Return<'a> {
455 self.map(MaybeCached::CannotCache)
456 }
457
458 fn map_to_cache(stmt: &mut S, _conn: C) -> Self::Return<'_> {
459 Ok(MaybeCached::Cached(stmt))
460 }
461
462 fn register_cache<'a>(
463 self,
464 callback: impl FnOnce(S) -> &'a mut S + Send + 'a,
465 ) -> Self::Return<'a>
466 where
467 Self: 'a,
468 {
469 Ok(MaybeCached::Cached(callback(self?)))
470 }
471}
472
473impl<T> Deref for MaybeCached<'_, T> {
474 type Target = T;
475
476 fn deref(&self) -> &Self::Target {
477 match *self {
478 MaybeCached::CannotCache(ref x) => x,
479 MaybeCached::Cached(ref x) => x,
480 }
481 }
482}
483
484impl<T> DerefMut for MaybeCached<'_, T> {
485 fn deref_mut(&mut self) -> &mut Self::Target {
486 match *self {
487 MaybeCached::CannotCache(ref mut x) => x,
488 MaybeCached::Cached(ref mut x) => x,
489 }
490 }
491}
492
493/// The lookup key used by [`StatementCache`] internally
494///
495/// This can contain either a at compile time known type id
496/// (representing a statically known query) or a at runtime
497/// calculated query string + parameter types (for queries
498/// that may change depending on their parameters)
499#[allow(missing_debug_implementations, unreachable_pub)]
500#[derive(Hash, PartialEq, Eq)]
501#[cfg_attr(
502 docsrs,
503 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
504)]
505pub enum StatementCacheKey<DB: Backend> {
506 /// Represents a at compile time known query
507 ///
508 /// Calculated via [`QueryId::QueryId`]
509 Type(TypeId),
510 /// Represents a dynamically constructed query
511 ///
512 /// This variant is used if [`QueryId::HAS_STATIC_QUERY_ID`]
513 /// is `false` and [`AstPass::unsafe_to_cache_prepared`] is not
514 /// called for a given query.
515 Sql {
516 /// contains the sql query string
517 sql: String,
518 /// contains the types of any bind parameter passed to the query
519 bind_types: Vec<DB::TypeMetadata>,
520 },
521}
522
523impl<DB> StatementCacheKey<DB>
524where
525 DB: Backend,
526 DB::QueryBuilder: Default,
527 DB::TypeMetadata: Clone,
528{
529 /// Create a new statement cache key for the given query source
530 // Note: Intentionally monomorphic over source.
531 #[allow(unreachable_pub)]
532 pub fn for_source(
533 maybe_type_id: Option<TypeId>,
534 source: &dyn QueryFragmentForCachedStatement<DB>,
535 bind_types: &[DB::TypeMetadata],
536 backend: &DB,
537 ) -> QueryResult<Self> {
538 match maybe_type_id {
539 Some(id) => Ok(StatementCacheKey::Type(id)),
540 None => {
541 let sql = source.construct_sql(backend)?;
542 Ok(StatementCacheKey::Sql {
543 sql,
544 bind_types: bind_types.into(),
545 })
546 }
547 }
548 }
549
550 /// Get the sql for a given query source based
551 ///
552 /// This is an optimization that may skip constructing the query string
553 /// twice if it's already part of the current cache key
554 // Note: Intentionally monomorphic over source.
555 #[allow(unreachable_pub)]
556 pub fn sql(
557 &self,
558 source: &dyn QueryFragmentForCachedStatement<DB>,
559 backend: &DB,
560 ) -> QueryResult<Cow<'_, str>> {
561 match *self {
562 StatementCacheKey::Type(_) => source.construct_sql(backend).map(Cow::Owned),
563 StatementCacheKey::Sql { ref sql, .. } => Ok(Cow::Borrowed(sql)),
564 }
565 }
566}