diesel/connection/statement_cache.rs
1//! Helper types for prepared statement caching
2//!
3//! A primer on prepared statement caching in Diesel
4//! ------------------------------------------------
5//!
6//! Diesel uses prepared statements for virtually all queries. This is most
7//! visible in our lack of any sort of "quoting" API. Values must always be
8//! transmitted as bind parameters, we do not support direct interpolation. The
9//! only method in the public API that doesn't require the use of prepared
10//! statements is [`SimpleConnection::batch_execute`](super::SimpleConnection::batch_execute).
11//!
12//! In order to avoid the cost of re-parsing and planning subsequent queries,
13//! Diesel caches the prepared statement whenever possible. Queries will fall
14//! into one of three buckets:
15//!
16//! - Unsafe to cache
17//! - Cached by SQL
18//! - Cached by type
19//!
20//! A query is considered unsafe to cache if it represents a potentially
21//! unbounded number of queries. This is communicated to the connection through
22//! [`QueryFragment::is_safe_to_cache_prepared`]. While this is done as a full AST
23//! pass, after monomorphisation and inlining this will usually be optimized to
24//! a constant. Only boxed queries will need to do actual work to answer this
25//! question.
26//!
27//! The majority of AST nodes are safe to cache if their components are safe to
28//! cache. There are at least 4 cases where a query is unsafe to cache:
29//!
30//! - queries containing `IN` with bind parameters
31//! - This requires 1 bind parameter per value, and is therefore unbounded
32//! - `IN` with subselects are cached (assuming the subselect is safe to
33//! cache)
34//! - `IN` statements for postgresql are cached as they use `= ANY($1)` instead
35//! which does not cause a unbound number of binds
36//! - `INSERT` statements with a variable number of rows
37//! - The SQL varies based on the number of rows being inserted.
38//! - `UPDATE` statements
39//! - Technically it's bounded on "number of optional values being passed to
40//! `SET` factorial" but that's still quite high, and not worth caching
41//! for the same reason as single row inserts
42//! - `SqlLiteral` nodes
43//! - We have no way of knowing whether the SQL was generated dynamically or
44//! not, so we must assume that it's unbounded
45//!
46//! For queries which are unsafe to cache, the statement cache will never insert
47//! them. They will be prepared and immediately released after use (or in the
48//! case of PG they will use the unnamed prepared statement).
49//!
50//! For statements which are able to be cached, we then have to determine what
51//! to use as the cache key. The standard method that virtually all ORMs or
52//! database access layers use in the wild is to store the statements in a
53//! hash map, using the SQL as the key.
54//!
55//! However, the majority of queries using Diesel that are safe to cache as
56//! prepared statements will be uniquely identified by their type. For these
57//! queries, we can bypass the query builder entirely. Since our AST is
58//! generally optimized away by the compiler, for these queries the cost of
59//! fetching a prepared statement from the cache is the cost of [`HashMap<u32,
60//! _>::get`], where the key we're fetching by is a compile time constant. For
61//! these types, the AST pass to gather the bind parameters will also be
62//! optimized to accessing each parameter individually.
63//!
64//! Determining if a query can be cached by type is the responsibility of the
65//! [`QueryId`] trait. This trait is quite similar to `Any`, but with a few
66//! differences:
67//!
68//! - No `'static` bound
69//! - Something being a reference never changes the SQL that is generated,
70//! so `&T` has the same query id as `T`.
71//! - `Option<TypeId>` instead of `TypeId`
72//! - We need to be able to constrain on this trait being implemented, but
73//! not all types will actually have a static query id. Hopefully once
74//! specialization is stable we can remove the `QueryId` bound and
75//! specialize on it instead (or provide a blanket impl for all `T`)
76//! - Implementors give a more broad type than `Self`
77//! - This really only affects bind parameters. There are 6 different Rust
78//! types which can be used for a parameter of type `timestamp`. The same
79//! statement can be used regardless of the Rust type, so [`Bound<ST, T>`](crate::expression::bound::Bound)
80//! defines its [`QueryId`] as [`Bound<ST, ()>`](crate::expression::bound::Bound).
81//!
82//! A type returning `Some(id)` or `None` for its query ID is based on whether
83//! the SQL it generates can change without the type changing. At the moment,
84//! the only type which is safe to cache as a prepared statement but does not
85//! have a static query ID is something which has been boxed.
86//!
87//! One potential optimization that we don't perform is storing the queries
88//! which are cached by type ID in a separate map. Since a type ID is a u64,
89//! this would allow us to use a specialized map which knows that there will
90//! never be hashing collisions (also known as a perfect hashing function),
91//! which would mean lookups are always constant time. However, this would save
92//! nanoseconds on an operation that will take microseconds or even
93//! milliseconds.
94
95use std::any::TypeId;
96use std::borrow::Cow;
97use std::collections::HashMap;
98use std::hash::Hash;
99use std::ops::{Deref, DerefMut};
100
101use crate::backend::Backend;
102use crate::connection::InstrumentationEvent;
103use crate::query_builder::*;
104use crate::result::QueryResult;
105
106use super::Instrumentation;
107
108/// A prepared statement cache
109#[allow(missing_debug_implementations, unreachable_pub)]
110#[cfg_attr(
111 docsrs,
112 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
113)]
114pub struct StatementCache<DB: Backend, Statement> {
115 pub(crate) cache: HashMap<StatementCacheKey<DB>, Statement>,
116}
117
118/// A helper type that indicates if a certain query
119/// is cached inside of the prepared statement cache or not
120///
121/// This information can be used by the connection implementation
122/// to signal this fact to the database while actually
123/// preparing the statement
124#[derive(Debug, Clone, Copy)]
125#[cfg_attr(
126 docsrs,
127 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
128)]
129#[allow(unreachable_pub)]
130pub enum PrepareForCache {
131 /// The statement will be cached
132 Yes,
133 /// The statement won't be cached
134 No,
135}
136
137#[allow(
138 clippy::len_without_is_empty,
139 clippy::new_without_default,
140 unreachable_pub
141)]
142impl<DB, Statement> StatementCache<DB, Statement>
143where
144 DB: Backend,
145 DB::TypeMetadata: Clone,
146 DB::QueryBuilder: Default,
147 StatementCacheKey<DB>: Hash + Eq,
148{
149 /// Create a new prepared statement cache
150 #[allow(unreachable_pub)]
151 pub fn new() -> Self {
152 StatementCache {
153 cache: HashMap::new(),
154 }
155 }
156
157 /// Get the current length of the statement cache
158 #[allow(unreachable_pub)]
159 #[cfg(any(
160 feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes",
161 feature = "postgres",
162 all(feature = "sqlite", test)
163 ))]
164 #[cfg_attr(
165 docsrs,
166 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
167 )]
168 pub fn len(&self) -> usize {
169 self.cache.len()
170 }
171
172 /// Prepare a query as prepared statement
173 ///
174 /// This functions returns a prepared statement corresponding to the
175 /// query passed as `source` with the bind values passed as `bind_types`.
176 /// If the query is already cached inside this prepared statement cache
177 /// the cached prepared statement will be returned, otherwise `prepare_fn`
178 /// will be called to create a new prepared statement for this query source.
179 /// The first parameter of the callback contains the query string, the second
180 /// parameter indicates if the constructed prepared statement will be cached or not.
181 /// See the [module](self) documentation for details
182 /// about which statements are cached and which are not cached.
183 #[allow(unreachable_pub)]
184 pub fn cached_statement<T, F>(
185 &mut self,
186 source: &T,
187 backend: &DB,
188 bind_types: &[DB::TypeMetadata],
189 mut prepare_fn: F,
190 instrumentation: &mut dyn Instrumentation,
191 ) -> QueryResult<MaybeCached<'_, Statement>>
192 where
193 T: QueryFragment<DB> + QueryId,
194 F: FnMut(&str, PrepareForCache) -> QueryResult<Statement>,
195 {
196 self.cached_statement_non_generic(
197 T::query_id(),
198 source,
199 backend,
200 bind_types,
201 &mut prepare_fn,
202 instrumentation,
203 )
204 }
205
206 /// Reduce the amount of monomorphized code by factoring this via dynamic dispatch
207 fn cached_statement_non_generic(
208 &mut self,
209 maybe_type_id: Option<TypeId>,
210 source: &dyn QueryFragmentForCachedStatement<DB>,
211 backend: &DB,
212 bind_types: &[DB::TypeMetadata],
213 prepare_fn: &mut dyn FnMut(&str, PrepareForCache) -> QueryResult<Statement>,
214 instrumentation: &mut dyn Instrumentation,
215 ) -> QueryResult<MaybeCached<'_, Statement>> {
216 use std::collections::hash_map::Entry::{Occupied, Vacant};
217
218 let cache_key = StatementCacheKey::for_source(maybe_type_id, source, bind_types, backend)?;
219
220 if !source.is_safe_to_cache_prepared(backend)? {
221 let sql = cache_key.sql(source, backend)?;
222 return prepare_fn(&sql, PrepareForCache::No).map(MaybeCached::CannotCache);
223 }
224
225 let cached_result = match self.cache.entry(cache_key) {
226 Occupied(entry) => entry.into_mut(),
227 Vacant(entry) => {
228 let statement = {
229 let sql = entry.key().sql(source, backend)?;
230 instrumentation
231 .on_connection_event(InstrumentationEvent::CacheQuery { sql: &sql });
232 prepare_fn(&sql, PrepareForCache::Yes)
233 };
234
235 entry.insert(statement?)
236 }
237 };
238
239 Ok(MaybeCached::Cached(cached_result))
240 }
241}
242
243/// Implemented for all `QueryFragment`s, dedicated to dynamic dispatch within the context of
244/// `statement_cache`
245///
246/// We want the generated code to be as small as possible, so for each query passed to
247/// [`StatementCache::cached_statement`] the generated assembly will just call a non generic
248/// version with dynamic dispatch pointing to the VTABLE of this minimal trait
249///
250/// This preserves the opportunity for the compiler to entirely optimize the `construct_sql`
251/// function as a function that simply returns a constant `String`.
252#[allow(unreachable_pub)]
253#[cfg_attr(
254 docsrs,
255 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
256)]
257pub trait QueryFragmentForCachedStatement<DB> {
258 /// Convert the query fragment into a SQL string for the given backend
259 fn construct_sql(&self, backend: &DB) -> QueryResult<String>;
260 /// Check whether it's safe to cache the query
261 fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool>;
262}
263impl<T, DB> QueryFragmentForCachedStatement<DB> for T
264where
265 DB: Backend,
266 DB::QueryBuilder: Default,
267 T: QueryFragment<DB>,
268{
269 fn construct_sql(&self, backend: &DB) -> QueryResult<String> {
270 let mut query_builder = DB::QueryBuilder::default();
271 self.to_sql(&mut query_builder, backend)?;
272 Ok(query_builder.finish())
273 }
274
275 fn is_safe_to_cache_prepared(&self, backend: &DB) -> QueryResult<bool> {
276 <T as QueryFragment<DB>>::is_safe_to_cache_prepared(self, backend)
277 }
278}
279
280/// Wraps a possibly cached prepared statement
281///
282/// Essentially a customized version of [`Cow`]
283/// that does not depend on [`ToOwned`]
284#[allow(missing_debug_implementations, unreachable_pub)]
285#[cfg_attr(
286 docsrs,
287 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
288)]
289#[non_exhaustive]
290pub enum MaybeCached<'a, T: 'a> {
291 /// Contains a not cached prepared statement
292 CannotCache(T),
293 /// Contains a reference cached prepared statement
294 Cached(&'a mut T),
295}
296
297impl<T> Deref for MaybeCached<'_, T> {
298 type Target = T;
299
300 fn deref(&self) -> &Self::Target {
301 match *self {
302 MaybeCached::CannotCache(ref x) => x,
303 MaybeCached::Cached(ref x) => x,
304 }
305 }
306}
307
308impl<T> DerefMut for MaybeCached<'_, T> {
309 fn deref_mut(&mut self) -> &mut Self::Target {
310 match *self {
311 MaybeCached::CannotCache(ref mut x) => x,
312 MaybeCached::Cached(ref mut x) => x,
313 }
314 }
315}
316
317/// The lookup key used by [`StatementCache`] internally
318///
319/// This can contain either a at compile time known type id
320/// (representing a statically known query) or a at runtime
321/// calculated query string + parameter types (for queries
322/// that may change depending on their parameters)
323#[allow(missing_debug_implementations, unreachable_pub)]
324#[derive(Hash, PartialEq, Eq)]
325#[cfg_attr(
326 docsrs,
327 doc(cfg(feature = "i-implement-a-third-party-backend-and-opt-into-breaking-changes"))
328)]
329pub enum StatementCacheKey<DB: Backend> {
330 /// Represents a at compile time known query
331 ///
332 /// Calculated via [`QueryId::QueryId`]
333 Type(TypeId),
334 /// Represents a dynamically constructed query
335 ///
336 /// This variant is used if [`QueryId::HAS_STATIC_QUERY_ID`]
337 /// is `false` and [`AstPass::unsafe_to_cache_prepared`] is not
338 /// called for a given query.
339 Sql {
340 /// contains the sql query string
341 sql: String,
342 /// contains the types of any bind parameter passed to the query
343 bind_types: Vec<DB::TypeMetadata>,
344 },
345}
346
347impl<DB> StatementCacheKey<DB>
348where
349 DB: Backend,
350 DB::QueryBuilder: Default,
351 DB::TypeMetadata: Clone,
352{
353 /// Create a new statement cache key for the given query source
354 // Note: Intentionally monomorphic over source.
355 #[allow(unreachable_pub)]
356 pub fn for_source(
357 maybe_type_id: Option<TypeId>,
358 source: &dyn QueryFragmentForCachedStatement<DB>,
359 bind_types: &[DB::TypeMetadata],
360 backend: &DB,
361 ) -> QueryResult<Self> {
362 match maybe_type_id {
363 Some(id) => Ok(StatementCacheKey::Type(id)),
364 None => {
365 let sql = source.construct_sql(backend)?;
366 Ok(StatementCacheKey::Sql {
367 sql,
368 bind_types: bind_types.into(),
369 })
370 }
371 }
372 }
373
374 /// Get the sql for a given query source based
375 ///
376 /// This is an optimization that may skip constructing the query string
377 /// twice if it's already part of the current cache key
378 // Note: Intentionally monomorphic over source.
379 #[allow(unreachable_pub)]
380 pub fn sql(
381 &self,
382 source: &dyn QueryFragmentForCachedStatement<DB>,
383 backend: &DB,
384 ) -> QueryResult<Cow<'_, str>> {
385 match *self {
386 StatementCacheKey::Type(_) => source.construct_sql(backend).map(Cow::Owned),
387 StatementCacheKey::Sql { ref sql, .. } => Ok(Cow::Borrowed(sql)),
388 }
389 }
390}