Caching is an important technique in software development that can provide significant benefits to application performance and scalability. Some benefits of caching include:
- Faster access to data: Caching allows frequently accessed data to be stored in memory or on disk, which can reduce the time required to retrieve the data from a database or other source. This can lead to faster application performance and improved user experience.
- When a heavy computation is done in a program, caching can store the computed results so that the computation does not need to be performed again if the same inputs are used, which conserves resources.
- Reduced load on backend systems: By caching frequently accessed data, caching can reduce the number of requests made to backend systems like databases or APIs, which can help to reduce the load on those systems and improve their scalability.
However, there are also some risks associated with caching that need to be considered, including:
- Outdated or stale data: Cached data can become outdated or stale if it is not properly managed, which can lead to issues with consistency and accuracy. This risk can be mitigated by setting expiration times for cached data, using invalidation mechanisms to remove outdated data from the cache, and monitoring the underlying data source for changes.
- Cache coherence issues: If caching is used across multiple instances or nodes, there is a risk that different nodes may have different versions of the cached data, leading to inconsistencies in the results. This risk can be mitigated by using a distributed cache that can synchronize data across multiple instances, or by using other techniques like cache coherence protocols to ensure that all nodes have the same version of the cached data.
- Increased memory usage: Caching can increase the memory usage of an application, particularly if large amounts of data are cached. This risk can be mitigated by setting appropriate cache sizes and eviction policies, and by carefully monitoring memory usage to ensure that it remains within acceptable limits.
Java and Caching
The Java Caching Specification (JCS) was introduced in 2001, as part of the Java Community Process (JCP). It was developed by a group of industry experts and was designed to standardize caching interfaces and APIs in Java. The specification has been updated several times since its initial release, with the most recent version being JCS 2.3. The JCS defines a standard set of interfaces and APIs for caching in Java. It provides a consistent interface for caching across different caching implementations, which can simplify the development of caching-enabled applications. The JCS specification includes several core interfaces:
The JCS specification includes the following core interfaces:
- Cache: Defines the core caching functionality, including methods for adding, retrieving, and removing cached items.
- CacheElement: Represents a single cached item, including its key, value, and other metadata.
- CacheManager: Provides access to one or more caches, and manages their creation, initialization, and shutdown.
- CacheStatistics: Provides statistics and other information about a cache.
Several caching providers have implemented the JCS specification, including:
- Apache JCS: An open-source caching implementation that provides a variety of caching features, including support for disk-based caching, memory-based caching, and clustered caching.
- Ehcache: An open-source caching implementation that provides a variety of caching features, including support for distributed caching, caching of large data sets, and advanced cache eviction policies.
- Hazelcast: An open-source in-memory data grid that provides distributed caching capabilities, as well as other features like distributed computing, messaging, and more.
- Infinispan: An open-source data grid platform that provides advanced caching capabilities, including support for distributed caching, caching of large data sets, and advanced cache eviction policies.
- Redis: A popular in-memory data structure store that can be used as a caching provider in Java applications. Redis offers features like data persistence, distributed caching, and advanced data structures, making it a powerful caching solution. Redis can be integrated with Java applications using a variety of client libraries, including Jedis and Lettuce.
These caching providers can be used in a variety of applications to improve performance and scalability, and to simplify the management of cached data.
Caching using Spring Boot
To use caching with Spring Boot you need to add the „spring-boot-starter-cache“ starter package. This starter includes the spring-context-support and spring-cache modules, which provide the necessary classes and interfaces for implementing caching in your Spring Boot application.
By default, Spring Boot uses the SimpleCacheConfiguration which provides an in-memory cache implementation based on the ConcurrentHashMap data structure. However, Spring Boot allows you to easily configure and use different caching providers such as Ehcache, Hazelcast, Redis, etc.
To determine which caching provider is being used in your Spring Boot application, you can look at the dependency report (generated by adding —debug option):
SimpleCacheConfiguration matched:
– Cache org.springframework.boot.autoconfigure.cache.SimpleCacheConfiguration automatic cache type (CacheCondition)
Spring Boot allows to configure and customize the caching behavior of your application by using cache-related properties. For example, you can set the maximum size of the cache or the time-to-live (TTL) of cached entries using these properties.
f you’re not using Spring Boot, you have to manually define a bean to register the cache manager.
@Configuration
@EnableCaching
public class CachingConfig {
@Bean
public CacheManager cacheManager() {
return new ConcurrentMapCacheManager("cacheName");
}
}
However, with Spring Boot, you can register the ConcurrentMapCacheManager simply by including the starter package on the classpath and using the @EnableCaching annotation.
Caching the result of a method:
One of the easiest ways to enable caching for a method is to annotate it with @Cacheable and specify the name of the cache where the results should be stored as a parameter:
/**
* Method that returns the input parameter after a simulated delay of 5 seconds,
* and caches the result for subsequent requests with the same input parameter.
*
* @param info Input parameter to be processed and returned
* @return The input parameter after a delay of 5 seconds
*/
@Cacheable(cacheNames = "info")
public String doSomeWork(String info) {
try {
Thread.sleep(5000); // Simulate a delay of 5 seconds
} catch (InterruptedException e) {
throw new RuntimeException(e); // Throw a RuntimeException if interrupted
}
return info; // Return the input parameter after the delay
}
Specify the cache size:
You can set the cache size using the „cacheNames“ parameter of the annotation, and then configure the cache properties in the application.properties or application.yml file.
In this example, the @Cacheable annotation is used to cache the result of the doSomeWork() method. The cacheNames parameter is set to „info“, which is the name of the cache to use. You can define the cache properties, including the cache size, in the application.properties or application.yml file. Here’s an example of how to set the cache size to 100 entries:
spring.cache.cache-names=info
spring.cache.caffeine.spec=maximumSize=100
In this example, the spring.cache.cache-names property specifies the name of the cache, while the spring.cache.caffeine.spec property sets the cache properties. The maximumSize parameter is set to 100, which means the cache can hold a maximum of 100 entries. When the cache is full and you try to add a new entry, Spring removes the least recently used entry to make space for the new entry. This is known as the „LRU“ (Least Recently Used) eviction policy.
If the cache becomes outdated:
There is a risk that cached data may become outdated if it is not properly managed. When data is cached, it is stored in memory or on disk for faster retrieval, but this means that it may not always be the most up-to-date version of the data. This can lead to issues if the cached data is used in place of the current version of the data, particularly if the data is changing frequently.
To mitigate this risk, it is important to ensure that the cache is properly managed and that it is refreshed or invalidated when necessary. This can involve setting expiration times for cached data, so that it is automatically refreshed after a certain period of time, or using invalidation mechanisms to remove outdated data from the cache. It can also involve monitoring the underlying data source for changes, and refreshing the cache as needed to ensure that the cached data is always up-to-date.
Manually cache clearing:
It is also recommended to enable manual cache clearing:
The @CacheEvict annotation is used in Spring Framework to remove entries from a cache. It is often used in conjunction with the @Cacheable annotation, which caches the results of a method invocation in a cache.
When a method annotated with @CacheEvict is called, Spring Framework removes the specified entries from the cache, so that the next time the cached data is requested, the method is executed again and the cache is repopulated with the updated data.
Here’s an example of how to use the @CacheEvict annotation:
/**
* Method that clears the cache named "info".
*
* @return A string indicating that the cache has been cleared
*/
@CacheEvict(cacheNames = "info", allEntries = true)
public String clearCache() {
return "cache cleared"; // Return a message indicating that the cache has been cleared
}
It is also possible to disable caching using Actuator in a Spring Boot application by using the Actuator HTTP endpoint to disable caching at runtime by sending a POST request to the /actuator/caches endpoint with the following JSON body:
{
"caches": [],
"cacheManagers": [
{
"name": "cacheManagerName",
"clear": true
}
]
}
To disable caching permanently, you need to set the management.endpoint.caches.enabled property to false in application.properties or application.yml file.
Another risk with caching is that it can lead to inconsistent or incorrect results if the cache is not properly synchronized across multiple instances or nodes. This can occur if different nodes or instances have different versions of the cached data, leading to inconsistencies in the results. To mitigate this risk, it is important to use a distributed cache that can synchronize data across multiple instances, or to use other techniques like cache coherence protocols to ensure that all nodes have the same version of the cached data.