PostgreSQL is increasingly favored for new application development due to its robust features, scalability, and extensibility. It offers developers a potent open-source relational database management system, excelling in handling complex queries and transactions. The platform’s popularity is bolstered by an active community dedicated to continuous improvement and innovation. Leveraging a Docker image streamlines the creation of a PostgreSQL database for development purposes. However, in my observations, Java developers often underutilize PostgreSQL, opting for a basic setup where databases and users are created without optimizing for application-specific schemas. Typically, users are directly placed in the public schema, limiting the database to hosting a single application. In the upcoming sections, I’ll guide you through the process of creating a database user and a dedicated schema for your application, enhancing logical separation and optimizing PostgreSQL for Java development.
Create a Schema (myschema):
CREATE SCHEMA myschema;
This command creates a new schema named myschema where you can organize your database objects. Create a User (myschema) with a Password:
CREATE USER myschema WITH PASSWORD 'myschema';
This command creates a user named myschema with the password ‚myschema‚. Replace ‚myschema‘ with your desired password. Set myschema as the Default Schema for the User:
ALTER USER myschema SET search_path TO myschema;
This command configures the user myschema to use the myschema schema as its default schema. This means that any tables or objects created by this user will be in the myschema schema by default.
Change the Ownership of myschema to the User myschema:
ALTER SCHEMA myschema OWNER TO myschema;
This command changes the ownership of the myschema schema to the user myschema, ensuring that the user has full control over this schema.
Grant CREATE and USAGE Privileges on myschema to the User myschema:
GRANT CREATE, USAGE ON SCHEMA myschema TO myschema;
This command grants the user myschema the ability to create and use objects within the myschema schema. This is useful for development purposes.
By following these steps, you’ve created a development environment in PostgreSQL where the user myschema has its own schema and can create and manage database objects within that schema. This setup helps you keep your development work organized and secure.
In general, it’s best practices to favor composition over inheritance in Java. This is known as the „Composition over Inheritance“ principle.
The main reason for this is that inheritance can introduce tight coupling between classes, making it difficult to change the behavior of a class without affecting all of its subclasses. In contrast, composition allows for greater flexibility and modularity by separating the responsibilities of different classes and allowing them to be combined in different ways.
When using composition in Java, it’s a good practice to use interfaces to define the contract for the behavior of a component, as this allows for greater flexibility in the implementation of the component and makes it easier to swap out different implementations without affecting the rest of the system.
Let’s consider this example:
public class Server implements Runnable {
@Override
public void run() {
// do some work here...
System.out.println(“do something");
}
}
The job done in the run method can be delegated to another class by using composition:
public class Server implements Runnable {
private final ServerImpl serverImpl;
public Server(ServerImpl serverImpl) {
this.serverImpl = serverImpl;
}
@Override
public void run() {
serverImpl.run();
}
}
public class ServerImpl implements Runnable {
@Override
public void run() {
// do some work here...
System.out.println("Doing work in ServerImpl");
}
}
So the work in the run() method is delegated to the ServerImpl class.
Lombok annotations can help to reduce the boilerplate code needed to define the class and make the code easier to read and maintain:
import lombok.RequiredArgsConstructor;
import lombok.experimental.Delegate;
@RequiredArgsConstructor
public class Server implements Runnable {
@Delegate
private final ServerImpl serverImpl;
}
We use the @Delegate annotation to generate delegate methods for all the methods in the Runnable interface, which means that we don’t have to manually implement the run() method
We’ve also used the @RequiredArgsConstructor annotation to generate a constructor that takes a ServerImpl object as an argument and assigns it to the serverImpl field.
Caching is an important technique in software development that can provide significant benefits to application performance and scalability. Some benefits of caching include:
Faster access to data: Caching allows frequently accessed data to be stored in memory or on disk, which can reduce the time required to retrieve the data from a database or other source. This can lead to faster application performance and improved user experience.
When a heavy computation is done in a program, caching can store the computed results so that the computation does not need to be performed again if the same inputs are used, which conserves resources.
Reduced load on backend systems: By caching frequently accessed data, caching can reduce the number of requests made to backend systems like databases or APIs, which can help to reduce the load on those systems and improve their scalability.
However, there are also some risks associated with caching that need to be considered, including:
Outdated or stale data: Cached data can become outdated or stale if it is not properly managed, which can lead to issues with consistency and accuracy. This risk can be mitigated by setting expiration times for cached data, using invalidation mechanisms to remove outdated data from the cache, and monitoring the underlying data source for changes.
Cache coherence issues: If caching is used across multiple instances or nodes, there is a risk that different nodes may have different versions of the cached data, leading to inconsistencies in the results. This risk can be mitigated by using a distributed cache that can synchronize data across multiple instances, or by using other techniques like cache coherence protocols to ensure that all nodes have the same version of the cached data.
Increased memory usage: Caching can increase the memory usage of an application, particularly if large amounts of data are cached. This risk can be mitigated by setting appropriate cache sizes and eviction policies, and by carefully monitoring memory usage to ensure that it remains within acceptable limits.
Java and Caching
The Java Caching Specification (JCS) was introduced in 2001, as part of the Java Community Process (JCP). It was developed by a group of industry experts and was designed to standardize caching interfaces and APIs in Java. The specification has been updated several times since its initial release, with the most recent version being JCS 2.3. The JCS defines a standard set of interfaces and APIs for caching in Java. It provides a consistent interface for caching across different caching implementations, which can simplify the development of caching-enabled applications. The JCS specification includes several core interfaces:
The JCS specification includes the following core interfaces:
Cache: Defines the core caching functionality, including methods for adding, retrieving, and removing cached items.
CacheElement: Represents a single cached item, including its key, value, and other metadata.
CacheManager: Provides access to one or more caches, and manages their creation, initialization, and shutdown.
CacheStatistics: Provides statistics and other information about a cache.
Several caching providers have implemented the JCS specification, including:
Apache JCS: An open-source caching implementation that provides a variety of caching features, including support for disk-based caching, memory-based caching, and clustered caching.
Ehcache: An open-source caching implementation that provides a variety of caching features, including support for distributed caching, caching of large data sets, and advanced cache eviction policies.
Hazelcast: An open-source in-memory data grid that provides distributed caching capabilities, as well as other features like distributed computing, messaging, and more.
Infinispan: An open-source data grid platform that provides advanced caching capabilities, including support for distributed caching, caching of large data sets, and advanced cache eviction policies.
Redis: A popular in-memory data structure store that can be used as a caching provider in Java applications. Redis offers features like data persistence, distributed caching, and advanced data structures, making it a powerful caching solution. Redis can be integrated with Java applications using a variety of client libraries, including Jedis and Lettuce.
These caching providers can be used in a variety of applications to improve performance and scalability, and to simplify the management of cached data.
Caching using Spring Boot
To use caching with Spring Boot you need to add the „spring-boot-starter-cache“ starter package. This starter includes the spring-context-support and spring-cache modules, which provide the necessary classes and interfaces for implementing caching in your Spring Boot application.
By default, Spring Boot uses the SimpleCacheConfiguration which provides an in-memory cache implementation based on the ConcurrentHashMap data structure. However, Spring Boot allows you to easily configure and use different caching providers such as Ehcache, Hazelcast, Redis, etc.
To determine which caching provider is being used in your Spring Boot application, you can look at the dependency report (generated by adding —debug option):
SimpleCacheConfiguration matched:
– Cache org.springframework.boot.autoconfigure.cache.SimpleCacheConfiguration automatic cache type (CacheCondition)
Spring Boot allows to configure and customize the caching behavior of your application by using cache-related properties. For example, you can set the maximum size of the cache or the time-to-live (TTL) of cached entries using these properties.
f you’re not using Spring Boot, you have to manually define a bean to register the cache manager.
@Configuration
@EnableCaching
public class CachingConfig {
@Bean
public CacheManager cacheManager() {
return new ConcurrentMapCacheManager("cacheName");
}
}
However, with Spring Boot, you can register the ConcurrentMapCacheManager simply by including the starter package on the classpath and using the @EnableCaching annotation.
Caching the result of a method:
One of the easiest ways to enable caching for a method is to annotate it with @Cacheable and specify the name of the cache where the results should be stored as a parameter:
/**
* Method that returns the input parameter after a simulated delay of 5 seconds,
* and caches the result for subsequent requests with the same input parameter.
*
* @param info Input parameter to be processed and returned
* @return The input parameter after a delay of 5 seconds
*/ @Cacheable(cacheNames = "info") public String doSomeWork(String info) { try {
Thread.sleep(5000); // Simulate a delay of 5 seconds } catch (InterruptedException e) { throw new RuntimeException(e); // Throw a RuntimeException if interrupted } return info; // Return the input parameter after the delay }
Specify the cache size:
You can set the cache size using the „cacheNames“ parameter of the annotation, and then configure the cache properties in the application.properties or application.yml file.
In this example, the @Cacheable annotation is used to cache the result of the doSomeWork() method. The cacheNames parameter is set to „info“, which is the name of the cache to use. You can define the cache properties, including the cache size, in the application.properties or application.yml file. Here’s an example of how to set the cache size to 100 entries:
spring.cache.cache-names=info
spring.cache.caffeine.spec=maximumSize=100
In this example, the spring.cache.cache-names property specifies the name of the cache, while the spring.cache.caffeine.spec property sets the cache properties. The maximumSize parameter is set to 100, which means the cache can hold a maximum of 100 entries. When the cache is full and you try to add a new entry, Spring removes the least recently used entry to make space for the new entry. This is known as the „LRU“ (Least Recently Used) eviction policy.
If the cache becomes outdated:
There is a risk that cached data may become outdated if it is not properly managed. When data is cached, it is stored in memory or on disk for faster retrieval, but this means that it may not always be the most up-to-date version of the data. This can lead to issues if the cached data is used in place of the current version of the data, particularly if the data is changing frequently.
To mitigate this risk, it is important to ensure that the cache is properly managed and that it is refreshed or invalidated when necessary. This can involve setting expiration times for cached data, so that it is automatically refreshed after a certain period of time, or using invalidation mechanisms to remove outdated data from the cache. It can also involve monitoring the underlying data source for changes, and refreshing the cache as needed to ensure that the cached data is always up-to-date.
Manually cache clearing:
It is also recommended to enable manual cache clearing:
The @CacheEvict annotation is used in Spring Framework to remove entries from a cache. It is often used in conjunction with the @Cacheable annotation, which caches the results of a method invocation in a cache.
When a method annotated with @CacheEvict is called, Spring Framework removes the specified entries from the cache, so that the next time the cached data is requested, the method is executed again and the cache is repopulated with the updated data.
Here’s an example of how to use the @CacheEvict annotation:
/**
* Method that clears the cache named "info".
*
* @return A string indicating that the cache has been cleared
*/
@CacheEvict(cacheNames = "info", allEntries = true)
public String clearCache() {
return "cache cleared"; // Return a message indicating that the cache has been cleared
}
It is also possible to disable caching using Actuator in a Spring Boot application by using the Actuator HTTP endpoint to disable caching at runtime by sending a POST request to the /actuator/caches endpoint with the following JSON body:
To disable caching permanently, you need to set the management.endpoint.caches.enabled property to false in application.properties or application.yml file.
Another risk with caching is that it can lead to inconsistent or incorrect results if the cache is not properly synchronized across multiple instances or nodes. This can occur if different nodes or instances have different versions of the cached data, leading to inconsistencies in the results. To mitigate this risk, it is important to use a distributed cache that can synchronize data across multiple instances, or to use other techniques like cache coherence protocols to ensure that all nodes have the same version of the cached data.
Actuator provides monitoring and metrics features that are ready for production use in Spring Boot applications. These features are accessible through various endpoints, which can be accessed both via HTTP and JMX MBeans.
Enabling Actuator in a Spring Boot application allows us to gain visibility into the application: Among other things, statistics about the application, such as the number of times a particular endpoint has been requested, the amount of memory the application is consuming, the logging levels of different packages within the application, and the available configuration properties in the application environment. Additionally, Actuator provides information about the application’s health and any external services it is coordinating with.
Enabling Actuator in Spring Boot projects
To enable Actuator in a Spring Boot project, all you need to do is add the following dependency:
Note: By default, only one endpoint, the health endpoint, is accessible for security reasons. This endpoint checks whether the service is up or not. To enable access to the other endpoints, you will need to set the management.endpoints.web.exposure.include parameter to ‘*’. In sensitive environments, it is essential to secure all the endpoints as they may expose sensitive data.
The link mentioned above provides HATEOAS links for all the endpoints:
Now, let’s take a look at the info endpoint. You have complete control over what is returned by this REST endpoint. The information presented by this endpoint is contributed by different info contributors:
build
BuildInfoContributor
Exposes build information.
env
EnvironmentInfoContributor
Exposes any property from the Environment whose name starts with info.
git
GitInfoContributor
Exposes Git related information.
java
JavaInfoContributor
Exposes Java runtime information.
Enabling the env contributor by setting the management.info.env.enabled parameter to true will display all properties with the prefix „info“
If your application.yml file looks like the following:
then the info endpoint will provide the following information:
{"creator":"tramotech","email":"info@email.com"}
Build information is critical. The contributor that provides this information is enabled by default. If a valid META-INF/build-info.properties file is present, Spring Boot Actuator will display the build details. So all you need to do is to populate the build-info.properties file. Fortunately, the Spring Boot Maven plugin includes a build-info goal that can be used to create the build-info.properties file. Here’s an example with custom properties:
Another crucial piece of information is Git information, which is provided by the GitContributor and enabled by default. The contents of git.properties file will be exposed if it is present. To generate this file, you will need to add the git-commit-id-plugin plugin:
When you build your project the git.properties file will be generated in the classes folder. Here is a snippet of this file:
#Generated by Git-Commit-Id-Plugin
#Mon Apr 03 11:30:06 CEST 2023
git.branch=main
git.build.host=Ahmeds-MacBook-Pro.fritz.box
git.build.time=2023-04-03T11\:30\:06+0200
git.build.user.email=info@email.com
git.build.user.name=Ahmed Fikri
git.build.version=0.0.1-SNAPSHOT
git.closest.tag.commit.count=
git.closest.tag.name=
git.commit.id=557bc0a5c3f128d0b2d42177be7e452acca1d108
git.commit.id.abbrev=557bc0a
Reading and updating configuration properties:
The env endpoint is another crucial endpoint that provides an overview of all configuration properties in a transparent manner. This endpoint not only allows you to read properties but also enables the modification of properties in the running application.
Actuator also includes an endpoint that enables the shutdown of the application. For instance:
curl -X POST http://localhost:8080/actuator/shutdown
As you can see, Actuator can alter the behavior of your application, which is why it is crucial to secure the exposed endpoints using Spring Boot Security.
Configuring properties in Spring Boot improves the application’s flexibility and adaptability. This can be achieved by leveraging environment variables or Java system properties, or by defining properties within the deployed package. However, modifying a property will necessitate restarting the application or even rebuilding and redeploying it. Additionally, it’s not very convenient to modify a property for all instances of the same microservice.
Fortunately, Spring Cloud Config Server offers centralized configuration through a server that all microservices within an application can rely on to retrieve their configuration. The microsevices will consume the configuration using a REST API exposed by the Config Server.
The configuration served by the Config Server is kept external to the server, usually in a source code control system like Git. Using a source code management system brings with it many benefits: Versioning, branching, reverting and so on.
Creating Config Server
The Config Server itself is a microservice with the sole task of managing the configuration. It can be built like any other Spring Boot application by adding this dependency:
And you will need to enable Config Server for example by annotating the bootstrap’s class with @EnableConfigServer annotation.
@EnableConfigServer
@SpringBootApplication
public class ConfigServerApplication {
public static void main(String[] args) {
SpringApplication.run(ConfigServerApplication.class, args);
}
}
As mentioned earlier, the shared configuration is stored in a source control system, and you must provide the URL to access the configuration. In addition, microservices that consume configuration from the Config Server assume that the server is operating on port 8888. Therefore, it may be beneficial to configure the server to use port 8888. So, if you are using GitHub as source control system then the application.yml file could look like following:
To access a profile other than the default, simply substitute the default profile name with the desired profile name in the URL.
Please note that the term application in the URL (after the port number) can be substituted with any other string. This corresponds to the spring.application.name of the Microservice that is consuming the configuration.
Additionally, it is possible to configure the Config Server to retrieve configuration information from a specific branch. To do so, simply add the branch’s name using the parameter default-label
To make use of the configuration served by the Spring Cloud Config Server, the microservice needs to include the Config Server client dependency in its project configuration:
Upon running the application, the autoconfiguration will automatically register a property source that fetches properties from a Config Server. By default, the autoconfiguration assumes that the Config Server is running on the localhost and listening on port 8888. However, if the Config Server is not running on the default location, the location of the Config Server can be configured by setting the spring.cloud.config.uri property. So your configuration might look like following:
spring: cloud: config: uri: http://localhost:8888
application: name: mymicroservice
To specify properties that are specific to a particular microservice and not shared with other microservices, you can create a file with the same name as the microservice. For example, if the name of your microservice is „mymicroservice“, you can create a file called „mymicroservice.yml“. If there are duplicate property definitions between the common properties in application.yml and those in the microservice-specific configuration file, the microservice-specific properties will take precedence.
Refreshing configuration properties on the fly
Spring Cloud Config Server provides a feature to refresh configuration properties of running applications without requiring a redeployment or restart. This means that changes made to the external configuration repository can be immediately reflected in the running application, with zero downtime. This can be achieved in two ways: manually, by calling the Actuator endpoint enabled at the microservice client; or automatically, by defining a commit hook in the Git repository that triggers a refresh on all servers that are clients of the Config Server. While manual refresh gives more precise control over when services are updated with fresh configuration, automatic refresh can save time and effort.
To refresh the configuration explicitly using the Config Server, you can add the Spring Boot Actuator starter dependency to the client’s project.
Microservice architecture involves breaking down an application into smaller, independent applications that are developed and deployed separately. These individual microservices work together to deliver the overall functionality of the larger application. This brings many advantages in contrast to monolith application.
Microservices are relatively easy to comprehend
Microservices are easier to test
Each microservice can be scaled independently of others
Microservices are less prone to encountering issues with library incompatibilities.
Different technology choices can be made for each microservice.
Microservices can be deployed to production more frequently.
However, the distributed nature of microservice architecture presents its own set of difficulties, for example, microservices must contend with the challenge of inter-service communication and ensuring each service is aware of its coordinating counterparts. In fact, hardcoding microservices with specific host and port information for all other associated microservices is not only tedious but makes microservices tightly coupled. To address this issue, we need to create a microservice as service registry with the sole task of providing all the necessary information to access other microservices, simply by providing the microservice name. Microservices then register with the service registry, providing a name and all the information about how to access them. Consequently, microservices only need to know the name of the associated services and the access information is retrieved from the service registry.
Fortunately, Spring Cloud offers an out-of-the-box service registry called Eureka (Developed by Netflix)
All you need to do is to create a microservice using for example Spring Initializr and add the dependency to the Eureka Server:
Additionally, you need to enable the Eureka server by annotating the application’s main bootstrap class with @EnableEurekaServer
@SpringBootApplication
@EnableEurekaServer
public class ServiceRegistryApplication {
public static void main(String[] args) {
SpringApplication.run(ServiceRegistryApplication.class, args);
}
}
And that’s it! If you start the application, you’ll have a Eureka service registry running and listening on the port 8080. However, by default, microservices that register with the service registry use port 8761. In the development environment, it may be more convenient to modify the Eureka port to 8761 using this parameter:
Server.port=8761
Here’s a configuration that can be used in development environment for the Eureka server’s application.yml file:
client.register-with-eureka set to false prevent the Eureka Service to register itself as service with other Eureka Services as Eureka service is at its own a microservice. The default value is true.
client.fetch-registry set to false to prevent Eureka to fetch other Eureka instances.
After the microservice has been started, it registers itself with the Eureka service. It’s assumed that Eureka is operating on the local machine and listening on port 8761. However, the microservice is registered with the default name of „UNKNOWN.“ To rectify this, you must specify the microservice name using this parameter:
Spring.application.name
As you can see, registration is automatic. Consequently, the ports of the various microservices are no longer hard-coded and can be chosen at random. So we don’t have to specify the port in the different microservices, instead we can set the port to 0 for all microservices. Assigning a value of 0 to the port parameter causes the application to initiate on a randomly selected available port. This eliminates a lot of concerns, as there’s no need to fret over port conflicts or which service will use which port.
It’s natural to question which microservice instance will be utilized when a microservice has numerous instances. Eureka offers load balancing on the client-side to prevent an instance from becoming overwhelmed. Ribbon (from Netflix) is a client-side load balancer available once the Netflix-Eureka-Client dependency is added to the project. All you need to do at client side is to annotate the RestTemplate or the WebClient with @LoadBalanced annotation.
Keycloak has proven itself in Securing Java web applications. However, it does not offer multi-factor authentication.
Fortunately, it is possible to delegate authentication in Keycloak to a provider that offers MFA. E.g Azure AD. The advantage of this approach is that no changes need to be made in the client applications. You don’t even need to redeploy the application.
In order to achieve this, the following two steps must be taken:
Add a new registration for the Keycloak instance in Azure portal
Define Azure Active Directory as an identity provider in Keycloak.
1- Add a new registration for Keycloak instance in Azure portal
Sign in to the Azure portal and then go to Azure Active Directory:
Then switch to „app registrations„
Then register the Keycloak instance as follows:
A meaningful name should be chosen and the link must be that of the Keykloak instance: e.g.
Note: The link consists of the realms „master“ and identity alias: „azure-ad“. The identity provider will be created later, but the name can be taken from the redirect URI.
The client ID and the directory ID must be noted. These are used in Keycloak:
In order for AD to be accessible from the outside, a new client secret must be created: In the menu Certificates & secrets → New client secret
The password is generated and must be copied immediately (the password cannot be determined later):
2- Define Azure Active Directory as Identity Provider in Keycloak
In Keycloak as an admin, add OpenID Connect as a provider in the Identity Providers menu:
Then:
Enter Alias and Display name.
Enter the following link in the „Discovery endpoint“: https://login.microsoftonline.com/{directoryID}/v2.0/.well-known/openid-configuration {directoryID} must be replaced with the directory ID from Azure AD.
Client ID and Client Secret from Azure AD (these have already been generated and noted – see above)
Today I want to discuss with you the effective way to get rid of most of data from a large table in oracle.
For example, if you need to delete 90 % of the data from a logging table or something like that.
A large table
•Black area contains unneeded data that should be deleted
•Green area contains data that is still needed
•White area unused space
Using Delete statement would most likely take an eternity and will generate a huge amount of REDO and UNDO (as we will see). Unnecessary REDO and UNDO represent only a stress for the entire database.
Usually Delete statement should be used only to remove a small amount of data.
Starting with the second version of Oracle 12 (12.2.0.1), oracle introduced an elegant way to achieve this, namely the DDL „alter table move“ was enhanced to move only a part of the data using where clause.
So, it is possible to move and reorg only the data that you need: for example, alter table large_table move including rows where id < 1000 (only data that meets this condition remains in the table)
Let’s create a test table
create table mytable as select id from dual connect by level < 101;
Using this new feature:
alter table mytable move including rows where id < 50;
This reorgs the table segment by keeping only data that meets the where clause.
It is even possible to delete all data (like truncate statement):
alter table mytable move including rows where 1=2;
no row meets this condition and consequently the table will be emptied.
This feature was only introduced in the version 12.2.
Does this mean that we must use a DML statement to delete data in versions older than 12.2 e.g., in 12.1 or 11.2.0.4?
Certainly not.
There are many ways to achieve delete data without using DML.
I will show you another method based on the partition exchange.
Using partition exchange, doesn’t mean that the target table should be partitioned but we create a worktable that is partitioned, and we create a partition with only the data that shouldn’t be deleted, and we perform a partition exchange
We will test the new method and compare it to the method that uses partition exchange and to the method that uses DELETE statement.
We create a big table: The data we want to delete is uniformly spread over all blocks – just to make the test more realistic
CREATE TABLE big_tbl AS
SELECT ROWNUM ID,2022+MOD(ROWNUM, 10) YEAR, rpad(rownum,10,'x') col1, rpad(rownum,20,'y') col2 , rpad(rownum,30,'z') col3,SYSDATE + ROWNUM/24/60 col4,
rpad(rownum,50,'a') col5, rpad(rownum,100,'b') col6,rpad(rownum,100,'b') col7,rpad(rownum,100,'b') col8,rpad(rownum,100,'b') col9 FROM
(SELECT 1 FROM (SELECT 1 FROM dual CONNECT BY LEVEL < 11) CROSS JOIN (SELECT 1 FROM dual CONNECT BY LEVEL < 100001))
Check the number of rows:
SELECT /*+ parallel(t 4) */ YEAR,COUNT(*) FROM big_tbl t GROUP BY ROLLUP(YEAR) ORDER BY 1
Year
count
2022
10000
2023
10000
2024
10000
2025
10000
2026
10000
2027
10000
2028
10000
2029
10000
2030
10000
2031
10000
Total
100000
Block to create nine tables as CTAS from big_tbl
DECLARE
TYPE t_numbers IS TABLE OF NUMBER;
v_years t_numbers := t_numbers(2022,2023,2024,2025,2026,2027,2028,2029,2030);
PROCEDURE ex_ignor_err(cmd VARCHAR2) IS BEGIN EXECUTE IMMEDIATE cmd; EXCEPTION WHEN OTHERS THEN NULL; END;
PROCEDURE LOG(msg VARCHAR2) IS BEGIN dbms_output.put_line(msg); END;
BEGIN
FOR i IN 1..v_years.count LOOP
LOG(v_years(i));
ex_ignor_err( 'drop table tbl_' ||v_years(i) || ' purge') ;
EXECUTE IMMEDIATE( 'create table tbl_' || v_years(i) || ' as select /*+ parallel(t 4) */ * from big_tbl t');
END LOOP;
END;
Check the size of the tables:
SELECT segment_name,bytes/1024/1024 size_in_mb FROM user_segments WHERE segment_name LIKE 'TBL%'
Table
Size in MB
TBL_2029
608
TBL_2030
608
TBL_2031
608
TBL_2022
608
TBL_2023
607
TBL_2024
608
TBL_2025
608
TBL_2026
608
TBL_2027
608
TBL_2028
608
Deleting data using alter table move including rows
DECLARE
v_redo NUMBER;
v_undo NUMBER;
v_ela NUMBER;
TYPE t_numbers IS TABLE OF NUMBER;
v_years t_numbers := t_numbers(2022,2023,2024,2025,2026,2027,2028,2029,2030);
FUNCTION get_stat(p_name VARCHAR2) RETURN NUMBER IS
ret NUMBER;
BEGIN
EXECUTE IMMEDIATE 'select b.value from v$statname a join v$mystat b using(statistic#) where a.name = :1' INTO ret USING p_name;
RETURN ret;
END get_stat;
PROCEDURE LOG(msg VARCHAR2) IS BEGIN dbms_output.put_line(msg); END;
BEGIN
FOR i IN 1..v_years.count LOOP
v_redo := get_stat('redo size');
v_ela := dbms_utility.get_time;
v_undo := get_stat('undo change vector size');
EXECUTE IMMEDIATE 'alter table tbl_'||v_years(i) || ' move including rows where year > ' ||v_years(i);
v_ela := dbms_utility.get_time - v_ela;
v_redo := get_stat('redo size') - v_redo;
v_undo := get_stat('undo change vector size') -v_undo;
LOG(i*10 || '% ' || ' ' ||v_redo ||' ' || v_undo || ' ' || v_ela );
END LOOP;
END;
The above block iterates over the nine test tables and delete each time 10% more data than the previous iteration. The get_stat method is used to calculate the amount of generated REDO/UNDO.
Deleting data using DML
DECLARE
v_redo NUMBER;
v_undo NUMBER;
v_ela NUMBER;
TYPE t_numbers IS TABLE OF NUMBER;
v_years t_numbers := t_numbers(2022,2023,2024,2025,2026,2027,2028,2029,2030);
FUNCTION get_stat(p_name VARCHAR2) RETURN NUMBER IS
ret NUMBER;
BEGIN
EXECUTE IMMEDIATE 'select b.value from v$statname a join v$mystat b using(statistic#) where a.name = :1' INTO ret USING p_name;
RETURN ret;
END get_stat;
PROCEDURE LOG(msg VARCHAR2) IS BEGIN dbms_output.put_line(msg); END;
BEGIN
FOR i IN 1..v_years.count -1 LOOP
v_redo := get_stat('redo size');
v_ela := dbms_utility.get_time;
v_undo := get_stat('undo change vector size');
EXECUTE IMMEDIATE ‘delete from table tbl_'||v_years(i) || ' where year > ' ||v_years(i);
commit;
v_ela := dbms_utility.get_time - v_ela;
v_redo := get_stat('redo size') - v_redo;
v_undo := get_stat('undo change vector size') -v_undo;
LOG(i*10 || '% ' || ' ' ||v_redo ||' ' || v_undo || ' ' || v_ela );
END LOOP;
END;
Deleting data using partition exchange
DECLARE
v_redo NUMBER;
v_undo NUMBER;
v_ela NUMBER;
TYPE t_numbers IS TABLE OF NUMBER;
v_years t_numbers := t_numbers(2022,2023,2024,2025,2026,2027,2028,2029,2030);
v_cols sys.odcivarchar2list;
v_get_cols VARCHAR2(4000);
FUNCTION get_cols RETURN VARCHAR2 IS
ret VARCHAR2(4000);
BEGIN
FOR i IN 1..v_cols.count LOOP ret := ret || v_cols(i) || ','; END LOOP;
RETURN RTRIM(ret,',');
END get_cols;
PROCEDURE LOG(msg VARCHAR2) IS BEGIN dbms_output.put_line(msg); END;
PROCEDURE pc_del_data_using_prtexchg(p_trgt_tbl VARCHAR2, p_where_cond VARCHAR2, p_tblspace VARCHAR2, p_dop VARCHAR2 DEFAULT '4') IS
v_sql VARCHAR2(32767);
v_tmp_tbl VARCHAR2(30) := 'tmp_tbl_to_purge_old_data';
BEGIN
BEGIN
EXECUTE IMMEDIATE 'drop table ' || v_tmp_tbl || ' purge';
EXCEPTION
WHEN OTHERS THEN
NULL;
END ;
v_sql := 'CREATE TABLE ' || v_tmp_tbl || ' ('||v_get_cols||')
partition by range ('||v_cols(1)||') (partition p0 values less than (maxvalue) tablespace '||p_tblspace||')
as ( select /*+ parallel(t '||p_dop||') */ * from '||p_trgt_tbl||' t '||p_where_cond||')';
EXECUTE IMMEDIATE v_sql;
v_sql := 'ALTER TABLE ' || v_tmp_tbl || ' EXCHANGE PARTITION p0 WITH TABLE ' || p_trgt_tbl ;
EXECUTE IMMEDIATE v_sql;
EXCEPTION
WHEN OTHERS THEN
LOG(SQLERRM);
RAISE;
END pc_del_data_using_prtexchg;
FUNCTION get_stat(p_name VARCHAR2) RETURN NUMBER IS
ret NUMBER;
BEGIN
EXECUTE IMMEDIATE 'select b.value from v$statname a join v$mystat b using(statistic#) where a.name = :1' INTO ret USING p_name;
RETURN ret;
END get_stat;
BEGIN
SELECT column_name BULK COLLECT INTO v_cols FROM user_tab_cols WHERE table_name = 'BIG_TBL' ORDER BY column_id;
v_get_cols := get_cols;
FOR i IN 1..v_years.count LOOP
v_redo := get_stat('redo size');
v_ela := dbms_utility.get_time;
v_undo := get_stat('undo change vector size');
pc_del_data_using_prtexchg('tbl_'||v_years(i), ' where year > ' ||v_years(i), 'USERS');
v_ela := dbms_utility.get_time - v_ela;
v_redo := get_stat('redo size') - v_redo;
v_undo := get_stat('undo change vector size') -v_undo;
LOG(i*10 || '% ' || ' ' ||v_redo ||' ' || v_undo || ' ' || v_ela );
END LOOP;
END;
Elapsed time in ms to delete from a table with 1 Mio rows (600MB). Alter table move vs partition exchange vs Delete
MOVE
Partition Exchange
MOVE_DOP 1
MOVE_DOP 4
PEXC_DOP 1
PEXC_DOP 4
Delete
10%
8070
4300
11740
9920
5420
20%
7240
3800
10910
8860
9350
30%
6460
3550
10180
8460
12610
40%
5930
3130
9440
7820
16540
50%
5130
2800
8230
6930
18580
60%
4580
2640
7720
6270
23570
70%
3910
2370
6920
5700
27950
80%
3340
1850
6510
4900
32270
90%
2890
1560
5770
4510
46880
Elapsed time to delete from a table with 1 Mio rows (600MB). Alter table move vs partition exchange vs Delete
We see that using DML to delete a big amount of data is not suitable. The new feature is very performant, especially when parallel processing is used.
Let’s check the amount of generated REDO/UNDO
Amount of generated REDO/UNDO in MB – Deletion from a table with 1 Mio rows: 600 MB
Redo_Delete
Undo_Delete
Redo_MOVE
Undo_Move
Redo_PEXC
Undo_PEXC
10%
81856
63481
481
61
486
67
20%
166274
127545
453
58
408
64
30%
246739
191644
426
54
379
60
40%
328479
255715
401
52
356
58
50%
410338
319785
374
49
326
54
60%
491963
383787
348
45
301
51
70%
574941
447902
320
42
275
48
80%
656880
511998
295
39
249
45
90%
738101
576195
265
35
222
42
The new feature and the method using partition exchange hardly generate REDO/UNDO in contrast to the method using DML.
Same test using a table with 10 Mio rows (6GB)
Elapsed time to delete from a table with 10 Mio rows (6GB). Alter table move vs partition exchange
MOVE
Partition Exchange
MOVE_DOP 1
MOVE_DOP 4
PEXC_DOP 1
PEXC_DOP 4
10%
84040
60370
133170
86090
20%
70120
48910
107490
85520
30%
59560
45040
90850
78610
40%
53760
42210
83130
69570
50%
52450
36210
77590
66380
60%
43260
32750
73830
60630
70%
37120
27480
67570
55770
80%
33120
21520
61460
49840
90%
27320
14750
54860
44020
Elapsed time (ms) to delete from a table with 10 Mio rows (6GB). Alter table move vs partition exchange
Redo_MOVE
Undo_Move
Redo_PEXC
Undo_PEXC
10%
1289
115
1242
121
20%
1213
111
1188
117
30%
1139
107
1109
113
40%
1066
103
1016
109
50%
989
99
943
105
60%
914
95
868
101
70%
841
91
792
97
80%
782
95
720
93
90%
564
65
513
70
The two methods hardly generate REDO/UNDO and are very fast.
It appears that the VBoxManage API (version 6.1.16r140961) has a problem when changing the disk type to „shareable“. I wanted to create a shareable hard drive to use it in a RAC installation. I created a fixed size virtual hard disk (as this is a requirement for using the same VDI in multiple virtual machines at the same time):
However, I got the following error by issuing the following command:
VBoxManage modifymedium disk "L:\vms\rac\rac_FRA1.vdi" --type shareable
VBoxManage.exe: error: Cannot change type for medium 'L:\vms\rac\rac_FRA1.vdi' to 'Shareable' since it is a dynamic medium storage unit VBoxManage.exe: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component MediumWrap, interface IMedium, callee IUnknown VBoxManage.exe: error: Context: "COMSETTER(Type)(enmMediumType)" at line 775 of file VBoxManageDisk.cpp
Also for the second disk I got the same error. In the Virtual Media Manager I saw that the disks are fixed size storage:
Strangely enough changing the type in the GUI worked:
After changing the type of one disk in the UI, I was able to change the type of the second disk using the API (this is strange).
When I tried to set the log_archive_dest_1, I got the error ORA-16179:
SQL> alter system set log_archive_dest_1=’/u1/oracle/archivelogs/ora12db1/‘;
alter system set log_archive_dest_1=’/u1/oracle/archivelogs/ora12db1/‘
*
ERROR at line 1:
ORA-32017: failure in updating SPFILE
ORA-16179: incremental changes to „log_archive_dest_1“ not allowed with SPFILE
The error message doesn’t mean a lot and is not meaningful. Expect e.g. incorrect path or syntax error.
After checking the documentation I have seen that the log_archive_dest_1 is more comp lexer than a path (unlike for example db_create_file_dest), one has to specify whether the archive location is remote (on other server) or local (and there are lot of other things that could be specified in this parameter). In my case, I just wanted to keep the archive logs on the same machine as the Oracle instance owner, so I just need to add the location key word at the beginning of the parameter:
SQL> alter system set log_archive_dest_1=’location=/u1/oracle/archivelogs/ora12db1/‘;
System altered.
Note:
If the path does not exist (or oracle instance owner has no access to it) the following error is displayed:
SQL> alter system set log_archive_dest_1=’location=/u1/oracle/archivelogs/ora12db1/‘;
alter system set log_archive_dest_1=’location=/u1/oracle/archivelogs/ora12db1/‘
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-16032: parameter LOG_ARCHIVE_DEST_1 destination string cannot be translated
ORA-07286: sksagdi: cannot obtain device information.
Linux-x86_64 Error: 2: No such file or directory