SQL: How to join versioned tables

In some cases, data versioning is used at the record level. This is especially useful because you can version tables independently. You only need to version those records, that have changed, without changing dependent records from other tables. This by using two date columns in each table „version_from“ and „version_to“. So that there is always exactly one version of a record for a given time from each table. This reduces the memory space and the IOs.

Because in any given time there is at most only one version of a record per table, joining two tables in a given time is easy. However to join the tables independently for a given time (all changes across two tables) is little bit tricky. In this Blog I will show you how to do that.

For instance let’s create two tables „tbl_a“ and „dim_1“. The two tables should be joined using the columns tbl_a.dim_1_id and dim_1.id.

-- create table tbl_a
create table tbl_a (
                      id number
                      ,valid_from timestamp(6)
                      ,valid_to timestamp(6)
                      ,dim_1_id number
                      ,some_info varchar2(10)
);


--create table dim_1
create table dim_1 (
                      id number
                      ,valid_from timestamp(6)
                      ,valid_to timestamp(6)
                      ,some_value number
);

Let’s generate some data:

insert into tbl_a
select 123456 id,
               systimestamp + numtodsinterval(1, 'SECOND') +
               numtodsinterval(rownum - 1, 'HOUR') valid_from,
               systimestamp + numtodsinterval(rownum, 'HOUR') valid_to,
               123 dim_1_id,
               decode(mod(rownum,3),1,'value1',2,'value2','value3') some_value
               
          from dual connect by level < 4
union all
select 123456 id,
               systimestamp + numtodsinterval(1, 'SECOND') +
               numtodsinterval(rownum - 1, 'HOUR') valid_from,
               systimestamp + numtodsinterval(rownum, 'HOUR') valid_to,
               124 dim_1_id,
               decode(mod(rownum,3),1,'value1',2,'value2','value3') some_value
               
          from dual connect by level < 4
union all
select 123456 id,
               systimestamp + numtodsinterval(1, 'SECOND') +
               numtodsinterval(rownum - 1, 'HOUR') valid_from,
               systimestamp + numtodsinterval(rownum, 'HOUR') valid_to,
               125 dim_1_id,
               decode(mod(rownum,3),1,'value1',2,'value2','value3') some_value
               
          from dual connect by level < 4
;
commit;
insert into dim_1
  select 123 id,
         systimestamp + numtodsinterval(1, 'SECOND') +
         numtodsinterval(30 * (rownum - 1), 'MINUTE') valid_from,
         systimestamp + numtodsinterval(30 * rownum, 'MINUTE') valid_to,
         decode(mod(rownum, 3), 1, 1000, 2, 2000, 3000) some_value
  
    from dual
  connect by level < 10
  union all
  select 124 id,
         systimestamp + numtodsinterval(1, 'SECOND') +
         numtodsinterval(30 * (rownum - 1), 'MINUTE') valid_from,
         systimestamp + numtodsinterval(30 * rownum, 'MINUTE') valid_to,
         decode(mod(rownum, 3), 1, 1000, 2, 2000, 3000) some_value
  
    from dual
  connect by level < 10
  union all
  select 125 id,
         systimestamp + numtodsinterval(1, 'SECOND') +
         numtodsinterval(30 * (rownum - 1), 'MINUTE') valid_from,
         systimestamp + numtodsinterval(30 * rownum, 'MINUTE') valid_to,
         decode(mod(rownum, 3), 1, 1000, 2, 2000, 3000) some_value
  
    from dual
  connect by level < 10;
commit;
select * from tbl_a order by valid_to desc
select * from dim_1 order by id,valid_to des

Joining the tables at given a time is straightforward

select *
  from tbl_a a
  join dim_1 b
    on a.dim_1_id = b.id
   and to_timestamp('16.03.2019 18:42:00', 'dd.mm.yyyy hh24:mi:ss') between
       a.valid_from and a.valid_to
   and to_timestamp('16.03.2019 18:42:00', 'dd.mm.yyyy hh24:mi:ss') between
       b.valid_from and b.valid_to
join result at to_timestamp(‚16.03.2019 18:42:00‘, ‚dd.mm.yyyy hh24:mi:ss‘)

But joining the tables without specifying a given time is little bit tricky:

select a.id,
       least(a.valid_from, b.valid_from) valid_from,
       greatest(a.valid_to, b.valid_to) valid_to,
       a.some_info,
       b.some_value
  from tbl_a a
  join dim_1 b
    on a.dim_1_id = b.id
   and greatest(a.valid_from, b.valid_from) < least(a.valid_to, b.valid_to)
 order by 3 desc
Join result without specifying a time (all versions change across both tables)

You can use the same to join several tables.