Java EE: Make an @Entity “empty” using a @Transient proxy

By | July 22, 2016

In some cases while optimizing code to minimize the database footprint of a Java EE application we get to the point of trying to get rid of duplicate data.
In a lot of cases several steps of the processing flow tend to store partially processed data or states of data and in a lot of cases we generate this way unwanted duplicates. When we are talking about blobs that get duplicated this can have a huge footprint in the database and in time we discover that we waste hundreds of GB of database space. This is wasted space but also decrease in database performance due to the extra overhead introduced by this duplicate data.

Lets take the concrete case of having two entities defined as bellow:

An image entity:

And a container entity that refers to it.

We also know that the actual image data exists also into another entity called Message that represents in fact an incommoding data package that contains also the image file binary. This is usually the case of enterprise systems where the input data must be kept for audit purposes in the exact form it was received.

So in our case we will have 3 entities that correspond to 3 tables in the database: Message,Image,CollectionItem where we have redundant data in Message and Image tables. To eliminate the data duplication we have to do the following changes.

Define a new proxy class that will stay in front of the real Image entity.

Change the container entity and add a new field marked with the @Transient annotation. This annotation specifies that the property or field is not persistent. Change the setter and getter of the image member to use instead the proxy object.

With the new setter we in fact “fake” save the binary data into the entity, instead we just cache the data in a transient object. When a new CollectionItem is created no data will be saved under the Image entity.

With the new getter if the image=null we are in the case no data exists in the Image table for that image data, so then we extract the real data from the input message (getItemPicturesFromZip(sourceMessage,this) ) and we cache the data in a transient field in case we need it for some operations in the current transaction.

With the new getter in case of legacy images (image!=null) for which we have an entry in the Image table we keep the old behaviour.

This is a very useful trick that can be applied to a lot of cases when duplicate data can be eliminated.