Contact

Technology

Jun 16, 2014

What Can Querydsl Do for Me Part 2: Multi-Tenant Application Support and Deep Initialization Paths

Bryan Chapman

Bryan Chapman

Default image background

WHAT IS A MULTITENANCY?

Multitenancy refers to the ability for a single instance of an application to run on a single server (or distributed servers) and serve multiple organizations or clients (tenants) simultaneously. The alternative to this approach is multi-instance architectures in which separate instances of the applications are deployed to support all of the different tenants. Multitenancy, while more complex to implement and develop, can yield significant rewards if done correctly. These include:

–  Decreased hardware/software licenses requirements–machines can be reused. –  Simplified data aggregation–if customer data is stored in a single location and in a standard schema. –  Simplified release management–one build, one production environment.

While the rewards are certainly worth the development investment, implementing a system in this fashion is not free of issues and concerns. One of the critical risks comes in the form of data segregation, which we will specifically focus on and how Querydsl can help in this situation. When multiple customers’ data is saved in a single data store and accessible through one application, security concerns are bound to surface. How can we be sure users only see the data that is applicable to their user/organization in a general way to help ease the amount of QA involved with securing this data? Let’s take a look at how Querydsl can help.

INTRODUCE THE NEED FOR MULTITENANCY

Let’s begin by introducing the need for multitenancy into our previous banking application. Up until this point, we weren’t exposing any customer specific data, and we were only displaying all of the available branches for various banks. Let’s imagine in our application that banks wish to store all of their banking transaction data in our application, and it is up to us to secure and filter this data for them.

Time to switch branches:

git checkout b2.1

From a quick glance at our new source, you’ll notice we have introduced a Customer and a Transaction model, as well as repositories for retrieving this data. In addition, there is a UserService, which we will use for mock authentication. All this does is place the current username into a ThreadLocal to simulate an authentication model (you should use your own solution, which is slightly more secure).

When you run this code, you’ll notice that even though we “authenticated” using Frank’s account, we are able to view all of Alice and Matt’s transactions. I certainly don’t want anyone else accessing my personal data, thus the need for multitenancy has arisen.

Transactions found with findAll():------------------------------- Transaction[id=1, customer='Frank Wood', accountNumber='583942302', ... Transaction[id=4, customer='Alice Lain', accountNumber='294032410', ... Transaction[id=8, customer='Matt Lean', accountNumber='034502102', ...

HOW CAN WE SEPARATE OUR CUSTOMER’S DATA USING QUERYDSL?

Time to switch branches:

git checkout b2.2

Querydsl will let us build our own dynamic queries, and the Transaction model has a reference to the customer—so why can’t we just build this into the query and retrieve the current username from the thread local ourselves? All we have to do is append the following code from all our existing calls to the Transaction repository:

.and(tr.customer.username.eq(UserService.getCurrentUser())

Now if we run our application again it appears it almost worked this time. What went wrong?

As you can see from the output, while most of the search restricted the results to only Frank’s account based on the current username, the final search request ignored the current logged in user and passed in Alice’s username directly and nothing stopped it. While the developer followed the rules and only attempted to access the current user’s transactions in most cases, the developer slipped on the last one. Ideally, we wouldn’t want our design to even allow this scenario and take the power away from the developer (ouch, that hurts to say).

Transactions found for Alice: transactionService.findAll(tr.customer.username.eq("alain"));------------------------------- Transaction[id=4, customer='Alice Lain', ... Transaction[id=5, customer='Alice Lain', ... ...

While our approach of using the clause from above (“.and(tr.customer.username.eq() “ ) to restrict our results was a good start, there is a better place for this logic to be introduced in which the user of the repository requires no knowledge of the restrictions and will retrieve the expected filtered results without any explicit action required on their part. This will prevent any accidental data access.

IS THERE A BETTER WAY TO DO THIS?

Time to switch branches:

git checkout b2.3

Up until this point we have been interacting with our repositories directly, while in most projects you would have some type of service layer component between your controllers and your repositories. In this branch version, you will notice we have introduced a TransactionService to do just that. For this example, we will only be exposing the findAll() methods from the repository; however, it may be necessary to expose many more or use the JPA repository methods available to suit your needs. If we take a look at the implementation below for the findAll() methods, we’ll notice that the no-arg findAll() method simply passes in the dummy predicate of tr.isNotNull to the findAll(BooleanExpression). This is done for sake of clarity. You could have also passed null as the parameter and the results would have been the same, based on our implementation.

public Iterable findAll() { QTransaction tr = QTransaction.transaction;   //include always true predicate as a building point return findAll(tr.isNotNull());}   public Iterable findAll(BooleanExpression expression) { QTransaction tr = QTransaction.transaction;   //need a valid predicate, include always true predicate //as a building point if(expression == null) { expression = tr.isNotNull(); }   //filter results limited to current authenticated user without //caller needed to have knowledge of this expression = expression.and(tr.customer.username.eq(UserService.getCurrentUser()));   return repository.findAll(expression);}

Now when we run our application, we notice that our attempt to retrieve Alice’s transactions returns an empty set, even though we didn’t specify whose transactions we wish to find. And so the power of Querydsl for supporting multitenancy without explicit specifications is discovered.

Transactions found for Alice: transactionService.findAll(tr.customer.username.eq("alain"));-------------------------------

DEEP INITIALIZATION PATHS

While developing this post and experimenting with different aspects of Querydsl, I came across an interesting issue when attempting to build queries on properties nested more than two levels deep. After some quick search through the documentation, I came across a fairly straight-forward solution to deep initialization paths.

Reading through the Querydsl documentation, the creators mention path initialization. Specifically that Querydsl will only initialize direct reference (or second level references with the new version of Querydsl) properties. If you require longer initialization paths, these must be annotated using the @QueryInit annotation. Below is an example from the documentation:

@Entityclass Event { @QueryInit("customer") Account account;}   @Entityclass Account{ Customer customer; }   @Entityclass Customer{ String name; // ...}

In this example, the annotation will force initialization of the account.customer path whenever the Event path is initialized as a root path variable.

Note: The path initialization format supports wildcards as well, e.g., “customer.*” or just “*”.

Well, that’s all fine and great, but what does it mean? Looking at the example model below, if you did not include the @QueryInit annotation on the Event’s account, the following expression would fail with an NPE since the account path would not be initialized by default.

eventRepository.findAll(event.account.customer.name.eq(“Bryan”)

EXAMPLE

Let’s take a look at an example from our own application:

Time to switch branches:

git checkout b2.4

We have now added a reference to a parent/guardian customer to the account model to allow for three levels of references for example purposes.  We attempt to search for transactions based on the account’s parent/guardian first name. Based on our previous discussion, by default the account’s parent/guardian transaction path would not be initialized, and we can assume this request will fail. And as we expect, we receive an NPE when this example is run:

transactionService.findAll(tr.customer.account.custodialAccountHolder.firstName.eq("Alice")); Caused by: java.lang.NullPointerException at com.credera.querydsl.Application.main(Application.java:82)

Time to switch branches:

git checkout b2.5

Now that we know the problem and have a better understanding of initialization path limitations in Querydsl, we have a one-line fix to the problem. Our annotation below will initialize the custodialAccountHolder field within the customer’s account, and we can now complete our search without any NPEs.

@ManyToOne @QueryInit("account.custodialAccountHolder")private Customer customer;

If you have any questions about the Querydsl tutorial project, please feel free to use the comments section below or contact us at info@credera.com. You can also follow us on Twitter at @CrederaOpen and connect with us on LinkedIn for additional Open Technologies insights.

Conversation Icon

Contact Us

Ready to achieve your vision? We're here to help.

We'd love to start a conversation. Fill out the form and we'll connect you with the right person.

Searching for a new career?

View job openings