Archive

Tag Archives: Spring-Data

A recent contract I was working on had decided to use Solr to implement full-text search over a product catalogue for an e-commerce platform. Naturally we were approaching development with a TDD-mindset, and were keen to implement both Unit Tests for core business functionality, and also integration tests for for a more end-to-end style of testing. The primary application stack consists of Spring (Core, Data, MVC), MySQL and Solr 4.

Just a slight aside, but for anyone looking to implement full-text search the primary candidates are Solr and ElasticSearch. I won’t discuss the merits of either implementation further as it’s best to evaluate each in respect to your use cases (and here is an excellent resource to help you decide http://solr-vs-elasticsearch.com/

With our chosen frameworks and datastores we found the Unit testing relatively straight-forward, and decided to use JUnit (driven via the Maven surefire plugin), Mockito for mocking external dependencies (persistence layer, API calls etc), and PowerMock for the difficult mocking (for example, mocking static method calls of several reliable-but-decidedly-old-skool dependencies).

Integration testing was also relatively easy to setup – we chose to again drive tests via JUnit (this time via the failsafe plugin), and use Spring’s @ContextConfiguration and AbstractTransactionalJUnit4SpringContextTests to manage injected sub-components (@Autowires etc) and instantiate various parts of the application for testing, and we also ran an embedded H2 database to allow realistic simulation of a SQL datastore (just an aside, in ~99% of ‘standard’ use cases I have found H2 to behave identically to MySQL, but there are a couple of corner cases to watch out for – this will be another blog post :))

The Problem – How do we run an embedded Solr?

When we first started using Solr 4 we naturally wanted to create integration tests running against this datastore, and we wanted to run this in the same manner as we did with H2 – executing as a light-weight in-memory (embedded) process that we could create, pre-load, and destroy relatively quickly.

We soon found the EmbeddedSolrServer Class distributed within the Solr package, and although useful it didn’t fit in exactly with the way we wanted to design and deploy the Solr communication layer within our Spring application. For production use we wanted to instantiate a SolrServer bean for which we supply the target endpoint on the network (and under the hood this SolrServer bean would actually be instantiated using a custom HttpSolrServer Class). We needed a way to create an ’embedded’ version that implemented the SolrServer interface, but also allowed us to override the Solr config and data directory (to load pre-canned indexes etc)

After a fair bit of searching we stumbled over ZoomInfo’s excellent blog in which they had shared their version of an embedded SolrServer that could easily be exposed as a Spring bean. They called the Class the InProcessSolrServer

We would like to offer many thanks to ZoomInfo for sharing there great work, and this Class provided us with many months of good service. However, with the latest releases of Solr (4.2 +) ZoomInfo’s InProcessSolrServer will no longer compile due to an interface change within the Solr internals.

In the spirit of sharing the wealth I wanted to blog an update to the original ZoomInfo code, which addresses the interface change, and I’ve also included the Spring scaffolding in the gist below to give you an idea of how we run this code.

package uk.co.taidev.solrtesting.solr;
import com.google.common.base.Throwables;
import com.google.common.io.Files;
import com.iat.compassmassive.normalisedproductloader.springutils.SpringProfileName;
import org.apache.solr.client.solrj.SolrRequest;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.client.solrj.response.UpdateResponse;
import org.apache.solr.common.util.NamedList;
import org.apache.solr.core.CoreContainer;
import org.apache.solr.core.CoreDescriptor;
import org.apache.solr.core.SolrCore;
import org.apache.solr.core.SolrResourceLoader;
import org.apache.solr.schema.IndexSchema;
import org.apache.solr.search.SolrIndexSearcher;
import org.apache.solr.util.RefCounted;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.annotation.Profile;
import org.springframework.stereotype.Component;
import java.io.Closeable;
import java.io.File;
import java.io.IOException;
import java.util.Collection;
/**
* SolrServer sub-class that manages the life-cycle of an in-process(embedded) Solr server.
* <p/>
* Modified from original source provided by ZoomInfo @ http://browse.feedreader.com/c/ZoomInfo_Blog/12021683
* <p/>
* Required dependencies: Spring 3.2.X, Solr, 4.2.X+ (or 4.3.X), Guava 14+
* <p/>
* User: Daniel Bryant
* Date: 01/07/13
*/
@Component
@Profile(SpringProfileName.DEVELOPMENT)
public class InProcessSolrServer extends SolrServer implements Closeable {
//
//------------------ static -------------------------
//
private static final Logger LOGGER = LoggerFactory.getLogger(InProcessSolrServer.class);
private static final String DEFAULT_SOLR_HOME_DIR_PATH = "./src/test/resources/solr/";
//
//------------------ instance-------------------------
//
private File solrHomeDir = null;
private File dataDir = null;
private SolrServer delegate = null;
private transient SolrCore core = null;
//
//------------------ constructor -------------------------
//
/**
* Create an InProcessSolrServer using the default Solr Home Directory.
*/
public InProcessSolrServer() {
this(DEFAULT_SOLR_HOME_DIR_PATH);
}
/**
* Create an InProcessSolrServer using the specified Solr Home Directory and a Solr Data Directory placed
* beneath the system's temporary directory (as defined by the Guava method Files.createTempDir()).
*
* @param solrHomeDirPath path to Solr Root Directory
*/
public InProcessSolrServer(String solrHomeDirPath) {
try {
System.setProperty("solr.solr.home", solrHomeDirPath);
System.setProperty("solr.data.dir", Files.createTempDir().getAbsolutePath());
CoreContainer.Initializer initializer = new CoreContainer.Initializer();
CoreContainer coreContainer = initializer.initialize();
delegate = new EmbeddedSolrServer(coreContainer, "");
} catch (Exception e) {
throw Throwables.propagate(e);
}
}
//
//------------------ public -------------------------
//
/**
* This method passes all queries and indexing events on to an in-process delegate.
*
* @param req Solr Request
* @return NamedList
* @throws SolrServerException if an error occurs when processing the request
* @throws IOException if an IOException occurs when processing the request
*/
@Override
public NamedList<Object> request(final SolrRequest req) throws SolrServerException, IOException {
try {
return getDelegate().request(req);
} catch (final Exception e) {
Throwables.propagateIfInstanceOf(e, SolrServerException.class);
Throwables.propagateIfInstanceOf(e, IOException.class);
throw Throwables.propagate(e);
}
}
/**
* Closes the Solr Core.
*/
@Override
public synchronized void close() {
if (core != null) {
core.close();
core = null;
}
}
/**
* SolrIndexSearcher adds schema awareness and caching functionality over the Lucene IndexSearcher.
* http://lucene.apache.org/solr/normalisedproductloader/org/apache/solr/search/SolrIndexSearcher.html
*
* @return RefCounted SolrIndexSearcher
* @throws SolrServerException
*/
public RefCounted<SolrIndexSearcher> getIndexSearcher() throws SolrServerException {
getDelegate(); // force the delegate to be created
return core.getSearcher();
}
/**
* Returns the index schema used by this Solr server.
*
* @return delegate SolrServer primary core IndexSchema
* @throws SolrServerException
*/
public IndexSchema getIndexSchema() throws SolrServerException {
getDelegate(); // force the delegate to be created
return core.getSchema();
}
/**
* Prepares this SolrServer for shutdown.
*/
@Override
public void shutdown() {
LOGGER.debug("shutdown entry...");
close();
LOGGER.debug("...core closed...");
}
@Override
public UpdateResponse addBeans(Collection<?> beans) throws SolrServerException, IOException {
UpdateResponse updateResponse = super.addBeans(beans);
super.commit();
return updateResponse;
}
@Override
public UpdateResponse deleteByQuery(String query) throws SolrServerException, IOException {
UpdateResponse updateResponse = super.deleteByQuery(query);
super.commit();
return updateResponse;
}
//
//------------------ protected -------------------------
//
@Override
@SuppressWarnings("FinalizeDeclaration")
protected void finalize() throws Throwable {
close();
super.finalize();
}
//
//------------------ private -------------------------
//
/**
* This method creates an in-process Solr server that otherwise behaves just as expected.
*/
private synchronized SolrServer getDelegate() throws SolrServerException {
if (delegate != null) {
return delegate;
}
try {
CoreContainer container = new CoreContainer(SolrResourceLoader.locateSolrHome());
CoreDescriptor descriptor = new CoreDescriptor(container, "core1", solrHomeDir.getCanonicalPath());
core = container.create(descriptor);
container.register("core1", core, false);
delegate = new EmbeddedSolrServer(container, "core1");
return delegate;
} catch (IOException ex) {
throw new SolrServerException(ex);
}
}
/**
* Sets the Solr root directory. In Solr’s documentation, this is generally referred to as "/solr-root". The "conf"
* directory (containing schema, stopwords, synonyms etc) will be a subdirectory of this.
*
* @param solrHomeDir Solr 'Home Directory'
*/
private void setSolrHomeDir(final File solrHomeDir) {
this.solrHomeDir = solrHomeDir;
System.setProperty("solr.home", solrHomeDir.getPath());
if (this.dataDir == null) {
setDataDir(new File(solrHomeDir, "data"));
}
}
/**
* Sets the Solr data directory. This is the parent directory of the "index" and "spellchecker" directories.
*
* @param dataDir Solr 'Data Directory'
*/
private void setDataDir(final File dataDir) {
this.dataDir = dataDir;
System.setProperty("solr.data.dir", dataDir.getPath());
}
}

I hope this helps, and if you have any questions then please feel free to comment or tweet 🙂

Spring Data – by Mark Pollack et al

5_star

TLDR:  If you are working with Spring Data on a daily basis and want a complete and thorough overview of the framework then this book is all you will need. It covers all aspects of Spring Data without being overly verbose, and even if you have used Spring Data quite a lot already (like me), then I still believe you’ll discover something useful from this book. You will also find bonus chapters in context with Spring Data on Spring Roo, the REST repo exporter (very cool!), ‘Big Data’ via Hadoop, Pig, Hive and Spring Batch/Integration, and also coverage of GemFire.

I’ve been working professionally with Spring Data for quite some time now, both for ‘old skool’ RDBMS and also a lot of NoSQL (primarily MongoDB and Redis). The company I was working for at the time the Spring Data projects were approaching release were somewhat early-adopters, and in combination with the fact that their applications were firmly rooted in Spring made the decision to use this framework an easy choice. After some initial problems, which should be expected with a new technology (such as config issues and incompatibly between libraries bundled in JARs etc), Spring Data has provided a massive boost to productivity, and it is now my de facto choice when implementing persistence within Spring.

About the book itself: The first few chapters provide a great introduction to Spring Data, and describe the key motivations and techniques behind the framework. If you are simply modifying an already configured Spring Data app then this is all you need (but please do keep reading to learn more!). The next few chapters cover integration with an RDBMS, and also the popular NoSQL implementations – MongoDB, Neo4j and Redis. If you are working in one specific technology then reading the corresponding chapter will get you up and running quickly. Although Spring Data provides a common abstraction layer, it allows datastore-specific functionality to bleed through the interfaces (which is a good thing in my opinion, as it allows you to leverage specific features and strengths of the underlying technology), and this book will provide an excellent grounding and explanation of key concepts within each underlying datastore technology so that you can become productive quickly. Of course, you can also head over to the Spring Source website to learn the really advanced stuff (if you want to).

Part 4 of the book covers several interesting advanced features of the framework, such as using Spring Roo to auto-generate repository code, and also a brief guide on how to use the REST Repository Exporter. Metaprogamming and RAD tools like Spring Roo (and web-frameworks such as Grails and Play) are becoming increasingly popular in the industry, and so this chapter is a nice addition to the book. The REST exporter is also a very cool feature, and essentially allows you to expose CRUD functionality on your repositories via a REST interface. For anyone building a SOA-based app (or using micro-services etc) then encapsulating datastores and exposing simply functionality via a well defined HTTP-based API is very cool.

The final two parts of the book provide detailed coverage of using Spring Data to work with ‘Big Data’ through the use of Apache Hadoop, Spring Batch, Spring Integration and GemFire. Although this content wasn’t relevant to my initial decision to buy the book the chapters are a complete bonus in my opinion, and upon reading them I was even more happy with my purchase. The content provided is obviously quite high-level (as Big Data is a huge topic, no pun intended :)), but has enough detail to get you up and running with some Hadoop Jobs and Hive and Pig etc, which is a great skill to add to your CV.

I chose this book over the only other real competition for Spring Data coverage, Petri Kainulainen’s Spring Data, purely because this book offered more content. Obviously the book under review has more pages, ~280 vs ~160, but more importantly it covers a greater amount of topics, and Petri’s book focuses primarily on Redis (for which I was already familiar with). My main motivation for buying a Spring Data book was to learn about the ‘tips and tricks’, and I think either book would have met this need, but the coverage of other NoSQL technologies in the book under review, and the bonus chapters on Big Data technologies swayed my final decision. Now that I’ve read the book I am very happy with the decision.

In summary: This book will be all you need to master Spring Data, from key concepts to advanced usage. You’ll learn all of the ‘tips and tricks’ along the way, and also become familiar with Spring Roo, the REST repo exporter and fundamental techniques within Spring Data’s ‘Big Data’ processing (Hadoop, Spring Batch/Integration etc). I would recommend the book to any Spring developer, even one like myself who is happy learning about Spring from the excellent Spring Source website This book is a little more ‘polished’ than the Spring Source docs, and also provides concepts in well-structured and bit-sized chunks of information.

Click Here to buy ‘Spring Data‘ on Amazon (This is a sponsored link. Please click through and help a fellow developer to buy some more books!  )

Spring in Practice – by Willie Wheeler and Joshua White

5_star

TLDR; This is an excellent and comprehensive guide to advanced usage of the Spring framework. For anyone who is looking to further their knowledge gained from several years of Spring development in the trenches, this book will pay dividends. Although a Spring novice may be able to learn about Spring from this book, I would recommend picking up a copy of Spring in Action first, as the ‘In practice’ books can be quite fast paced!

As a seasoned Java developer I have been working with the Spring framework for many years now. One of the first Spring books I read was Spring in Action, and in combination with Java Persistence with Hibernate  this book has helped me complete many successful projects (I seriously owe the authors a few beers!). From the grounding provided in these book, and in combination with the excellent Spring Source website, I have been able to explore and develop my skills as the Spring framework has expanded – for example, the Spring Data project is now my go-to framework for all things NoSQL related. However, I always enjoy learning from advanced Spring practitioners and also from reading stories about real-world use and abuse of the framework, and I have yet to find a good book that meets this need – until now. ‘Spring in Practice’ satisfies this gap in the market perfectly.

The book is ~500 pages, and it manages to cram in a lot of content. Advanced usage of all the main Spring components is covered, and covered well. The first nine chapters provide a great grounding and advanced look at topics such as data persistence (ORM), Spring MVC, Web Flow and Security. The remaining chapters deep-dive into topics such as Integration Testing and Enterprise Integration (REST, RabbitMQ and IMAP integration etc), and really focus on how to write good (high-quality) code for the common but difficult tasks.

As the title suggests, the book’s focus is very much about practical usage of Spring. It’s not quite in the ‘cookbook’ style you may have seen with other books, but IMHO, this book is better organised for general learning (i.e. reading the book from cover to cover). The obvious advantage with a cookbook style reference is that it’s easy to cherry-pick solutions to problems, but I find that cookbooks can be difficult to read through if you simply want to learn. ‘Spring in Practice’ is logically structured, the book is nicely paced for the advanced developer, and the discussions of real-world problems and the related code sample solutions seek to further your knowledge and encourage exploration of Spring.

As mentioned above, I have worked with Spring for several years, but this book has taught me lot of new tricks – there’s nothing like finding a section of the book that leads to a ‘no way, Spring does that?’ moment 🙂 The author’s clearly have their own style of developing in Spring, and I personally would chose to do some things differently (e.g. I code the production of XML/JSON differently), but I can’t argue that what they’ve done isn’t best practice, and with a framework as large and wide-scoped as Spring, there is bound to be many approaches to do the same thing.

In summary, this is an excellent book, and one that should be on the bookshelf of any serious Spring developer. It will help deepen knowledge gained from ‘Spring in Action’, and also help to augment skills honed from time in the development trenches. I can almost guarantee that anyone who picks up a copy of this book, no matter how advanced they are, will learn something new. As you’ve no doubt guessed by now, I highly recommend this book, and I would like to offer my congratulations to the authors and Manning for writing a book which has long been needed by advanced Spring practitioners!

Click Here to buy ‘Java Persistence with Hibernate‘ on Amazon (This is a sponsored link. Please click through and help a fellow developer to buy some more books!  )