Archive

Tag Archives: Solr

A recent contract I was working on had decided to use Solr to implement full-text search over a product catalogue for an e-commerce platform. Naturally we were approaching development with a TDD-mindset, and were keen to implement both Unit Tests for core business functionality, and also integration tests for for a more end-to-end style of testing. The primary application stack consists of Spring (Core, Data, MVC), MySQL and Solr 4.

Just a slight aside, but for anyone looking to implement full-text search the primary candidates are Solr and ElasticSearch. I won’t discuss the merits of either implementation further as it’s best to evaluate each in respect to your use cases (and here is an excellent resource to help you decide http://solr-vs-elasticsearch.com/

With our chosen frameworks and datastores we found the Unit testing relatively straight-forward, and decided to use JUnit (driven via the Maven surefire plugin), Mockito for mocking external dependencies (persistence layer, API calls etc), and PowerMock for the difficult mocking (for example, mocking static method calls of several reliable-but-decidedly-old-skool dependencies).

Integration testing was also relatively easy to setup – we chose to again drive tests via JUnit (this time via the failsafe plugin), and use Spring’s @ContextConfiguration and AbstractTransactionalJUnit4SpringContextTests to manage injected sub-components (@Autowires etc) and instantiate various parts of the application for testing, and we also ran an embedded H2 database to allow realistic simulation of a SQL datastore (just an aside, in ~99% of ‘standard’ use cases I have found H2 to behave identically to MySQL, but there are a couple of corner cases to watch out for – this will be another blog post :))

The Problem – How do we run an embedded Solr?

When we first started using Solr 4 we naturally wanted to create integration tests running against this datastore, and we wanted to run this in the same manner as we did with H2 – executing as a light-weight in-memory (embedded) process that we could create, pre-load, and destroy relatively quickly.

We soon found the EmbeddedSolrServer Class distributed within the Solr package, and although useful it didn’t fit in exactly with the way we wanted to design and deploy the Solr communication layer within our Spring application. For production use we wanted to instantiate a SolrServer bean for which we supply the target endpoint on the network (and under the hood this SolrServer bean would actually be instantiated using a custom HttpSolrServer Class). We needed a way to create an ’embedded’ version that implemented the SolrServer interface, but also allowed us to override the Solr config and data directory (to load pre-canned indexes etc)

After a fair bit of searching we stumbled over ZoomInfo’s excellent blog in which they had shared their version of an embedded SolrServer that could easily be exposed as a Spring bean. They called the Class the InProcessSolrServer

We would like to offer many thanks to ZoomInfo for sharing there great work, and this Class provided us with many months of good service. However, with the latest releases of Solr (4.2 +) ZoomInfo’s InProcessSolrServer will no longer compile due to an interface change within the Solr internals.

In the spirit of sharing the wealth I wanted to blog an update to the original ZoomInfo code, which addresses the interface change, and I’ve also included the Spring scaffolding in the gist below to give you an idea of how we run this code.

I hope this helps, and if you have any questions then please feel free to comment or tweet 🙂

I’m currently working on a Java-based component which is utilising Solr heavily. We bundle the Solr core library dependency within a fat JAR (which we deploy standalone), and this dependency is managed via the de facto Maven approach.

The Problem

After a series of new features were added to this component by a member of the team we suddenly noticed that the usual Solr logging to the console has stopped. At first we thought Solr had stopped working (which caused a little panic 🙂 ), but even though no logging was being displayed we could still access the web console, and everything appeared to be functioning correctly.

What most of the team didn’t realise is that during the addition of new features one of the developers also bumped the version of the Solr dependency from 4.2.0 to 4.3.0. The intentions were good – get the latest and greatest version, and the expectation is that a minor version number increase usually fixes bugs and adds a few pieces of new functionality.

However, this time around it was probably worth reading the release notes, as the Solr team have fundamentally altered their approach to logging in the 4.3.0 release  http://wiki.apache.org/solr/SolrLogging

The Solution

The fix for us was to include an appropriate log4j.properties config file within our component’s Maven resources directory. We were already including the slf4j and log4j dependencies within our component Maven POM, and so we didn’t need to perform any additional steps to incorporate these into the deployment artifact (our fat JAR) as mentioned by the Solr team at http://wiki.apache.org/solr/SolrLogging