• Posted by Intent Media 18 Aug
  • 1 Comments

Improving Code Design by Replacing isA(…) Matchers in Tests

When I practice Test-Driven Development, I use the feedback from my tests to improve the design of the production code.  One particular source of feedback I find useful is the expectations and assertions used in the tests.  I’ve become particularly sensitive to the use of one particular expectation – the isA(…) matcher.  Whenever I see the isA(…) matcher used in an assertion, I’m usually suspicious that there might be a way to change the design that allows me to more precisely specify my assertion and reduce the coupling in my production code.

Here’s a brief introduction to matchers, the isA(…) matcher, and an example demonstrating how I used the feedback from a test to improve the design of the production code.

A Brief Introduction to Matchers

Test double libraries like JMock, Mockito, EasyMock, or RSpec Mocks allow you to make assertions about the arguments to a given method call using matchers.  For example, if we were building a mailing list application, you might find some code like this (using EasyMock):

Customer customer = aCustomer();  // A valid customer with some default configuration
Catalog catalog = createMock(Catalog.class);
expect(catalog.subscribe(same(customer), eq(Frequency.WEEKLY))).andReturn(...);

This test creates a test double for a Catalog object and expects the object under test to call subscribe with two arguments: the same customer instance listed above and a frequency equal to WEEKLY.

The isA(…) Matcher

There’s another type of matcher called isA(…) which takes a class as an argument.  For example:

Catalog catalog = createMock(Catalog.class);
expect(catalog.subscribe(isA(Customer.class), ...).andReturn(...);

In this case, you see that the Catalog test double expects to be called with its first argument to be something which is an instance of the Customer class. For example, this will make that expectation pass:

catalog.subscribe(new Customer(), ...);

Here the isA(…) matcher doesn’t tell me anything about the specifics of that parameter.  For example, I don’t know if there is any particular information about that specific Customer that is important – will any Customer do?  Are there specific properties that the Customer must have?  When I read the test and see the isA(…) matcher, it’s not clear to me.

I find the isA(…) matcher helpful if I’m interfacing with code that expects an anonymous inner class (such as Guava‘s Predicate) or if I’m interacting with some object whose interface I can’t change (often when using a third-party library, but I prefer to not mock objects I don’t own).  Sometimes I might have an object which dispatches sequences of commands (using the Command pattern) and I might use isA to test that the correct commands are executed.  But most of the time I am suspicious when I see this matcher.

An Example

Here’s an example of some code that I worked on where I noticed an isA(…) matcher and decided to do some more investigation.

In this story our analytics infrastructure team was working on adding a feature that would ask our analytics database to re-index its data after a batch job inserted new aggregated data.  We already had code that would insert the data and we were looking for a spot to add this new indexing behavior.  This code came from the BatchJobOutputTest which looked like this:

// In BatchJobOutputTest
  
@Test
public void testImportsOutputAndDeletesTemporaryFiles() throws Exception {
    File outputPart1 = aResultFile();
    File outputPart2 = aResultFile();
 
    ResultsStorage resultsStorage = storageIncluding(outputPart1, outputPart2);
 
    DatabaseImporter importer = createMock(DatabaseImporter.class);
    expect(importer.importData(
        isA(OutputCollector.class), ...))
        .andReturn(...);
 
    JobOutput subject = new BatchJobOutput(resultsStorage, importer);
    subject.importStepOutput(...);
 
    assertFilesDoNotExist(outputPart1, outputPart2);
}

I wasn’t clear on some of the details of this test, so I started by looking for what the OutputCollector was used for.  Looking through the test, I never saw the creation of OutputCollector at all!  This made me suspicious – I hope to try and find some clear connection between the expectations of a test and the setup/inputs.  There was also something going on with the ResultsStorage which required some File objects (outputPart1 and outputPart2).  I decided to look and see what the production code was doing:

public class BatchJobOutput implements JobOutput {
  
    public BatchJobOutput(JobStorage jobStorage, DatabaseImporter importer) { ... }
   
    @Override
    public void importStepOutput(...) {
        OutputCollector outputCollector = new OutputCollector();
        try {
            ...
            importer.importData(outputCollector, ...);
            ...
        }
        finally {
            for (File outputPart : outputCollector) {
                outputPart.delete();
            }
        }
    }
    ...
}

Notice that the OutputCollector is created and then passed as an argument into the DatabaseImporter.importData method, but I didn’t know what the importer does with that object.  The OutputCollector is used at the bottom of the method in that finally() block, but I was curious to know more.  I decided to look and see what the DatabaseImporter was doing with the OutputCollector:

public class DatabaseImporter {
    public DatabaseImporter(Transaction transaction, ...) { ... }
  
    public Result importData(OutputCollector outputCollector, ...) {
        TransactionCallback callback = callBackFor(
            outputCollector.getLocalCopies(),  // Here's where the OutputCollector is used
            ...
        );
        return transaction.execute(callback);
    }
}

The code calls getLocalCopies() on the OutputCollector and passes those results as arguments to callbackFor() – and that result is passed in to Transaction’s execute() method.  At this point in my examination I started to wonder if there might be a way to collapse these nested calls, but as a first step I wanted to see what exactly that OutputCollector was being used for.  It turns out getLocalCopies() returns a List of Files.  I remember seeing a list of files back up in the test setup of the ResultsStorage:

// Back in BatchJobOutputTest ...
  
@Test
public void testImportsOutputAndDeletesTemporaryFiles() throws Exception {
    File outputPart1 = aResultFile();        // These input files ...
    File outputPart2 = aResultFile();
    ResultsStorage resultsStorage = storageIncluding(outputPart1, outputPart2); // ... are referenced in ResultsStorage
 
    ...
 
    JobOutput subject = new BatchJobOutput(resultsStorage, importer);
    subject.importStepOutput(...);
 
    assertFilesDoNotExist(outputPart1, outputPart2);  // .. and later checked that they get deleted
}

Ah!  It turns out the DatabaseImporter wants a list of files to import!  I decided to pass that list of files directly to the DatabaseImporter rather than giving it the OutputCollector and requiring it to know to ask for the local file copies.

Here’s the DatabaseImporter now accepting a list of files rather than an OutputCollector:

public class DatabaseImporter {
    public DatabaseImporter(Transaction transaction, ...) { ... }
  
    public Result importData(List filesToImport, ....) {  // CHANGED was OutputCollector
        TransactionCallback callback = callBackFor(
            filesToImport,                                      // CHANGED from outputCollector
            ...
        );
        return transaction.execute(callback);
    }
}

Now BatchJobOutput must pass the list of Files obtained from getLocalCopies() instead of passing the OutputCollector:

// in BatchJobOutput
  
@Override
public void importStepOutput(...) {
    OutputCollector outputCollector = new OutputCollector();
    try {
        ...
        importer.importData(outputCollector.getLocalCopies(), ...);  // CHANGED from outputCollector
        ...
        }
        finally {
            for (File outputPart : outputCollector) {
                outputPart.delete()
            }
        }
    }
}

And here’s the updated test for BatchJobOutput:

// In BatchJobOutputTest
@Test
public void testImportsOutputAndDeletesTemporaryFiles() throws Exception {
    File outputPart1 = aResultFile();
    File outputPart2 = aResultFile();
 
    ResultsStorage resultsStorage = storageIncluding(outputPart1, outputPart2);
 
    DatabaseImporter importer = createMock(DatabaseImporter.class);
    expect(importer.importData(
        eq(asList(outputPart1, outputPart2)), ...)) // CHANGED was isA(OutputCollector.class)
        .andReturn(...);
 
    JobOutput subject = new BatchJobOutput(resultsStorage, importer);
    subject.importStepOutput(...);
 
    assertFilesDoNotExist(outputPart1, outputPart2);
}

Now the test shows that the ResultsStorage create some files (outputPart1 & outputPart2), expects them to be handed to the DatabaseImporter for importing, and after that they should not exist any more.

There’s still more we can do to improve the clarity of this code, but I feel that by passing the files to import rather than the OutputCollector I have a better understanding of the intent of this code.  I can now see that it does what the test name suggests – it should import batch output (files) and delete them when done.

So what?

By giving the DatabaseImporter the list of files to import rather than the OutputCollector, the DatabaseImporter knows less about its surrounding context.  Rather than knowing about OutputCollector (which collects information about batch data results), DatabaseImporter knows about a List of Files.  If the OutputCollector changes, the DatabaseImporter no longer has a reference to it and would not need to change.  In this case I prefer this since I’m passing the values the DatabaseImporter cares about rather than making the DatabaseImporter know that it can get a list of files from the OutputCollector.  As an added bonus, the use of the OutputCollector is now a private implementation detail that BatchJobOutput hides from its collaborators.  We could change how this code works and as long as it preserves the same functionality, the tests for BatchJobOutput would not need to change.

I might prefer passing in the OutputCollector in if there were multiple different clients that wanted to get information from that object and I preferred to keep that interface stable.  This would mean clients could choose the information that they needed from the OutputCollector.  In this case, there’s just one client (DatabaseImporter) so I preferred to pass exactly what the importer needed.

For More Information

For more on this subject, I highly recommend J. B. Rainsberger’s post “Beyond Mock Objects” which shows another example of making dependencies explicit and reducing an object’s knowledge of its surrounding context.  This section of his post stood out for me:

When we build modules (functions, objects, whatever) with less knowledge of their surroundings, we naturally build modules more fit for reuse. No more excuses. If you want more reuse, you have to make it happen, and building modules with context independence in mind achieves that aim. We got there by noticing indirection without abstraction, then inverting the dependency (which happened to remove a mock object), then spotting duplication, then extracting that duplication into a context-independent, reusable module.

Write code more independent of its context. I find this code easier to test, easier to understand, easier to maintain, more stable, and easier to reuse.

I couldn’t agree more.


Daniel Wellman is a software engineering practice lead at Intent Media where he helps teams build great software.  He has written and spoken about software development practices to help people build reliable, flexible code and work with ease.  You can find him on Twitter and GitHub.

Post Comments 1

Posted by J. B. Rainsberger on
  • Aug 18 2015
 
When I saw isA(OutputCollector.class), I got the picture of someone thinking, "I'm not sure what to check here, but I'd better check _something_. What can I check?" Even if you don't check "the right thing" or "something good", please check _something_. Eventually, someone will make an observation like the one in this article and improve the design. If you don't leave the breadcrumbs, then no-one will find you. I really like this example of finding subtle/implicit/hidden duplication (between OutputCollector.class and files in storageIncluding(files)), because it so clearly shows a kind of duplication that text processing doesn't seem likely to find.