• Posted by Intent Media 27 Jan
  • 0 Comments

Random Assumptions can make an A– Part 1

Overview

Sometimes its good to take an in-depth look at the mistakes we make. This is an examination of an issue we were addressing, and how a couple of assumptions we made, introduced a new and devious bug.

It ain’t pretty but looking back is how we get better. And hopefully you can read this and learn something without making the same mistake.

We split this into two posts. The first post covers the original problem we encountered and our attempt to solve it. The second post will cover the bug we introduced, and our final resolution.

Background on Multivariate Testing

We here at Intent Media believe in testing and data. We have a whole system dedicated to test the effectiveness of our ads, the Multivariate Test System. This system allows us to test multiple attributes of our ads at the same time. It’s used throughout our system and has proven to be a very effective tool to maximize our effectiveness.

For example, let’s say we decide to test a new design, Design Awesome, against our current design, Design Sweet with a 50/50 split. We also want to test a new Ad Copy, “Look at me!”, vs “I am a sweet ad copy” also at 50/50 split. Our system allows us not only to look at each attribute independently but also in combinations (how did Design Awesome with “Look at me!” vs Design Sweet with “I am a sweet ad copy”).

I won’t go into too many details about Multivariate Testing but if you want to read more about check out http://en.wikipedia.org/wiki/Multivariate_testing.

The original setup

Our old implementation worked by building a ‘splat’ table. Using our above test as an example, the splat table would contain four entries:

Design, Ad Copy, Bottom Range, Top Range
———————————————————————————
Design Awesome, “Look at me!”, 0.0, 0.25 
Design Awesome, “I am a sweet ad copy”, 0.25, 0.5 
Design Sweet, “Look at me!”, 0.5, 0.75 
Design Sweet, “I am a sweet ad copy”, 0.75, 1.0

When a user visits one of our publisher’s site we assign them a single random number or ‘dice roll’. This number will determine all the values for all the attributes the user will see while looking at our ads.

But the dice must also be sticky. If the random number changes everytime the user visits the site, the ad could potentially change color and design for a page refresh, and that would break the validity of our tests. We don’t want to use a cookie to record the random number because that would be third party cookie, which many users and browsers block and we don’t want to rely on them.

To do this we use a user_id assigned by the publisher (not personably identifiable), add a salt string to the user_id, and use the hash code to seed the Random class to get a repeatable random number:


int seed = String.format("%s some salt", userId).hashCode();
double diceRoll = Random(seed).nextDouble();

This was quick way to generate a sticky random number, and distribution of the numbers was quite good. This implementation worked well for us for a quite a while. But over time we discovered two bugs with this.

The original bugs

1. The splat table is really slow.

As we started testing more and more attributes ,the splat table started getting really large. In our above example we have only have two attributes each with two possible values which leads to a splat table of 4 rows. For one of our current partners we are testing 13 separate attributes with 2-6 possible values for each attribute. The splat table for this partner would be 7,464,960 rows. Yikes!

While the dice roll to get the attributes is relatively quick, building the table is really really slow. We tried lots of tricks to speed up generation, but it was taking a lot of time and effort. In addition if we made one change to any of those attributes we would have to rebuild the whole table every time (at one point it locked up our whole database for over an hour). So it made updating any of these tests painful.

2. Moving fenceposts

Occasionally we fiddle with the percentages of our tests. This will shift some users from one value to another on the attribute we changed. This is acceptable for us because we are shifting the tests.

But because of the nature of splat tables, it can also shift users on multiple attributes not just one. Consider our original example splat table:

Design, Ad Copy, Bottom Range, Top Range 
———————————————————————————
Design Awesome, “Look at me!”, 0.0, 0.25 
Design Awesome, “I am a sweet ad copy”, 0.25, 0.5 
Design Sweet, “Look at me!”, 0.5, 0.75 
Design Sweet, “I am a sweet ad copy”, 0.75, 1.0

Now instead of doing a 50/50 spit for the design, let’s do a 75/25 split. That would generate the following table:

Design, Ad Copy, Bottom Range, Top Range 
———————————————————————————
Design Awesome, “Look at me!”, 0.0, 0.125 
Design Awesome, “I am a sweet ad copy”, 0.125, 0.25 
Design Sweet, “Look at me!”, 0.25, 0.625 
Design Sweet, “I am a sweet ad copy”, 0.625, 1.0

Now imagine you are user and your sticky random dice roll is 0.40. In the original table you would get Design Awesome with “I am a sweet ad copy”, but now you get Design Sweet with “Look at me!”. So even though we only changed the probability for one attribute we shifted some users over multiple attributes. Not good!

Enter the conquering/bumbling heroes…

Given all this we set out to try and solve both issues. There were a number of strategies we could have employed to solve the first problem, but the second problem just seemed intractable. We could just not figure out a way to structure the splat table to get around this moving fence post issue. So we decided to abandon it, and do a ‘dice roll’ for each attribute.

But this created another problem. How do we do generate 17 different sticky random numbers. Looking at how we used to generate random numbers, we decided to modify our random number generator to include both the user id and the id of the attribute:


int seed = String.format("user %s attribute_id %s", userId, attribute.getId().toString()).hashCode();
double diceRoll = Random(seed).nextDouble();

This would generate a unique repeatable seed for each user and attribute. We tested this with a single attribute and got a good distribution, and surprisingly it was faster than our older implementation. All is well, right?

Next time, the fall

There was actually a very large bug we introduced with this change. We’ll detail it in the next blog post, but see if you can guess what mistakes we made in this fix.

Orion Delwaterman
Software Engineer

Post Comments 0