- Posted by Intent Media 14 Oct
- 0 Comments
I am a graduate student at Columbia University studying Operations Research (OR). I came to Intent Media (IM) as a data scientist intern this summer with excitement as well as a bit of nervousness. I have great interest in data science, and I wanted to see what it would be like to work in the area, however I was anxious about whether I would be able to contribute well. Also, I had never worked at a startup before, so I was unsure as to what lay ahead of me. Plus, I am a little scared of dogs, so it did not help that I saw Ben as soon as I walked in to the office.
I was given a Macintosh laptop, something I had never used before (I mean c’mon, Apple is a fruit!), and was asked to set up everything myself. It is with a bit of abashedness that I admit that I have never been good at figuring out where wires must go. Yes, I have an engineering background, but I only learned how to efficiently fix bugs I created myself, when I was a software engineer. Also, the enormous amount of technical jargon that was flying around got me very worried. However, I am pleased to say, paraphrasing Linkin Park who informed us at 1000 decibels and 600 words per minute, in the end, it didn’t matter.
I had a great summer at IM. My favourite part of IM were the buttered muffins that were always available. Second to that, I liked that IM has retained its startup flavour despite being reasonably well-sized. The processes are lean and the environment, in general, is very vibrant. Having said that, IM also has the steadiness of a well-established company. A very organized tech stack and strong technical infrastructure are its strengths. There is a very open culture that allows people to express themselves and also be heard. It is really inspiring to meet and work with people who are very passionate about what they build together.
During my internship, I worked on the problem of budget smoothing for active advertisers in the online travel space. Let me give you a bit of context before I tell you more about the problem. The web has become a major market for advertisements. Online advertisements drive the revenue of big search giants such as Google, Yahoo, Bing etc. For every word/query that is looked up on a search engine, a finite set of advertisements are displayed. Where do these ads come from? There are a set of advertisers who wish to advertise their products. They begin by registering their products and providing their assessment about what categories each product belongs to. In this model, the people who display these advertisements are called publishers. Advertisers hope that publishers will display their ads whenever the publishers display content in the categories that the advertisers believes are relevant to their ads. Also, there are auctioneers who act as intermediaries in connecting advertisers and publishers for queries. Advertisers bid a price they are willing to pay when a web user clicks on their advertisement. Here is video by Google’s chief economist that explains auction pricing if you’re keen to find out more.
Each advertiser runs several campaigns in order to maximize their presence on the web. Each campaign constitutes a set of queries that the advertiser wants to bid on. All advertisers set a budget cap on the amount of money they want to spend on each campaign. Some advertisers can afford to spend a lot of money on online marketing, therefore, they sustain their online presence throughout the day. There are few other advertisers who have limited budgets and tend to exhaust their budget before the end of day. The natural question that arises here is – how can they spend their budget wisely throughout the day?
To put you out of the stage 4 sleep I just put you in, let me state this problem in simpler terms. If you had $100 to spend in a day, and you kept getting limited time (say < 5 min) offers, how would you know which offers to spend it on? One way to do this would be to keep spending on all the available opportunity until you exhaust your budget. In algorithmic parlance, we call this a ‘greedy’ approach and greedy approaches are seldom optimal. Clearly, why not wait until a better opportunity comes along. However, if you waited too long, you might find that you lost the best opportunities from earlier in the day. How then do you solve this problem in a holistic and principled manner?
I worked under the ‘super’vision of Sharath Rao, a Data Scientist at Intent Media. Based on the problem statement, few of the questions that I started with were:
- What is the best strategy to spend budgets?
- Is a greedy approach optimal?
- Spend in proportion to traffic/size of opportunity or in proportion to campaign performance?
- Is there a way to smooth the budget to ensure participation in auctions throughout the day?
- Is it possible to solve this problem in a dynamic environment with game theoretic constraints? If yes, is it practically feasible?
- Something may maximize auctioneer revenue but would that be fair to advertisers?
I began by reading a few research papers about this topic. A lot of research in this field has been undertaken by software giants, and other DSP/Real Time Bid (RTB) platforms. A few papers that I explored as part of my research were by Google, Microsoft, Turn, Yahoo. My job involved understanding the models presented in the paper. I had to devise ad-hoc experiments to test the feasibility of such approaches. Specifically, the questions that I attempted to answer during my internship were:
- Do these models work well in our context?
- How efficient is this model? Does it scale well?
- Do we need to make any changes to the model?
- What will IM have to do to be able to provide the necessary data for a model like this? Specifically, what kind of data collection infrastructure is required for this?
- Would the model actually help budget constrained advertisers? Clearly there is a conflict of interest between the advertisers and the auctioneers (discussed well in the Yahoo paper) that needs to handled.
- Is there a best strategy that would benefit everyone?
On the whole, my internship was a very fulfilling experience. I got a lot of exposure to work that involves big data and predictive modelling. I came here with a genuine curiosity and an eagerness to learn how the industry employs principled mathematical analysis to drive business growth in dynamic environments such as the internet. I take back to school a lot of insights and learning from this experience. To me, the phrase “mathematical methods in decision making” perfectly sums up the field of Operations Research (OR). While the description is quite broad, it draws our attention to an important fact that, as opposed to qualitative analysis, OR deals with quantitative methods that are used in business and policy decisions. Data science/analysis, of the kind that I did at IM, clearly falls under this definition. It was apparent to me that concepts such as parameter and density estimation, linear and quadratic optimization, stochastic modeling and Bayes’ theorem have found applicability in the real world only because of the vastness of data available to us and that data is a very potent resource for any kind of decision making. I also enjoyed being able to place academic knowledge in context. There is a very specific process with regard to how academic research permeates engineering practice. I was very lucky to have witnessed this.
I would like to take this opportunity to encourage everyone out there (YES! EVERYONE!) to be bold and pursue whatever interests you the most. Life is too short to spend on things that allow you to afford seat warmers in your car, but actually don’t interest you. At graduate school, I have noticed that many of my female peers, who want to pursue careers in technology, are intimidated to dive into fields with gender imbalance. Gender imbalance won’t be righted until women stop believing that they can’t. There are only too many women who are extraordinarily accomplished professionals in highly technical fields. Case in point, my co-worker Betsy transitioned from ‘Business Analyst’ to ‘QA engineer’ and is now heading to dev boot camp and coming back as a software developer!
I am continuing to work at IM over the Fall exploring data analytics and learning lots more. Also, my fear for dogs has reduced greatly. I can now pet Ben without flinching 🙂
If you enjoyed my post, you know what to do – Come work for Intent Media
I am a graduate student at Columbia University studying Operations Research. I worked as a software engineer at Oracle before I began my soul searching sojourn! I dabbled in a lot of things before I fell in love with data science. I interned at Intent Media over the summer of ‘14 and and am continuing to intern during fall of 2014. You can connect with me on Linkedin.