Curiosity-Driven Data Science - by Eric Colson @ HBR



Eric Colson - HBR

Data science can enable wholly new and innovative capabilities that can completely differentiate a company. But those innovative capabilities aren’t so much designed or envisioned as they are discovered and revealed through curiosity-driven tinkering by the data scientists. So, before you jump on the data science bandwagon, think less about how data science will support and execute your plans and think more about how to create an environment to empower your data scientists to come up with things you never dreamed of.

First, some context. I am the Chief Algorithms Officer at Stitch Fix, an online personalized styling service with 2.7 million clients in the U.S. and plans to enter the U.K. next year. The novelty of our service affords us exclusive and unprecedented data with nearly ideal conditions to learn from it. We have more than 100 data scientists that power algorithmic capabilities used throughout the company. We have algorithms for recommender systems, merchandise buying, inventory management, relationship management, logistics, operations — we even have algorithms for designing clothes! Each provides material and measurable returns, enabling us to better serve our clients, while providing a protective barrier against competition. Yet, virtually none of these capabilities were asked for by executives, product managers, or domain experts — and not even by a data science manager (and certainly not by me). Instead, they were born out of curiosity and extracurricular tinkering by data scientists.

Data scientists are a curious bunch, especially the good ones. They work towards clear goals, and they are focused on and accountable for achieving certain performance metrics. But they are also easily distracted, in a good way. In the course of doing their work they stumble on various patterns, phenomenon, and anomalies that are unearthed during their data sleuthing. This goads the data scientist’s curiosity: “Is there a better way that we can characterize a client’s style?” “If we modeled clothing fit as a distance measure could we improve client feedback?” “Can successful features from existing styles be re-combined to create better ones?” To answer these questions, the data scientist turns to the historical data and starts tinkering. They don’t ask permission. In some cases, explanations can be found quickly, in only a few hours or so. Other times, it takes longer because each answer evokes new questions and hypotheses, leading to more testing and learning.

Are they wasting their time? No. Not only does data science enable rapid exploration, it’s relatively easier to measure the value of that exploration, compared to other domains. Statistical measures like AUC, RMSE, and R-squared quantify the amount of predictive power the data scientist’s exploration is adding. The combination of these measures and a knowledge of the business context allows the data scientist to assess the viability and potential impact of a solution that leverages their new insights. If there is no “there” there, they stop. But when there is compelling evidence and big potential, the data scientist moves on to more rigorous methods like randomized controlled trials or A/B Testing, which can provide evidence of causal impact. They want to see how their new algorithm performs in real life, so they expose it to a small sample of clients in an experiment. They’re already confident it will improve the client experience and business metrics, but they need to know by how much. If the experiment yields a big enough gain, they’ll roll it out to all clients. In some cases, it may require additional work to build a robust capability around the new insights. This will almost surely go beyond what can be considered “side work” and they’ll need to collaborate with others for engineering and process changes.

The key here is that no one asked the data scientist to come up with these innovations. They saw an unexplained phenomenon, had a hunch, and started tinkering. They didn’t have to ask permission to explore because it’s relatively cheap to allow them to do so. Had they asked permission, managers and stakeholders probably would have said ‘no’.

These two things, low cost exploration and the ability to measure the results, set data science apart from other business functions. Sure, other departments are curious too: “I wonder if clients would respond better to this this type of creative?” a marketer might ask.  “Would a new user interface be more intuitive?” a product manager inquires. But those questions can’t be answered with historical data. Exploring those ideas requires actually building something, which will be costly. And justifying the cost is often difficult since there’s no evidence that suggests the ideas will work. With its low-cost exploration and risk-reducing evidence, data science makes it possible to try more things, leading to more innovation.

Sounds great, right? It is! But you can’t just declare as an organization that “we’ll do this too.” This is a very different way of doing things. You need to create an environment in which it can thrive.

First, you have to position data science as its own entity. Don’t bury it under another department like marketing, product, finance, etc. Instead, make it its own department, reporting to the CEO. In some cases, the data science team will need to collaborate with other departments to provide solutions. But it will do so as equal partners, not as a support staff that merely executes on what is asked of them. Instead of positioning data science as a supportive team in service to other departments, make it responsible for business goals. Then, hold it accountable to hitting those goals — but let the data scientists come up with the solutions.

Next, you need to equip the data scientists with all the technical resources they need to be autonomous. They’ll need full access to data as well as the compute resources to process their explorations. Requiring them to ask permission or request resources will impose a cost and less exploration will occur. My recommendation is to leverage a cloud architecture where the compute resources are elastic and nearly infinite.

The data scientists will need to have the skills to provision their own processors and conduct their own exploration. They will have to be great generalists. Most companies divide their data scientists into teams of specialists — say, Modelers, Machine Learning Engineers, Data Engineers, Causal Inference Analysts, etc. – in order to get more focus. But this will require more people to be involved to pursue any exploration. Coordinating multiple people gets expensive quickly. Instead, leverage “full-stack data scientists” with the skills to do all the functions. This lowers the cost of trying things, as a single tinkering initiative may require each of the data science functions I mentioned. Of course, data scientists can’t be experts in everything. So, you’ll need to provide a data platform that can help abstract them from the intricacies of distributed processing, auto-scaling, etc. This way the data scientist focuses more on driving business value through testing and learning, and less on technology.

Finally, you need a culture that will support a steady process of learning and experimentation. This means the entire company must have common values for things like learning by doing, being comfortable with ambiguity, balancing long-and short-term returns. These values need to be shared across the entire organization as they cannot survive in isolation.

But before you jump in and implement this at your company, be aware that it will be hard if not impossible to implement at an older company. I’m not sure it could have worked, even at Stitch Fix, if we hadn’t enabled data science to be autonomous from the very the beginning. I’ve been at Stitch Fix for six and a half years and, with a seat at the executive table, data science never had to be “inserted” into the organization. Rather, data science was native to us in the formative years, and hence, the necessary ways-of-working are more natural to us.

This is not to say data science is destined for failure at older, more mature companies, though it is certainly harder than starting from scratch. Some companies have been able to pull off miraculous changes. And it’s too important not to try. The benefits of this model are substantial, and for any company that wants data science to be a competitive advantage, it’s worth considering whether this approach can work for you.

Eric Colson is Chief Algorithms Officer at Stitch Fix. Prior to that he was Vice President of Data Science and Engineering at Netflix. @ericcolson

RIP President George H.W. Bush

Our name CAVU Global was indeed inspired from President Bush. His life is one of public service and optimism. The excerpt from the NY Times captures his spirit best…

“You know Brian, you’ve got us pegged just right,” Mr. Mulroney said Mr. Bush had told him. Then he walked Mr. Mulroney down to a spot overlooking the ocean at Walker’s Point and showed him a plaque that bore the inscription “CAVU,” which as a young pilot he knew as the acronym for “ceiling and visibility unlimited.”

“Those were the words we hoped to hear before takeoff — it meant perfect flying — and that’s the way I feel about our life today,” Mr. Mulroney said Mr. Bush had told him. “CAVU: Everything is perfect. Barb and I could not have asked for better lives. We are truly happy, and truly at peace.”

5 Levels of Difficulty - Blockchain Video

Contextualization is always key when talking tech. Being smart isn’t about knowing tech, its about applying it in a way that is useful and meaningful for people and the person your speaking with.

Blockchain, the key technology behind Bitcoin, is a new network that helps decentralize trade, and allows for more peer-to-peer transactions. WIRED challenged political scientist and blockchain researcher Bettina Warburg to explain blockchain technology to 5 different people; a child, a teen, a college student, a grad student, and an expert.

Data Science is in your future - case study interview with Alastair Woolcock

Click on the title to connect to the PodCast/interview.


You have probably heard the term Big Data. What does that mean anyway? Today I'm joined by Alastair Woolcock (@AlastairKW), to talk about Data Science. We uncover the science of data analytics and how data is unlocking market changing insights at all levels of business. Whether you're a small business trying to understand social media and it's influence on your marketing efforts, or an enterprise organization trying to understand client behavior, data can be a powerful tool. There is a growing industry of data science that is trying to use the power of hypothesis to mine for insights in the mountains of data that our modern businesses produce. 



Apollo Moment In Data

100 Executives from Fortune 1000 businesses were polled and asked one question….“Do you have a competitive data strategy?”

That question requires a definition:  What is strategy?  Michael E. Porter along with Jan Rivkin published a paper in conjunction with Harvard Business School stating, “Strategy is not operational effectiveness”. While operations are necessary, they are not sufficient nor a strategy.  We agree.

For over 20 years data produced by technology has been a vehicle to drive the operational efficiency of business processes.  As an industry, we have become masters of using IT operations to provide scale and cost savings for both complex and simple tasks.  Furthermore, using multi-discipline partnerships and segment technology stacks worked extremely well for most IT executives, this approach has enabled the delivery of core applications to our business partners.  However, it hasn’t provided answers.  IT has made us more efficient, but now IT can lead revenue transformation with a better use of core data sets and external data sets using analytics, custom algorithms, and data science.

Source image – Michael E. Porter

Source image – Michael E. Porter

Operational Effectiveness vs. Strategic Positioning

CAVU is a flight acronym that stands for ‘ceiling and visibility unlimited’.  At CAVU we would suggest we are in the midst of our Apollo moment in technology and data exploration.

1969 saw one of man’s greatest accomplishments achieved.  Neil Armstrong walked on the moon a mere ten years after President Kennedy’s bold challenge.  A relatively insurmountable task when you consider we hadn’t even sent a man to space 9 years earlier, and yet that year we were broadcasting to the world man’s first steps on the moon.

Yet the 1970’s were not as kind to our bold space explorers.  It was not until the launch of the Space Shuttle Columbia in 1981 when a new vision and strategy was employed to again challenge the fabric of mankind… this time could we live in space, not just travel in it.

Today we know the answers, successes, failures, tragedies and heroes of the global space community.

Including most recently Chris Hadfield’s amazing journey as chronicled on such earthbound technology like Twitter, Facebook, and Google+. However, it took the reinvention. Mercury, to Saturn, to Apollo, the Shuttle and now Expedition 35 on ISS. Along the way efficiency was driven. The standing joke of millions of parts built by the lowest bidder is well known and a relative comparison to how most IT departments are run today.  Cost reduction is the goal and less focus on revenue creation.  However ,data science, algorithms, and analytics are changing that. 

So why is this our Apollo moment?

If general strategy isn’t efficiency, then it is:

  1. Unique competitive positioning of assets or services (internal and external)
  2. It is being activity focused on business outcomes, not technology outcomes (customer insights, financial alignment, and results)
  3. Clear differentiation from competitors, is your IT department differentiated to what can be consumed from the market? Are you providing strategic new services or just better efficiency?  Are you leveraging data for competitive advantage?
  4. Sustainability of strategy is not a one-time process, but a series of activities /partnerships (hypothesis to analytics to custom algorithms then insights and then run the loop again)
  5. Operational effectiveness to deliver outcomes is a must, but quantifiable and qualitative in natureWhere virtual health care services are enabled in the field, or financial services firms are using AI cognitive agents to detect market trends, or governments are connecting traffic lights with traffic flow to enable connected cities through data science models.

If the last twenty years has been our Apollo mission in IT, then what’s our next frontier? Gartner would suggest Digital Media. McKinsey & Co. states data science and analytics. 

CAVU Global would agree with these analysts, but our goal isn’t just academia, but actual implementation.

Our goal is simple. Not to tell “what” needs to be done, but help people with “how” use their data and publicly available data for the betterment of society.

The answer to our opening question?

The majority of Executives in Fortune 1000 said “no” to having a formalized data strategy and road map.

Have a question or a problem? Ready to unleash “The Power of Answers” Email us:


CAVU Founder Recognized: Boulder Valley 40 Under 40

CAVU would like to congratulate our founder Alastair Woolcock as he is recognized on Boulder Valley's 40 under 40 List. Alastair's continued hard work, dedication and vision has made CAVU possible. In addition to CAVU, his role as a senior member of the team at Long View Systems and his church, as well as his contributions to the local economy and community made him an easy selection.

About Boulder Valley 40 under 40:

These honorees have made an impact on his or her organization, live or work in Boulder or Broomfield counties, have received professional recognition for significant achievements in the community and have worked to help others through community service and charitable giving.

The annual awards ceremony recognizes the best and brightest of emerging leaders in Boulder and Broomfield counties. They will be honored at an evening event set for 5 p.m., Tuesday, March 7, at the Lionsgate Event Center, 1055 S. 112th St., in Lafayette, and also will be profiled in BizWest.

Please join the CAVU global team in congratulating Alastair on his achievement.

For more information about Boulder Valley 40 Under 40 of the awards ceremony: