Bottom Line up Front

Great teams relentlessly pursue mastery of the basics, and this holds especially true when working with data. Drawing lessons from the U.S. Marine Corps, I suggest a series of foundational elements to master working with data.

U.S . Marine Corps Tenets:

  1. Orders & confirmation briefs

  2. Checks & inspections

  3. After action reviews & debriefs

  4. Guardian angels

  5. “Geometry of fires”

  6. Unity of command

How it Applies to Working With Data:

  1. Data dictionaries & contracts

  2. Testing & validation

  3. Blameless post-mortem

  4. Pair programming

  5. Governance & architecture reviews

  6. Assigning ownership

Purpose & Overview

There is an abundance of TED talks, books, and other material likening the conduct of business to the nature of war. The reasoning makes sense. Both environments are characterized by complexity, competition, confusion, and disruption. And more often that not, the techniques that work well in one environment work well in the other.

Given this dynamic, one would expect to see a lot of overlap between the military and business fields, certainly in their thinking, priorities, and ways of working. But this is not always the case, especially when working with data and technology.

All too often, businesses are placing a premium on new, flashy “solutions” and perceived efficiencies in their way of working. This stands in stark contrast to the prevailing philosophy of the U.S. military, where “Brilliance in the Basics,” not flash and buzzwords, is esteemed above all else.

I am firmly in the camp of “Brilliance in the Basics,” especially when it comes to working with data. I believe that championship quality teams are made through a relentless, fierce pursuit of getting the basics right.

The purpose of this white paper is to:

  1. Outline my belief in the emphasizing the basics over showy “solutions”

  2. Give a few applicable examples of the basics emphasized by the military

  3. Draw inference from these examples to identify best practices in working with data and technology

First, get the basics right. Next, get the basics right...

On the eve of a fight with Evander Holyfield, Mike Tyson famously commented “Everyone has a plan… until they get punched in the mouth.” Unsurprisingly, this quote is widely disseminated among the ranks of the U.S. Military. But is that to say plans aren’t important? Of course not.

The essence of Mike Tyson’s quote is to say there is something more important than plans, something essential to be able to “roll with the punches” and win. That something is, of course, conquering the basics. For boxers, it is conditioning and mastering the fundamental elements of moving, punching, and defending.

For the U.S. Marine Corps, the basics are fighting, moving, and communicating. But what is more important than how they define the basics is the belief that mastering them is key. Such mastery creates teams that thrive under pressure, produce quality outputs, and adapt quickly to rapidly changing situations. I believe this philosophy holds especially true when working with data. In particular, the principles espoused in the Marine Corps carry over nicely to the world of data and technology.

Examples that inspire from the U.S. Marine Corps

The U.S. Marine Corps is a unique organization in the United States. The Corps is over 200 years old and has repeatedly been thrust into dangerous, austere, environments wherein the cost of failure was catastrophic. The Corps has done a great job of institutionalizing a culture and building upon lessons learned from previous generations. As a result, the Corps is a paragon of American society. They are fast, tough, efficient, and command a great deal of respect. “Send in the Marines” is a phrase that means something.

I'm not arguing that the Corps is perfect. But it does a lot of things well, and a lot of things that could, and should, be imitated when working with data. A few examples are:

  1. Mission orders & confirmation briefs

  2. Pre-operational checks & inspections

  3. After action reviews & debriefs

  4. Guardian angels

  5. “Geometry of fires”

  6. Unity of command

Mission orders & confirmation briefs

When it comes to organizing, developing, and communicating a plan, the Marine Corps has one template: “OSMEAC,” which has been roughly outlined below (using less military-specific jargon).

  1. Orientation

    a. Team overview

    b. Ecosystem overview

  2. Situation

    a. The goal of your boss / client

    b. The problem being solved for

    c. The goal of adjacent teams

  3. Mission

    a. 5W’s

    b. Purpose

  4. Execution

    a. What success looks like

    b. Team & individual tasks

    c. Critical reporting requirements

    d. Meeting cadence

    e. Technical details

  5. Admin & logistics

    a. Meeting rooms

    b. Flights, cars, charge codes, etc.

  6. Communication plan

    a. Communication channels (slack, teams, etc.)

    b. Ranking of who is in charge

Every single individual throughout the entire Corps is expected know, and use , this framework. There are no exceptions, and no deviations. Moreover, the contents of this planning framework are a requirement for any initiative, no matter how small it might seem. Whether you are invading a country, or simply taking your team out for a jog, all of the content of the “OSMEAC” planning template will be covered.

At first this process seems cumbersome, at least for small asks around the office. It is not intuitive that communicating all of this info would actually increase the speed of an organization. Yet, this practice is shockingly effective.

Giving individuals a more thorough orientation to a task, a clear articulation of the mission and purpose, their specific roles, and an explanation of the job specifics, enables them to better perform their duties. This applies to any team, especially those working with data.

Another area wherein this practice from the Marine Corps can be applied to working with data is in the use of data dictionaries. In sum, a data dictionary is a contract about what each of the terms within a data set mean. For instance, this could mean what is meant by a “user,” and how that would be recorded in data.

Such a distinction might seem too detailed, but consider that multiple teams in a business might have widely different interpretations of what is mean by a “user.” One team might define a “user” as anyone who visits a website, whereas another team might only define a “user” as someone who subscribes to a service or buys a product. Such small misunderstandings can have major implications on the recording and use of data, so it is essential to have thoughtful data contracts to outline all of this in the first place.

I've given users only as an example, but the aspects of a data dictionary would cover any term or event in the data. Time zones, credits, and multiples are all examples of terms that would be explicitly defined in a data dictionary.

It is important to note that a data dictionary doesn’t have to be anything fancy. It could be something as simple as a Google document. The important thing is simply to maintain and use a data dictionary. In the same way the Marine Corps is disciplined about the use of operations orders, so too should businesses be disciplined about their data dictionaries.

Checks and Inspections

“Inspect what you expect” is a widely used phrase in the Marine Corps. Indeed, the Marine Corps obsesses over what they term “pre-combat checks & inspections.” These inspections cover all aspects of an operation, from the mission gear and equipment, to individuals understanding of the plan, to group rehearsals.

Of course, these inspections take a lot of time and seldom fit nicely within an already busy schedule. Yet, the Marines are highly disciplined in doing them.

There is a clear parallel between the Marine Corps “pre-combat inspections” and testing and validation in data. When working with data, it is critical to have automated tests anyone can easily run at the push of a button. Anyone can test data visually, but visual inspections don't scale. Automated testing is key because it can be run in cadence every time changes in the data have been made. This is key because data grows exponentially. As data grows, entropy happens. Problems creep into the data. Over the course of time businesses can find they are making decisions on bad data.

After Action Reviews & Debriefs

Just as Marines obsess about inspections, so too do they obsess about debriefing operations. In fact, an operation isn’t considered complete until it has been properly debriefed to determine what lessons could be learned from it and how the organization could improve in future iterations. These debriefs are conducted for all operations, both successful and failed. The lessons from each operation are captured, retained, and disseminated to other units in a timely manner. This is an area wherein the data world could learn a lot from the Marines.

“Blameless post mortems” are a common technique for teams working in technology and data. Blameless post mortems are where an individual gives an open talk about something that went wrong so other developers and engineers can learn from their mistakes. While this is certainly worthwhile, it is an incomplete practice.

Data and engineer teams should follow in the Marine Corps example of conducting after action reviews and debriefs. At the end of each initiative, the entire team should setup time to talk through what went well, what could be improved, and what they would do differently. The lessons learned should be documented, retained, and disseminated to other teams working in similar functions. Furthermore, if a them or trend is identified from debriefing several teams over time, leadership should be notified.

Guardian Angels

Another common expression in the Marines is the phrase “two is one, and one is none.” In short, every critical task, or piece of equipment needs to be duplicated so as to not be a single point of failure. This methodology is certainly applied to the Marine Corps strongest asset: the Marine. Marines never go anywhere alone, nor do they work on tasks alone.

In business, the concept of duplicating equipment and teaming up on everything probably sounds like nails on chalkboard. Indeed, it is not intuitive that teams would actually be more efficient when everyone works in pairs and key equipment is duplicated. But the Marine Corps has thrived in hundreds of years of volatile, dynamic, and extremely high pressure environments. This experience has left them with the clear belief that “guardian angels” are non-optional.

I believe similar concepts to “guardian angels” exist in data in the form of pair programming and reducing the “bus factor.”

Pair programming is when two people team up to complete a development / engineering task. One person drives and the other person is the navigator telling the other what to do. Like guardian angels, it is not intuitive this would work well. On paper, it looks like you are taking two highly paid people to do one job. This technique, however, catches many more bugs from being introduced to code bases, which saves a lot of work further down the line.

The “bus factor” is a term meant to indicate especially vulnerable parts of a code base. The idea is that there are parts of the code base where if somebody gets “hit by a bus,” (the person who wrote the code) there will be huge problems because only one person understands it.

The “bus factor” behind each aspect of a code base should be minimized at any cost. Key infrastructure that is only understood by a small group of people, or even just a single person, exposes a company to significant risk.

“Geometry of Fires”

While extremely useful to reference, and applicable to data, the term “geometry of fires” needs a little explanation for those not familiar with the military.

In short, geometry of fires refers to the understanding of each unit’s maneuvers, fires (guns, cannons, bombs, etc.) and how they impact the space they are operating in. More importantly, it refers to deconflicting the effects of these fires so units don’t end up firing on each other or civilians. And while this might sound straight forward, it is incredibly challenging in a theater of war, which can be characterized by speed, stress, confusion, and friction.

Once explained a little, it is clear to see how this concept applies to a tech development environment. It is increasingly important to deconflict work efforts as companies grow and continue to have more engineers working on projects.

Version control software is key for deconflicting work efforts. There are several version control systems, to include GitHub, which is becoming increasingly popular. These systems allow developers to collaborate with others without interfering with each others work. This is a critical function, and it is a major warning sign when companies are not using version control.

Another useful tool for deconflicting efforts is the use of Architecture Review Boards (ARBs). These boards are a good way to communicate across the teams who are working in tech development. ARBs go hand in hand with the data dictionaries mentioned earlier.

Unity of Command

The last of the concepts being gleaned from the Marine Corps is the easiest to explain. “Unity of Command” basically just means clearly delineating who is in charge and ensuring everyone is working towards a common purpose. The Marines do this very well. It should be done equally well when working with data.

It is quite common for services, databases, and other parts of a company’s tech stack to be operated by different teams. But it is crucial that ownership and accountability is explicitly defined for each component of the tech stack.

All too often, companies are in a position where multiple people are working on the same code base, but there is not a clear owner. Such situations represent a significant risk to a firm. Though it may not always be comfortable to do so, firms ought to maintain discipline in clearly assigning owners to the components of their code.

Summary

If there is one lesson to be taken from this white paper, let it be this: obsess over the basics.

The points represented here have been gleaned from the U.S. Marine Corps, an organization I believe especially useful for benchmarking because of their well earned reputation for toughness and discipline.

There are, of course, other points that could be considered the “basics” for working with data and technology. These are just a few.

More important to the definition of the basics, however, is the belief that they should be the key focus of teams who wish to succeed in their respective fields.

It is my firmly held belief that amateurs focus on new and flashy “solutions” and quick fixes for their teams/companies. True professionals and champions forego such fully and obsess on mastering the basics.