Do you want to build software faster and release it more often without the risks of negatively impacting your user experience? Imagine a world where there is not only less fear around testing and releasing in production, but one where it becomes routine. That is the world of feature flags.
A feature flag lets you deliver different functionality to different users without maintaining feature branches and running different binary artifacts. Feature flags encompass certain parts of your code in conditional statements that you can turn on and off.
Feature flags are also sometimes called feature toggles, release toggles, feature switches, feature gates, or conditional features. In agile environments, you can use toggles during runtime to enable or disable a given feature on demand, for some or all users.
When to use feature flags
Because feature flags add a layer of complexity to a codebase, you should make sure to use them only when necessary. Feature flags can complicate your codebase in two ways. First, multiple variations of code are present, increasing the maintenance burden. Second, overlapping feature flags make it harder to have confidence in the state of production. Feature flags are like using sugar in cooking: it’s not always needed, but when it is, don’t overdo it.
When it comes to software development, feature flags can change the way you build and release code. By segmenting user-facing features and ops functions into togglable flags, you enable experimentation, controlled rollouts, and the ability for non-developers to turn things on and off for customers.
However, when using feature flags, don’t keep every flag around indefinitely. Some feature flags continue to be useful long after they’re added, while others lose their utility when your entire userbase gets the corresponding feature.
Use cases
Because feature flags are so powerful, you can use them in a variety of ways, depending on your business goals and development environment. More broadly, feature flags are useful in software delivery when trying to shorten time to production, roll out new functionality slowly, and release features before they’re finished:
- Shortening time to production with feature flags is simple. Keep features off for all users—except the developers and the Quality Assurance (QA) team—to make improvements before your users try them.
- Perform A/B testing by enabling a feature for some users, and not for others. You can get feedback from a specific population of your users based on attributes you’ve chosen and whether or not they negatively impact the user experience.
- Even if a feature is not finished, deploy it behind a flag. Since the new feature is present for any major refactoring, you don’t have to worry about long-lived branches that become harder to merge over time.
Other use cases include code management, percentage-based rollouts, beta releases, and rollbacks.
What is observability?
Observability as a concept is the ability to ask any novel question about the state of your system and receive answers based on rich data providing context. Observable systems should be limitless in the amount of questions you can ask, enabling engineers to iterate as they investigate and debut their complex and distributed systems. Observability engineering is the ability to collect data about a program’s execution, modules’ internal states, and the communication between all components. To discover and understand a problem within a service through observability, software engineers use observability platforms like Honeycomb to analyze all system telemetry like logs, metrics, and traces in one place.
Unlike monitoring, which only shows you that a random outage is taking place (the known), observability helps you figure out why it’s happening (the unknown) and who it impacts. In monitoring, you set up alerts based on your system knowledge of what might fail to tell you where the problem is. In observability, you examine the entire system and user experience in real time to surface anomalies and answer why something is happening before it degrades the user experience. Observability engineering lets you ask questions of the data, visualize anomalies, and pursue potential leads. The data shows you where to look and what to ask.
Monitoring can’t handle a “never seen that before” situation because it’s set up to only alert about known problems. This was great for predictable monolithic systems, but as noted, this is not the reality for modern environments. An observable system, meanwhile, lets you explore anything and everything that’s occurring during an incident (or even just out of curiosity). Monitoring is like a police officer spotting a known crime, while observability is a detective following the money.
Teams using observability can find and fix incidents faster than those relying only on monitoring, thus saving organizations time, resources, and their reputation. Furthermore, instead of poring through logs to resolve issues (which can take hours!), teams can work on higher-value projects like developing new features or increasing reliability. Observability engineering extracts all relevant information that can help resolve an outage, corralling data from disparate systems and managing the overwhelming amount of information to process. Most importantly, developers who roll out distributed tracing as part of an observability engineering effort can visualize the concrete benefits of their coding, leveraging rich data for more efficient application development.
The benefits of feature flags combined with observability
Feature flags and observability work well together for progressive delivery. The former lets you segment traffic, and the latter lets you visualize differences between hosts. With progressive delivery, you can decouple deploys from releases, validating code quickly and experimenting iteratively. Progressive delivery lets you see whether new changes benefit users before they’re broadly released. You can deploy to percentages of users, environments, and hosts. Once you’re satisfied with the results in production, you can expand the feature widely.
Honeycomb uses observability and feature flags itself. If you want to nerd out, read this progressive delivery story about how feature flags helped Honeycomb realize that correlation didn’t imply the obvious causation, and in fact, was actively misleading.
How to use feature flags
You can use feature flags in a variety of ways: to increase productivity, mitigate risk, test for bugs, A/B test changes, show demos to prospective clients, offer new features in beta programs, target features to audiences, and more.
More productivity, less risk
Feature flags let you deliver more features while mitigating risk. Wrapping different versions of code in conditional statements that you can turn on and off lets you work more efficiently under less stress.
Continuous deployment vs. continuous delivery
You can use feature flags to continuously ship new code to production, but only ship new features to users when they are ready—or vice-versa. This minimizes risk by decoupling production deployment from userbase deployment.
Testing in production
Feature flags let you test new features in production while mitigating the risk of a poor release. Testing with real, live users paints a much more accurate picture of a release’s behavior. Instead of trying to simulate the production environment in staging, you can validate the functionality of a new feature with your users and gather feedback. You can also gain insights into how changes impact your code’s performance.
A/B testing
Feature flags are ideal when using A/B testing to compare alternative versions of a feature. If you want to experiment and try different versions on a userbase, feature flags let you do so by flipping a switch to collect and observe usage data. Flip the switch all the way to enable the winning option for all users.
Percentage-based rollout
Feature flags let you select a small number of users to test a new feature or a new design via a percentage-based rollout. You can increase or decrease the percentage as you observe how users behave under the changes. Once the changes are stable and user feedback is positive, you can scale up to 100%.
Beta releases
Feature flags let you test a new feature on a subgroup of users to see how it performs and gather feedback just from that cohort. If you observe quality results, you can roll it out to a broader audience. If you don’t, you’ve limited the risk by not launching it to your full userbase, and you can roll it back for the smaller audience. This can be useful for figuring out if a new feature has the desired impact on a userbase or to catch any late-stage bugs that snuck through.
Rollbacks
You can use feature flags as a kill switch. If you need to disable a new feature, there’s no need to re-deploy or push any code. Enabling or disabling a new feature is as simple as editing a config file. If a new feature causes a crash or if you discover a bug, you can use its feature flag to roll it back immediately, without touching your source code. If your code lives somewhere not under your complete control, like a public cloud or an app store, you can release or roll back new features without having to deploy code or get approval. Furthermore, you can have a “lite” or “saver” feature flag that you flip during high-demand periods.
Flexible code management
You can leverage feature flags to disable a feature, even if you’re not a developer. If something is going wrong in the production environment, you can toggle a feature flag on or off, depending on the situation, without asking a developer to change code and go through a code review process. Anyone on the team who understands feature flags can immediately address bugs, outages, and other problems.
Who uses feature flags?
As hinted above, feature flags are not just for engineering teams. Yes, developers set up feature flags and are the biggest beneficiaries from their use. Other stakeholders also stand to benefit, however, including product, sales, customer support, ops, and management. Feature flags ensure that engineering is not the bottleneck for teams helping customers:
- Product managers and QA teams can use feature flags to manage rollouts and turn functionality on and off as needed.
- DevOps teams can use feature flags to help product managers better control releases, coordinate launch timings, and create feedback loops.
- Sales and support teams can use feature flags to manage unfinished features or new features for customers.
- Ops can use feature flags to quickly react to problems, for example by disabling code that is working inconsistently or causing a crash.
- Management can use feature flags to gain visibility into what’s happening in development, check how new features are being tested with users, or create and enforce governance and standardization.
When other parts of the company use feature flags, they don’t have to waste a developer’s time to get work done. This frees up developers to do more interesting work, like shipping new features.
Feature flag best practices
There are myriad ways to implement feature flags. Following these best practices will help you avoid future headaches:
Control access to flags
Set up logging so you can track who made which change. Such transparency is useful to reduce dependency between product and engineering teams.
Use a standardized naming scheme
Set up a naming convention so that the type of flag is clear (release, experiment, permission, kill, etc.), so as not to create flags with the same or similar names. You don’t want someone toggling the wrong flag because of a misunderstood name. Whether you use a feature flag management tool, a config file, or a database table, everyone who uses feature flags should be able to understand what a given flag does based on its name.
Manage different flags differently
Set up a management system for your flags. Flags are not all created equal. You should maintain each one using standards based on their uses, how critical they are, and who will use them.
Make flag settings visible
Set up a system to check what feature flag settings a specific user has. Store this information in the user’s profile in your database and analytics system. These will be useful later when troubleshooting issues and understanding A/B tests.
Avoid dependencies between flags
Set up each flag with a specific purpose that is independent from any other flag. You don’t want flags to rely on one another or conflict with other flags. Flags with dependencies aren’t just confusing for the team—they can also cause difficult-to-debug user issues.
Clean up your flags
Set up a system to regularly (say, monthly or quarterly) delete feature flags you no longer use. Some flags are useful for the life of the product. Others are temporary and you need to remove them to avoid technical debt. Often referred to as flag debt, this type of technical debt occurs when your code becomes littered with useless flags. If you’re using a standardized naming scheme or a service that can help you determine if a flag is still in use, flag cleanup should be straightforward.
How to implement feature flags
You can implement feature flags using a management service that defines the flag, a run-time query to figure out the value of the flag, or an if/else construct.
The easiest way to start is to use if/else statements in your code. For a more robust solution, leverage open-source projects and libraries for your preferred programming language. If you require more complex logical statements than just Boolean, use a feature flag management tool.
To figure out what works for your team, consider the pain points you’re trying to address. Are your use cases only for developers or for the entire company? Would you prefer to build or buy a feature flagging management system?
Since there are different types of feature flags, there is no one-size-fits-all method to implement a flag. That said, you should:
- Keep new features hidden behind feature flags so you can continuously push code.
- Segment your users for those features according to their device type, location, and other attributes, such as if they are in a beta test group.
- Determine which users and what percentage of them will get a given feature once it’s ready.
- Roll it out and observe how it performs.
- If there are problems, flip the flag off.
- If everything is sunshine and roses, ramp up to larger percentages until you’ve launched the feature to 100% of your users.
- Keep the flag only if you still might need it—make sure to remove it on flag cleanup day.
Recapping the benefits of feature flags and observability
Feature flags and observability work together to let you build code faster and more safely. Implement both to have greater control over releases, rollbacks, and everything in between. When something inevitably goes wrong, use feature flags to act, and observability to figure out what happened.
To learn more about how to control features in production using feature flags and observability, watch this talk with our partner, LaunchDarkly.