Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kameleoon.com/llms.txt

Use this file to discover all available pages before exploring further.

Kameleoon offers two types of dynamic traffic allocation algorithms to help maximize experiment performance: multi-armed bandits (MABs) and contextual bandits. Both approaches use real-time performance data to allocate more traffic to better-performing variations but differ in how they treat user data. This article explains how these algorithms work, when to use them, and how to activate them in your experiments. To enable dynamic traffic allocation, create a new experiment or open an existing one.
In the Variation to serve section, select the preferred allocation from the dropdown menu. Choose between Multi-armed bandit or Contextual bandit optimization.
Kameleoon updates the allocation based solely on the lift of the primary goal.

Multi-armed bandits

When using dynamic allocation (such as MABs), you cannot manually edit exposure rates. Instead, Kameleoon automatically measures improvement over the original variation and estimates the gain in total conversions using the Epsilon Greedy algorithm. Kameleoon repeats this process hourly. The MAB algorithm redirects traffic to higher-performing variations, even without statistical significance, which can drastically reduce the time required to identify winning or losing variations.
Auto-optimized experiments rely on the original variation (“off” for Feature Experiments) to optimize the deviations. If the original variation does not receive traffic, the deviation might not update, causing the allocation to remain at 50/50 despite a clear winning variation.
MABs do not rely on a control or baseline experience. Unlike A/B tests, MABs prioritize improvement over an initial equal allocation and dynamically adjust traffic based on real-time performance. In cases where statistical analysis is less important and you must minimize “exploration” time, MABs are useful because they focus more on “exploitation.”

Contextual bandits

Contextual bandits dynamically optimize traffic allocation in experiments using machine learning. They adapt in real-time to redistribute traffic based on variation performance and user context to maximize effectiveness. Key differences distinguish multi-armed bandits from contextual bandits:
  • Multi-armed bandits: These optimize traffic distribution among multiple variations (arms) to maximize a defined goal, such as click rates or conversions. They treat all users equally, with no distinction based on user attributes. This makes them ideal for scenarios where user-specific data is unavailable or unnecessary, and the focus remains on finding the best-performing variation for the overall audience.
  • Contextual bandits: These incorporate additional user-specific data—such as device type, location, or behavior—into decision-making. They facilitate more personalized decisions by tailoring variations to specific users for improved outcomes. The variability introduced by user attributes allows contextual bandits to optimize decisions in dynamic environments.
While multi-armed bandits optimize traffic allocation uniformly across users, contextual bandits leverage contextual data to make more personalized, data-driven decisions. Read the Dynamic traffic allocation article to learn more about how MAB optimization works, or read the Kameleoon statistical paper to dive deeper into the technical details of the MAB algorithm.