Skip to main content

Observability-Driven Testing

Observability is the practice of instrumenting systems so that you can understand their internal state by analysing outputs such as logs, metrics, and traces. It allows development and QA teams to identify and resolve issues before they impact end-users.

Why Observability Matters in Testing

Even the most thorough test suites can miss real-world problems:

  1. Tests pass with all green dashboards, yet a hotfix is needed after deployment.
  2. Users report rare or unusual errors via support tickets that are difficult to reproduce.
  3. Developers and testers spend hours investigating alerts rather than fixing issues.

Monitoring provides visibility into known issues, but observability helps you understand and debug the unknowns, making your testing more proactive and insightful.

Quoting "Test Automation University":
"Monitoring lets you track expected behaviour and alert on deviations. Observability empowers teams to understand and respond to the unexpected."

Observability in QA

Observability allows QA to extend testing beyond traditional pre-production environments:

  • Production-like Testing: While deep testing in production is limited due to security and data constraints, observability enables real-time monitoring of how your product behaves with actual users.
  • Predictive Analysis: Detect potential failures before they become critical.
  • Faster Debugging: Provides rich insights to reduce the time spent analysing errors.

Use Cases in QA

1. Exploratory Testing

  • Observability tools provide access to logs, metrics, and traces, helping testers dig deeper into unexpected behaviour.
  • Even without immediate solutions, detailed observability data allows better collaboration with developers.

2. Decision-Making

  • Testers can make data-driven decisions by exploring patterns, system behaviour, and performance insights.
  • Supports informed risk assessments and test prioritization.

Datadog: A Unified Observability Platform

Datadog is a SaaS platform that combines metrics, logs, and traces into a single platform, offering comprehensive visibility across applications, infrastructure, and third-party services.

Key Features for QA

1. Real User Monitoring (RUM) & Session Replay

  • Track product usage across devices, browsers, and geographies.
  • Troubleshoot frontend errors and identify performance bottlenecks.
  • Session Replay: Replay user sessions to see exactly how users interact with the system.
  • Frustration Signals: Detect key UX pain points:
    • Rage Clicks: Multiple clicks in a short time.
    • Dead Clicks: Clicks on non-responsive elements.
    • Error Clicks: Clicks preceding JavaScript errors.

2. Synthetic Monitoring

  • Schedule automated tests in production or pre-prod environments.
  • Detect issues proactively before users encounter them.
  • Supports code-based and code-free browser/API tests.
  • Integrates with CI pipelines to shift testing left.

3. Continuous Testing

  • Automates tests across the product lifecycle.
  • Enables codeless web recording, parallel test execution, and multi-location testing.
  • Accelerates delivery while ensuring high-quality features.

4. Error Tracking

  • Automatically aggregates similar errors into issues.
  • Provides stack traces linked to source code.
  • Reduces alert fatigue by avoiding duplicate notifications.

Getting Started with Datadog

Prerequisites

  1. Access Datadog via Microsoft Office 365 (Applications section).
  2. If access is missing, contact Cloud & Hosting or post in #ask-sysops Slack.
  3. Ensure access to CI and UX monitoring sections.
  4. Familiarize yourself with Datadog dashboards and console basics.

Summary

Observability transforms QA from reactive testing into proactive insight-driven testing. It complements functional testing, supports non-functional testing, and enables teams to:

  • Detect and analyse production issues in real-time.
  • Reduce debugging and incident resolution time.
  • Make informed testing and release decisions.

By integrating observability tools like Datadog, QA teams can monitor, analyse, and continuously improve product quality in environments closest to the end user.

References