What is Lagging Indicators?

Lagging indicators in engineering metrics are past performance measurements. They measure how well our past actions led to stability, quality, customer satisfaction and business success. By tracking lagging indicators, teams can gain valuable insights into their processes and make data-driven decisions to improve future outcomes.

Types of Lagging Indicators

Performance Metrics

Performance metrics track how efficiently software changes are delivered, as well as the team’s ability to respond to issues.

Deployment frequency (DF)
Lead time for changes (LT)
Mean Time to Recovery (MTTR)
Change failure rate (CFR)
Customer satisfaction scores (CSAT /NPS)

Quality Metrics

Quality metrics gauge the reliability, stability, and robustness of the product as experienced by users.

Number of production incidents
Production Bug Rate
Customer-reported issues
System uptime
Service level agreement (SLA) compliance

Business Metrics

Business metrics bridge the gap between engineering efforts and the company's broader goals, showing the impact of technical work on business success.

Feature adoption rates
Revenue impact
Customer retention
Market share
Cost per deployment

Performance Metrics - Explained

Deployment Frequency (DF)

Description: Measures how often new code or features are deployed to production.
Importance: Higher deployment frequency indicates faster delivery, which helps the team respond quickly to customer needs and feedback.
How to Improve:
- Automate deployment processes to reduce manual work.
- Foster a culture of continuous integration and delivery (CI/CD).
- Break down large changes into smaller, more frequent deployments.

Lead Time for Changes (LT)

Description: The time taken from code commit to production deployment.
Importance: Short lead times allow faster response to customer needs and reduce the cost of waiting for features or bug fixes.
How to Improve:
- Streamline the code review and testing processes.
- Automate as much of the CI/CD pipeline as possible.
- Encourage smaller, iterative changes instead of large releases.

Mean Time to Recovery (MTTR)

Description: Average time taken to restore service after an incident.
Importance: A low MTTR indicates effective incident management and resilience, minimizing downtime and customer impact.
How to Improve:
- Implement robust monitoring and alerting to detect issues quickly.
- Invest in incident management training and processes.
- Use post-incident reviews to identify and address root causes.

Change Failure Rate (CFR)

Description: The percentage of changes that result in production incidents or require rollback.
Importance: Lower failure rates indicate reliable deployments and effective testing, leading to more stable releases.
How to Improve:
- Increase test coverage and automate quality checks.
- Conduct regular code reviews and deploy in small, incremental batches.
- Implement a rollback plan to quickly address failed changes.

Customer Satisfaction Scores (CSAT / NPS)

Description: Customer feedback on satisfaction with the product, often collected through surveys like Net Promoter Score (NPS) or Customer Satisfaction (CSAT) scores.
Importance: High satisfaction reflects a positive customer experience and product alignment with customer needs.
How to Improve:
- Address customer pain points through feedback loops.
- Enhance usability and reliability in areas that affect user experience.
- Provide timely support and communication to customers.

Quality Metrics - Explained

Number of Production Incidents

Description: Counts the incidents that occur in production impacting service availability or quality.
Importance: Lower incident counts indicate better code quality and more stable releases.
How to Improve:
- Implement rigorous testing and staging environments.
- Conduct root cause analyses and corrective actions post-incident.
- Continuously monitor production to proactively address issues.

Production Bug Rate

Description: The number of bugs that are missed in internal testing and occurred in production environment.
Importance: A high production bug rate can indicate issues in testing or quality assurance processes.
How to Improve:
- Increase test coverage, focusing on high-impact areas.
- Improve automated testing, especially for regression and edge cases.
- Encourage thorough code reviews and peer testing.

Customer-Reported Issues

Description: Issues reported directly by customers, indicating areas where the product is not meeting expectations.
Importance: Reducing customer-reported issues improves product satisfaction and trust.
How to Improve:
- Regularly analyze customer feedback for trends.
- Prioritize and resolve recurring issues promptly.
- Maintain an open communication channel for customer feedback.

System Uptime

Description: Percentage of time the system is available and functioning as expected.
Importance: High uptime is essential for customer trust, particularly for mission-critical applications.
How to Improve:
- Set up robust monitoring and alerting for early detection of issues.
- Invest in infrastructure redundancy and disaster recovery planning.
- Conduct regular maintenance to prevent downtime.

Service Level Agreement (SLA) Compliance

Description: Tracks how well the team meets agreed service level agreements, such as response time and availability targets.
Importance: Consistently meeting SLAs demonstrates reliability and commitment to customer agreements.
How to Improve:
- Regularly assess SLAs against operational metrics and adjust as necessary.
- Improve processes and tooling to meet or exceed SLA targets.
- Establish clear escalation paths and response protocols.

Business Metrics - Explained

Feature Adoption Rates

Description: Tracks the percentage of users who adopt new features.
Importance: High adoption indicates features are aligned with customer needs and provide value.
How to Improve:
- Conduct user research to understand customer needs before feature development.
- Improve onboarding and feature discoverability within the product.
- Use A/B testing to determine which features are most useful.

Revenue Impact

Description: Measures the effect of software changes on revenue, such as new features driving purchases or subscriptions.
Importance: Direct correlation with revenue highlights features that are valuable to the business.
How to Improve:
- Track revenue metrics by feature to understand which are most impactful.
- Align development priorities with business objectives.
- Focus on customer-centric development for revenue-generating features.

Customer Retention

Description: Tracks how well the product retains users over time.
Importance: High retention rates indicate strong customer satisfaction and product stickiness.
How to Improve:
- Address common reasons for customer churn proactively.
- Implement engagement features that encourage regular product use.
- Gather and act on feedback from churned users to improve retention.

Market Share

Description: Measures the product’s share of its target market compared to competitors.
Importance: Higher market share reflects competitive advantage and business growth.
How to Improve:
- Focus on unique value propositions and differentiation.
- Maintain a pulse on competitor offerings and adapt as necessary.
- Enhance product quality and innovation to stay relevant in the market.

Cost per Deployment

Description: Total cost of deploying code to production, including tools, personnel, and downtime.
Importance: Lower deployment costs improve development efficiency and return on investment.
How to Improve:
- Optimize and automate deployment processes to reduce manual effort.
- Use scalable infrastructure to reduce downtime and associated costs.
- Regularly review the deployment pipeline to eliminate unnecessary steps.

Summary

Lagging indicators collectively provide a retrospective view of how well engineering efforts have met quality, operational, and business goals. By continuously monitoring and improving these indicators, teams can create a stable, customer-centric, and financially successful product.

Lagging Indicators - Deep Dive

Table of contents