Monitoring Glossary
New to uptime monitoring or SRE? Explore our glossary of essential terms and concepts to help you build more reliable systems.
MTTR (Mean Time To Recovery)
The average time it takes to restore a system to full functionality after a failure.
Learn more
SLA (Service Level Agreement)
A formal contract between a service provider and a customer that defines the expected level of service, such as uptime percentage.
Learn more
High Availability (HA)
A system design protocol that ensures an agreed level of operational performance, usually uptime, for a higher than normal period.
Learn more
Heartbeat Monitoring
A monitoring method where the monitored system sends regular signals (pings) to the monitoring service to prove it is still alive.
Learn more
Latency
The time it takes for a data packet to travel from its source to its destination.
Learn more