Imagine: your online store goes down on Friday evening — at the peak of orders. You call your IT support provider, but the contract just says "we provide technical support." When should you expect a response — in 15 minutes or 2 days? That's exactly why SLA exists.

What is SLA

SLA (Service Level Agreement) is a service level agreement. It is a legally binding document between a client and a service provider that clearly defines quality parameters: what, when, and how will be delivered. SLA transforms vague promises into measurable commitments.

Key SLA metrics

First Response Time

The time from when a request arrives to the first specialist's reaction. Not problem resolution — but acknowledgment that the request has been received and is being worked on. Typical values: critical incidents — 15–30 minutes, standard requests — 1–4 hours.

Resolution Time

Time from ticket opening to complete problem resolution. Depends on priority: P1 (critical) — 1–4 hours, P2 (high) — 4–8 hours, P3 (medium) — 1–2 business days, P4 (low) — 3–5 business days.

Uptime (system availability)

The percentage of time a system or service is available. The most important metric for cloud services and critical systems:

  • 99% uptime — 87.6 hours of downtime per year (acceptable for non-critical systems)

  • 99.9% uptime — 8.76 hours of downtime per year (good level for business applications)

  • 99.95% uptime — 4.38 hours of downtime per year

  • 99.99% uptime — 52.6 minutes of downtime per year (enterprise level)

MTTR (Mean Time To Recovery)

Average recovery time after a failure. The lower — the better. For critical systems, the target is MTTR under 1 hour.

SLA tiers: what providers typically offer

  • Basic — support during business hours (9am–6pm, Mon–Fri), 4-hour response, 2 business-day resolution

  • Business — extended hours (8am–8pm, Mon–Sat), 1-hour response, 4-hour resolution for critical issues

  • Premium/24x7 — around the clock 7 days a week, 15–30 minute response, dedicated engineer, proactive monitoring

Penalty clauses: what happens when SLA is breached

A good SLA contains a penalty mechanism (SLA credits). Typical schemes: partial refund of monthly fee for each hour of downtime beyond the limit, fixed compensation for each violated P1 priority ticket, client's right to terminate the contract without penalty for systematic violations (3+ times per quarter).

How to negotiate a good SLA

Step 1: Identify your system criticality

Which systems are critical to you? Which system's failure would stop the business within an hour? These are the systems that need a strict 24/7 SLA with guarantees.

Step 2: Request reports

A good provider will share reports from previous clients or their own uptime and response time statistics. If they refuse — that's a red flag.

Step 3: Check term definitions

"Critical incident" must have a clear definition in the contract. So must "business hours." What counts as "downtime" — only complete unavailability, or also performance degradation?

Step 4: Insist on an escalation matrix

Who calls whom if the first specialist didn't solve the problem in the allotted time? An escalation matrix with names and phone numbers is mandatory for critical systems.

Example of a good SLA for small business

For an online store with 500,000 UAH/month turnover, the optimal SLA looks like: critical failures (site unavailable) — 30-minute response, 2-hour resolution, 24/7. Non-critical requests — 2-hour response, 1 business day resolution. 99.9% uptime guarantee. Monthly SLA performance report.

SLA is not just a piece of paper. It is a risk management tool that protects your business from unpredictable downtime and vague provider accountability. Demand a clear SLA with any IT partner.