Service Level Agreement (SLA) for Support and Maintenance
Key facts (answer-first)
- This SLA describes support and maintenance service targets for incidents, bugs, and security vulnerabilities.
- Default coverage is business hours in CET/CEST. After-hours / 24×7 on-call is available by support plan.
- Primary intake channel is a ticketing system. Requests from chats can be mirrored into tickets for SLA tracking.
- Service restoration is prioritized before deep root-cause work for critical incidents.
- Onboarding is required to set realistic targets for complex or inherited systems.
Document metadata and applicability
- Document type: Service Level Agreement (SLA) — Support & Maintenance
- Timezone baseline: CET/CEST
- Coverage model: business hours by default; extended coverage by plan
- Contract note: detailed targets and plan-specific values can be documented in a SOW / support plan appendix.
Purpose and audience
This page defines how support requests are classified, accepted, escalated, handled, and reported. The intended audience is procurement, CTO/engineering leadership, security stakeholders, and delivery owners who need predictable operational processes.
Scope of SLA: services in scope, exclusions, responsibility boundaries
This SLA covers custom software support and evolution for systems typically deployed on customer-managed infrastructure. We support production incidents, bug fixing for critical business flows (payments, checkout, account access), security vulnerabilities, and coordination with third-party providers. A shared responsibility model applies: we are responsible for application code, configuration, deployment processes, and diagnostics within our access scope; the customer is responsible for infrastructure, network, and third-party service availability unless explicitly included in the SOW. Hosting by Webdelo is available in limited cases for selected customers, but is not offered as a public hosting service.
Services in scope
This SLA applies to:
- Production incidents affecting availability or core business flows.
- Bug fixing for production issues (including core flows such as checkout, payments, account access).
- Security vulnerabilities and security incidents (handled with high priority).
- Coordination and technical assistance with third-party providers when included in the support plan.
Services out of scope (exclusions)
This SLA does not cover issues that are outside operational control, including:
- Data center / infrastructure outages not managed by our team.
- Third-party service outages or API changes (examples: payment providers, email marketing platforms).
- Failures inside microservices or subsystems not developed or operated by our team (we may assist in diagnosis and coordination).
- Cases where required access is not provided (limits apply; see “Access prerequisites” and “Onboarding”).
What we do when the root cause is outside our control
When an incident is caused by third-party providers or external dependencies, we can:
- Identify and confirm the dependency-related failure mode.
- Provide technical context and logs where available.
- Assist the customer as a technical mediator in communication with vendor support if the plan includes this support scope.
- Document findings in the ticket and propose mitigations or workarounds on our side when possible.
Shared responsibility model
Support and maintenance operate within a shared responsibility model:
- Our responsibility: application code and business logic, deployment and release processes, configuration of components under our management, diagnostics and incident coordination, monitoring within the agreed scope.
- Customer responsibility: infrastructure and hosting environment (servers, network, DNS, load balancers), access provisioning and credential management, third-party service contracts (payment providers, email services, CDN), business decisions on mitigation acceptance and change prioritization.
- Shared / coordinated: CI/CD pipeline management (depending on access), database administration (depending on scope), security practices (application vs. infrastructure layers).
The exact boundary is defined per project during onboarding and documented in the SOW.
Definitions: Incident, Bug, Service Request, Change Request
A ticket is the base unit for SLA tracking. An incident is a production failure (service outage, HTTP 500, infrastructure failure). A bug is the incorrect behaviour of an existing function (cart, payments, account). A Change Request is planned work, not an incident. Urgent changes driven by external deadlines (regulatory, campaign) can be prioritised as urgent outside the regular queue.
Ticket
A ticket is the primary object for tracking, triage, escalation, response, and reporting. SLA tracking is based on tickets.
Incident (production incident)
An incident is a service failure that impacts availability or core functionality in production. Examples:
- Site/service is unavailable (full or partial outage).
- Persistent server errors (e.g., HTTP 500) blocking usage.
- Critical deployment or infrastructure failure impacting production.
Bug (functional defect)
A bug is incorrect behaviour in an existing function. Examples:
- Add-to-cart fails.
- Payment flow fails or returns errors.
- Account/cabinet functionality fails.
Change Request (planned improvement)
A change request is a request to modify or improve the system (new UI/UX, new feature, new business logic). This work is handled as a planned delivery flow:
- It is estimated.
- It is scheduled based on workload and contract conditions.
- It is not treated as an incident unless it blocks production operations.
Urgent changes with external deadlines
Some changes are urgent due to external deadlines (law/regulation, time-bound campaign, critical business event). These can be prioritised as urgent work even if they are not incidents. Prioritisation depends on customer impact and support plan.
Definitions: First response, mitigation (restore), resolution
First response means ticket acknowledgement and actual work commencement, with escalation if needed. Mitigation (restore) means returning minimal viable service operation as quickly as possible — the default approach for critical incidents. Resolution (full fix) means eliminating the root cause and restoring full functionality; for complex cases, it follows after service restoration.
First response (acknowledgement + work started)
First response means:
- The ticket is acknowledged.
- The team confirms the issue has been received and is being worked on.
- Escalation is triggered if needed (right people are involved).
Mitigation / restore service
Mitigation means restoring minimal viable operation as quickly as possible (even if the full root cause fix requires additional time). This is the default approach for critical incidents.
Resolution (full fix)
Resolution means a complete fix addressing the cause and restoring full functionality. For complex cases, full resolution may follow after service restoration (especially for after-hours incidents).
Support hours, timezones, holidays
Default coverage is business hours in CET/CEST, with Europe and the US as key markets. Extended coverage includes after-hours response, on-call engineer assignment with customer timezone overlap (including US timezones), and deployment scheduling in customer-friendly windows. 24×7 on-call is available for Enterprise. For US customers, CET/CEST business hours provide a structural advantage: rapid incident response during US nighttime and release scheduling in low-traffic US windows. Support languages: English and Russian (guaranteed), German (available), other languages by agreement. Holiday calendars are aligned with the delivery location and documented in the SOW.
Standard coverage: business hours (CET/CEST)
Default support coverage is provided during business hours in CET/CEST.
Extended coverage: after-hours and on-call (by plan)
Extended coverage can be provided for selected plans:
- After-hours response for critical incidents.
- On-call engineer assignment for customer timezone overlap (including US timezones if required by plan).
- Scheduling of critical deployments in customer-friendly windows (e.g., night/early morning windows to reduce user impact).
Holidays and non-working days
Holiday calendars can be aligned with the delivery location and contract context (examples used in operations: Moldova, Germany, USA). The applicable calendar can be defined per support plan / SOW.
Timezone advantage for US customers
CET/CEST-based operations create a structural advantage for customers in US timezones:
- Nighttime incident response: critical incidents occurring during US nighttime overlap with our standard business hours, enabling immediate engineer engagement without on-call surcharges.
- Release windows: deployments and critical maintenance can be scheduled during US night/early morning hours to minimize impact on US business operations.
- Coverage for CIS timezones can also be arranged.
Supported languages
- English and Russian: fully supported for all communication, documentation, and reporting.
- German: available for customer communication and documentation.
- Other languages: can be arranged by agreement.
- AI-assisted translation is available as an option to reduce language barriers in operational communication, subject to the customer's data handling policy.
Intake channels and ticketing: official support channel, required request data
The official SLA tracking channel is the ticketing system. A request should include description, time window, location (page/route/API), business impact, and, where possible screenshots/logs/trace IDs. Support effectiveness depends on access: without production access, support may be limited to repository-level changes. Access requirements are established during onboarding.
Official support channel (SLA tracking starts here)
The official intake channel for SLA tracking is a ticketing system. Chat-based requests (e.g., messenger channels) can be used for operational communication, but should be mirrored into tickets to ensure consistent tracking and reporting.
Required information in a request (minimal fields)
To reduce triage time and avoid repeated clarifications, a request should include:
- What happened (short description).
- When it happened (time window).
- Where it happened (page/route/API/function; environment if known).
- Business impact (what is blocked; who is affected).
- Initial severity assessment (if the customer can provide it).
- Relevant artifacts when possible: screenshots, error messages, logs, trace IDs.
Access prerequisites and operational limits
Support effectiveness depends on access. Examples:
- Without production access, support may be limited to repository-level changes and guidance.
- Infrastructure failures (deployment pipeline broken, runtime environment issues) cannot be fully resolved without access to the relevant infrastructure layer. Access requirements are established during onboarding.
Severity and priority model: mapping, definitions, examples
4-level severity model S1–S4, compatible with enterprise priority labels P0–P4. S1/P0 — full or partial service outage, core business flow blocked, critical security incident. S2/P1 — significant degradation with workarounds available. S3–S4 — limited impact, planned queue. Urgent changes by external deadlines (regulatory, campaign) can be prioritised outside the regular queue.
Severity levels and mapping to priority labels
We use a severity model compatible with standard enterprise practice:
- Severity S1–S4 (can be mapped to P0–P4 priority labels).
Severity definitions (business impact first)
S1 / P0 — Critical, business-blocking
- Full or partial service outage.
- Core business flow blocked (payments/orders/account access).
- Critical infrastructure/deployment failure impacting production.
- Security incident with risk of data exposure or major reputational damage.
S2 / P1 — High impact, not fully blocking
- Significant degradation or a major feature fails, but business can operate with limitations or workarounds.
- Examples: SEO/indexing impact, important pages broken, secondary but business-relevant flows degraded.
S3 / P2 — Medium impact
- Functional defects with limited impact or viable workarounds.
- Suitable for planned bug-fixing queue.
S4 / P3–P4 — Low impact / backlog
- Cosmetic issues, minor UX changes, non-urgent improvements.
- Scheduled based on workload and contract conditions.
Urgent changes by external deadline
- Can be treated as elevated priority when business deadline risk is explicit (regulatory compliance, time-bound campaign, event-specific requirements). Handling depends on the plan and impact.
Service targets: response, mitigation, resolution, communication
Service targets depend on the support plan (Basic / Extended / Enterprise) and are documented in the SOW. Typical first response target for critical incidents during business hours is ~20 minutes. Service restoration target for S1: up to 4 hours (when the cause is in our control scope; final targets in SOW). Status update frequency for an active S1: every 30 minutes (Enterprise), every 2 hours (Extended), every 4 hours (Basic). Milestone-based updates: acknowledgement, diagnosis, restoration, fix, recommendations.
Targets depend on support plan
Service targets depend on the selected support plan (Basic / Extended / Enterprise). Exact targets can be documented in a SOW / plan appendix.
First response targets (operational baseline)
- During business hours, for critical incidents, the operational target is to start work quickly after ticket receipt; typical target for critical cases is ~20 minutes (faster for higher plans when resources are available).
- After-hours response is provided by plan; typical after-hours response may be within hours due to on-call notification and ramp-up.
Service restoration targets (mitigation vs. resolution)
The restore-first approach means restoring minimal critical functionality as quickly as possible, even if the full root cause fix requires additional time. Restoration targets apply when the root cause is within our area of responsibility (application code, configuration, deployment under our control).
| Target type | S1 (Critical) | S2 (High) | Notes |
|---|---|---|---|
| Service restoration (mitigation) | Target: up to 4 hours | Target: up to 8 hours | Applies when cause is in our control scope |
| Full resolution (root cause fix) | Best effort | Best effort | Depends on complexity; not guaranteed as a fixed timeframe |
Important qualifications:
- Final restoration targets are defined in the SOW per project and depend on: technology stack, system maturity, access level, infrastructure control scope, and number of third-party dependencies.
- If the root cause is outside our control (data centre outage, network failure, hosting provider issue), we provide coordination and technical assistance but cannot guarantee restoration time for infrastructure we do not manage.
- Restoration means returning critical business flows to a functional state. This may include operating in degraded mode (limited functionality) while full restoration continues.
- Root cause analysis (RCA) is performed after service restoration, not instead of it.
Communication milestones (status updates)
We provide updates in tickets based on progress milestones:
- Acknowledged / work started.
- Initial diagnosis direction confirmed.
- Service restored (mitigation applied) — if applicable.
- Resolution deployed (full fix) — if applicable.
- Follow-up notes and recommendations.
Status update frequency depends on the support plan (see "Enterprise conditions and contractual appendices" → "Incident communication: status update cadence"). The minimum update frequency for an active S1 incident is every 4 hours (Basic), with a higher frequency for Extended and Enterprise plans.
Blocking factors: access and customer-provided inputs
If progress is blocked by missing access or missing customer input:
- We request the needed access/information in the ticket.
- Work may be paused until access/input is provided.
- Elapsed time includes waiting time unless otherwise agreed, because the incident cannot be resolved without required prerequisites.
Escalation model: roles, ladder, leadership involvement
3-tier escalation: on-call engineer → tech lead / project lead → CTO / architect. Escalation is triggered immediately when domain context or access is lacking. For Enterprise, time-based thresholds are documented in the SOW: 4 hours → tech lead, 8 hours → CTO, 12 hours → CEO. Executive involvement — for direct financial losses, reputational incidents, or critical security incidents.
Escalation ladder (who is involved)
The standard escalation path is:
- On-call / assigned engineer (first responder)
- Project lead / tech lead (domain owner)
- Senior leadership / high-privilege roles (CTO, architect) — for complex or critical cases
When escalation happens
Escalation is triggered:
- Immediately, if the first responder lacks domain context or access (e.g., payments module, infrastructure layer).
- After investigation, if the incident leads into a subsystem without required access or expertise.
Executive escalation triggers
Senior leadership involvement can occur when:
- Business-critical customers face direct financial losses (e.g., banking/trading flows).
- A reputational incident requires rapid containment (e.g., visible compromise/defacement).
- A critical security incident requires coordinated response.
Time-based escalation thresholds (Enterprise)
For the Enterprise plan, an escalation matrix with time-based thresholds is documented in the SOW (see "Enterprise conditions and contractual appendices" → "Escalation matrix with time-based thresholds").
Incident management process for S1/S2: restore-first, mitigation toolbox, closure criteria
For S1/S2, a restore-first approach applies: restoring minimal viable service before deep root-cause analysis. Mitigation toolbox: disabling a problematic function, blocking an overloaded route, emergency load reduction, hotfix deployment. Milestone-based communication: received → localized → restored → full fix. An incident is closed when critical business flows are restored.
Operational priority: restore service first
For S1/S2 incidents, the first priority is to restore minimal viable service. Deep root-cause work follows after service restoration.
Mitigation toolbox (examples used in practice)
Mitigation actions depend on incident type and may include:
- Temporarily disabling or limiting a problematic function.
- Blocking an overloaded route/endpoint to stop cascade failures.
- Applying emergency controls to reduce load impact while investigating infrastructure or application issues.
- Deploying a targeted hotfix when appropriate.
Communication approach during incidents
Communication is milestone-based:
- "We received the incident and started work."
- "We localized the failure area / confirmed suspected cause."
- "We restored critical function / reduced impact."
- "We are working on the full fix and verification." This keeps the customer informed without creating overhead that slows down recovery.
Incident closure criteria
An incident is considered closed when:
- The service is restored and operational for critical business flows.
- The primary failure mode is mitigated so it does not keep blocking business. Full restoration of all functionality or deeper systemic improvements may continue as planned work after stabilization.
Root Cause Analysis (RCA) / Postmortem: preliminary explanation and structured follow-up
A preliminary operational explanation is provided during the incident. After resolution of major incidents, a structured RCA summary (postmortem) is prepared by the tech lead: what happened, why, what actions were taken, what is recommended to prevent recurrence. The format is focused on practical value, not formal timelines.
Preliminary explanation during mitigation
During incident handling, we provide a short operational explanation based on observed symptoms and early diagnosis.
RCA summary after resolution (major incidents)
After stabilization and resolution, a structured RCA summary can be prepared by the project lead / tech lead. The RCA summary focuses on:
- What happened (high-level).
- Why it happened (likely root cause).
- What was done to fix/mitigate.
- What we recommend to prevent recurrence (actions and improvements). The RCA format is kept practical and aggregated; overly detailed timelines are not produced by default unless required by the customer.
Monitoring and proactive detection: coverage by plan, alerting, depth of checks
Monitoring by plan: baseline — synthetic availability checks, certificate and domain expiration monitoring; advanced — content-level checks (HTML markers), infrastructure and load monitoring. For Enterprise, 24×7 monitoring with on-call alerting is available. Proactive customer notifications for critical incidents include initial status and expected next steps.
Monitoring depends on support plan
Monitoring coverage and depth depend on the support plan. Typical capabilities include:
Baseline monitoring
- Synthetic availability checks for critical domains/pages at defined intervals.
- Alerting when endpoints are unavailable or return failure responses.
Operational expiry monitoring
- Certificate expiration monitoring.
- Domain expiration monitoring and related operational alerts.
Advanced monitoring (by plan)
- Content-level checks (HTML markers / expected text blocks) to detect "HTTP 200 but broken page" scenarios.
- Infrastructure and resource monitoring to detect load spikes or insufficient capacity.
Alert routing and response
- Alerts can be routed to engineering groups responsible for the supported service.
- For plans with 24×7 monitoring, alerts are available around the clock and can trigger on-call response.
Proactive customer notifications
- For critical incidents, customers are notified with an initial status and expected next steps.
- For non-critical issues resolved quickly, notification may be omitted to avoid unnecessary noise. Notification rules can be customized by plan.
Security incidents and vulnerability handling: classification, response principles, customer notification
Security vulnerabilities and incidents are handled with high priority: containment, exposure window reduction, senior technical role involvement. For Enterprise, the confirmed incident notification timeline is up to 48 hours (down to 24 hours by separate agreement). Customers are notified for high-risk vulnerabilities even before exploitation is confirmed — to enable legal/operational preparedness.
Vulnerability classification and ticket tagging
Security issues are handled as standard tickets with explicit security classification (e.g., "vulnerability / security incident") and high priority.
Response principles for security issues
For security issues, the operational priority is containment and risk reduction:
- Rapid mitigation to close the exposure window.
- Verification that the vulnerability is blocked or controlled.
- Involvement of senior technical roles when required. Publicly, security handling is defined as best-effort with high priority, with plan-based escalation and response options.
Customer notification policy for security
- If a security issue is observed in production or reported externally, the customer is notified and kept informed.
- If a potential vulnerability is found internally, notification is based on risk level. For high-impact potential exposure, the customer can be notified to prepare operational/legal/communications response even if exploitation is not confirmed.
Data handling, access controls, and contract termination
Customer data is processed on the customer's infrastructure by default. Staging environments at Webdelo are used only for non-critical projects or when production data access is not provided — in such cases, mock data is used. For systems handling sensitive data (personal data, financial data, medical records), the customer is responsible for specifying requirements; access controls and security measures are tightened accordingly. Upon contract termination, access revocation is controlled and verified. Specific data handling requirements are documented in the SOW.
Default model: customer infrastructure
- Data is processed and stored on the customer's infrastructure (hosting, cloud, on-premise).
- We access the customer's systems via agreed secure channels (VPN, SSH, bastion hosts) as defined during onboarding.
Staging environments and test data
- For non-critical projects, a staging environment may be hosted at Webdelo for development and testing purposes.
- When production data access is not available, mock data or anonymized datasets are used.
- Use of real production data in staging environments requires explicit customer approval and additional security measures.
Sensitive data handling
- The customer is responsible for notifying us about systems that process sensitive data (personal data, financial records, health data, regulated data).
- For sensitive data systems, additional measures apply:
- Restricted access scope and personnel.
- Separate agreements with team members can be arranged if required.
- Preference for working exclusively on the customer's infrastructure without local data copies.
- The definition of "sensitive data," access boundaries, and specific requirements are documented in the SOW.
Access revocation upon contract termination
- Upon contract termination, all granted accesses are revoked in a controlled manner.
- Verification that no data copies remain on Webdelo systems (if temporary copies were permitted during the engagement).
- Access revocation checklist is part of the offboarding process (see "Vendor continuity and knowledge management").
Security posture and development practices: applied measures, certification status, security questionnaire
Secure development practices are applied at a level corresponding to the support plan, project type, and customer budget. Core practices include code review, dependency management, access control, logging, and secrets management. Formal certification (SOC 2, ISO 27001) is not held at present; a roadmap toward certification is planned within a 12–15 month horizon (this is a stated intention, not a guarantee). A security questionnaire and description of applied controls are available on request.
Applied security practices
The following practices are part of the standard development and support process:
- Code review: all changes to production code are reviewed before deployment.
- Dependency management: regular monitoring and updates of third-party libraries and frameworks to address known vulnerabilities.
- Access control: role-based access, principle of least privilege, access revocation upon team changes.
- Logging and monitoring: application-level logging for diagnostics and incident investigation; monitoring within agreed scope.
- Secrets management: credentials, API keys, and tokens are stored securely and not committed to repositories.
- Framework and runtime updates: coordinated with the customer; security-driven updates can be prioritized.
Depth of security practices depends on context
The depth and rigor of security practices are balanced against project requirements and budget:
- Baseline practices (listed above) are applied to all projects.
- Enhanced measures (penetration testing, security audits, threat modeling, SAST/DAST tooling) are available by agreement and documented in the SOW.
- Planned security work (audits, hardening, infrastructure review) is handled as separate work, not as incident resolution.
Certification status
- No formal certification (SOC 2, ISO 27001) is currently held.
- A roadmap toward certification is planned within a 12–15 month timeframe. This is a stated intention and plan; the customer should inquire about the current status before contract signing.
- A security questionnaire describing applied controls, data handling practices, and organizational measures is available on request.
Code quality standards and test coverage: targets, critical path coverage, customer responsibility
We strongly recommend that customers allocate budget for automated test coverage as a core part of the development process. Test coverage directly affects the reliability, maintainability, and speed of future changes. Reducing or eliminating test coverage to save budget increases the risk of regressions, longer incident resolution times, and higher long-term maintenance costs.
Test coverage targets (operational reference)
- Standard target: not less than 90% automated test coverage for application code (unit and integration tests, as applicable to the technology stack).
- Critical business paths — 100% test coverage: all code paths related to payments, customer data handling, core business logic, and authentication must be covered by automated tests without exception.
- Test types and depth (unit, integration, end-to-end) are agreed per project and documented in the SOW.
Customer responsibility for test budget decisions
- If the customer decides to reduce or eliminate test coverage to optimize budget, this decision must be explicitly confirmed in writing (ticket, email, or SOW amendment).
- By confirming reduced test coverage, the customer acknowledges and accepts the associated risks: increased probability of regressions, longer diagnostic and resolution times for incidents, and higher cost of future changes.
- Webdelo will document the reduced coverage decision and its potential impact in the project risk register.
Static analysis and code review standards
- All production code changes are subject to code review before deployment.
- Static analysis tools (linters, code style checkers) are part of the standard CI/CD pipeline where applicable.
- For projects with enhanced security requirements, SAST/DAST tooling can be integrated by agreement (documented in the SOW).
Reporting and metrics: what can be provided, how it is generated, when it is delivered
Reporting is generated from ticket data: time to first response, time to restore, incident counts by severity, historical summaries. Monthly reporting is an option for Extended and Enterprise plans; format is agreed per project (PDF, CSV, dashboard). Tooling: a ticket portal is mandatory for all plans; Jira is used for Extended and Enterprise by default; integration with the customer's system (Jira Service Management, ServiceNow, PagerDuty) is available for Enterprise by agreement. Data is used to identify recurring patterns, improve monitoring, and reduce time to detect and time to restore.
Reporting availability (by request / by plan)
Reporting can be generated based on ticket data. Default mode:
- By request, to avoid unnecessary overhead for low-tier plans. For enterprise plans, periodic reporting can be agreed.
Metrics that can be produced from ticket history
When work is tracked via tickets, the following metrics can be extracted for selected periods:
- Time to first response.
- Time to restore service (when applicable).
- Incident counts by severity and category.
- Historical incident lists and summaries.
Use of reporting for continuous improvement
Ticket-based reporting can be used to:
- Identify recurring incident patterns.
- Improve monitoring coverage and alert quality.
- Reduce time to detect and time to restore by focusing on weak points in the service.
Reporting format and frequency
- Basic: reporting on request.
- Extended: monthly reporting available as an option; format agreed per project.
- Enterprise: periodic reporting (monthly by default); format agreed in the SOW (PDF summary, CSV export, or dashboard access).
Tooling and integrations
- A ticket portal is mandatory for all plans.
- Extended and Enterprise: Jira is used by default for ticket management.
- Enterprise (by agreement): integration with the customer's existing system is possible. Examples of systems that can be coordinated with: Jira Service Management, ServiceNow, PagerDuty, Zendesk.
- Communication channels: ticketing system (mandatory), Slack / Microsoft Teams (by agreement for operational communication).
- Changing or integrating tools requires onboarding, process adjustment, and may affect cost — documented in the SOW.
Dependencies and customer responsibilities: access, external services, delays
External services (payment providers, email providers, third-party APIs) are outside our control; we provide diagnosis, evidence, and vendor coordination. Customer responsibilities: access (production, logs, CI/CD), timely responses to support requests, business impact confirmation. Without access, work may be paused.
Third-party dependencies
If business impact is caused by external services (payment providers, email services, other APIs):
- We cannot fix the external service.
- We can diagnose, provide evidence, propose mitigation, and assist customer/vendor coordination as part of paid support scope.
Customer responsibilities affecting outcomes
To achieve the best possible outcomes, customers should provide:
- Required access (production, logs, CI/CD where applicable).
- Timely responses to support requests when information is needed.
- Confirmation of business impact and acceptance of mitigation steps when required.
Maintenance and planned updates: security patches and platform updates
Planned maintenance includes dependency updates to address vulnerabilities and framework/language updates (PHP, Go, etc.). Updates are coordinated with the customer; security-driven updates may be prioritized.
Maintenance actions may be required to keep systems stable and secure, including:
- Dependency updates to address vulnerabilities.
- Framework and language version updates (e.g., PHP/Go runtime updates) when needed.
When possible, updates are coordinated with the customer. For security-driven issues, updates may be prioritized to reduce exposure.
Change management: approval workflow, maintenance windows, rollback, emergency changes
Changes to production systems follow a structured approval process. A change includes deployments, bug fixes, dependency updates, configuration changes, and infrastructure modifications within our scope. For Enterprise, integration with the customer's change management process (sprints, approval boards) is available. Rollback plans are mandatory for critical changes. Emergency changes during full outages may be applied without prior approval; the customer is notified immediately after.
What constitutes a change
A change is any modification to the production environment or codebase that may affect system behavior:
- Code deployments (new features, bug fixes, refactoring).
- Dependency and framework updates.
- Configuration changes (environment variables, routing, access rules).
- Infrastructure changes within our management scope (container configuration, CI/CD pipeline modifications).
Approval workflow
- Standard changes: requested via ticket, estimated, scheduled, and deployed after customer confirmation.
- Enterprise / integrated process: changes can be coordinated through the customer's workflow — sprint planning, change advisory review, or approval gates. Integration details are documented in the SOW.
- Authorized contacts: the customer designates persons authorized to approve changes (business, technical, infrastructure roles). The list of authorized contacts is maintained in the SOW or project documentation.
Maintenance windows
- For smaller projects, changes are typically deployed during business hours without a formal maintenance window.
- For Enterprise and critical systems, maintenance windows are agreed in advance and documented in the SOW.
- Deployment scheduling can be aligned with customer business hours to minimize user impact (e.g., night or early morning windows).
Rollback plan
- A rollback plan is mandatory for critical changes (changes affecting core business flows, database migrations, infrastructure modifications).
- The rollback approach is documented in the ticket before deployment.
- For non-critical changes, rollback readiness is maintained as standard engineering practice.
Emergency changes
- In the event of a full service outage caused by a recent change, an emergency rollback or hotfix may be applied without prior customer approval to restore service.
- Emergency changes do not alter business requirements — they restore the previous working state or apply a targeted fix.
- The customer is notified immediately after an emergency change is applied, with an explanation of what was done and why.
Support plans overview (no pricing): Basic, Extended, Enterprise
3 support plans: Basic (business hours, baseline monitoring), Extended (extended hours, faster response, advanced monitoring, status updates every 2 hours for S1), Enterprise (24×7 on-call, highest priority, advanced monitoring with customizable metrics, escalation to CTO/CEO, status updates every 30 minutes for S1). Availability target: 99.5% (aspiration 99.8%) for systems with full application access. For Enterprise, the SOW documents: service credits (up to 30% of monthly fee), escalation matrix with time thresholds, update cadence up to every 30 minutes (S1), RPO/RTO, security notification timeline (up to 48h), quarterly service reviews (QBR).
Plan comparison (capabilities)
| Support plan | Coverage hours | First response approach | Monitoring depth | Escalation / senior involvement |
|---|---|---|---|---|
| Basic | Business hours (CET/CEST) | Best-effort; targets agreed per plan/SOW | Baseline availability checks | Standard escalation as needed |
| Extended | Business hours + optional extensions | Faster response targets for critical incidents (plan-based) | Baseline + selected advanced checks | Tech lead involvement for complex cases |
| Enterprise | 24×7 on-call for critical incidents (plan-based) | Highest priority and fastest operational response | Advanced monitoring; customizable metrics | Senior-level escalation for high-impact incidents |
Additional capabilities by plan (Enterprise conditions defined in SOW)
| Capability | Basic | Extended | Enterprise |
|---|---|---|---|
| Service credits for first response target breach | — | — | SOW |
| Escalation matrix with time-based thresholds | — | — | SOW |
| Defined status update cadence | Milestones | Per plan | SOW |
| RPO/RTO (when DR is included in service scope) | — | — | SOW |
| Security breach notification timeline | — | — | SOW |
| Regular service review meetings (QBR) | — | — | SOW |
| Availability target (when full application access is provided) | — | Per plan | SOW |
Availability targets and uptime framework
For systems where we have full access to the application layer and deployment process, and the infrastructure is stable and under adequate control, the operational availability target is 99.5%, with an aspiration of 99.8%. This applies to systems developed or significantly maintained by our team, with limited complex third-party dependencies.
Availability targets are not published as unconditional guarantees because uptime depends on factors outside our control:
- Infrastructure stability and hosting provider SLA.
- Third-party service reliability (payment gateways, external APIs, CDN).
- Timeliness of customer-provided access and approvals.
- System architectural maturity and technical debt level.
Project-specific availability targets are documented in the SOW after onboarding and system assessment. We commit to what we control; infrastructure-level uptime is the responsibility of the hosting provider.
Why pricing is not published: cost factors for support plans
Support pricing is not fixed publicly because the cost of maintaining reliable SLA commitments depends on multiple project-specific factors:
- Technology stack complexity (Go microservices, Laravel, Symfony, legacy systems, mixed environments).
- Number of integrations and third-party dependencies (payment systems, external APIs, data providers).
- Required engineer seniority (senior/lead involvement for critical systems vs. standard support for stable applications).
- Response and communication requirements (update cadence, 24×7 availability, number of communication channels).
- Reporting and tooling requirements (integration with customer systems such as Jira, ServiceNow; dashboard access; custom reporting).
- Security process requirements (depth of secure development practices, security audits, penetration testing).
- Data handling restrictions (prohibition of external AI tools, local model deployment, restricted access environments).
Scope and service targets are defined in the SOW. Team composition and engineer seniority depend on incident criticality and system complexity. Requirements for reporting, monitoring depth, and communication cadence directly affect delivery cost. We commit to what we control; project-specific targets are agreed contractually.
Contractual commitments documented in SOW
Detailed contractual commitments (including service credits, escalation time thresholds, RPO/RTO, security notification timelines, service review cadence, and availability targets) are documented in a SOW / support plan appendix, not on this public page.
Enterprise conditions and contractual appendices (SOW / Appendices)
The Enterprise plan includes contractual commitments across 6 areas: service credits (from 2% per incident, up to 30% of the monthly fee), escalation matrix with time thresholds (4/8/12 hours — tech lead/CTO/CEO), status update cadence (up to every 30 minutes for S1), RPO/RTO after audit (when DR is in scope), confirmed security incident notification (up to 48 hours, down to 24 hours by separate agreement), quarterly service reviews (QBR). All conditions are documented in the SOW and can be customized with service fee recalculation.
Service Credits
For the Enterprise plan, financial remedies (service credits) may apply for breach of first response / work commencement targets. Specific conditions, calculation methods, and limits are documented in the SOW / contract appendix.
Principles documented in the SOW:
- Basis: a service credit is applied when the Provider fails to commence work on an S1 incident (does not confirm acceptance and actual start of work) within the agreed first response target time.
- Amount: from 2% of the monthly support fee per incident qualifying for a service credit.
- Cap: no more than 30% of the monthly support fee per billing period (month) in total.
- Form of compensation: service credit (discount) applied to the next billing period.
- Service credits apply to first response / work commencement time, but not to full resolution time, as resolution time depends on the nature of the incident and is not always predictable.
- At the Customer's request, conditions (percentages, caps, metrics) can be adjusted; the service fee is recalculated based on the level of commitments.
Escalation matrix with time-based thresholds (Escalation Matrix)
For the Enterprise plan, an escalation matrix with time-based thresholds (when leadership roles are engaged) is documented in the SOW / support plan appendix.
Principles documented in the SOW:
- Roles: On-call engineer → Project tech lead → CTO → CEO.
- Triggers (S1):
- 4 hours without work commenced on S1 → Project tech lead is notified.
- 8 hours without work commenced on S1 → CTO is notified.
- 12 hours without work commenced on S1 → CEO is notified.
- Notification channels: ticketing system (mandatory) + additional channels per contract (see "Communication channels").
Incident communication: status update cadence (Update Cadence)
Status update frequency depends on the support plan. The primary channel across all plans is the ticketing system.
- Enterprise: for S1 — every 30 minutes; for S2 — every 4 hours (during agreed support hours).
- Extended (Pro): for S1 — every 2 hours; for S2 — daily.
- Basic (Start): for S1 — every 4 hours; for S2 — at key milestones (accepted / resolved) or per contract.
Additional channels for Enterprise (per contract): ticket + chat + email and other agreed channels.
Note: higher requirements for update frequency and number of channels increase the cost of support — conditions are documented in the contract.
Disaster Recovery: RPO/RTO and backups
Recovery targets (RPO/RTO) are documented in the SOW / DR appendix only if backups and disaster recovery are included in the agreed service scope. RPO/RTO values depend on data volume and architecture and are determined after an audit.
Principles documented in the SOW:
- RPO (Recovery Point Objective) and RTO (Recovery Time Objective) are determined per system after an audit and include:
- data and component composition (databases, file storage, caches, queues, etc.)
- backup method (full / incremental / differential)
- retention period
- consistency requirements (e.g., MySQL + Redis + MongoDB combinations, etc.)
- recovery test procedures and frequency
- If backups are performed by the Customer:
- The Provider can configure monitoring / verification of backup execution (per contract).
- The Provider can assist with recovery during an incident (per contract).
Disaster recovery testing
We recommend regular testing of the recovery process to verify that backups are functional and recovery procedures work as expected. A backup that has never been tested is not a reliable backup.
- Recommended frequency: not less than once per month for production systems.
- Scope of testing: recovery testing should cover both the file system and all databases (relational, NoSQL, caches, queues as applicable).
- Test data strategy: test markers (specific records or data snapshots) should be prepared in advance and left in the system over time. This allows verification that recovery works correctly for both older data and recent changes — confirming that no data loss occurs across the backup window.
- Testing environment: recovery tests should be performed against production backups (restored to a separate environment) to validate real-world recovery scenarios.
- For critical systems: the testing interval can be shortened (e.g., bi-weekly or weekly) if the customer requires higher assurance and is prepared to allocate budget for this.
- Testing frequency and scope are documented in the SOW / DR appendix.
Security breach notification
For the Enterprise plan, the notification timeline for a confirmed security incident is documented in the contract and defaults to up to 48 hours from confirmation of unauthorized access or confirmed data breach.
Principles documented in the SOW:
- Customer notification: within 48 hours of confirmation of unauthorized access or confirmed data breach.
- Stricter timelines (e.g., 24 hours) can be agreed separately with a recalculation of the service fee.
- Notification channel: ticketing system + email (and/or agreed secure channel).
Regular service reviews (Service Review / QBR)
For the Enterprise plan, regular service reviews (QBR) are available: analysis of incidents, trends, and recommendations for monitoring and stability improvements. The cadence is documented in the contract.
Principles documented in the SOW:
- Cadence: quarterly (included in Enterprise), monthly (optional, billed separately).
- Inputs: ticket report, top incidents, stability risks and recommendations.
- Outputs: action plan, monitoring improvement proposals, stabilization/refactoring recommendations (as a separate work stream).
Onboarding for support readiness: required steps, checklist, typical duration ranges
Onboarding is mandatory for establishing reliable service targets. It includes: access setup, architecture and dependency review, identification of critical business flows, incident history analysis, monitoring agreement. Typical timelines: standard web applications — from 1 week; complex DevOps environments (Kubernetes, CI/CD) — longer; large enterprise systems — up to several months. Full SLA target performance begins after onboarding prerequisites are completed.
Onboarding is required for reliable support targets
Onboarding is required because service complexity and operational maturity vary. Inherited systems (not originally developed by our team) require discovery before reliable targets can be applied.
Onboarding checklist (minimum)
- Access setup (production/logs/repositories/CI/CD as applicable).
- Architecture overview and dependency map (including external services).
- Identification of critical business flows and critical pages/endpoints.
- Review of incident history and known weak points.
- Agreement on monitoring scope and alert routing.
Onboarding depth and duration ranges
- Small, typical projects (e.g., standard web applications): initial readiness can be achieved within the first week.
- Complex DevOps environments (containers, Kubernetes, non-trivial CI/CD): onboarding requires more time and practical validation.
- Large enterprise systems: onboarding and stabilization can take multiple months. Full SLA target performance is expected after onboarding prerequisites are completed.
Working with legacy and inherited systems: approach, risks, conditions
We are prepared to take on legacy and inherited systems for support and further development. We recognize that the majority of real-world software systems (approximately 90%) grow from legacy code, and this is a normal part of the software lifecycle. However, we are transparent about the differences in what we can guarantee for inherited systems compared to systems we have built from the ground up.
Key principles for legacy systems
- No blocking factor: legacy code is not a reason to decline a project. We have experience working with inherited systems across various technology stacks and levels of technical debt.
- Honest expectations: for inherited systems, we cannot provide the same level of guarantees (response times, estimation accuracy, restoration targets) that we provide for systems we have developed ourselves. The gap depends on: documentation quality, test coverage, architectural clarity, number of undocumented dependencies, and access to the original development team's knowledge.
- Onboarding is critical: for legacy systems, the onboarding phase is longer and more thorough. It includes architecture discovery, dependency mapping, identification of undocumented behavior, and risk assessment (see "Onboarding for support readiness").
- Gradual improvement: we recommend and can implement a phased approach to improving legacy systems: stabilization first (monitoring, critical bug fixes, test coverage for core flows), then targeted refactoring and architectural improvements as a planned work stream.
Recommendations for customers with legacy systems
- Allocate budget for discovery: the initial assessment phase is essential and should not be skipped. It determines realistic SLA targets and identifies the highest-risk areas.
- Expect adjusted SLA targets: service restoration times and estimation accuracy for legacy systems will be wider than for systems we control end-to-end. These adjusted targets are documented in the SOW after onboarding.
- Invest in test coverage and documentation: the fastest way to bring a legacy system to a higher service level is to increase automated test coverage for critical paths and create operational documentation.
- Consider a refactoring roadmap: we provide recommendations for refactoring priorities based on business impact, stability risk, and cost. Refactoring is handled as a planned work stream, not as incident resolution.
Vendor continuity and knowledge management: documentation, bus factor, AI-assisted support, offboarding
Service continuity is ensured through mandatory architecture documentation, knowledge distribution across the team, runbooks, and support checklists. For critical projects, multiple specialists are assigned to reduce key-person dependency. AI-assisted analysis (code, logs, integration review, incident history) is available as an option with human oversight. Usage of external AI services is agreed with the customer based on their data policy. Exit and offboarding procedures (documentation handover, access revocation, risk report) are available by contract.
Architecture documentation and knowledge base
- Architecture documentation is mandatory for all projects with active support.
- For inherited or legacy systems, dedicated time is allocated for documentation during onboarding.
- Documentation includes: system architecture overview, dependency map, critical flows, deployment procedures, known risks, and operational runbooks.
- Support checklists and runbooks are created and maintained for repeatable operational tasks.
Key-person risk mitigation
- For critical projects, at least two specialists are assigned with overlapping knowledge areas.
- Tech lead and/or CTO involvement is maintained for large projects to ensure architectural continuity.
- Internal knowledge sharing is part of the standard engineering process (code reviews, documentation, pair work on complex incidents).
AI-assisted support (optional)
AI-based analysis tools can be used to support operations:
- Code analysis, log analysis, integration review, incident history analysis.
- All AI-assisted actions are performed with human oversight (human-in-the-loop).
- Impact on delivery speed: depending on the project type, complexity, and criticality, the use of AI-assisted tools can accelerate routine engineering tasks (code analysis, dependency review, log parsing, documentation generation, test scaffolding) by an estimated factor of 5× to 20× compared to fully manual execution. The actual impact varies by task type and is not guaranteed as a fixed multiplier; it is provided as an operational reference based on internal experience.
AI usage is optional and subject to agreement:
- External AI services (cloud-based) are used only with the customer's explicit consent.
- Options include: EU/US region-based services, customer-hosted local models, or a complete prohibition of AI usage.
- A complete prohibition of AI tools is possible; this may affect operational timelines and costs — documented in the SOW.
Onboarding a new team member (replacement or team expansion)
The time required to bring a new team member to independent productivity depends on multiple factors:
- Project complexity: monolith vs. microservices, number of integrations, technology stack depth.
- Time since project takeover: projects maintained by our team for an extended period have better documentation and established knowledge transfer processes.
- Documentation and test coverage: well-documented projects with high test coverage enable faster onboarding.
- AI tooling permissions: whether AI-assisted code analysis and documentation tools are permitted for the project accelerates ramp-up.
- Service size and scope: the number of services, modules, and business domains the engineer needs to work with.
Typical onboarding timelines (operational reference, not a guarantee):
- Minimum: from 2 weeks for well-documented projects with limited scope.
- Average: approximately 3–4 weeks for typical mid-complexity projects.
- Complex or legacy systems: may require longer, depending on the factors above.
During the onboarding period, the new team member works under supervision of the tech lead or a senior engineer to ensure quality and continuity.
Exit and offboarding procedures
Upon contract termination, the following can be provided by agreement:
- Complete project documentation and knowledge transfer materials.
- Support checklists, runbooks, and operational procedures.
- Full list of accesses, integrations, and dependencies.
- Risk report and list of known technical debt and weak points.
For lower-tier plans, structured offboarding may not be included by default and can be agreed separately.
FAQ: quick answers for procurement and engineering review
Key answers: official channel — ticketing system, coverage — CET/CEST with 24×7 option, severity model S1–S4 (P0–P4), first response ~20 minutes (critical), S1 restore target up to 4 hours (when in our control scope), availability target 99.5% (SOW), service credits for Enterprise (SOW), escalation thresholds at 4/8/12h, status updates from every 30 minutes (Enterprise S1) to every 2 hours (Extended S1), change management with rollback plans, data handling on customer infrastructure, no formal certification (roadmap 12–15 months), AI-assisted support optional, English/Russian/German, pricing defined in SOW.
What is the official support channel?
Which timezone do you operate in?
Do you offer 24×7 support?
What is the difference between first response and restore?
Do you pause SLA time if customer does not provide access?
How do you classify incident severity?
Do you provide RCA / postmortem?
How do you handle security vulnerabilities?
Do you monitor services proactively?
Can you help with third-party outages (payments, email providers)?
Are service credits available for SLA breaches?
How does escalation work for prolonged incidents?
How often are status updates provided during an incident?
Do you provide RPO/RTO?
What is the security breach notification timeline?
Are regular service reviews conducted?
Do you provide an uptime / availability guarantee?
What is the service restoration target for critical incidents?
How do you handle change management?
How do you handle data and access security?
Do you hold security certifications (SOC 2, ISO 27001)?
Do you use AI in support operations?
What languages do you support?
What happens when the contract ends?
What are your code quality and test coverage standards?
Do you work with legacy / inherited systems?
How long does it take to onboard a new team member?
How often do you test disaster recovery?
Why are prices not listed on this page?
Legal Notice / SLA Disclaimer
THIS DOCUMENT DESCRIBES TARGET VALUES AND PROCESSES AND DOES NOT CONSTITUTE A GUARANTEE OR A BINDING OFFER. ALL INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Binding service obligations, liability, and service credits arise exclusively from the individually executed agreement (Master Service Agreement / SOW).