Skip to end of metadata
Go to start of metadata

Service Management beyond OS-Operations

Observe, notify, brag of success or fix problem.
What you need
  • Heartbeat on each service.
  • Trace metrics for every service call, in prod. You must know your daily behavior to efficiently find your performance hickups.
  • Each service must show it's usage, and value.
  • Usable, and available logs
    • Info (add new user, new operation)
    • Error handling (user/consumer tried to do something that didn't work)
    • Exception (something unexpectedly wrong happened)
  • Audit Logs
% Average is a lie!
Use metrics like 80, 90, 95% percentile. Record all individual calls that are extremely slow.


Base elements
  • Business Value → Observe userstorries accomplished. Tell the success story. What value did you provide to your customers.
  • Operational Metrics → Observe functionality. Which functions are used, which functions are not used? Does upgrade increase or decrease end-user usage?
  • Technical Metrics → Observe usage, what is the number of requests. What is the rate of 200/404/5xx responses?
    • Performance Metrics → Observe everything that happen in production. Sumarize valuable metrics, then discard raw data. Create a basic understanding of how your services operate in the wild.

Tools available

Available Logs

  • Logback - log4j successor. Has web front for logs.


Metrics provider

Metrics consumer and GUI

Commercial tools

  • AppDynamics - Monitor each service, GUI, non-introsiv - expensive. Follow trace through chain of servers.
  • New Relic - Monitor performance metrics in production. Monitor all services.
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.