Time series analysis of workload isolation strategies

By now you have figured out that I really like time series plots of transactional data with log scale latencies as a visual representation of workloads. Taking this a step further, we can overlay color-coded workloads with distinct latency values on the same plot to monitor their relative performance over time. Here’s an example of this type of plot using transactional data from an OLTP style application servicing 4 distinct workloads:

Here the x-axis is time and the y-axis is latency (log scale) in the main scatter plot. The time series plots below the main scatter plot illustrate CPU consumption on the two supporting nodes servicing this workload. This data set was monitoring an intentional failover event to determine the effectiveness of an HA design. The vertical red line is the failure event. The test was a success, as you can see the workloads continued to process following the intentional failure of one of the database nodes. The plot is pretty cool too. Some of the key visualization strategies here were:

  • plotting the entire population of data points for a complete view of the workloads
  • temporal overlays of performance data from the application and database tiers
  • use of earth tones and gray scales
  • overlay of plots and data tables

This data rich plot provided a dense plot that summarized 2 weeks of effort to setup and execute a complex simulated failure event in a comprehensive manner. Despite the information density, you can clearly observe the failure event in the callout plot and see that the workload was not interrupted by the failure event, yet you can dig in on the specific throughput and latency characteristics for each summary time period in the provided tables.