Evolution of a second-order performance metric


As with any application that is undergoing a performance profiling effort, you start with standard instrumentation pulling data from the supporting infrastructure stack and from metrics specific to the application. In this case, a probabilistic search is a substantial portion of the application workload that was the focus of the study. The search has three basic phases, a “selection” process that narrows the field of potential matches through a grouping strategy and retrieves comparison strings from a deep scoring and matching in the second phase. The final phase is the retrieval of attribute data for the candidates that matched above the specified threshold for the search.

Over time and much analysis it was observed that the candidate selection phase was a key bottleneck that governed overall search performance. If the objects involved in the selection process were not contained within the database cache then the searches were much slower. This observation lead to the production of our first interesting plot, a scatter plot of candidates selected on the x-axis and selection time on the y-axis.

This plot routinely illustrated strong linear trends and periodically multiple distinct populations with different slopes were observed. It was also noted over time that different customers exhibited different slope values. Both characteristics indicated that this was a promising second order metric that could potentially be used for site to site comparisons of the application performance characteristics for this class of workload.

Initial analyses focused on the slope of the entire population. However, in certain cases where multiple populations were present the slope value did not provide a true performance characterization of that installation. Further investigations into these deployments with distinct populations identified unique slope values for search populations from distinct workload sources.

In this case, the original data set was split into two populations by the source of the search. The two populations illustrate strong slope correlations. The slope values for these “isolated” populations demonstrate a much better fit, and provide the portable metric we were after:

With the fundamental model in place, we ramped up the data set sizes from tens of thousands of samples to tens of millions per customer site. This significant jump in data points required another change to the visualization strategy; the use of two dimensional frequency plots. The use of “heat plots” allowed for the rapid tracking of multiple search populations with distinct slopes over time. Coupling 2D frequency plots with a micro-plotting strategy yielded a valuable diagnostic visualization that could summarize 20+ million datapoints in a highly condensed time series view:

The development of this phase of the slope metric took the better part of a year when factoring in large scale lab and field data collection, analysis and refinement efforts (and while doing my day job…).