The percentileif aggregation function calculates the percentile of a numeric column, conditional on a specified boolean predicate. This function is useful for filtering data dynamically and determining percentile values based only on relevant subsets of data.

You can use percentileif to gain insights in various scenarios, such as:

  • Identifying response time percentiles for HTTP requests from specific regions.
  • Calculating percentiles of span durations for specific service types in OpenTelemetry traces.
  • Analyzing security events by percentile within defined risk categories.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

Usage

Syntax

summarize percentileif(Field, Percentile, Predicate)

Parameters

ParameterDescription
FieldThe numeric field from which to calculate the percentile.
PercentileA number between 0 and 100 that specifies the percentile to calculate.
PredicateA Boolean expression that filters rows to include in the calculation.

Returns

The function returns a single numeric value representing the specified percentile of the Field for rows where the Predicate evaluates to true.

Use case examples

You can use percentileif to analyze request durations for specific HTTP methods.

Query

['sample-http-logs']
| summarize post_p90 = percentileif(req_duration_ms, 90, method == "POST"), get_p90 = percentileif(req_duration_ms, 90, method == "GET") by bin_auto(_time)

Run in Playground

Output

post_p90get_p90
1.691 ms1.453 ms

This query calculates the 90th percentile of request durations for HTTP POST and GET methods.

  • percentile: Calculates the percentile for all rows without any filtering. Use percentile when you don’t need conditional filtering.
  • avgif: Calculates the average of a numeric column based on a condition. Use avgif for mean calculations instead of percentiles.
  • minif: Returns the minimum value of a numeric column where a condition is true. Use minif for identifying the lowest values within subsets.
  • maxif: Returns the maximum value of a numeric column where a condition is true. Use maxif for identifying the highest values within subsets.
  • sumif: Sums a numeric column based on a condition. Use sumif for conditional total calculations.