Salesforce Selective Queries

A basic understanding of the selective query concept is fundamental to scalability on the Salesforce platform. Non-selective queries have a significant performance impact on List Views, Reports and SOQL and can often result in Apex Trigger runtime exceptions as below – as the data volume increases over time.

[code language=”java”]
System.QueryException Cause: null Message: Non-selective query against large object type (more than 200000 rows)
[/code]

SOQL queries executed from Apex Triggers have to be selective where the record count in the queried object is 200,000 or more. The determination of a selective query is a function of volume plus field-level selectivity. It is therefore the case that the selectivity state of a given query is volatile meaning in practical terms that the initiating Apex Trigger may work one day and not the next.

Selective Query Definition

Selectivity is determined by the index state of the query filter conditions and the number of records the filter returns (selectivity) versus the object total records. The thresholds below show the difference between the selectivity calculation for a standard index versus a custom index.

Selectivity Thresholds:
Standard Index – 30% (first 1M) then 15%
Custom Index – 10% (first 1M) then 5%

Unary filter:

e.g. select Name from Account where IndexedField__c=’ABC’

With a custom index on IndexedField__c the filter must return <10% of the total records in the object to be considered selective – up to the first 1 million records from that point the threshold drops to 5%.

Multiple filters AND (exclusively):

e.g. select Name from Account where IndexedField__c=’ABC’ and SecondIndexedField__c=’123′

The Query Optimiser will set the leading operation on the basis of lowest cost. If no filters are selective a table scan is performed.

If all filters have an index then a Composite Index Join optimisation can be applied.

In this case each filter must be less than 2x (two-times) the selectivity threshold.
All filters combined must be less than selectivity threshold.

If all filter fields are standard then use the standard index selectivity threshold – otherwise use custom index selectivity threshold.

Multiple filters OR (at least one):

e.g. select Name from Account where IndexedField__c=’ABC’ or SecondIndexedField__c=’123′

Selective AND filter indexes could be set as the Leading Operation – if none exist, then a table scan occurs unless all filters have an index then a Composite Index Union optimisation becomes possible.

In this case each filter must be less than selectivity threshold.
All filters combined must be less than selectivity threshold.

If all fields are standard then use the standard index selectivity threshold – otherwise use custom index selectivity threshold.

Parent Field Filter:

e.g. select Name from Contact where IndexedField__c=’ABC’ and Account.IndexedField__c=’ABC’

Where parent object fields are referenced in a filter, each filter index is individually and the lowest cost option selected as the leading operation.

Note, the parent field is not indexed on the queried object, so Account.Id can incur a table scan on Opportunity whereas AccountId may allow the standard index to become the leading operation.

The notes above provide a basic outline of the concepts but should be sufficient to convey the key concepts.

 

Implementation Approach

As data volumes grow query behaviour can change dramatically, to mitigate this database queries originating in Apex code, list view and report definitions must consider the future peak data volume and field-level data characteristics (primarily selectivity). This considered approach can help identify an appropriate indexing strategy that maintains query performance by ensuring query selectivity. So, forward planning is absolutely key; queries should be designed to be selective up to the projected peak data volumes. Thinking ahead to this extent is very seldom applied in my experience, particularly where the Salesforce implementation evolves project-by-project and the longer term picture is not a priority.

In order to evaluate the selectivity of a given query – the following 2 approaches can be applied.

REST API Query Resource Feedback Parameter

The Force.com REST API exposes a Query resource that accepts an explain parameter which can set with a SOQL query, List View Id or Report Id. The results show the options considered by the Query Optimiser and the lowest cost option (leading operation) taken. A relative cost value of less than 1 indicates a selective query, anything higher indicates non-selective. The example below shows the output for a report where a report filter hits an indexed field.

selective-queries-api

Developer Console Query Plan

From Summer ’14 on the Developer Console can be enabled [Help>Preferences>Enable Query Plan] to display a Query Plan option on the Query Editor tab. The construct of the output is the same as the API approach. Note, this appproach is limited to SOQL queries.

selective-queries-developer-console

The references section below provides link to the technical detail of the 2 approaches introduced above.

 

References

Query & Search Optimisation Cheat Sheet

Developing Selective Force.com Queries through the Query Resource Feedback Parameter Beta

Developer Console Query Editor

Improve performance with Custom indexes using Selective SOQL Queries

Custom Index Request Checklist

%d bloggers like this: