Skip to content

Variant query

LAPIS offers a special query language to specify variants. A variant query can be used to filter sequences and be passed to the server through the query parameter variantQuery. It is not allowed to use the variantQuery parameter alongside other variant-defining parameters (e.g., pangoLineage or aaMutations). Don’t forget to encode/escape the query correctly (in JavaScript, this can be done with the encodeURIComponent() function)!

The formal specification of the query language is available here as an ANTLR v4 grammar. In following, we provide an informal description and examples. The respective unit test provides a full list of possible atomic queries.

The query language understands Boolean logic. Expressions can be connected with & (and), | (or) and ! (not). Parentheses ( and ) can be used to define the order of the operations. Further, there is a special syntax to match sequences for which at least or exactly n out of a list of expressions are fulfilled.

Examples

Get the sequences with the nucleotide mutation 300G, without a deletion at position 400 and either the AA change S:123T or the AA change S:234A:

300G & !400- & (S:123T | S:234A)

Get the sequences with at least 3 out of five mutations/deletions:

[3-of: 123A, 234T, S:345-, ORF1a:456K, ORF7:567-]

Get the sequences that fulfill exactly 2 out of 4 conditions:

[exactly-2-of: 123A & 234T, !234T, S:345- | S:346-, [2-of: 222T, 333G, 444A, 555C]]

For SARS-CoV-2, it is also possible to use Pango lineage queries (either called by pangolin or by Nextclade) and filter by Nextstrain clades:

BA.5* | nextcladePangoLineage:BA.5* | nextstrainClade:22B

LAPIS supports a ternary logic to query ambiguous nucleotide symbols.

MAYBE(123W)