VarSome API

For information about the API and pricing, please click here.

Introduction

The VarSome API allows developers to easily retrieve information from over 70 genomic databases in a single call. The data is returned using Json, which is easily accessible from any language or platform. For example, it can be easily transformed to native Python objects if required for processing.

This is the same data as is visualised in the VarSome genomic search engine, and leverages Saphetor's proprietary high-performance genomic database.

Batch requests are also available, allowing data for up to 10,000 variants per batch query (we recommend 1,000 for optimal results) to be efficiently retrieved in a single call.

Please ensure you have registered as a VarSome user, then contact us with your user login in order to receive an authentication token to use the API.

Your feedback will be much appreciated.

Example Queries

Here are a few simple examples queries:

All the annotation data is returned as a single dictionary, keyed by each source institution & database name. Example keys are "icgc_somatic", "iarc_tp53_germline" and "ncbi_clinvar".

Variant endpoints

Retrieve variant related data.

Schema Lookup request

[GET] [https://api.varsome.com/lookup/schema]

Retrieves the schema of a variant response object, containing relevant information for each field included in the variant lookup response.

Example

https://api.varsome.com/lookup/schema

ACMG annotation fields

For full documentation see Documentation

acmg_annotation - The json object containing the ACMG annotation.

  • version_name - String. Varsome's software version.
  • gene_symbol - String. The gene symbol of the variant.
  • transcript - String. The transcript used for the acmg_annotation.
  • transcript_reason - String. The reason for selecting the specific transcript. If the override_transcript isn't used, the transcript selected by default is the one with the most severe coding impact, or the longest canonical transcript, or the longest one
  • coding_impact - String. The variant's coding impact based on the aforementioned transcript.
  • verdict - Json object containing the overall results of the acmg classification.
    • ACMG_rules - Json object containing a summary.
      • verdict - String. The ACMG classification based on the annotation data, obtained by combining the pathogenic & benign sub-scores per the ACMG guidelines.
      • pathogenic_subscore - String. The verdict derived using only pathogenic rules.
      • benign_subscore - String. The verdict derived using only benign rules.
      • approx_score - Integer. A numeric score that is used for sorting. Not used for deriving the verdict.
    • classifications - Array. The ACMG rules that succeeded.
  • classifications - Array. The rules that succeeded, as well as the ones that failed but have a clear user explanation.
    • name - String, The ACMG rule's name.
    • met_criteria - Boolean, true for success, false for failure.
    • user_explain - Array. The user explanations for a rule that succeeded.
    • user_explain_failed - Array. The user explanations for a rule that failed.
  • gene_id - Integer. Internal gene identification number.

saphetor_known_pathogenicity field: for Saphetor internal use only - not supported

AMP annotation fields

For full documentation see Documentation

amp_annotation - The json object containing the AMP annotation.

  • version_name - String. Varsome's software version.
  • verdict - Json object containing the overall results of the amp classification.
    • tier - String. The AMP classification (Tier I - Tier IV) based on the annotation data, obtained by combining the pathogenic & benign sub-scores per the guidelines.
    • approx_score - Double. A numeric score that is used for sorting. Not used for deriving the verdict.
  • classifications - Array. The rules that succeeded, as well as the ones that failed but have a clear user explanation.
    • name - String, The AMP rule's name.
    • tier - The tier assigned to the rule (Tier I - Tier IV).
    • user_explain - Array. The user explanations for a rule that succeeded.
    • user_explain_failed - Array. The user explanations for a rule that failed.
    • total_samples - Integer. The total samples found for the specific variant. Only in the Somatic Rule.
  • sample_findings - Json object containing the overall results of the findings for the specified sample.
    • sex - String. The sample findings for the specified sex.
    • age - String. The sample findings across for the specified age.
    • age_match - String. The sample findings across for the specified age.
    • tissue_type_match - String. The sample findings for the specified tissue type.
    • cancer_type_match - String. The sample findings for the specified cancer type.
    • ethnic_frequency - String. The sample findings for the specified ethnic frequency.

Variant lookup

[GET] [https://api.varsome.com/lookup/{query}/{ref_genome}{?add-all-data=1&add-varsome-user-entries=1&expand-pubmed-articles=1&add-region-databases=1&add-source-databases}]

The query parameter can be any of the following: HGVS Protein-level variant, HGVS DNA-level variant, rs_id, 4-part genomic variant specification and variant_id.

The response to all these types of queries has the same format and can be either an Array of variant response objects, or a single variant response object.

Parameters

  • query - A String representation of the variant to query. Can be any of the following:

    • HGVS Protein-level variant - Gene/transcript name followed by HGVS Protein level notation. Examples: BRAF:V600E, NM_001252678:I182T
    • HGVS DNA-level variant - Gene/transcript name followed by HGVS DNA level notation. Example: FTO:c.46-43098T>C
    • rs_id - the dbSNP accession number. String “rs” followed by one or more digits. Example: rs113488022
    • 4-part genomic variant specification- chromosome:position:reference_allele:alternate_allele or chromosome:position:reference_length:alternate_allele. The separator may be ‘:' or ‘-‘, the chromosome number is optionally preceded by the string “chr”, and position is the 1-based chromosomal position.
    • variant_id - our 20-digit integer value (example '10190150730273780002’). If you call “region_variants” below, it is faster to then obtain variant data using the variant_ids returned.
  • ref_genome(optional) - `hg19` or `hg38` Default: `hg19`
  • add-all-data (optional) - Can be 0 or 1. Include all data in the annotation(same as enabling all the parameters below)
  • add-ACMG-annotation (optional) - Can be 0 or 1. Include ACMG classification (only available in specific API plans).
  • minimum-clinvar-stars (optional) - Can be 0 to 4. Define the minimum ClinVar rating to take into account when calculating the ACMG Verdict.
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-region-databases (optional) - Can be 0 or 1. Include region databases data in the response
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result
  • override_transcript (optional) - Overrides the transcript used for acmg annotation with the one specified by override_transcript. The transcript used by default is the one with the most severe coding impact, or the longest canonical transcript, or the longest one. Requires that add-ACMG-annotation parameter is set to 1. If the transcript defined for that variant isn't valid, the api returns an error.
  • add-AMP-annotation (optional) - Can be 0 or 1. Enables the AMP annotation.
  • sex (optional) - String. Male or Female. The sample's sex to be used for matching. Requires that the AMP annotation is enabled.
  • age (optional) - Integer. The sample's age to be used for matching. Requires that the AMP annotation is enabled.
  • ethnicity (optional) - String. The sample's ethnicity to be used for matching. Requires that the AMP annotation is enabled.
  • cancer-type (optional) - String. The sample's cancer type to be used for matching. Requires that the AMP annotation is enabled.
  • tissue-type (optional) - String. The sample's tissue type to be used for matching. Requires that the AMP annotation is enabled.

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Examples

Batch Lookup for many variants

[POST] [https://api.varsome.com/lookup/batch/{ref_genome}{?add-all-data=1&add-varsome-user-entries=1&expand-pubmed-articles=1&add-region-databases=1&add-source-databases=all}]

Retrieve variant data for more than one variant which are passed in the POST request payload, based on a reference genome id. This is currently limited to 1000 variants per request.

Parameters

  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
  • add-all-data (optional) - Can be 0 or 1. Include all data in the annotation(same as enabling all the parameters below)
  • add-varsome-user-entries (optional) - Can be 0 or 1. Include VarSome's user submitted publications in the response
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-region-databases (optional) - Can be 0 or 1. Include region databases data in the response
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To perform a batch query you need to include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Request body

  • variants (array) - an Array of strings containing any of the supported variant lookup notations as shown above. Example: `{"variants": ["rs113", "chr22:39777823::CAA"]}`

Get variants in a genomic region

[GET] [https://api.varsome.com/region_variants/{ref_genome}/{chromosome_id}/{position}/{length}{?add-all-data=1&add-source-databases=all}]

Retrieve all known variants inside the genomic region described, using the ref_genome, chromosome_id, position and length.

Parameters

  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
  • chromosome_id - A number representing the chromosome, 1-22, 23 for X and 24 for Y.(example `1`)
  • position - the 1-based chromosomal position of the start of the region.
  • length - the length of the region in base pairs
  • add-all-data (optional) - Can be 0 or 1. Include all databases in the response. Same as add-source-databases=all
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Example

Gene endpoints

Retrieve gene related data.

Gene response schema

[GET] [https://api.varsome.com/lookup/schema/genes]

Retrieves the schema of a gene response object, containing relevant information for each field included in the gene lookup response.

Example

https://api.varsome.com/lookup/schema/genes

Gene lookup based on gene symbol

[GET] [https://api.varsome.com/lookup/gene/{gene_symbol}/{ref_genome}{?add-all-data=1&add-source-databases=all}]

Retrieve gene data for the given 'gene_symbol'. Also based on a reference genome id.

Parameters

  • gene_symbol - The gene's symbol
  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
  • add-all-data (optional) - Can be 0 or 1. Include all data in the response. Same as enabling all the following parameters
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Examples

Batch Lookup for many genes

[POST] [https://api.varsome.com/lookup/genes/batch{?add-all-data=1&expand-pubmed-articles=1&add-source-databases=all}]

Retrieve variant data for more than one variant which are passed in the POST request payload, based on a reference genome id. This is currently limited to 1000 variants per request.

Parameters

  • add-all-data (optional) - Can be 0 or 1. Include all data in the response. Same as enabling all the following parameters
  • expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
  • add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result

Headers

  • Authorization (optional) - To perform a batch query you need to include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Request body

  • genes (array) - an Array of strings containing any of the supported variant lookup notations as shown above. Example: `genes: ['BRAF', 'TP53']`

Transcript endpoints

Retrieve transcript related data.

Transcript lookup based on transcript name

[GET] [https://api.varsome.com/lookup/transcript/{transcript_name}/{ref_genome}]

Retrieve transcript data for the given transcript name. Also based on a reference genome id.

Parameters

  • transcript_name - The transcript name
  • ref_genome (optional) - `hg19` or `hg38` Default: `hg19`

Headers

  • Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`

Examples