Illumina TruSight Whole Genome Analysis Application User Guide
- July 4, 2024
- illumina
Table of Contents
- Illumina TruSight Whole Genome Analysis Application
- Product Information
- Product Usage Instructions
- Product Documentation
- Overview
- Getting Started
- Settings
- Run Creation
- 200050132).
- Output File Summary
- Variant Call File Information
- Technical Assistance
- References
- Read User Manual Online (PDF format)
- Download This Manual (PDF format)
- 200050132).
- Homozygous is number of homozygous variants in the region.
- Heterozygous is number of heterozygous variants in the region.
Illumina TruSight Whole Genome Analysis Application
Product Information
Specifications
- Product Name: TruSight Whole Genome Analysis Application
- For: In Vitro Diagnostic Use
- Version: Document # 200049931 v00, April 2024
- Analysis Workflow: Demultiplexing, FASTQ generation, read mapping, alignment to GrCh38/hg38 human reference genome, variant calling
- Variant Callers: Small Variant Caller, CNV Caller, Repeat Expansion Detection with Expansion Hunter
Product Usage Instructions
Getting Started
Ensure the TruSight Whole Genome Analysis Application is installed on the
NovaSeq 6000Dx instrument. It can be found on the Applications screen of the
instrument or in Illumina Run Manager on a networked computer.
Contact your local Illumina Field Representative for assistance with installation.
Data Storage Requirements
Refer to the NovaSeq 6000Dx Product Documentation and DRAGEN Server for
NovaSeq 6000Dx Product Documentation for information on data output and
storage.
The application outputs data into a Run Folder and an Analysis Folder in
external storage. Approximate storage requirements are provided based on the
size of data output for different flow cell configurations.
Approximate Analysis Time
The analysis time may vary depending on the specific parameters and
workflow complexity.
FAQ
-
Q: How do I access technical assistance?
A: Please refer to the Technical Assistance section in the user manual for contact information. -
Q: What are the main functions of the TruSight Whole Genome Analysis Application?
A: The main functions include planning sequencing runs, demultiplexing, read mapping, variant calling, quality control, andgenerating reports.
TruSight Whole Genome Analysis Application
Product Documentation
ILLUMINA PROPRIETARY Document # 200049931 v00 April 2024
FOR IN VITRO DIAGNOSTIC USE.
Revision History
Document | Date | Description of Change |
---|---|---|
Document # 200049931 v00 | April 2024 | Initial release. |
This document and its contents are proprietary to Illumina, Inc. and its
affiliates (“Illumina”), and are intended solely for the contractual use of
its customer in connection with the use of the product(s) described herein and
for no other purpose. This document and its contents shall not be used or
distributed for any other purpose and/or otherwise communicated, disclosed, or
reproduced in any way whatsoever without the prior written consent of
Illumina. Illumina does not convey any license under its patent, trademark,
copyright, or common-law rights nor similar rights of any third parties by
this document.
The instructions in this document must be strictly and explicitly followed by
qualified and properly trained personnel in order to ensure the proper and
safe use of the product(s) described herein. All of the contents of this
document must be fully read and understood prior to using such product(s).
FAILURE TO COMPLETELY READ AND EXPLICITLY FOLLOW ALL OF THE INSTRUCTIONS
CONTAINED HEREIN MAY RESULT IN DAMAGE TO THE PRODUCT(S), INJURY TO PERSONS,
INCLUDING TO USERS OR OTHERS, AND DAMAGE TO OTHER PROPERTY, AND WILL VOID ANY
WARRANTY APPLICABLE TO THE PRODUCT(S).
ILLUMINA DOES NOT ASSUME ANY LIABILITY ARISING OUT OF THE IMPROPER USE OF THE
PRODUCT(S) DESCRIBED HEREIN (INCLUDING PARTS THEREOF OR SOFTWARE).
© 2024 Illumina, Inc. All rights reserved.
All trademarks are the property of Illumina, Inc. or their respective owners.
For specific trademark information, refer to
www.illumina.com/company/legal.html.
Overview
The TruSight Whole Genome Analysis Application is used to plan sequencing runs
for TruSight Whole Genome and automatically initiate analysis after the run
completes. Analysis includes demultiplexing, FASTQ generation, read mapping,
alignment to the graph-enabled GrCh38/hg38 human reference genome, and variant
calling using the Illumina DRAGEN Server for NovaSeq 6000Dx.
At different stages of the analysis workflow, the application performs quality
control (QC) according to defined sequencing, FASTQ, and sample library
metrics, and generates reports with the results. For samples that pass all QC
steps, the application generates supporting output files for use in downstream
germline applications.
The TruSight Whole Genome Analysis Application executes DRAGEN variant
callers, including the Small Variant Caller, Copy Number Variant (CNV) Caller,
and Repeat Expansion Detection with ExpansionHunter.
The application also performs annotation of low, intermediate, or high
confidence tier for small variants and includes this annotation in the output
file.
Getting Started
Make sure the TruSight Whole Genome Analysis Application is installed on the NovaSeq 6000Dx instrument that will be used for sequencing as part of TruSight Whole Genome. Installed applications can be found on the Applications screen on the NovaSeq 6000Dx Instrument or in Illumina Run Manager using a browser on a networked computer. For assistance to schedule installation, contact your local Illumina Field Representative.
Data Storage Requirements
Refer to the NovaSeq 6000Dx Product Documentation (document # 200010105) and
DRAGEN Server for NovaSeq 6000Dx Product Documentation (document # 200014171)
for general information on data output and storage.
The TruSight Whole Genome Analysis Application outputs data into a Run Folder
and an Analysis Folder in external storage. The minimum storage requirements
can be approximated from the size of data output into each folder for a single
sequencing run shown below.
Configuration | Run Folder (GB) | Analysis Folder (GB) |
---|---|---|
S2 Flow Cell (6 samples) | ~430 | ~350 |
S4 Flow Cell (16 samples) | ~1110 | ~890 |
Approximate Analysis Time
Analysis begins automatically after a sequencing run is completed and occurs
sequentially on samples within a run. Data output files will be available on
the external storage once analysis is complete for all samples in a run and
copy transfer to the external storage is complete. When starting a sequencing
run on both side A and side B at the same time, sequencing will be performed
concurrently. Analysis of these sequencing runs will be performed sequentially
by the TruSight Whole Genome Analysis Application after sequencing completes.
The run that completes sequencing and transfer first will be analyzed first.
The second sequencing run will be transferred and queued for analysis after
the first analysis completes. Refer to View Run and Results on page 6 for how
to determine status of active or failed runs.
Approximate time until analysis results are available after sequencing
completes is shown below for the situation when side A and side B are loaded
simultaneously with the same configuration.
Configuration | Analysis Run 1 (hours) | Analysis Run 2 (hours) |
---|---|---|
S2 Flow Cell (6 samples) | ~12 | ~24 |
S4 Flow Cell (16 samples) | ~24 | ~48 |
Settings
Select the TruSight Whole Genome Analysis Application on the Applications screen to view current configuration and change permissions.
Configuration
The configuration screen displays the following application settings:
- Application Name
- Application Version
- DRAGEN Version
- RTA Version
- Release Date
- Organization
- Device Identifier
- Production Identifier
- Library Prep Kits— Displays the library prep kit. This setting cannot be changed.
- Index Adapter Kits— Displays the index adapter sets available for use.
- Index Reads
- Read Type
- Index Lengths
- Read Lengths— Read lengths are set by default when the index set is selected. This setting cannot be changed.
Permissions
The designated administrator has Permissions access and can use the checkboxes
on the Permissions screen to manage user access for the TruSight Whole Genome
Analysis Application.
For more information regarding permissions and user management, refer to the
System Configuration section of the NovaSeq 6000Dx Product Documentation
(document # 200010105).
Run Creation
Create new runs in IVD mode either on the instrument or by accessing Illumina
Run Manager (IRM) using a browser on a networked computer. To access the
instrument remotely, use the address and user account information provided by
your Illumina representative. Refer to NovaSeq 6000Dx Product Documentation
(document # 200010105) for more information.
Create Run is the recommended method for run planning. Import Sample Sheet is
not recommended. The sample sheet files output in run and analysis folders are
not suitable for import during run planning.
Create Runs
-
From the Runs screen, select Create Run.
-
Select the TruSight Whole Genome Analysis Application, and then select Next.
-
On the Run Settings screen, enter a run name. The run name identifies the run from sequencing through analysis.
-
[Optional] Enter a run description to further identify the run. Library Prep kit is set by default as TruSight Whole Genome and cannot be changed.
-
Select the desired TruSight Whole Genome index set from the Index Adapter Kit drop-down menu. Read length will be set by default and cannot be changed. (Read 1 and 2 use 151 cycles; Index 1 and 2 use 10 cycles).
-
Enter a Library Tube ID (recommended format as DX1234567-LIB), and then select Next.
If no Library Tube ID is specified at this step, the planned run will need to be selected before loading of sequencing consumables. If the incorrect Library Tube ID is entered at this step, the planned run must be corrected before loading consumables. Refer to Run Revision on page 5 for protocol to correct run when ready to load consumables. -
On the Sample Data and Sample Settings screens, sample information will be entered. Sample data can be entered manually or by importing a sample data file. The sample ID must be unique for each sample and can only contain alphanumeric characters, underscores, and dashes. Do not include spaces. Well Position refers to the well in format A01 to H04 of the index plate. Index sequence information will be populated automatically when index plate Well Position is entered. Sex must be entered as Male, Female, or Unknown. Library Plate ID and Library Well ID (eg, format A01) are required fields.
- To enter sample data manually, add rows (to a total of 6 for S2 or 16 for S4 flow cell) and enter required information into Sample ID and Well Position Fields. Information may also be copied and pasted from Excel. Select Next. On the Sample Settings screen, enter Library Plate ID, Library Well ID, and Sex. Select Next.
- To import a sample data file, select Import Samples and upload the sample data file. Information will be populated into rows automatically. A template (*.csv) is available for download on this screen. Select Next. On the Sample Settings screen, information will be populated into rows automatically from the imported sample data file. Select Next.
-
On the Analysis Settings screen, enter the Batch Name recorded during batch and run planning.
-
[Optional] Select the Flow Cell Type, S2 or S4.
-
Confirm or deselect the checkbox to Generate ORA compressed FASTQs, then select Next.
NOTE The TruSight Whole Genome Analysis Application generates ORA compressed FASTQs by default. Changing this setting will increase size of final data output. -
On the Run Review screen, review the information entered. If no changes are needed, select Save. If changes are needed, select Back as needed to return to the appropriate screen.
CAUTION
TruSight Whole Genome has been validated for 6 samples when using the NovaSeq
6000Dx S2 flow cell, and 16 samples when using the NovaSeq 6000Dx S4 flow
cell. Ensure the correct number of samples are entered for the selected flow
cell configuration.
Run Revision
If changes are required after run creation and before loading consumables for
sequencing, revise runs in IVD mode either on the instrument or by accessing
Illumina Run Manager (IRM) using a browser on a networked computer.
- Select Runs.
- Select the Run name on the Planned Runs tab.
- Select Edit.
- Update the run or sample information as needed. For example, enter or correct the Library Tube ID to match that which was used when completing the workflow.
- Select Next up to Run Review.
- Select Save.
- Select Exit.
Return to Sequencing in IVD mode to repeat loading of consumables. Run should
now be automatically highlighted.
If updating Library Tube ID while loading consumables, return to Run Selection
in Control Software and select Refresh for the associated column, A or B. Run
should now be automatically
Requeue Analysis
Refer to the Troubleshooting section in the TruSight Whole Genome Package
Insert (document # 200050132) to determine which type of Requeue Analysis is
most appropriate.
Requeue analysis with no changes
- Select the Completed Run name to view Run Details.
- Select Requeue Analysis.
- Select Requeue Analysis with no changes.
- Provide details in the Reanalysis Reason field.
- Select Requeue Analysis.
- Exit the page and navigate to the Active Runs page to confirm the requeue is in progress.
Requeue analysis with changes
- Select the Completed Run name to view Run Details.
- Select Requeue Analysis.
- Select Edit run settings and Requeue Analysis.
- Provide details in the Reanalysis Reason field.
- Select Requeue Analysis.
- Confirm or update the Run Settings, then select Next.
- Correct the sample information as necessary by manually updating the fields or select Download Template to create a sampledata.csvfile with current information. Correct information and delete existing rows in the Sample Data tab before using Import Samples to populate the corrected sample data.
- Review the information on the Run Review screen and select Save to start reanalysis.
- Select Exit and navigate to the Active Runs page to confirm the requeue is in progress.
The original run data folder must be present at the external storage location specified in Run Details for reanalysis to complete successfully. If reanalysis fails, make sure the run has not been moved or deleted.
View Run and Results
-
From the Illumina Run Manager main screen in IVD mode, select Runs.
-
From the Completed Runs tab, select the Run name.
This tab will also display runs that have completed due to failure of sequencing, data transfer, or analysis. Active runs and their status are displayed in the Active Runs tab. Refer to NovaSeq 6000Dx Product Documentation (document # 200010105) for more information. -
Select the Run name in the Completed Runs tab to view Run Details and Results for the path to the Analysis Output Folder.
For failed runs, review the Status for each step and then refer to the Troubleshooting section of the TruSight Whole Genome Package Insert (document200050132).
-
Navigate to the analysis folder on your local drive and open the Consolidated Report to review the PASS/FAIL result for each QC step as follows:
- For sequencing run QC, refer to Summary Sequencing QC Result
- For FASTQ QC for each sample in the run, refer to Summary FASTQ QC Result
- For library QC for each sample in the run, refer to Summary Sample Library QC Result
If a FAIL result is observed, note the QC step and refer to the Troubleshooting section of the TruSight Whole Genome Package Insert (document
200050132).
Output File Summary
The TruSight Whole Genome Analysis Application saves the following main output
files. Refer to file information sections below for location of main output
files.
Runs and samples which do not pass validity criteria do not produce CRAM, ROH
bed, or *genome.vcf) files. QC Report Information
The Consolidated Report <
<
The report headers include the following information about the run: the app version, batch name, library pool tube ID, sequencing run name, sequencing run ID, and flow cell type. The following tables describe the information included in the Consolidated Report. The individual Sample Report includes the same information except for the Demultiplex Metrics.
Table 1 Sequencing QC Metrics
Metrics | Spec | Description |
---|---|---|
Non- | N/A | No specification since lower yield runs may result in passing |
Indexed | sample libraries. Expect ≥ 3000 Gbp for S4 and ≥ 1000 GB for S2 | |
Total Yield | flow cell. | |
(GB) | ||
Total % ≥ | ≥ 85 | Measure of base quality at the run level. Minimum |
specification is
Q30| | set since too low %Q30 runs will not pass Q30 bases in Sample
| | Library QC.
Summary| PASS or| For Sequencing QC failure, consult Troubleshooting section
in the
Sequencing| FAIL| TruSight Whole Genome Package Insert (document #
QC Result| | 200050132).
Table 2 Demultiplex Metrics
Table 3 FASTQ QC Metrics
Table 4 Sample Library QC Metrics
Metrics | Spec | Description |
---|---|---|
Average | ≥ 35 | Average coverage across the autosomes. Minimum |
autosomal | specification is set to ensure analytical performance. | |
coverage | ||
Percent of | ≥ 93.94 | Measure of coverage uniformity that detects issues not |
autosomes | necessarily related to GC bias. Minimum specification is set to | |
with coverage | ensure analytical performance. | |
greater than | ||
20X | ||
Normalized | 0.82 ≤ x ≤ | Measure of coverage uniformity that detects GC bias, |
coverage at | 1.13 | specifically a loss of coverage in areas of the genome with |
60% to 79% | higher % GC and lower % AT base composition. Minimum and | |
GC bins | maximum specifications are set to ensure analytical | |
performance. | ||
Normalized | 0.97 ≤ x ≤ | Measure of coverage uniformity that detects GC bias, |
coverage at | 1.06 | specifically a loss of coverage in areas of the genome with |
20% to 39% | lower % GC and higher % AT base composition. Minimum and | |
GC bins | maximum specifications are set to ensure analytical | |
performance. | ||
Average | ≥ 500 | Coverage of the mitochondrial chromosome. Minimum |
mitochondrial | specification is set to ensure mitochondrial SNV limit of | |
coverage | detection. | |
Percent Q30 | ≥ 85 | Measure of base quality. Minimum specification is set to |
ensure
bases| | analytical performance.
Estimated| ≤0.005| Detects contaminating reads from other samples. Maximum
sample| | specification is set to ensure mitochondrial SNV limit of
contamination| | detection (the variant type with the highest sensitivity to
| | contamination).
Summary| PASS or| For Sample Library QC failure, consult Troubleshooting
section
Sample Library| FAIL| in the TruSight Whole Genome Package Insert (document #
QC Result| | 200050132).
Table 5 Ploidy QC Metrics
Table 6 For Information Only Metrics
Variant Call File Information
VCF Files
Variant call format (*.vcf) files contain information about variants found at
specific positions in a
reference genome and can be found in the
The VCF file header includes the VCF file format version and the variant
caller version and lists the annotations used in the remainder of the file.
The last line in the header contains the column headings for the data lines.
Each of the VCF file data lines contains information about a single reference
position.
All VCF files contain a header with descriptions of output columns, and
variant call data in columns labeled as CHROM, POS, ID, REF, ALT, QUAL,
FILTER, INFO, FORMAT, SAMPLE. Definitions of column values can vary among
variant callers.
Small Variant and mSNV VCF
Output is saved under the
A genomic VCF (gVCF) file contains information on variants and positions
determined to be homozygous to the reference genome. For homozygous regions,
the gVCF file includes statistics that indicate how well reads support the
absence of variants or alternative alleles. The gVCF file includes an
artificial
DRAGEN provides post-VCF variant filtering based on annotations present in the
VCF records. Variant hard filtering is described below. However, due to the
nature of DRAGEN’s algorithms, which incorporate the hypothesis of correlated
errors from within the core of variant caller, the pipeline has improved
capabilities in distinguishing the true variants from noise, and therefore the
dependency on post-VCF filtering is substantially reduced.
The TruSight Whole Genome Analysis Application provides annotation of
confidence score and confidence tier for small variants that can be used to
further improve performance. Confidence tier annotation is not a quality
filter and as such is not directly reflected in the quality status of the
variant calls. Therefore, it is possible to see passing variant calls which
are nonetheless annotated as low confidence.
Table 7 VCF File Headings
Table 8 VCF File Annotations
Copy Number Variant VCF
The target counts stage is the first processing stage for the DRAGEN CNV
pipeline, producing
The POS column is the start position of the variant. According to the VCF
specification, if any of the ALT
alleles is a symbolic allele, such as , then the padding base is required
and POS denotes the coordinate of the base preceding the polymorphism. All
coordinates in the VCF are 1-based.
The ID column is used to represent the event. The ID field encodes the event
type and coordinates of the event.
The REF column contains an N for all CNV events.
The ALT column specifies the type of CNV event. Because the VCF contains only
CNV events, only the DEL or DUP entry is used.
The QUAL column contains an estimated quality score for the CNV event, which
is used in hard filtering.
The FILTER column contains PASS if the CNV event passes all filters, otherwise
the column contains the name of the failed filter.
The INFO column contains information representing the event. The REFLEN entry
indicates the length of the event. The SVTYPE entry is always CNV. The END
entry indicates the end position of the event. The FORMAT fields are described
in the header.
- GT—Genotype
- SM—Linear copy ratio of the segment mean
- CN—Estimated copy number
- BC—Number of bins in the region
- PE—Number of improperly paired end reads at start and stop breakpoints
Repeats VCF
ExpansionHunter performs a sequence-graph based realignment of reads that
originate inside and around each target repeat. ExpansionHunter then genotypes
the length of the repeat in each allele based on these graph alignments.
More information and analysis are available in the following ExpansionHunter
papers:
- Dolzhenko et al., Detection of long repeat expansions from PCR-free whole-genome sequence data 2017
- Dolzhenko et al., ExpansionHunter: A sequence-graph based tool to analyze variation in short tandem repeat regions 2019
The TruSight Whole Genome Analysis Application STR variant catalog contains
specifications on disease-causing repeats located in AFF2, AR, ATN1, ATXN1,
ATXN10, ATXN2, ATXN3, ATXN7, ATXN8OS, C9ORF72, CACNA1A, CBL, CNBP, CSTB,
DIP2B, DMPK, FMR1, FXN, GLS, HTT, JPH3, NIPA1, NOP56, NOTCH2NL, PABPN1,
PHOX2B, PPP2R2B, and TBP genes.
The results of repeat genotyping are output as a separate VCF file, which
provides the length of each
allele at each callable repeat defined in the repeat-specification catalog
file. The file name is
Some columns are specific to repeat expansion caller:
Table 9 Core VCF Fields
Field | Description |
---|---|
CHROM | Chromosome identifier |
POS | Position of the first base before the repeat region in the reference |
ID | Always . |
REF | The reference base at position POS |
ALT | List of repeat alleles in format |
units.
QUAL| Always .
FILTER| LowDepth filter is applied when the overall locus depth is below 10x
or the number of
| reads that span one or both breakends is below 5.
Table 10| Additional INFO Fields
Field| Description
END| Position of the last base of the repeat region in the
| reference
REF| Reference copy number
REPID| Variant ID from the variant catalog
RL| Reference length in bp
RU| Repeat unit in the reference orientation
VARID| Variant ID from the variant catalog
Table 11| GENOTYPE (Per Sample) Fields
Field| Description
AD| Allelic depths for the ref and alt alleles in the order listed
ADFL| Number of flanking reads consistent with the allele
ADIR| Number of in-repeat reads consistent with the allele
ADSP| Number of spanning reads consistent with the allele
DST| Results (+ detected, – undetected, ? undetermined) of the test
represented by the variant
GT| Genotype
LC| Locus Coverage
REPCI| Confidence interval for REPCN
The
000344.3:c.840C/T) with Spinal Muscular Atrophy (SMA) status in the following
custom fields.
Table 12 SMA Results in repeats.vcf Output File
ROH BED
Regions of homozygosity (ROH) are detected as part of the small variant
caller. The caller detects and outputs the runs of homozygosity from whole
genome calls on autosomal human chromosomes. Sex chromosomes are ignored
unless the sample sex karyotype is XX, as determined by the Ploidy Estimator.
ROH output allows downstream tools to screen for and predict consanguinity
between the parents of the proband subject.
A region is defined as consecutive variant calls on the chromosome with no
large gap in between these variants. In other words, regions are broken by
chromosome or by large gaps with no SNV calls. The gap size is set to 3
Mbases.
The ROH caller produces an ROH output file named
ID>/Analysisdirectory. Each row represents one region of homozygosity. The bed
file contains the following columns:
Chromosome Start End Score #Homozygous #Heterozygous
Where
- Score is a function of the number of homozygous and heterozygous variants, where each homozygous variant increases the score by 0.025, and each heterozygous variant reduces the score by 0.975.
- Start and end positions are a 0-based, half-open interval.
-
Homozygous is number of homozygous variants in the region.
-
Heterozygous is number of heterozygous variants in the region.
The caller also produces a metrics file named
Ploidy Estimation Metrics
The Ploidy Estimator runs by default. The Ploidy Estimator uses reads from the
mapper/aligner to calculate the sequencing depth of coverage for each autosome
and allosome in the human genome. The sex karyotype of the sample is then
estimated using the ratios of the median sex chromosome coverages to the
median autosomal coverage. XX or XY, and CONCORDANT, DISCORDANT, or ND (Not
Determined) compared to the sample data provided are reported in the
consolidated report. The detailed results, including each normalized per-
contig median coverage, is reported in the <Sample_
ID>.ploidy_estimation_metrics.csvfile.
FASTQ Files
FASTQ (.fastq.gz, .fastq.ora) is a text-based file format containing base
calls and quality values per read. Each file contains the following
information:
- The sample identifier
- The sequence
- A plus sign (+)
- The Phred quality scores in an ASCII + 33 encoded format
The software generates one FASTQ file for every sample, read, and lane. For
example, for each sample in a paired-end run, the software generates two FASTQ
files: one for Read 1 and one for Read 2. In addition to these sample FASTQ
files, the software generates two FASTQ files per lane containing all unknown
samples. FASTQ files for Index Read 1 and Index Read 2 are not generated
because the sequence is included in the header of each FASTQ entry. The file
name format is constructed from fields
specified in the sample sheet and use the file naming format of
The sample identifier is formatted as follows:
@Instrument:RunID:FlowCellID:Lane:Tile:X:Y ReadNum:FilterFlag:0:SampleNumber
Example:
@SIM:1:FCX:1:15:6329:1045 1:N:0:2
TCGCACTCAACGCCCTGCATATGACAAGACAGAATC
- <>;##=><9=AAAAAAAAAA9#:<#<;<<???#=
CRAM Files
Compressed Reference-oriented Alignment Map or CRAM files (*.cram) are stored
in the
ID>/Analysis/
CRAM files contain a header section and an alignment section:
- Header—Contains information about the entire file, such as sample name, sample length, and alignment method. Alignments in the alignments section are associated with specific information in the header section.
- Alignments—Contains read name, read sequence, read quality, alignment information, and custom tags. The read name includes the chromosome, start coordinate, alignment quality, and the match descriptor string.
The alignments section includes the following information for each read or read pair:
- AS: Paired-end alignment quality.
- RG: Read group, which indicates the number of reads for a specific sample.
- BC: Barcode tag, which indicates the demultiplexed sample ID associated with the read.
- SM: Single-end alignment quality.
- XC: Match descriptor string.
- XN: Amplicon name tag, which records the amplicon ID associated with the read
To view alignment records, samtoolscan be used as samtools view –reference
Technical Assistance
For technical assistance, contact Illumina Technical Support.
- Website : www.illumina.com
- Email : techsupport@illumina.com
Safety data sheets (SDSs)—Available on the Illumina website at support.illumina.com/sds.html. Product documentation—Available for download from support.illumina.com.
Illumina, Inc.
5200 Illumina Way
San Diego, California 92122 U.S.A.
+ 1.800.809.ILMN (4566)
+ 1.858.202.4566 (outside North America)
techsupport@illumina.com
www.illumina.com
Australian Sponsor
Illumina Australia Pty Ltd Nursing Association Building Level 3, 535 Elizabeth
Street Melbourne, VIC 3000 Australia
FOR IN VITRO DIAGNOSTIC USE.
© 2024 Illumina, Inc. All rights reserved.
References
- Support Center
- Safety Data Sheets (SDS)
- Illumina | Sequencing and array solutions to fuel genomic discoveries
- Illumina | Sequencing and array solutions to fuel genomic discoveries
- Legal | Legal Information for this website
- Support Center
Read User Manual Online (PDF format)
Read User Manual Online (PDF format) >>