The sequencing reads are further analyzed using RSEM. First, the FASTQs are demultiplexed into smaller FASTQs for each individual well barcode using Cutadapt. Then, RSEM is used to quantify the expression of T/B marker genes for the DriverMap™ scAIR TCR-Mark36 and scAIR BCR-Mark30 Phenotyping Kits.
1. The analysis will require the following files from Cellecta: (Right-click the link and select Save Link As)
• Barcode FASTA: Cellecta_96well_Barcodes.fasta
• Marker Gene FASTA: scAIR_mark30_BCR.fasta OR scAIR_mark36_TCR.fasta
• Marker Gene GTF:
scAIR_mark30_BCR.gtf OR scair-mark30-bcr-gtf
2. Follow the instructions to install RSEM and Cutadapt.
3. Demultiplex the reads using Cutadapt with the information from reverse reads:
cutadapt -e 1 -g ^file:Cellecta_96well_Barcodes.fasta \
-o $OUTPUT_DIR/$NAME-{name}.1.fastq.gz \
-p $OUTPUT_DIR/$NAME-{name}.2.fastq.gz \
$READ_2 \
$READ_1
where:
$READ_1, $READ_2
= paired-end read file names (e.g. sample_a_R1.fastq.gz, sample_a_R2.fastq.gz)
$OUTPUT_DIR
= the desired output directory (e.g. ~/Documents/sample_a_fqs)
$NAME
= the desired output sample name (e.g. sample_a); $NAME is distinct from the name, which should not be changed on the script
4. Build a reference index using scAIR TCR-Mark36 or scAIR BCR-Mark30 reference FASTA and GTF files:
rsem-prepare-reference --gtf $GTF_FILE –bowtie2 $FASTA $OUTPUT_DIR
where:
$FASTA
= the reference fasta file
$GTF_FILE
= the reference GTF file
$OUTPUT_DIR
= the desired output directory (e.g. ~/Documents/RSEM_refs/TCR)
5. Align the forward reads to the reference:
rsem-calculate-expression --bowtie2 \
$READ1 \
$INDEX \
$OUTPUT_DIR
where:
$READ1
= the forward read post-demultiplexing (e.g. well_A01.fastq.gz)
$INDEX
= the directory of the RSEM index (e.g. ~/Documents/RSEM_refs/TCR)
$OUTPUT_DIR
= the desired output directory (e.g. ~/Documents/RSEM/sample_a/well_A01/)
Need more help with this?
Contact Us