Package vcf
Class VcfRec
- java.lang.Object
-
- vcf.VcfRec
-
- All Implemented Interfaces:
IntArray,DuplicatesGTRec,GTRec,MarkerContainer
public final class VcfRec extends java.lang.Object implements GTRec
Class
VcfRecrepresents a VCF record. If one allele in a diploid genotype is missing, then both alleles are set to missing.Instances of class
VcfRecare immutable.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description intallele1(int sample)Returns the first allele for the specified sample or -1 if the allele is missing.intallele2(int sample)Returns the second allele for the specified sample or -1 if the allele is missing.int[]alleles()Returns an array of lengththis.size()whosej-th element is equal tothis.allele(j}java.lang.Stringfilter()Returns the FILTER field.java.lang.Stringformat()Returns the FORMAT field.java.lang.String[]formatData(java.lang.String formatCode)Returns an array of lengththis.size()containing the specified FORMAT subfield data for each sample.intformatIndex(java.lang.String formatCode)Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.java.lang.StringformatSubfield(int subfieldIndex)Returns the specified FORMAT subfield.static VcfRecfromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)Constructs and returns a newVcfRecinstance from a VCF record and its GL or PL format subfield data.static VcfRecfromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)Constructs and returns a newVcfRecinstance from a VCF record and its GT format subfield datastatic VcfRecfromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)Constructs and returns a newVcfRecinstance from a VCF record and its GT, GL, and PL format subfield data.intget(int hap)Returns the specified allele for the specified haplotype or -1 if the allele is missing.floatgl(int sample, int allele1, int allele2)Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.static intgtIndex(int a1, int a2)Returns the VCF genotype index for the specified pair of alleles.booleanhasFormat(java.lang.String formatCode)Returnstrueif the specified FORMAT subfield is present, and returnsfalseotherwise.java.lang.Stringinfo()Returns the INFO field.booleanisPhased()Returnstrueif every genotype for each sample is a phased, non-missing genotype, and returnsfalseotherwise.booleanisPhased(int sample)Returnstrueif the genotype for the specified sample has non-missing alleles and is either haploid or diploid with a phased allele separator, and returnsfalseotherwise.Markermarker()Returns the marker.intnFormatSubfields()Returns the number of FORMAT subfields.java.lang.Stringqual()Returns the QUAL field.java.lang.StringsampleData(int sample)Returns the data for the specified sample.java.lang.StringsampleData(int sample, int subfieldIndex)Returns the specified data for the specified sample.java.lang.StringsampleData(int sample, java.lang.String formatCode)Returns the specified data for the specified sample.Samplessamples()Returns the list of samples.intsize()Returns the number of haplotypes.java.lang.StringtoString()Returns the VCF record.VcfHeadervcfHeader()Returns the VCF meta-information lines and the VCF header line.
-
-
-
Field Detail
-
GL_FORMAT
public static final java.lang.String GL_FORMAT
The VCF FORMAT code for log-scaled genotype likelihood data: "GL".- See Also:
- Constant Field Values
-
PL_FORMAT
public static final java.lang.String PL_FORMAT
The VCF FORMAT code for phred-scaled genotype likelihood data: "PL".- See Also:
- Constant Field Values
-
-
Method Detail
-
gtIndex
public static int gtIndex(int a1, int a2)Returns the VCF genotype index for the specified pair of alleles.- Parameters:
a1- the first allelea2- the second allele- Returns:
- the VCF genotype index for the specified pair of alleles
- Throws:
java.lang.IllegalArgumentException- ifa1 < 0 || a2 < 0
-
fromGT
public static VcfRec fromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)
Constructs and returns a newVcfRecinstance from a VCF record and its GT format subfield data- Parameters:
vcfHeader- meta-information lines and header line for the specified VCF record.vcfRecord- a VCF record with a GL format field corresponding to the specifiedvcfHeaderobject- Returns:
- a new
VcfRecinstance - Throws:
java.lang.IllegalArgumentException- if the VCF record does not have a GT format fieldjava.lang.IllegalArgumentException- if a VCF record format error is detectedjava.lang.IllegalArgumentException- if there are notvcfHeader.nHeaderFields()tab-delimited fields in the specified VCF recordjava.lang.NullPointerException- ifvcfHeader == null || vcfRecord == null
-
fromGL
public static VcfRec fromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRecinstance from a VCF record and its GL or PL format subfield data. If both GL and PL format subfields are present, the GL format field will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThresholdis set to 0.- Parameters:
vcfHeader- meta-information lines and header line for the specified VCF recordvcfRecord- a VCF record with a GL format field corresponding to the specifiedvcfHeaderobjectmaxLR- the maximum likelihood ratio- Returns:
- a new
VcfRecinstance - Throws:
java.lang.IllegalArgumentException- if the VCF record does not have a GL format fieldjava.lang.IllegalArgumentException- if a VCF record format error is detectedjava.lang.IllegalArgumentException- if there are notvcfHeader.nHeaderFields()tab-delimited fields in the specified VCF recordjava.lang.NullPointerException- ifvcfHeader == null || vcfRecord == null
-
fromGTGL
public static VcfRec fromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
Constructs and returns a newVcfRecinstance from a VCF record and its GT, GL, and PL format subfield data. If the GT format subfield is present and non-missing, the GT format subfield is used to determine genotype likelihoods. Otherwise the GL or PL format subfield is used to determine genotype likelihoods. If both the GL and PL format subfields are present, only the GL format subfield will be used. If the maximum normalized genotype likelihood is 1.0 for a sample, then any other genotype likelihood for the sample that is less thanlrThresholdis set to 0.- Parameters:
vcfHeader- meta-information lines and header line for the specified VCF recordvcfRecord- a VCF record with a GT, a GL or a PL format field corresponding to the specifiedvcfHeaderobjectmaxLR- the maximum likelihood ratio- Returns:
- a new
VcfRec - Throws:
java.lang.IllegalArgumentException- if the VCF record does not have a GT, GL, or PL format fieldjava.lang.IllegalArgumentException- if a VCF record format error is detectedjava.lang.IllegalArgumentException- if there are notvcfHeader.nHeaderFields()tab-delimited fields in the specified VCF recordjava.lang.NullPointerException- ifvcfHeader == null || vcfRecord == null
-
qual
public java.lang.String qual()
Returns the QUAL field.- Returns:
- the QUAL field
-
filter
public java.lang.String filter()
Returns the FILTER field.- Returns:
- the FILTER field
-
info
public java.lang.String info()
Returns the INFO field.- Returns:
- the INFO field
-
format
public java.lang.String format()
Returns the FORMAT field. Returns the empty string ("") if the FORMAT field is missing.- Returns:
- the FORMAT field
-
nFormatSubfields
public int nFormatSubfields()
Returns the number of FORMAT subfields.- Returns:
- the number of FORMAT subfields
-
formatSubfield
public java.lang.String formatSubfield(int subfieldIndex)
Returns the specified FORMAT subfield.- Parameters:
subfieldIndex- a FORMAT subfield index- Returns:
- the specified FORMAT subfield
- Throws:
java.lang.IndexOutOfBoundsException- ifsubfieldIndex < 0 || subfieldIndex >= this.nFormatSubfields()
-
hasFormat
public boolean hasFormat(java.lang.String formatCode)
Returnstrueif the specified FORMAT subfield is present, and returnsfalseotherwise.- Parameters:
formatCode- a FORMAT subfield code- Returns:
trueif the specified FORMAT subfield is present
-
formatIndex
public int formatIndex(java.lang.String formatCode)
Returns the index of the specified FORMAT subfield if the specified subfield is defined for this VCF record, and returns -1 otherwise.- Parameters:
formatCode- the format subfield code- Returns:
- the index of the specified FORMAT subfield if the
specified subfield is defined for this VCF record, and
-1otherwise
-
sampleData
public java.lang.String sampleData(int sample)
Returns the data for the specified sample.- Parameters:
sample- a sample index- Returns:
- the data for the specified sample
- Throws:
java.lang.IndexOutOfBoundsException- ifsample < 0 || sample >= this.size()
-
sampleData
public java.lang.String sampleData(int sample, java.lang.String formatCode)Returns the specified data for the specified sample.- Parameters:
sample- a sample indexformatCode- a FORMAT subfield code- Returns:
- the specified data for the specified sample
- Throws:
java.lang.IllegalArgumentException- ifthis.hasFormat(formatCode)==falsejava.lang.IndexOutOfBoundsException- ifsample < 0 || sample >= this.size()
-
sampleData
public java.lang.String sampleData(int sample, int subfieldIndex)Returns the specified data for the specified sample.- Parameters:
sample- a sample indexsubfieldIndex- a FORMAT subfield index- Returns:
- the specified data for the specified sample
- Throws:
java.lang.IndexOutOfBoundsException- iffield < 0 || field >= this.nFormatSubfields()java.lang.IndexOutOfBoundsException- ifsample < 0 || sample >= this.size()
-
formatData
public java.lang.String[] formatData(java.lang.String formatCode)
Returns an array of lengththis.size()containing the specified FORMAT subfield data for each sample. Thek-th element of the array is the specified FORMAT subfield data for thek-th sample.- Parameters:
formatCode- a format subfield code- Returns:
- an array of length
this.size()containing the specified FORMAT subfield data for each sample - Throws:
java.lang.IllegalArgumentException- ifthis.hasFormat(formatCode) == false
-
samples
public Samples samples()
Description copied from interface:GTRecReturns the list of samples.
-
vcfHeader
public VcfHeader vcfHeader()
Returns the VCF meta-information lines and the VCF header line.- Returns:
- the VCF meta-information lines and the VCF header line
-
marker
public Marker marker()
Description copied from interface:MarkerContainerReturns the marker.- Specified by:
markerin interfaceMarkerContainer- Returns:
- the marker
-
allele1
public int allele1(int sample)
Description copied from interface:DuplicatesGTRecReturns the first allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false.- Specified by:
allele1in interfaceDuplicatesGTRec- Parameters:
sample- a sample index- Returns:
- the first allele for the specified sample
-
allele2
public int allele2(int sample)
Description copied from interface:DuplicatesGTRecReturns the second allele for the specified sample or -1 if the allele is missing. The two alleles for a sample are arbitrarily ordered ifthis.unphased(marker, sample) == false.- Specified by:
allele2in interfaceDuplicatesGTRec- Parameters:
sample- a sample index- Returns:
- the second allele for the specified sample
-
get
public int get(int hap)
Description copied from interface:DuplicatesGTRecReturns the specified allele for the specified haplotype or -1 if the allele is missing. The two alleles for a sample at a marker are arbitrarily ordered ifthis.unphased(marker, hap/2) == false.- Specified by:
getin interfaceDuplicatesGTRec- Specified by:
getin interfaceIntArray- Parameters:
hap- a haplotype index- Returns:
- the specified allele for the specified sample
-
alleles
public int[] alleles()
Description copied from interface:DuplicatesGTRecReturns an array of lengththis.size()whosej-th element is equal tothis.allele(j}- Specified by:
allelesin interfaceDuplicatesGTRec- Returns:
- an array of length
this.size()whosej-th element is equal tothis.allele(j}
-
isPhased
public boolean isPhased(int sample)
Description copied from interface:DuplicatesGTRecReturnstrueif the genotype for the specified sample has non-missing alleles and is either haploid or diploid with a phased allele separator, and returnsfalseotherwise.- Specified by:
isPhasedin interfaceDuplicatesGTRec- Parameters:
sample- a sample index- Returns:
trueif the genotype for the specified sample is a phased, nonmissing genotype
-
isPhased
public boolean isPhased()
Description copied from interface:DuplicatesGTRecReturnstrueif every genotype for each sample is a phased, non-missing genotype, and returnsfalseotherwise.- Specified by:
isPhasedin interfaceDuplicatesGTRec- Returns:
trueif the genotype for each sample is a phased, non-missing genotype
-
gl
public float gl(int sample, int allele1, int allele2)Returns the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype. Returns1.0fif the corresponding genotype determined by theisPhased(),allele1(), andallele2()methods is consistent with the specified ordered genotype, and returns0.0fotherwise.- Parameters:
sample- the sample indexallele1- the first allele indexallele2- the second allele index- Returns:
- the probability of the observed data for the specified sample if the specified pair of ordered alleles is the true ordered genotype.
- Throws:
java.lang.IndexOutOfBoundsException- ifsamples < 0 || samples >= this.size()java.lang.IndexOutOfBoundsException- ifallele1 < 0 || allele1 >= this.marker().nAlleles()java.lang.IndexOutOfBoundsException- ifallele2 < 0 || allele2 >= this.marker().nAlleles()
-
size
public int size()
Description copied from interface:DuplicatesGTRecReturns the number of haplotypes.- Specified by:
sizein interfaceDuplicatesGTRec- Specified by:
sizein interfaceIntArray- Returns:
- the number of haplotypes
-
toString
public java.lang.String toString()
Returns the VCF record.- Overrides:
toStringin classjava.lang.Object- Returns:
- the VCF record
-
-