| Bioinformatics Toolbox™ | ![]() |
AnnotStruct =
ilmnbslookup(AnnotationFile, ID)
AnnotStruct =
ilmnbslookup(AnnotationFile, ID,
'LookUpField', LookUpFieldValue)
| AnnotationFile | String specifying a file name or a path and file name of an Illumina® annotation file (CSV, BGX, or TXT format). If you specify only a file name, that file must be on the MATLAB search path or in the current directory.
| ||
| ID | String or cell array of strings representing a unique identifier(s) for one or more targets (probes) on an Illumina microarray.
| ||
| LookUpFieldValue | Field in AnnotationFile where ilmnbslookup looks for the specified ID. Default is the Search_key field. |
| AnnotStruct | Structure containing the probe sequence and annotation information for one or more targets (probes) specified by ID, and by AnnotationFile, an Illumina annotation file. AnnotStruct contains the same fields as AnnotationFile. The fields are described in the following two tables. |
AnnotStruct = ilmnbslookup(AnnotationFile, ID) returns AnnotStruct, a structure containing probe sequence and annotation information for one or more targets (probes) specified by ID, and by AnnotationFile, an Illumina annotation file (CSV, BGX, or TXT format).
AnnotStruct contains the same fields as AnnotationFile. The fields are described in the following two tables.
Structure Created from Illumina CSV Annotation File
| Field | Description |
|---|---|
| Search_key | Internal identifier for the target, useful for custom design array |
| Target | Unique identifier for the target |
| ProbeId | Illumina probe identifier |
| Gid | GenBank identifier for the gene |
| Transcript | Illumina internal transcript identifier |
| Accession | GenBank accession number for the gene |
| Symbol | Typically, the gene symbol |
| Type | Probe type |
| Start | Starting position of the probe sequence in the GenBank record |
| Probe_Sequence | Sequence of the probe |
| Definition | Definition field from the GenBank record |
| Ontology | Gene Ontology terms associated with the gene |
| Synonym | Synonyms for the gene (from the GenBank record) |
Structure Created from a BGX or TXT Annotation File
| Field | Description |
|---|---|
| Accession | GenBank accession number for the gene |
| Array_Address_Id | Decoder identifier |
| Chromosome | Chromosome on which the gene is located |
| Cytoband | Cytogenetic banding region of the chromosome on which the gene associated with the target is located |
| Definition | Definition field from the GenBank record |
| Entrez_Gene_ID | Entrez Gene database identifier for the gene |
| GI | GenBank identifier for the gene |
| ILMN_Gene | Illuminainternal gene symbol |
| Obsolete_Probe_Id | Probe identifier before BGX annotation files |
| Ontology_Component | Gene Ontology cellular components associated with the gene |
| Ontology_Function | Gene Ontology molecular functions associated with the gene |
| Ontology_Process | Gene Ontology biological processes associated with the gene |
| Probe_Chr_Orientation | Orientation of the probe on the NCBI genome build |
| Probe_Coordinates | Genomic position of the probe on the NCBI genome build |
| Probe_Id | Illuminaprobe identifier |
| Probe_Sequence | Sequence of the probe |
| Probe_Start | Start position of the probe relative to the 5' end of the source transcript sequence |
| Probe_Type | Information about what the probe is targeting |
| Protein_Product | NCBI protein accession number |
| RefSeq_ID | Identifier from the NCBI RefSeq database |
| Reporter_Composite_map | Information associated with control probes |
| Reporter_Group_Name | Information associated with control probes |
| Reporter_Group_id | Information associated with control probes |
| Search_Key | Internal identifier for the target, useful for custom design array |
| Source | Source from which the transcript sequence was obtained |
| Source_Reference_ID | Source's identifier |
| Species | Species associated with the gene |
| Symbol | Typically, the gene symbol |
| Synonyms | Synonyms for the gene (from the GenBank record) |
| Transcript | Illuminainternal transcript identifier |
| Unigene_ID | Identifier from the NCBI UniGene database |
AnnotStruct = ilmnbslookup(AnnotationFile, ID, 'LookUpField', LookUpFieldValue) looks for ID in the annotation file in the field specified by LookUpFieldValue. Default is the Search_key field.
Note The gene expression file, TumorAdjacent-probe-raw.txt, and the annotation file, HumanRef-8_V3_0_R0_11282963_A.bgx, used in the following examples are not provided with the Bioinformatics Toolbox software. |
Look Up Annotation Information for a Single Target (Probe)
Read the contents of a tab-delimited file exported from the Illumina BeadStudio™ software into a MATLAB structure.
ilmnStruct = ilmnbsread('TumorAdjacent-probe-raw.txt')
ilmnStruct =
Header: [1x1 struct]
TargetID: {22184x1 cell}
ColumnNames: {1x37 cell}
Data: [22184x37 double]
TextColumnNames: {1x23 cell}
TextData: {22184x23 cell}Find the number of the Search_key column in the TextColumnNames cell array, which is returned in the ilmnStruct structure by the ilmnbsread function.
srchCol = find(strcmpi('Search_Key',ilmnStruct.TextColumnNames))
srchCol =
1Use the output from step 2 to look up the probe sequence and annotation information for the 10th entry in the annotation file, HumanRef-8_V3_0_R0_11282963_A.bgx.
annotation = ilmnbslookup('HumanRef-8_V3_0_R0_11282963_A.bgx',...
ilmnStruct.TextData{10,srchCol})
annotation =
Accession: 'NM_144670.2'
Array_Address_Id: '0004050154'
Chromosome: '12'
Cytoband: '12p13.31b'
Definition: 'Homo sapiens alpha-2-macroglobulin-like 1 (A2ML1), mRNA.'
Entrez_Gene_ID: '144568'
GI: '74271844'
ILMN_Gene: 'A2ML1'
Obsolete_Probe_Id: ''
Ontology_Component: ''
Ontology_Function: 'endopeptidase inhibitor activity [goid 4866] [evidence IEA]'
Ontology_Process: ''
Probe_Chr_Orientation: '+'
Probe_Coordinates: '8920412-8920461'
Probe_Id: 'ILMN_2136495'
Probe_Sequence: 'TGTAATCGCAGCCCCTTGGAAGGCCAAGGCAGGAGAATCGCCTCAACACT'
Probe_Start: '4889'
Probe_Type: 'S'
Protein_Product: 'NP_653271.2'
RefSeq_ID: 'NM_144670.2'
Reporter_Composite_map: ''
Reporter_Group_Name: ''
Reporter_Group_id: ''
Search_Key: 'ILMN_17375'
Source: 'RefSeq'
Source_Reference_ID: 'NM_144670.2'
Species: 'Homo sapiens'
Symbol: 'A2ML1'
Synonyms: [1x141 char]
Transcript: 'ILMN_17375'
Unigene_ID: ''Look Up Annotation Information for a Subset of Targets (Probes)
Use the ilmnbslookup function with the 'LookUpField' property to look up the annotation information for all targets located on chromosome 12 in the annotation file, HumanRef-8_V3_0_R0_11282963_A.bgx.
chr12annotation = ilmnbslookup('HumanRef-8_V3_0_R0_11282963_A.bgx',...
'12','LookUpField','Chromosome')
chr12annotation =
Accession: {1x1186 cell}
Array_Address_Id: {1x1186 cell}
Chromosome: {1x1186 cell}
Cytoband: {1x1186 cell}
Definition: {1x1186 cell}
Entrez_Gene_ID: {1x1186 cell}
GI: {1x1186 cell}
ILMN_Gene: {1x1186 cell}
Obsolete_Probe_Id: {1x1186 cell}
Ontology_Component: {1x1186 cell}
Ontology_Function: {1x1186 cell}
Ontology_Process: {1x1186 cell}
Probe_Chr_Orientation: {1x1186 cell}
Probe_Coordinates: {1x1186 cell}
Probe_Id: {1x1186 cell}
Probe_Sequence: {1x1186 cell}
Probe_Start: {1x1186 cell}
Probe_Type: {1x1186 cell}
Protein_Product: {1x1186 cell}
RefSeq_ID: {1x1186 cell}
Reporter_Composite_map: ''
Reporter_Group_Name: ''
Reporter_Group_id: ''
Search_Key: {1x1186 cell}
Source: {1x1186 cell}
Source_Reference_ID: {1x1186 cell}
Species: {1x1186 cell}
Symbol: {1x1186 cell}
Synonyms: {1x1186 cell}
Transcript: {1x1186 cell}
Unigene_ID: {1x1186 cell}The output structure indicates that there are 1,186 targets located on chromosome 12.
Bioinformatics Toolbox function: ilmnbsread
![]() | hmmprofstruct | ilmnbsread | ![]() |
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |